* fix: compaction config for small context windows (≤32K)
Raise COMPACTION_SMALL_CONTEXT_WINDOW from 16K to 32K so models like
Haiku 4.5 (30K context) use proportional 50% reserve instead of the
fixed 20K reserve. Also scale fixedOverhead for small contexts (capped
at 40% of context window) to prevent the doom loop where overhead alone
triggers compaction on every step.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add compaction tuning guidance to limits constants
Explain the relationship between SMALL_CONTEXT_WINDOW and
FIXED_OVERHEAD so devs know the 24K minimum constraint when
tweaking these values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add window focus listener in ChatFooter that focuses the textarea when
the side panel receives focus. Handles both initial open (via
document.hasFocus check on mount) and re-focus scenarios (via window
focus event). Guards against stealing focus from other interactive
elements.
Companion Chromium fix: side_panel_coordinator.cc now always calls
RequestFocus() in PopulateSidePanel(), not just when there's no
previous entry — ensuring the side panel WebContents receives focus
on every open/toggle.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add 2-stage pruning to compaction pipeline before LLM summarization
Add two new lightweight stages to the compaction prepareStep pipeline that
recover context tokens cheaply before falling back to expensive LLM
summarization:
- Stage 2: Use AI SDK's pruneMessages to remove old tool call/result
pairs beyond the last 6 messages entirely
- Stage 3: Replace remaining tool output values with short placeholders
("[Cleared — N chars]") while preserving tool call structure and IDs
Both stages re-estimate tokens from message content (not stale step
usage) after modifying messages. The existing LLM summarization and
sliding window fallback remain as Stage 4.
Also adds estimateTokensForThreshold() helper, clearToolOutputs()
function, and COMPACTION_PRUNE_KEEP_RECENT_MESSAGES /
COMPACTION_CLEAR_OUTPUT_MIN_CHARS constants.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: reorder compaction pipeline — truncate before clear, protect recent tools
- Stage 0: Check threshold, return untouched when under (no data loss)
- Stage 1: Prune old tool call/result pairs beyond last 6 messages
- Stage 2: Truncate large tool outputs to 15K chars (keeps partial content)
- Stage 3: Clear old tool outputs with placeholders, protect last 2
- Stage 4: LLM-based compaction with sliding window fallback
clearToolOutputs now accepts keepRecentCount parameter (default 2) to
skip the N most recent tool messages from clearing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: limits fixes
* fix: address review — preserve toKeep context, derive test values from constants
- When Stage 3 (clearToolOutputs) doesn't resolve overflow, pass
truncated (not cleared) messages to Stage 4 so toKeep retains
meaningful tool outputs for the agent's immediate context
- Add comment explaining intentional conservatism in post-prune
token estimation (step usage is stale, must re-estimate safely)
- Refactor computeConfig tests to derive expected values from
AGENT_LIMITS constants instead of hardcoding magic numbers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The system prompt referenced `browser_open_tab` which was renamed to
`new_page`. This caused models to infer a `browser_*` naming convention
and call non-existent tools like `browser_navigate`, resulting in
MCP error -32602.
Fixes TKT-540
* feat: new tools for breadcrumbs
* feat: setup scheduled task card
* feat: added dismiss cooldown
* chore: update prompt
* fix: support api key tool
* fix: prompt text to limit nudges
* fix: scheduled tasks card
* fix: update nudges prompt
* feat: skip nudges when user dismisses nudge
* fix: ensure nudges only show if they are not dismissed
* Revert "fix: ensure nudges only show if they are not dismissed"
This reverts commit d825254698829b8e9941aae7873bd440027d0c74.
* Revert "feat: skip nudges when user dismisses nudge"
This reverts commit 12b552b454d10ec4209b88668fc48681423ff6fc.
* Revert "fix: update nudges prompt"
This reverts commit 80b7520b953b4d3cbed2ed477b9e508e39938dca.
* feat: update agent with mcp when new mcp connection is added
* feat: created connect apps option as a blocking card system
* feat: schedule tasks passive without dismiss
* fix: nudges and prompt texts
* fix: biome lint errors
* fix: review comments
* fix: resolve comments
* fix: review comments
* fix: review comments
* fix: auto resolve state
* fix: eliminate the race where the async delete could resolve after the
new session
* feat: track ignored apps list
* fix: empty response text object on message reply
* feat: sync previously connected mcps
* feat: sync integrations with klavis
* feat: account for unauthenticated connections
* fix: analytics events
* fix: typescript issues
* fix: klavis client issue
* fix: invalid mcps causing entire responses from failing
* fix: prompt with card for integrations when the integration fails
* fix: prompt structure to support declined apps
* fix: refresh session on mcp changes
* feat: add agent skills system with catalog, loader, and UI
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: return 500 for server errors in PUT/DELETE skill routes
Previously both handlers returned 404 for all errors, masking filesystem
failures (disk full, permission denied) as "not found". Now only
"not found" errors return 404; everything else returns 500.
* fix: align SKILL.md format with agentskills.io spec
- Move `enabled` and `version` into `metadata` field (spec only allows
name, description, license, compatibility, metadata, allowed-tools)
- Frontmatter `name` now matches directory name (lowercase kebab-case)
- Human-readable name stored in `metadata.display-name`
- Add index signature to SkillMetadata for arbitrary string keys
- Validate frontmatter with type guard in getSkill (remove unsafe cast)
- updateSkill now preserves existing frontmatter fields (license, etc.)
- Tighten buildSkillMd param from Record<string, unknown> to SkillFrontmatter
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- truncateToolOutputs: handle all output.type variants (text, json,
content) by checking output.value directly instead of branching on
type. The old code missed type 'content' (array of content parts),
causing 1M+ char tool results to pass through untouched.
- estimateTokens: change chars/4 to chars/3 — HTML/Markdown content
tokenizes at ~3.14 chars/token empirically, not 4.
- COMPACTION_FIXED_OVERHEAD: 5K → 12K to account for system prompt
(~2.5K tokens) + tool definitions as JSON Schema (~8-9K tokens).
- Apply truncateToolOutputs in prepareStep (Stage 0) before token
estimation, not just during summarization.
* fix: robust compaction with Pi-style token counting + overflow middleware
Root cause: getCurrentTokenCount() returned stale inputTokens from the
previous step, ignoring new tool results added to messages since that
step. A large tool output (DOM snapshot, page content) caused a token
jump that bypassed the compaction threshold check, leading to
context_length_exceeded errors (322K tokens sent, model max 262K).
Layer 1 — Accurate token counting (proactive):
- Adopt Pi coding agent's additive approach: base(inputTokens) +
outputTokens + estimate(trailing tool results)
- Trailing tool results are estimated by walking backwards from end of
messages array until a non-tool message is found
- Falls back to full estimation with safety multiplier when no real
usage data is available (first step of a turn)
Layer 2 — Context overflow middleware (reactive):
- LanguageModelV3Middleware that wraps doGenerate/doStream
- Catches context_length_exceeded errors at the model call level
- Truncates prompt (keeps system messages + most recent non-system
messages targeting 60% of context window)
- Retries the model call once
Verified end-to-end with real model (Gemini Flash Lite via OpenRouter)
on 16K context window: 4 compactions triggered correctly across 8
steps, no context_length_exceeded errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: adopt Pi-style overflow detection patterns + fix truncation edge case
- Replace 6 generic substring matches with 17 provider-specific regex
patterns from Pi coding agent (Anthropic, OpenAI, Google, xAI, Groq,
OpenRouter, Bedrock, Copilot, llama.cpp, LM Studio, MiniMax, Kimi,
Mistral, z.ai)
- Fix truncatePrompt edge case: when the last message alone exceeds the
target, keepFrom was never updated → empty non-system messages. Now
always keeps at least the most recent non-system message.
- Add runtime guard for LanguageModelV3 cast in ai-sdk-agent.ts
- Add tests for false-positive rejection and truncation edge case
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The Kimi K2.5 model supports a 256,000 token context window, not
128,000. Updated the provider template and model config to reflect
the correct value.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: return element coordinates in tool responses and DPR in screenshots
- click, hover, fill, drag now return resolved coordinates in response text
- take_screenshot returns devicePixelRatio for mapping coordinates to pixels
- Coordinates are in CSS pixels; multiply by DPR to get screenshot pixels
* fix: use Promise.allSettled in screenshot to prevent DPR eval from aborting capture
Runtime.evaluate for devicePixelRatio can fail on PDF pages or
chrome-extension pages. Using Promise.allSettled ensures the screenshot
still succeeds, falling back to DPR=1.
* feat: gate Moonshot AI provider behind VITE_PUBLIC_KIMI_LAUNCH flag
Hide all Moonshot/Kimi provider UI when the launch flag is off:
- Filter moonshot from provider templates and type dropdown
- Gate Kimi flare badges in HubProviderRow
- Gate Kimi auto-insertion in LLM hub storage
- Add analytics events for Kimi API key configuration and guide clicks
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: allow editing existing moonshot providers when launch flag is off
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add search provider settings page with 5 engine options
Allow users to select their preferred search engine (Google, DuckDuckGo,
Bing, Brave Search, Yahoo) from a new settings page. The selected provider
drives search suggestions, search URL navigation, placeholder text, and
analytics tracking. Replaces all hardcoded Google references with the
stored preference. Adds Brave Search support, replacing Yandex.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add error handling for search provider storage writes
Write to storage before updating React state so UI never diverges from
persisted value on failure. Add try/catch in the settings page to show
an error toast if the write fails.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: migrate stale 400k context window for browseros provider
Existing installations cached the old 400k default in extension storage.
Always normalize the browseros provider's contextWindow to 200k on load,
matching the current default and preventing compaction from failing.
* fix: add browseros-auto model with 200k context length
* fix: setup migrations using the migrations api for context window size
---------
Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>
* fix: anchor agent to active tab page ID from browser context
Generalize the scheduled-task page anchoring instruction to all tasks.
The agent now always uses the page ID from Browser Context instead of
calling get_active_page or list_pages, preventing it from operating
on the wrong tab.
* fix: add chatMode guard and scope windowLine to scheduled tasks
- Skip page-context section in chat mode where list_pages is allowed
- Only show windowId instruction for scheduled tasks (hidden window)
* feat: integrate models.dev registry for auto-populated model defaults
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: fall back to upstream provider for model registry lookup
When the browseros meta-provider is used, the registry lookup now
also tries the upstream provider (e.g., openrouter, anthropic) so
that BrowserOS-hosted models get correct context window and image
support defaults.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: add Object.hasOwn guards to prevent prototype chain lookup
Addresses Greptile review: bracket notation on the registry object
could return prototype-chain properties for keys like __proto__ or
constructor, bypassing the 404 guard in the route handler.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add browseros-cli Go CLI for browser automation
Implements a full-featured CLI that communicates with the BrowserOS MCP
server over JSON-RPC 2.0 / StreamableHTTP. Covers all 54 MCP tools across
10 categories with a hybrid command structure (flat verbs for hot-path
commands, grouped noun-verb for resource management).
- MCP client with initialize + tools/call pattern, thread-safe request IDs
- Dual output: human-readable default, --json for structured/piped usage
- Implicit active page resolution with --page override
- 21 command files: open, nav, snap, click, fill, scroll, eval, ss, pdf,
dom, wait, dialog, pages, window, bookmark, history, group, health, info
- Cobra CLI framework with fatih/color for terminal formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test: add end-to-end integration tests for browseros-cli
Go integration tests gated by `//go:build integration` that exercise the
CLI binary against a running BrowserOS server. Tests build the binary,
run commands via exec.Command, and verify JSON output.
Covers: health, version, page lifecycle (open → text → snap → eval →
screenshot → nav → reload → close), active page, info, error handling,
and invalid page ID rejection. Skips gracefully when no server is running.
Run with: go test -tags integration -v ./...
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add init command and fix MCP client bugs
- Add `browseros-cli init` command that prompts for the server URL,
verifies connectivity, and saves to ~/.config/browseros-cli/config.json
- Config priority: --server flag > BROWSEROS_URL env > config file > default
- Fix Accept header: include text/event-stream (required by StreamableHTTPTransport)
- Fix nil args: send empty object {} instead of null for tools with no params
- Update error messages to suggest `browseros-cli init` on connection failure
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add README for browseros-cli with setup, usage, and testing guide
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: always send arguments object in MCP tools/call
Go's json omitempty omits empty maps, causing the arguments field to be
missing from tools/call requests. The MCP SDK requires arguments to be
an object (even empty {}), not undefined. Remove omitempty from the tag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: update help menu to be have groups
* refactor: replace hand-rolled MCP client with official Go SDK
Switch from custom JSON-RPC implementation to the official
github.com/modelcontextprotocol/go-sdk. This removes all hand-rolled
protocol types (jsonrpcRequest, jsonrpcResponse, RPCError, etc.) and
uses the SDK's StreamableClientTransport with DisableStandaloneSSE
for clean CLI process lifecycle.
Also adds URL normalization/validation, config command, and
updates init/README to reference YAML config.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add server-level instructions that get injected into the LLM system
prompt when external MCP clients (Claude Desktop, Cursor, Gemini CLI)
connect. Covers browser automation workflow, Klavis integration
discovery, and auth flow guidance.
* feat: add inline chat experience to new tab page
Bring the full sidepanel chat experience to the new tab page. When
users select an AI suggestion from the search bar, the page transitions
inline to a full chat view instead of opening the sidepanel.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: remove unnecessary comments from NewTab.tsx
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address PR review comments
- Move NEWTAB_CHAT_STARTED_EVENT tracking to startInlineChat where it
actually fires (was dead code in NewTabChat handleSubmit)
- Add NEWTAB_CHAT_RESET_EVENT tracking to handleNewConversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: gate newtab chat behind NEWTAB_CHAT_SUPPORT feature flag
When the flag is off (BrowserOS < 0.40.0), falls back to opening the
sidepanel via openSidePanelWithSearch (previous behavior). In dev mode
all features are enabled, so inline chat works during development.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add newtab origin context to chat system prompt
When chatting from the new tab page, the AI is instructed to open
content in new tabs rather than navigating the current tab, keeping
the user's new tab page accessible.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The AI SDK agent (v2) was allowing all 54 browser tools in chat mode,
while the Gemini agent correctly restricted to 6 read-only tools.
Extract CHAT_MODE_ALLOWED_TOOLS to a shared constant and filter
browser tools in AiSdkAgent.create() when chatMode is true.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: expose Klavis MCP tools to external MCP clients
Connect to Klavis Strata at server startup and register discovered tools
on each per-request McpServer instance. This lets external MCP clients
(Claude Code, Gemini CLI) access Klavis-proxied integrations (Gmail,
Slack, GitHub, etc.) alongside browser tools.
- Add register-klavis-mcp.ts with connectKlavisProxy() and registerKlavisTools()
- Wire KlavisProxyHandle through server.ts -> mcp routes -> mcp-server
- Use structured logging and proper type imports
* fix: forward Klavis tool schemas and add shutdown cleanup
- Use zod-from-json-schema to convert Strata's JSON Schema to Zod,
so MCP clients see proper parameter names, types, and required fields
- Close Klavis proxy transport on server shutdown
- Move per-request Klavis tool registration logging to debug level
- Use proper type imports instead of inline import() types
- Fix connectKlavisProxy return type (never returns null)
* fix: add timeout to Klavis MCP connect/listTools and log shutdown errors
* fix: clear timeout timer and pre-compute Klavis tool schemas at startup
* fix: use client.close() instead of transport.close() for proper cleanup
Add SOUL_SUPPORT feature flag to capabilities system requiring
minServerVersion 0.0.67. Hides "Agent Soul" nav item in settings
sidebar for older servers that lack the /soul endpoint.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
- Add `VITE_PUBLIC_KIMI_LAUNCH` feature flag controlling Kimi partnership branding
- BrowserOS provider card shows "Powered by Kimi K2.5 from Moonshot AI" badge and "Extended usage limits for the next 2 weeks!" when flag is on
- Moonshot/Kimi highlighted as "Recommended" in provider templates
- LLM Hub defaults to Kimi, ChatGPT, Claude, Gemini (with legacy defaults migration)
- Kimi hub row shows "Powered by Moonshot AI" flare
- Model selector locked to kimi-k2.5
- "How to get a Kimi API key" link in provider dialog
- Moonshot provider fully integrated across frontend and backend
* fix: refactor SDK BrowserService to use Browser class directly
The tools system was completely rewritten with new tool names and response
formats. BrowserService was calling non-existent MCP tools (browser_get_active_tab,
browser_navigate, etc.) that returned structuredContent which no longer exists.
Replaced MCP HTTP client calls with direct Browser class method calls:
- getActiveTab → browser.getActivePage() / browser.listPages()
- getPageContent → browser.contentAsMarkdown()
- getScreenshot → browser.screenshot()
- navigate → browser.goto() with tabId/windowId resolution
- getPageLoadStatus → browser.listPages() with isLoading check
- getInteractiveElements → browser.snapshot() / browser.enhancedSnapshot()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address PR review — consistent tabId guard and remove dead PageContent type
- Change `if (tabId)` to `if (tabId !== undefined)` in navigate() to match
the guard style used for windowId and elsewhere in the file
- Remove orphaned PageContent interface no longer imported after refactor
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>