* fix: remove filesystem tools when no workspace is selected - Make workingDir optional on ResolvedAgentConfig - Remove resolveSessionDir() fallback that always created a session dir, masking the no-workspace state and keeping filesystem tools available - Gate buildFilesystemToolSet() on workingDir being defined - Add workspace change detection mid-conversation — rebuilds the agent session when workspace is added, removed, or switched (same pattern as existing MCP server change detection) - download_file falls back to tmpdir() when no workspace is set - Memory/soul tools are unaffected — they use ~/BrowserOS/ paths * fix: sanitize message history when session rebuilds with different tools When a session is rebuilt due to workspace or MCP changes, the carried-over message history may contain tool parts for tools that no longer exist in the new session. The AI SDK validates messages against the current toolset and rejects parts with no matching schema. - Add toolNames getter to AiSdkAgent exposing registered tool names - Add sanitizeMessagesForToolset() to strip tool parts referencing removed tools from carried-over messages - Apply sanitization in both MCP and workspace session rebuilds * fix: prepend tool-change context to user message on session rebuild When workspace or MCP integrations change mid-conversation, prepend a [Context: ...] block to the user's message explaining what changed. This prevents the LLM from hallucinating tool usage based on patterns in the carried-over conversation history. Context messages vary by change type: - Workspace removed: lists unavailable filesystem tools, suggests selecting a working directory - Workspace added: confirms filesystem tools are available with path - Workspace switched: notes the new working directory - MCP changed: notes that some integration tools may have changed Only fires on the first message after a rebuild. Invisible in the UI. * fix: make MCP change context specific about which apps were added/removed Diff the old and new MCP server keys to produce specific context like: - "The following app integrations were disconnected: Gmail, Slack." - "The following app integrations were connected: Linear." instead of a generic "some tools may no longer be available" message. * refactor: extract shared rebuildSession helper in ChatService Eliminates the duplicated 20-line dispose→create→sanitize→store flow that existed separately in both the MCP and workspace change-detection blocks. Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com> * test: add sanitizeMessagesForToolset test suite Tests for the message sanitization that runs when a session rebuilds with a different toolset (workspace or MCP change mid-conversation): - Preserves messages with no tool parts - Preserves tool parts when tool is in the toolset - Strips tool parts when tool is NOT in the toolset - Strips multiple removed tool parts from same message - Keeps browser tools while removing filesystem tools - Removes messages that become empty after stripping - Preserves non-tool parts (reasoning, step-start, file) - Returns same references when no filtering needed - Handles empty message array and empty toolset * style: fix biome formatting in chat-service.ts --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
BrowserOS Server
MCP server and AI agent loop powering BrowserOS browser automation. This is the core backend — it connects to Chromium via CDP, exposes 53+ MCP tools, and runs the AI agent that interprets natural language into browser actions.
Runtime: Bun · Framework: Hono · AI: Vercel AI SDK · License: AGPL-3.0
Architecture
┌──────────────────────────────────────────────────────────────────────┐
│ MCP Clients │
│ (Agent UI, Claude Code, Gemini CLI, browseros-cli) │
└──────────────────────────────────────────────────────────────────────┘
│
│ HTTP / SSE / StreamableHTTP
▼
┌──────────────────────────────────────────────────────────────────────┐
│ BrowserOS Server (Bun) │
│ │
│ /mcp ─────── MCP tool endpoints (53+ tools) │
│ /chat ────── Agent streaming (AI SDK) │
│ /health ─── Health check │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Agent Loop │ │
│ │ ├── Multi-provider AI SDK (OpenAI, Anthropic, Google, ...) │ │
│ │ ├── Session & conversation management │ │
│ │ ├── Context overflow handling + compaction │ │
│ │ └── MCP client for external tool servers │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────┐ ┌────────────────────────────────────┐ │
│ │ CDP Tools │ │ Controller Tools │ │
│ │ (screenshots, │ │ (tabs, bookmarks, history, │ │
│ │ DOM, network, │ │ navigation, tab groups) │ │
│ │ console, input) │ │ │ │
│ └────────────────────┘ └────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
│ │
│ Chrome DevTools Protocol │ WebSocket
▼ ▼
┌─────────────────────┐ ┌─────────────────────────────────┐
│ Chromium CDP │ │ Controller Extension │
│ (port 9000) │ │ (port 9300) │
│ │ │ │
│ DOM, network, │ │ chrome.tabs, chrome.history, │
│ input, screenshots │ │ chrome.bookmarks │
└─────────────────────┘ └─────────────────────────────────┘
MCP Tools
53+ tools organized by category:
| Category | Tools |
|---|---|
| Navigation | new_page, navigate, go_back, go_forward, reload |
| Input | click, type, press_key, hover, scroll, drag, fill, clear, focus, check, uncheck, select_option, upload_file |
| Observation | take_snapshot, take_enhanced_snapshot, extract_text, extract_links |
| Screenshots | take_screenshot, save_screenshot |
| Evaluation | evaluate_script |
| Pages | list_pages, active_page, close_page, new_hidden_page |
| Windows | window_list, window_create, window_close, window_activate |
| Bookmarks | bookmark_list, bookmark_create, bookmark_remove, bookmark_update, bookmark_move, bookmark_search |
| History | history_search, history_recent, history_delete, history_delete_range |
| Tab Groups | group_list, group_create, group_update, group_ungroup, group_close |
| Filesystem | ls, read, write, edit, find, grep, bash |
| Memory | read_core, update_core, read_soul, update_soul, search_memory, write_memory |
| DOM | dom, dom_search |
| Console | get_console_messages |
| Other | browseros_info, handle_dialog, wait_for, download, export_pdf, output_file, nudges |
Agent Loop
The agent loop uses the Vercel AI SDK to orchestrate multi-step browser automation:
- Multi-provider support — OpenAI, Anthropic, Google, Azure, Bedrock, OpenRouter, Ollama, LM Studio, and any OpenAI-compatible endpoint
- Session management — conversations persist in a local SQLite database
- Context overflow handling — automatic message compaction when context windows fill up
- MCP client — connects to external MCP servers for additional tool access (40+ app integrations)
- Tool adapter — bridges MCP tool definitions to AI SDK tool format
Provider Factory
The provider factory (src/agent/provider-factory.ts) creates AI SDK providers from runtime configuration, supporting hot-swapping between providers without restart.
Skills System
Skills are custom instruction sets that shape agent behavior:
- Catalog (
src/skills/catalog.ts) — registry of available skills - Defaults (
src/skills/defaults/) — built-in skill definitions - Loader (
src/skills/loader.ts) — loads skills from local and remote sources - Remote sync (
src/skills/remote-sync.ts) — syncs skills from the BrowserOS cloud
Graph Executor (Workflows)
The graph executor (src/graph/executor.ts) runs visual workflow graphs built in the BrowserOS workflow editor. Each node in the graph maps to agent actions, conditionals, or data transformations.
Directory Structure
apps/server/
├── src/
│ ├── index.ts # Server entry point
│ ├── main.ts # Server initialization
│ ├── api/ # HTTP route handlers
│ ├── agent/ # Agent loop
│ │ ├── ai-sdk-agent.ts # Main agent implementation
│ │ ├── provider-factory.ts# LLM provider factory
│ │ ├── session-store.ts # Conversation persistence
│ │ ├── compaction.ts # Context window management
│ │ ├── mcp-builder.ts # External MCP client setup
│ │ └── tool-adapter.ts # MCP → AI SDK tool bridge
│ ├── browser/ # Browser connection layer
│ ├── tools/ # MCP tool implementations
│ │ ├── navigation.ts
│ │ ├── input.ts
│ │ ├── snapshot.ts
│ │ ├── memory/
│ │ ├── filesystem/
│ │ └── ...
│ ├── skills/ # Skills system
│ ├── graph/ # Workflow graph executor
│ ├── lib/ # Shared utilities
│ └── rpc.ts # JSON-RPC type definitions
├── tests/
│ ├── tools/ # Tool-level tests
│ ├── sdk/ # SDK integration tests
│ └── server.integration.test.ts
├── graph/ # Workflow graph definitions
└── package.json
Development
Prerequisites
- Bun runtime
- A running BrowserOS instance (for CDP and controller connections)
Setup
# Copy environment files
cp .env.example .env.development
# Start the server (with hot reload)
bun run start
See the agent monorepo README for full environment variable reference and process-compose setup.
Testing
bun run test:tools # Tool-level tests
bun run test:integration # Full integration tests (requires running BrowserOS)
bun run test:sdk # SDK integration tests
Building
# Build cross-platform server binaries
bun run build
# Build for specific targets
bun scripts/build/server.ts --target=darwin-arm64,linux-x64
# Build without uploading to R2
bun scripts/build/server.ts --target=all --no-upload
Ports
| Port | Env Variable | Purpose |
|---|---|---|
| 9100 | BROWSEROS_SERVER_PORT |
HTTP server (MCP, chat, health) |
| 9000 | BROWSEROS_CDP_PORT |
Chromium CDP (server connects as client) |
| 9300 | BROWSEROS_EXTENSION_PORT |
WebSocket for controller extension |