Files
Dani Akash dde403962f fix(server): tighten CORS allowlist for the agent server (#966)
* fix(server): tighten CORS allowlist for the agent server

Replace the permissive `origin || '*'` reflection in
`defaultCorsConfig` with an explicit allowlist composed of:

- a static list (empty by default)
- comma-separated origins from `BROWSEROS_TRUSTED_ORIGINS`

Add a small `requireTrustedOrigin` middleware that actively
rejects (403) any request whose `Origin` header is present and
not in the allowlist. The middleware is permissive when the
`Origin` header is absent — CLI tools, internal Node clients,
and some service-worker fetches legitimately omit it; the
threat model only covers cross-origin browser fetches, which
always carry `Origin` (it's on the Forbidden Header List, so
JS cannot suppress it).

Mount the middleware globally in `createHttpServer` after the
existing `cors()` layer. Document the new env var in
`.env.example`.

Tests cover allowlist parsing (empty, single, multi, trims,
case sensitivity, port match) and middleware behaviour
(missing Origin allowed, allowlisted Origin allowed, unknown
Origin rejected, "null" rejected, port mismatch rejected,
disallowed Origin doesn't reach the handler).

* fix(server): include published extension origin in default allowlist

Pin the published BrowserOS extension origin in the static
allowlist so the default install accepts the legitimate
extension without requiring `BROWSEROS_TRUSTED_ORIGINS` to be
populated. Additional origins (dev / alpha) keep working
through the env override.

* chore(server): trim .env.example comments

* chore(server): drop redundant comments from cors helpers
2026-05-08 11:22:54 +05:30
..
2026-03-17 19:01:10 +05:30

BrowserOS Server

MCP server and AI agent loop powering BrowserOS browser automation. This is the core backend — it connects to Chromium via CDP, exposes 53+ MCP tools, and runs the AI agent that interprets natural language into browser actions.

Runtime: Bun · Framework: Hono · AI: Vercel AI SDK · License: AGPL-3.0

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                         MCP Clients                                  │
│           (Agent UI, Claude Code, Gemini CLI, browseros-cli)         │
└──────────────────────────────────────────────────────────────────────┘
                                │
                                │ HTTP / SSE / StreamableHTTP
                                ▼
┌──────────────────────────────────────────────────────────────────────┐
│                    BrowserOS Server (Bun)                             │
│                                                                      │
│   /mcp ─────── MCP tool endpoints (53+ tools)                       │
│   /chat ────── Agent streaming (AI SDK)                              │
│   /health ─── Health check                                           │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  Agent Loop                                                  │   │
│   │  ├── Multi-provider AI SDK (OpenAI, Anthropic, Google, ...) │   │
│   │  ├── Session & conversation management                       │   │
│   │  ├── Context overflow handling + compaction                  │   │
│   │  └── MCP client for external tool servers                    │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  CDP-backed browser tools                                   │   │
│   │  (tabs, bookmarks, history, navigation, tab groups,         │   │
│   │   screenshots, DOM, network, console, input)                │   │
│   └─────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────┘
                                │
                                │ Chrome DevTools Protocol
                                ▼
                     ┌─────────────────────┐
                     │   Chromium CDP      │
                     │  (port 9000)        │
                     │                     │
                     │  DOM, network,      │
                     │  input, screenshots │
                     └─────────────────────┘

MCP Tools

53+ tools organized by category:

Category Tools
Navigation new_page, navigate, go_back, go_forward, reload
Input click, type, press_key, hover, scroll, drag, fill, clear, focus, check, uncheck, select_option, upload_file
Observation take_snapshot, take_enhanced_snapshot, extract_text, extract_links
Screenshots take_screenshot, save_screenshot
Evaluation evaluate_script
Pages list_pages, active_page, close_page, new_hidden_page
Windows window_list, window_create, window_close, window_activate
Bookmarks bookmark_list, bookmark_create, bookmark_remove, bookmark_update, bookmark_move, bookmark_search
History history_search, history_recent, history_delete, history_delete_range
Tab Groups group_list, group_create, group_update, group_ungroup, group_close
Filesystem ls, read, write, edit, find, grep, bash
Memory read_core, update_core, read_soul, update_soul, search_memory, write_memory
DOM dom, dom_search
Console get_console_messages
Other browseros_info, handle_dialog, wait_for, download, export_pdf, output_file, nudges

Agent Loop

The agent loop uses the Vercel AI SDK to orchestrate multi-step browser automation:

  • Multi-provider support — OpenAI, Anthropic, Google, Azure, Bedrock, OpenRouter, Ollama, LM Studio, and any OpenAI-compatible endpoint
  • Session management — conversations persist in a local SQLite database
  • Context overflow handling — automatic message compaction when context windows fill up
  • MCP client — connects to external MCP servers for additional tool access (40+ app integrations)
  • Tool adapter — bridges MCP tool definitions to AI SDK tool format

Provider Factory

The provider factory (src/agent/provider-factory.ts) creates AI SDK providers from runtime configuration, supporting hot-swapping between providers without restart.

Skills System

Skills are custom instruction sets that shape agent behavior:

  • Catalog (src/skills/catalog.ts) — registry of available skills
  • Defaults (src/skills/defaults/) — built-in skill definitions
  • Loader (src/skills/loader.ts) — loads skills from local and remote sources
  • Remote sync (src/skills/remote-sync.ts) — syncs skills from the BrowserOS cloud

Directory Structure

apps/server/
├── src/
│   ├── index.ts               # Server entry point
│   ├── main.ts                # Server initialization
│   ├── api/                   # HTTP route handlers
│   ├── agent/                 # Agent loop
│   │   ├── ai-sdk-agent.ts    # Main agent implementation
│   │   ├── provider-factory.ts# LLM provider factory
│   │   ├── session-store.ts   # Conversation persistence
│   │   ├── compaction.ts      # Context window management
│   │   ├── mcp-builder.ts     # External MCP client setup
│   │   └── tool-adapter.ts    # MCP → AI SDK tool bridge
│   ├── browser/               # Browser connection layer
│   ├── tools/                 # MCP tool implementations
│   │   ├── navigation.ts
│   │   ├── input.ts
│   │   ├── snapshot.ts
│   │   ├── memory/
│   │   ├── filesystem/
│   │   └── ...
│   ├── skills/                # Skills system
│   ├── lib/                   # Shared utilities
│   └── rpc.ts                 # JSON-RPC type definitions
├── tests/
│   ├── tools/                 # Tool-level tests
│   ├── sdk/                   # SDK integration tests
│   └── server.integration.test.ts
└── package.json

Development

Prerequisites

  • Bun runtime
  • A running BrowserOS instance (for CDP connectivity)

Setup

# Copy environment files
cp .env.example .env.development

# Start the server (with hot reload)
bun run start

See the agent monorepo README for full environment variable reference and process-compose setup.

Testing

bun run test:tools          # Tool-level tests
bun run test:integration    # Full integration tests (requires running BrowserOS)
bun run test:sdk            # SDK integration tests

Building

# Build cross-platform server binaries
bun run build

# Build for specific targets
bun scripts/build/server.ts --target=darwin-arm64,linux-x64

# Build without uploading to R2
bun scripts/build/server.ts --target=all --no-upload

Ports

Port Env Variable Purpose
9100 BROWSEROS_SERVER_PORT HTTP server (MCP, chat, health)
9000 BROWSEROS_CDP_PORT Chromium CDP (server connects as client)
9300 BROWSEROS_EXTENSION_PORT Legacy BrowserOS launch arg kept for compatibility