mirror of https://github.com/browseros-ai/BrowserOS.git synced 2026-05-13 15:46:22 +00:00

Files

Dani Akash dde403962f fix(server): tighten CORS allowlist for the agent server (#966 )

* fix(server): tighten CORS allowlist for the agent server

Replace the permissive `origin || '*'` reflection in
`defaultCorsConfig` with an explicit allowlist composed of:

- a static list (empty by default)
- comma-separated origins from `BROWSEROS_TRUSTED_ORIGINS`

Add a small `requireTrustedOrigin` middleware that actively
rejects (403) any request whose `Origin` header is present and
not in the allowlist. The middleware is permissive when the
`Origin` header is absent — CLI tools, internal Node clients,
and some service-worker fetches legitimately omit it; the
threat model only covers cross-origin browser fetches, which
always carry `Origin` (it's on the Forbidden Header List, so
JS cannot suppress it).

Mount the middleware globally in `createHttpServer` after the
existing `cors()` layer. Document the new env var in
`.env.example`.

Tests cover allowlist parsing (empty, single, multi, trims,
case sensitivity, port match) and middleware behaviour
(missing Origin allowed, allowlisted Origin allowed, unknown
Origin rejected, "null" rejected, port mismatch rejected,
disallowed Origin doesn't reach the handler).

* fix(server): include published extension origin in default allowlist

Pin the published BrowserOS extension origin in the static
allowlist so the default install accepts the legitimate
extension without requiring `BROWSEROS_TRUSTED_ORIGINS` to be
populated. Additional origins (dev / alpha) keep working
through the env override.

* chore(server): trim .env.example comments

* chore(server): drop redundant comments from cors helpers

2026-05-08 11:22:54 +05:30

graph

Add 'packages/browseros-agent/' from commit '90bd4be3008285bf3825aad3702aff98f872671a'

2026-03-13 21:22:09 +05:30

src

fix(server): tighten CORS allowlist for the agent server (#966 )

2026-05-08 11:22:54 +05:30

tests

fix(server): tighten CORS allowlist for the agent server (#966 )

2026-05-08 11:22:54 +05:30

.env.example

fix(server): tighten CORS allowlist for the agent server (#966 )

2026-05-08 11:22:54 +05:30

.env.production.example

Add 'packages/browseros-agent/' from commit '90bd4be3008285bf3825aad3702aff98f872671a'

2026-03-13 21:22:09 +05:30

.gitignore

feat: improved system prompt (#466 )

2026-03-17 19:01:10 +05:30

package.json

chore: bump server and extension version (#659 )

2026-04-08 10:18:24 -07:00

README.md

chore(agent): remove workflows feature (#656 )

2026-04-08 08:42:22 +05:30

tsconfig.json

Add 'packages/browseros-agent/' from commit '90bd4be3008285bf3825aad3702aff98f872671a'

2026-03-13 21:22:09 +05:30

README.md

BrowserOS Server

MCP server and AI agent loop powering BrowserOS browser automation. This is the core backend — it connects to Chromium via CDP, exposes 53+ MCP tools, and runs the AI agent that interprets natural language into browser actions.

Runtime: Bun · Framework: Hono · AI: Vercel AI SDK · License: AGPL-3.0

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                         MCP Clients                                  │
│           (Agent UI, Claude Code, Gemini CLI, browseros-cli)         │
└──────────────────────────────────────────────────────────────────────┘
                                │
                                │ HTTP / SSE / StreamableHTTP
                                ▼
┌──────────────────────────────────────────────────────────────────────┐
│                    BrowserOS Server (Bun)                             │
│                                                                      │
│   /mcp ─────── MCP tool endpoints (53+ tools)                       │
│   /chat ────── Agent streaming (AI SDK)                              │
│   /health ─── Health check                                           │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  Agent Loop                                                  │   │
│   │  ├── Multi-provider AI SDK (OpenAI, Anthropic, Google, ...) │   │
│   │  ├── Session & conversation management                       │   │
│   │  ├── Context overflow handling + compaction                  │   │
│   │  └── MCP client for external tool servers                    │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  CDP-backed browser tools                                   │   │
│   │  (tabs, bookmarks, history, navigation, tab groups,         │   │
│   │   screenshots, DOM, network, console, input)                │   │
│   └─────────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────────┘
                                │
                                │ Chrome DevTools Protocol
                                ▼
                     ┌─────────────────────┐
                     │   Chromium CDP      │
                     │  (port 9000)        │
                     │                     │
                     │  DOM, network,      │
                     │  input, screenshots │
                     └─────────────────────┘

MCP Tools

53+ tools organized by category:

Category	Tools
Navigation	`new_page`, `navigate`, `go_back`, `go_forward`, `reload`
Input	`click`, `type`, `press_key`, `hover`, `scroll`, `drag`, `fill`, `clear`, `focus`, `check`, `uncheck`, `select_option`, `upload_file`
Observation	`take_snapshot`, `take_enhanced_snapshot`, `extract_text`, `extract_links`
Screenshots	`take_screenshot`, `save_screenshot`
Evaluation	`evaluate_script`
Pages	`list_pages`, `active_page`, `close_page`, `new_hidden_page`
Windows	`window_list`, `window_create`, `window_close`, `window_activate`
Bookmarks	`bookmark_list`, `bookmark_create`, `bookmark_remove`, `bookmark_update`, `bookmark_move`, `bookmark_search`
History	`history_search`, `history_recent`, `history_delete`, `history_delete_range`
Tab Groups	`group_list`, `group_create`, `group_update`, `group_ungroup`, `group_close`
Filesystem	`ls`, `read`, `write`, `edit`, `find`, `grep`, `bash`
Memory	`read_core`, `update_core`, `read_soul`, `update_soul`, `search_memory`, `write_memory`
DOM	`dom`, `dom_search`
Console	`get_console_messages`
Other	`browseros_info`, `handle_dialog`, `wait_for`, `download`, `export_pdf`, `output_file`, `nudges`

Agent Loop

The agent loop uses the Vercel AI SDK to orchestrate multi-step browser automation:

Multi-provider support — OpenAI, Anthropic, Google, Azure, Bedrock, OpenRouter, Ollama, LM Studio, and any OpenAI-compatible endpoint
Session management — conversations persist in a local SQLite database
Context overflow handling — automatic message compaction when context windows fill up
MCP client — connects to external MCP servers for additional tool access (40+ app integrations)
Tool adapter — bridges MCP tool definitions to AI SDK tool format

Provider Factory

The provider factory (src/agent/provider-factory.ts) creates AI SDK providers from runtime configuration, supporting hot-swapping between providers without restart.

Skills System

Skills are custom instruction sets that shape agent behavior:

Catalog (src/skills/catalog.ts) — registry of available skills
Defaults (src/skills/defaults/) — built-in skill definitions
Loader (src/skills/loader.ts) — loads skills from local and remote sources
Remote sync (src/skills/remote-sync.ts) — syncs skills from the BrowserOS cloud

Directory Structure

apps/server/
├── src/
│   ├── index.ts               # Server entry point
│   ├── main.ts                # Server initialization
│   ├── api/                   # HTTP route handlers
│   ├── agent/                 # Agent loop
│   │   ├── ai-sdk-agent.ts    # Main agent implementation
│   │   ├── provider-factory.ts# LLM provider factory
│   │   ├── session-store.ts   # Conversation persistence
│   │   ├── compaction.ts      # Context window management
│   │   ├── mcp-builder.ts     # External MCP client setup
│   │   └── tool-adapter.ts    # MCP → AI SDK tool bridge
│   ├── browser/               # Browser connection layer
│   ├── tools/                 # MCP tool implementations
│   │   ├── navigation.ts
│   │   ├── input.ts
│   │   ├── snapshot.ts
│   │   ├── memory/
│   │   ├── filesystem/
│   │   └── ...
│   ├── skills/                # Skills system
│   ├── lib/                   # Shared utilities
│   └── rpc.ts                 # JSON-RPC type definitions
├── tests/
│   ├── tools/                 # Tool-level tests
│   ├── sdk/                   # SDK integration tests
│   └── server.integration.test.ts
└── package.json

Development

Prerequisites

Bun runtime
A running BrowserOS instance (for CDP connectivity)

Setup

# Copy environment files
cp .env.example .env.development

# Start the server (with hot reload)
bun run start

See the agent monorepo README for full environment variable reference and process-compose setup.

Testing

bun run test:tools          # Tool-level tests
bun run test:integration    # Full integration tests (requires running BrowserOS)
bun run test:sdk            # SDK integration tests

Building

# Build cross-platform server binaries
bun run build

# Build for specific targets
bun scripts/build/server.ts --target=darwin-arm64,linux-x64

# Build without uploading to R2
bun scripts/build/server.ts --target=all --no-upload

Ports

Port	Env Variable	Purpose
9100	`BROWSEROS_SERVER_PORT`	HTTP server (MCP, chat, health)
9000	`BROWSEROS_CDP_PORT`	Chromium CDP (server connects as client)
9300	`BROWSEROS_EXTENSION_PORT`	Legacy BrowserOS launch arg kept for compatibility