Files
BrowserOS/packages/browseros-agent/apps/server
Nikhil 517750e880 feat: add PostHog to CLI (#603)
* feat: add PostHog usage analytics to CLI

Add anonymous command-level analytics to browseros-cli using the PostHog
Go SDK. Tracks which commands are executed, their success/failure status,
and duration — no PII or person profiles.

- New analytics package with Init/Track/Close singleton
- Distinct ID resolves from server's browseros_id (server.json), falls
  back to CLI-generated UUID (~/.config/browseros-cli/install_id)
- API key injected at build time via ldflags (dev builds = silent no-op)
- Server now writes browseros_id into server.json for cross-surface
  identity correlation

* fix: address PR review feedback for #603

- Return "unknown" for unrecognized args in commandName to avoid
  sending arbitrary user input to PostHog
- Revert goreleaser to {{ .Env.POSTHOG_API_KEY }} (intentional hard
  fail — release builds must have the key set)
- go mod tidy to fix posthog-go direct/indirect marker
- Add POSTHOG_API_KEY to .env.production.example
2026-03-27 12:05:34 -07:00
..
2026-03-27 12:05:34 -07:00
2026-03-17 19:01:10 +05:30
2026-03-26 19:13:56 -07:00

BrowserOS Server

MCP server and AI agent loop powering BrowserOS browser automation. This is the core backend — it connects to Chromium via CDP, exposes 53+ MCP tools, and runs the AI agent that interprets natural language into browser actions.

Runtime: Bun · Framework: Hono · AI: Vercel AI SDK · License: AGPL-3.0

Architecture

┌──────────────────────────────────────────────────────────────────────┐
│                         MCP Clients                                  │
│           (Agent UI, Claude Code, Gemini CLI, browseros-cli)         │
└──────────────────────────────────────────────────────────────────────┘
                                │
                                │ HTTP / SSE / StreamableHTTP
                                ▼
┌──────────────────────────────────────────────────────────────────────┐
│                    BrowserOS Server (Bun)                             │
│                                                                      │
│   /mcp ─────── MCP tool endpoints (53+ tools)                       │
│   /chat ────── Agent streaming (AI SDK)                              │
│   /health ─── Health check                                           │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │  Agent Loop                                                  │   │
│   │  ├── Multi-provider AI SDK (OpenAI, Anthropic, Google, ...) │   │
│   │  ├── Session & conversation management                       │   │
│   │  ├── Context overflow handling + compaction                  │   │
│   │  └── MCP client for external tool servers                    │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│   ┌────────────────────┐    ┌────────────────────────────────────┐  │
│   │  CDP Tools          │    │  Controller Tools                  │  │
│   │  (screenshots,      │    │  (tabs, bookmarks, history,        │  │
│   │   DOM, network,     │    │   navigation, tab groups)          │  │
│   │   console, input)   │    │                                    │  │
│   └────────────────────┘    └────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────┘
          │                                         │
          │ Chrome DevTools Protocol                │ WebSocket
          ▼                                         ▼
┌─────────────────────┐              ┌─────────────────────────────────┐
│   Chromium CDP       │              │   Controller Extension          │
│  (port 9000)         │              │  (port 9300)                    │
│                      │              │                                 │
│  DOM, network,       │              │  chrome.tabs, chrome.history,   │
│  input, screenshots  │              │  chrome.bookmarks               │
└─────────────────────┘              └─────────────────────────────────┘

MCP Tools

53+ tools organized by category:

Category Tools
Navigation new_page, navigate, go_back, go_forward, reload
Input click, type, press_key, hover, scroll, drag, fill, clear, focus, check, uncheck, select_option, upload_file
Observation take_snapshot, take_enhanced_snapshot, extract_text, extract_links
Screenshots take_screenshot, save_screenshot
Evaluation evaluate_script
Pages list_pages, active_page, close_page, new_hidden_page
Windows window_list, window_create, window_close, window_activate
Bookmarks bookmark_list, bookmark_create, bookmark_remove, bookmark_update, bookmark_move, bookmark_search
History history_search, history_recent, history_delete, history_delete_range
Tab Groups group_list, group_create, group_update, group_ungroup, group_close
Filesystem ls, read, write, edit, find, grep, bash
Memory read_core, update_core, read_soul, update_soul, search_memory, write_memory
DOM dom, dom_search
Console get_console_messages
Other browseros_info, handle_dialog, wait_for, download, export_pdf, output_file, nudges

Agent Loop

The agent loop uses the Vercel AI SDK to orchestrate multi-step browser automation:

  • Multi-provider support — OpenAI, Anthropic, Google, Azure, Bedrock, OpenRouter, Ollama, LM Studio, and any OpenAI-compatible endpoint
  • Session management — conversations persist in a local SQLite database
  • Context overflow handling — automatic message compaction when context windows fill up
  • MCP client — connects to external MCP servers for additional tool access (40+ app integrations)
  • Tool adapter — bridges MCP tool definitions to AI SDK tool format

Provider Factory

The provider factory (src/agent/provider-factory.ts) creates AI SDK providers from runtime configuration, supporting hot-swapping between providers without restart.

Skills System

Skills are custom instruction sets that shape agent behavior:

  • Catalog (src/skills/catalog.ts) — registry of available skills
  • Defaults (src/skills/defaults/) — built-in skill definitions
  • Loader (src/skills/loader.ts) — loads skills from local and remote sources
  • Remote sync (src/skills/remote-sync.ts) — syncs skills from the BrowserOS cloud

Graph Executor (Workflows)

The graph executor (src/graph/executor.ts) runs visual workflow graphs built in the BrowserOS workflow editor. Each node in the graph maps to agent actions, conditionals, or data transformations.

Directory Structure

apps/server/
├── src/
│   ├── index.ts               # Server entry point
│   ├── main.ts                # Server initialization
│   ├── api/                   # HTTP route handlers
│   ├── agent/                 # Agent loop
│   │   ├── ai-sdk-agent.ts    # Main agent implementation
│   │   ├── provider-factory.ts# LLM provider factory
│   │   ├── session-store.ts   # Conversation persistence
│   │   ├── compaction.ts      # Context window management
│   │   ├── mcp-builder.ts     # External MCP client setup
│   │   └── tool-adapter.ts    # MCP → AI SDK tool bridge
│   ├── browser/               # Browser connection layer
│   ├── tools/                 # MCP tool implementations
│   │   ├── navigation.ts
│   │   ├── input.ts
│   │   ├── snapshot.ts
│   │   ├── memory/
│   │   ├── filesystem/
│   │   └── ...
│   ├── skills/                # Skills system
│   ├── graph/                 # Workflow graph executor
│   ├── lib/                   # Shared utilities
│   └── rpc.ts                 # JSON-RPC type definitions
├── tests/
│   ├── tools/                 # Tool-level tests
│   ├── sdk/                   # SDK integration tests
│   └── server.integration.test.ts
├── graph/                     # Workflow graph definitions
└── package.json

Development

Prerequisites

  • Bun runtime
  • A running BrowserOS instance (for CDP and controller connections)

Setup

# Copy environment files
cp .env.example .env.development

# Start the server (with hot reload)
bun run start

See the agent monorepo README for full environment variable reference and process-compose setup.

Testing

bun run test:tools          # Tool-level tests
bun run test:integration    # Full integration tests (requires running BrowserOS)
bun run test:sdk            # SDK integration tests

Building

# Build cross-platform server binaries
bun run build

# Build for specific targets
bun scripts/build/server.ts --target=darwin-arm64,linux-x64

# Build without uploading to R2
bun scripts/build/server.ts --target=all --no-upload

Ports

Port Env Variable Purpose
9100 BROWSEROS_SERVER_PORT HTTP server (MCP, chat, health)
9000 BROWSEROS_CDP_PORT Chromium CDP (server connects as client)
9300 BROWSEROS_EXTENSION_PORT WebSocket for controller extension