Files
BrowserOS/packages/browseros-agent/apps/cli
Dani Akash b3003542d8 docs: overhaul READMEs across all major packages (#594)
* docs: overhaul READMEs across all major packages

- Root README: restructure with feature table, LLM provider table,
  comparison matrix, architecture map, and docs link
- New: packages/browseros/README.md (Chromium fork build system)
- New: apps/server/README.md (MCP server + agent loop)
- New: packages/cdp-protocol/README.md (CDP type bindings)
- Polish: agent-sdk (badges, prerequisites, multi-step example, links)
- Polish: cli (badges, install section, MCP server section, links)
- Polish: agent extension (badges, WXT mention, architecture context)
- Polish: eval (badges, paper links)

* fix: address review — consistent tool count and correct default port

- CLI README: "54 MCP tools" → "53+ MCP tools" to match root and server docs
- Agent SDK README: localhost:3000 → localhost:9100 to match documented default

* docs: add detailed comparison links to How We Compare section

* docs: update comparison table with verified competitor data

Research all 5 competitors via official websites and docs:
- Chrome: no AI agent, Gemini Nano only, MV3 weakening ad blocking
- Brave: BYOM feature, local models via BYOM, Shields ad blocking, MV2+MV3
- Dia: Skills-based AI, no BYOK, cloud AI, acquired by Atlassian
- Comet: full cloud-based agent, built-in ad blocking, extensions on desktop
- Atlas: standalone Chromium browser with Agent Mode, 30-day cloud memory

Renamed Arc/Dia column to just Dia (Arc is sunset).

* docs: simplify comparison table with clean checkmarks and key differentiators

* docs: update browseros-agent README — remove submodule note, add missing packages
2026-03-27 11:59:04 +05:30
..

browseros-cli

License: AGPL v3

Command-line interface for controlling BrowserOS — launch and automate the browser from the terminal or from AI coding agents like Claude Code and Gemini CLI.

Communicates with the BrowserOS MCP server over JSON-RPC 2.0 / StreamableHTTP. All 53+ MCP tools are mapped to CLI commands.

Install

macOS / Linux

curl -fsSL https://cdn.browseros.com/cli/install.sh | bash

Windows

irm https://cdn.browseros.com/cli/install.ps1 | iex

Build from Source

Requires Go 1.25+.

make            # Build binary
make install    # Install to $GOPATH/bin

Setup

# First run — configure server connection
browseros-cli init

The init command prompts for your MCP server URL. Find it in: BrowserOS → Settings → BrowserOS MCP → Server URL

The port varies per installation (e.g., http://127.0.0.1:9004/mcp).

Config is saved to ~/.config/browseros-cli/config.yaml.

Usage

# Check connection
browseros-cli health
browseros-cli status

# Pages
browseros-cli pages                 # List all tabs
browseros-cli active                # Show active tab
browseros-cli open https://example.com
browseros-cli close 42

# Navigation
browseros-cli nav https://example.com
browseros-cli back
browseros-cli forward
browseros-cli reload

# Observation
browseros-cli snap                  # Accessibility tree snapshot
browseros-cli snap -e               # Enhanced snapshot
browseros-cli text                  # Extract page as markdown
browseros-cli links                 # Extract all links
browseros-cli eval "document.title" # Run JavaScript

# Input
browseros-cli click e5              # Click element by ref
browseros-cli click-at 100 200      # Click at coordinates
browseros-cli fill e12 "hello"      # Type into input
browseros-cli key Enter             # Press key
browseros-cli hover e3
browseros-cli scroll down 500

# Screenshots & export
browseros-cli ss                    # Screenshot (saves to screenshot.png)
browseros-cli ss -o shot.png        # Screenshot to specific file
browseros-cli pdf -o page.pdf       # Export as PDF

# Resource management (grouped commands)
browseros-cli window list
browseros-cli bookmark search "github"
browseros-cli history recent
browseros-cli group list

Use as MCP Server

BrowserOS exposes an MCP server that AI coding agents can connect to directly. The CLI is the easiest way to verify the connection and interact with tools from the terminal.

To connect Claude Code, Gemini CLI, or any MCP client, see the MCP setup guide.

Global Flags

Flag Env Var Description
--server, -s BROWSEROS_URL Server URL (default: from config)
--page, -p BROWSEROS_PAGE Target page ID (default: active page)
--json BOS_JSON=1 JSON output (outputs structuredContent)
--debug BOS_DEBUG=1 Debug output
--timeout, -t Request timeout (default: 2m)

Priority for server URL: --server flag > BROWSEROS_URL env > config file

If no server URL is configured, the CLI exits with setup instructions instead of assuming a localhost port.

Testing

Integration tests require a running BrowserOS server with the dev build (for structured content support).

# 1. Start the dev server from the monorepo root
bun run dev:watch:new

# 2. Configure the CLI to point at the dev server
./browseros-cli init
# Enter the Server URL shown in BrowserOS settings

# 3. Run integration tests
make test

# Or with a custom server URL
BROWSEROS_URL=http://127.0.0.1:9105 go test -tags integration -v ./...

Tests skip gracefully if no server is reachable — they won't fail in environments without BrowserOS.

The integration tests (integration_test.go) cover:

  • Health check and version
  • Page lifecycle: open → text → snap → eval → screenshot → nav → reload → close
  • Active page query
  • Info command
  • Error handling (invalid page ID, JS errors)

Build

make                    # Build binary
make vet                # Run go vet
make test               # Run integration tests
make install            # Install to $GOPATH/bin
make clean              # Remove binary
VERSION=1.0 make        # Build with version

Architecture

apps/cli/
├── main.go             # Entry point
├── Makefile            # Build targets
├── config/
│   └── config.go       # Config file (~/.config/browseros-cli/config.yaml)
├── cmd/
│   ├── root.go         # Root command, global flags
│   ├── init.go         # Server URL configuration
│   ├── open.go         # open (new_page / new_hidden_page)
│   ├── nav.go          # nav, back, forward, reload
│   ├── pages.go        # pages, active, close
│   ├── snap.go         # snap (take_snapshot / take_enhanced_snapshot)
│   ├── text.go         # text, links
│   ├── screenshot.go   # ss (take_screenshot / save_screenshot)
│   ├── eval.go         # eval (evaluate_script)
│   ├── click.go        # click, click-at
│   ├── fill.go         # fill, clear, key
│   ├── interact.go     # hover, focus, check, uncheck, select, drag, upload
│   ├── scroll.go       # scroll
│   ├── dialog.go       # dialog (handle_dialog)
│   ├── wait.go         # wait (wait_for)
│   ├── file_actions.go # pdf, download
│   ├── dom.go          # dom, dom-search
│   ├── window.go       # window {list,create,close,activate}
│   ├── bookmark.go     # bookmark {list,create,remove,update,move,search}
│   ├── history.go      # history {search,recent,delete,delete-range}
│   ├── group.go        # group {list,create,update,ungroup,close}
│   ├── health.go       # health, status (REST endpoints)
│   └── info.go         # info (browseros_info)
├── mcp/
│   ├── client.go       # MCP JSON-RPC 2.0 client (initialize + tools/call)
│   └── types.go        # JSON-RPC and MCP type definitions
└── output/
    └── printer.go      # Human-readable and JSON output formatting

The CLI communicates with BrowserOS via two HTTP POST requests per command:

  1. initialize — MCP handshake
  2. tools/call — execute the actual tool