Files
BrowserOS/apps/cli/README.md
Nikhil b7e63a4a1f feat: add browseros-cli Go CLI for browser automation (#421)
* feat: add browseros-cli Go CLI for browser automation

Implements a full-featured CLI that communicates with the BrowserOS MCP
server over JSON-RPC 2.0 / StreamableHTTP. Covers all 54 MCP tools across
10 categories with a hybrid command structure (flat verbs for hot-path
commands, grouped noun-verb for resource management).

- MCP client with initialize + tools/call pattern, thread-safe request IDs
- Dual output: human-readable default, --json for structured/piped usage
- Implicit active page resolution with --page override
- 21 command files: open, nav, snap, click, fill, scroll, eval, ss, pdf,
  dom, wait, dialog, pages, window, bookmark, history, group, health, info
- Cobra CLI framework with fatih/color for terminal formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add end-to-end integration tests for browseros-cli

Go integration tests gated by `//go:build integration` that exercise the
CLI binary against a running BrowserOS server. Tests build the binary,
run commands via exec.Command, and verify JSON output.

Covers: health, version, page lifecycle (open → text → snap → eval →
screenshot → nav → reload → close), active page, info, error handling,
and invalid page ID rejection. Skips gracefully when no server is running.

Run with: go test -tags integration -v ./...

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add init command and fix MCP client bugs

- Add `browseros-cli init` command that prompts for the server URL,
  verifies connectivity, and saves to ~/.config/browseros-cli/config.json
- Config priority: --server flag > BROWSEROS_URL env > config file > default
- Fix Accept header: include text/event-stream (required by StreamableHTTPTransport)
- Fix nil args: send empty object {} instead of null for tools with no params
- Update error messages to suggest `browseros-cli init` on connection failure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add README for browseros-cli with setup, usage, and testing guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: always send arguments object in MCP tools/call

Go's json omitempty omits empty maps, causing the arguments field to be
missing from tools/call requests. The MCP SDK requires arguments to be
an object (even empty {}), not undefined. Remove omitempty from the tag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: update help menu to be have groups

* refactor: replace hand-rolled MCP client with official Go SDK

Switch from custom JSON-RPC implementation to the official
github.com/modelcontextprotocol/go-sdk. This removes all hand-rolled
protocol types (jsonrpcRequest, jsonrpcResponse, RPCError, etc.) and
uses the SDK's StreamableClientTransport with DisableStandaloneSSE
for clean CLI process lifecycle.

Also adds URL normalization/validation, config command, and
updates init/README to reference YAML config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 14:49:45 -08:00

167 lines
5.6 KiB
Markdown

# browseros-cli
Command-line interface for controlling BrowserOS via MCP. Talks to the BrowserOS MCP server over JSON-RPC 2.0 / StreamableHTTP.
## Setup
Requires Go 1.25+.
```bash
# Build
make
# First run — configure server connection
./browseros-cli init
```
The `init` command prompts for your MCP server URL. Find it in:
**BrowserOS → Settings → BrowserOS MCP → Server URL**
The port varies per installation (e.g., `http://127.0.0.1:9004/mcp`).
Config is saved to `~/.config/browseros-cli/config.yaml`.
## Usage
```bash
# Check connection
browseros-cli health
browseros-cli status
# Pages
browseros-cli pages # List all tabs
browseros-cli active # Show active tab
browseros-cli open https://example.com
browseros-cli close 42
# Navigation
browseros-cli nav https://example.com
browseros-cli back
browseros-cli forward
browseros-cli reload
# Observation
browseros-cli snap # Accessibility tree snapshot
browseros-cli snap -e # Enhanced snapshot
browseros-cli text # Extract page as markdown
browseros-cli links # Extract all links
browseros-cli eval "document.title" # Run JavaScript
# Input
browseros-cli click e5 # Click element by ref
browseros-cli click-at 100 200 # Click at coordinates
browseros-cli fill e12 "hello" # Type into input
browseros-cli key Enter # Press key
browseros-cli hover e3
browseros-cli scroll down 500
# Screenshots & export
browseros-cli ss # Screenshot (saves to screenshot.png)
browseros-cli ss -o shot.png # Screenshot to specific file
browseros-cli pdf -o page.pdf # Export as PDF
# Resource management (grouped commands)
browseros-cli window list
browseros-cli bookmark search "github"
browseros-cli history recent
browseros-cli group list
```
## Global Flags
| Flag | Env Var | Description |
|------|---------|-------------|
| `--server, -s` | `BROWSEROS_URL` | Server URL (default: from config) |
| `--page, -p` | `BROWSEROS_PAGE` | Target page ID (default: active page) |
| `--json` | `BOS_JSON=1` | JSON output (outputs structuredContent) |
| `--debug` | `BOS_DEBUG=1` | Debug output |
| `--timeout, -t` | | Request timeout (default: 2m) |
Priority for server URL: `--server` flag > `BROWSEROS_URL` env > config file
If no server URL is configured, the CLI exits with setup instructions instead of assuming a localhost port.
## Testing
Integration tests require a running BrowserOS server with the dev build (for structured content support).
```bash
# 1. Start the dev server from the monorepo root
bun run dev:watch:new
# 2. Configure the CLI to point at the dev server
./browseros-cli init
# Enter the Server URL shown in BrowserOS settings
# 3. Run integration tests
make test
# Or with a custom server URL
BROWSEROS_URL=http://127.0.0.1:9105 go test -tags integration -v ./...
```
Tests skip gracefully if no server is reachable — they won't fail in environments without BrowserOS.
The integration tests (`integration_test.go`) cover:
- Health check and version
- Page lifecycle: open → text → snap → eval → screenshot → nav → reload → close
- Active page query
- Info command
- Error handling (invalid page ID, JS errors)
## Build
```bash
make # Build binary
make vet # Run go vet
make test # Run integration tests
make install # Install to $GOPATH/bin
make clean # Remove binary
VERSION=1.0 make # Build with version
```
## Architecture
```
apps/cli/
├── main.go # Entry point
├── Makefile # Build targets
├── config/
│ └── config.go # Config file (~/.config/browseros-cli/config.yaml)
├── cmd/
│ ├── root.go # Root command, global flags
│ ├── init.go # Server URL configuration
│ ├── open.go # open (new_page / new_hidden_page)
│ ├── nav.go # nav, back, forward, reload
│ ├── pages.go # pages, active, close
│ ├── snap.go # snap (take_snapshot / take_enhanced_snapshot)
│ ├── text.go # text, links
│ ├── screenshot.go # ss (take_screenshot / save_screenshot)
│ ├── eval.go # eval (evaluate_script)
│ ├── click.go # click, click-at
│ ├── fill.go # fill, clear, key
│ ├── interact.go # hover, focus, check, uncheck, select, drag, upload
│ ├── scroll.go # scroll
│ ├── dialog.go # dialog (handle_dialog)
│ ├── wait.go # wait (wait_for)
│ ├── file_actions.go # pdf, download
│ ├── dom.go # dom, dom-search
│ ├── window.go # window {list,create,close,activate}
│ ├── bookmark.go # bookmark {list,create,remove,update,move,search}
│ ├── history.go # history {search,recent,delete,delete-range}
│ ├── group.go # group {list,create,update,ungroup,close}
│ ├── health.go # health, status (REST endpoints)
│ └── info.go # info (browseros_info)
├── mcp/
│ ├── client.go # MCP JSON-RPC 2.0 client (initialize + tools/call)
│ └── types.go # JSON-RPC and MCP type definitions
└── output/
└── printer.go # Human-readable and JSON output formatting
```
The CLI communicates with BrowserOS via two HTTP POST requests per command:
1. `initialize` — MCP handshake
2. `tools/call` — execute the actual tool
All 54 MCP tools are mapped to CLI commands.