Compare commits

...

65 Commits

Author SHA1 Message Date
Nikhil Sonti
d63fa91a3b chore: pin release.linux.yaml to arm64-only for sysroot bootstrap test
x64 already builds cleanly — the failing leg is arm64 cross-compile from
an x64 host. Pin the config to arm64 to exercise the new
install-sysroot.py path in configure without burning time on x64.
Flip back to [x64, arm64] once arm64 is green.
2026-04-06 18:14:19 -07:00
Nikhil Sonti
c0889577ba fix: install linux sysroot in configure, not via gclient hook
`gn gen` was failing on the arm64 leg with `Missing sysroot
(//build/linux/debian_bullseye_arm64-sysroot)`. The previous design
relied on `git_setup` writing `target_cpus` to `.gclient` so that
`gclient sync`'s DEPS hook would download the cross-arch sysroot. That
chain breaks for any chromium_src that was synced before cross-arch
support landed (the hook is gated on .gclient state at sync time) and
for partial pipeline runs that skip git_setup entirely. Nothing in
configure declared or verified its sysroot precondition.

Make configure self-healing: on Linux, invoke
`build/linux/sysroot_scripts/install-sysroot.py --arch=<target>`
directly before `gn gen`. install-sysroot.py is idempotent (stamp file
+ SHA check), fast when already installed, and decoupled from .gclient
— it's exactly what the failing assertion's error message recommends.
The script accepts our arch names directly: `x64` translates to `amd64`
internally via ARCH_TRANSLATIONS, and `arm64` is a valid pass-through.

Also temporarily pin release.linux.yaml to x64 only while we validate
the sysroot bootstrap end-to-end. Flip back to `[x64, arm64]` once
arm64 is green.
2026-04-06 18:14:19 -07:00
Nikhil
8de2bf984f feat: build linux x64 + arm64 in a single invocation (#652)
`release.linux.yaml` now declares `architecture: [x64, arm64]` and the
runner loops the entire pipeline once per architecture. depot_tools
fetches both Linux sysroots automatically — `git_setup` idempotently
ensures `target_cpus = ['x64', 'arm64']` is in `.gclient` before
`gclient sync`, so cross-compiling arm64 from an x64 host just works.

The resolver returns `List[Context]` (single-element for the common
single-arch case), and `build/cli/build.py` loops `execute_pipeline` over
the per-arch contexts. Modules stay 100% arch-agnostic — no new
orchestration module, no new YAML schema beyond the list form.

Also fix a cross-compile bug in `build/modules/package/linux.py`: the
appimagetool binary must match the BUILD machine's arch (it executes
locally), not the target arch. Split into a host-keyed
`LINUX_HOST_APPIMAGETOOL` lookup vs the existing target-keyed
`LINUX_ARCHITECTURE_CONFIG`. Target arch is still passed to appimagetool
via the `ARCH` env var.

- build/common/resolver.py: scalar OR list `architecture` -> List[Context]
- build/cli/build.py: loop pipeline per arch, log multi-arch headers
- build/config/release.linux.yaml: `architecture: [x64, arm64]`
- build/modules/setup/git.py: idempotent `target_cpus` edit on Linux
- build/modules/package/linux.py: host vs target appimagetool split
- build/modules/package/linux_test.py: cover the host/target split
2026-04-06 13:08:06 -07:00
Nikhil
1b8720740c feat: add linux arm64 release support (#651)
* feat: support linux arm64 release artifacts

* fix: address PR review comments for 0406-linux_arm64_support
2026-04-06 10:20:38 -07:00
Nikhil
91be726381 refactor: remove --compile-only flag, consolidate into --ci (#646)
The --compile-only and --ci flags served overlapping purposes for CI
builds. Remove --compile-only entirely since --ci already handles the
CI use case (skip R2, skip prod env validation, local zip packaging)
and --no-upload covers the upload-skipping use case for full builds.
2026-04-03 14:58:52 -07:00
Nikhil
ff5386a24a fix: agent storage issue on update (#643)
* fix: agent storage erase issue fix

* fix: remove the guard against remote
2026-04-03 14:50:14 -07:00
Nikhil
a5f3c4da65 fix: skip windows exe patching in ci mode to avoid wine dependency (#645)
The server release CI workflow fails on ubuntu-latest because
patch-windows-exe.ts requires Wine to run rcedit. Thread the existing
--ci flag through compileServerBinaries so Windows PE metadata patching
is skipped in CI mode with a warning log.
2026-04-03 14:46:33 -07:00
Nikhil
e5a852dd3d chore: update server version (#644) 2026-04-03 14:29:07 -07:00
Felarof
aee30ce8e1 Update README.md (#638) 2026-04-02 13:00:11 -07:00
Nikhil
0833c8d42d fix: windows app-data location fix (#637) 2026-04-02 08:53:04 -07:00
Nikhil
036c7f280b fix: tab-grouping cdp crash (#635)
* fix: tab group crash + history fix

* fix: tab group crash + history fix
2026-04-01 15:06:41 -07:00
Nikhil
000429277d fix: isolate server release packaging to ci mode (#629)
* fix: relax compile-only release env requirements

* refactor: add ci mode for server release builds
2026-03-31 20:57:44 -07:00
Nikhil
f8535fd96d fix: exclude eval framework from language stats via gitattributes (#630) 2026-03-31 20:44:06 -07:00
Nikhil
f0cbf77924 feat: add server release workflow (#627)
* feat: add server release workflow

* fix: address PR review comments for 0331-add_server_release_workflow

* refactor: rework 0331-add_server_release_workflow based on feedback

* refactor: rework 0331-add_server_release_workflow based on feedback
2026-03-31 17:37:06 -07:00
Nikhil
17be06eb2f fix: report release cli version correctly (#626) 2026-03-31 16:17:57 -07:00
Nikhil
0e90785500 fix: accept port-only input in CLI init command (#625)
Users can now run `browseros-cli init 9000` in addition to the full URL.
Updated default example port from 9004 to 9000.
2026-03-31 16:16:30 -07:00
Nikhil
2bb432b0f2 feat: use hidden pages for scheduled tasks (#624)
* feat: use hidden pages for scheduled tasks

* refactor: rework 0331-use_hidden_pages_for_scheduled_tasks based on feedback
2026-03-31 16:02:47 -07:00
shivammittal274
565ce18eba feat: add npm/npx distribution for BrowserOS CLI (#618)
* feat(cli): skip self-update prompts for package manager installs

Checks BROWSEROS_INSTALL_METHOD env var (npm, brew) and skips automatic
update checks. Users should use their package manager's update mechanism.
FormatNotice now shows the appropriate upgrade command based on install method.

* feat(cli): add npm bin wrapper for browseros-cli

* feat(cli): add npm postinstall script to download platform binary

Downloads the correct platform binary from GitHub releases during npm
install, verifies SHA256 checksums, and extracts to .binary directory.

* feat(cli): add npm package metadata and README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: move npm package files to correct monorepo path

The bin wrapper and postinstall were created at apps/cli/npm/ instead of
packages/browseros-agent/apps/cli/npm/. Moves them to the correct location.

* style: use node: protocol for builtin module imports

* feat(cli): add Makefile npm targets and release workflow npm publish step

Adds npm-version and npm-publish Makefile targets for version sync.
Adds Node.js setup and npm publish step to the release workflow.
Adds npm/npx install instructions to release notes template.

* fix(cli): fail on missing checksum entry and limit redirect depth

- Abort if checksums.txt downloaded but archive entry is missing
- Warn if checksums.txt itself failed to download
- Cap redirect depth at 5 to prevent stack overflow on circular redirects

* fix(cli): match install.sh checksum behavior — warn instead of abort

The existing shell installer (install.sh) warns and continues when the
checksum entry is missing from checksums.txt. Match that behavior in the
npm postinstall to avoid unnecessary install failures. Both files come
from the same GitHub release, so the checksum is a corruption check,
not a strong security boundary.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 22:30:58 +05:30
shivammittal274
81350c0d7f feat: replace model picker with shadcn Combobox + fuse.js fuzzy search (#617)
The model picker in NewProviderDialog rendered inline, causing dialog
resizing and lacked keyboard navigation. Replace it with a Popover +
Command (shadcn Combobox) pattern and add fuse.js for fuzzy search.

- Replace custom ModelPickerList with Popover + Command dropdown
- Add fuse.js for fuzzy model search (replaces string.includes)
- Add MODEL_SELECTED_EVENT and AI_PROVIDER_UPDATED_EVENT analytics
- Enrich PROVIDER_SELECTED_EVENT with model_id in chat sessions
2026-03-30 16:38:21 +05:30
Nikhil
9bdb2413ec feat: clean-up - remove obsolete controller extension (#610)
* refactor(server): remove obsolete controller extension backend

* fix: address review feedback for PR #610
2026-03-27 17:01:04 -07:00
Nikhil
ace9307878 feat: add browseros-cli self-updater (#605)
* feat: add browseros-cli self-updater

* fix: address review comments for 0327-cli_self_updater

* fix: address PR review comments for 0327-cli_self_updater

* fix: replace goreleaser with Makefile-based release build

Remove .goreleaser.yml (required Pro license for monorepo field) and
consolidate cross-compilation into `make release`. CI now uses the same
Makefile target, fixing a bug where POSTHOG_API_KEY was missing from
release ldflags.

* fix: address critical self-updater bugs from code review

- Fix SHA256 checksum mismatch: verify archive checksum before extraction
  instead of verifying extracted binary against archive hash (was always
  failing). Add VerifyChecksum() and integration test.
- Fix JSON field name mismatch: TypeScript was emitting camelCase
  (publishedAt, archiveFormat) but Go expected snake_case
  (published_at, archive_format). Manifest parsing was silently broken.
- Add decompression size limit (256 MB) to prevent zip/gzip bombs.
- Don't update LastCheckedAt on transient errors so retry happens on
  next CLI invocation instead of waiting 24h.
2026-03-27 14:52:54 -07:00
Nikhil
83a25ad301 fix: make SDK navigation tolerate unfocused startup tabs (#607) 2026-03-27 14:34:36 -07:00
github-actions[bot]
4b191a759c docs: update agent extension changelog for v0.0.98 (#609)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-27 14:34:02 -07:00
Nikhil
d02b3f74e6 chore: update agent version (#608) 2026-03-27 13:58:42 -07:00
Nikhil
86c62f14a5 chore: fix version number for extension (#606) 2026-03-27 13:18:10 -07:00
Nikhil
42c3e8fe01 fix: standardize release names to "BrowserOS <Product> - vX.Y.Z" format (#604)
Update workflow release titles for Extension, Agent SDK, and CLI to use
consistent branding. Existing GitHub releases also renamed via gh CLI.
2026-03-27 13:17:56 -07:00
Nikhil
517750e880 feat: add PostHog to CLI (#603)
* feat: add PostHog usage analytics to CLI

Add anonymous command-level analytics to browseros-cli using the PostHog
Go SDK. Tracks which commands are executed, their success/failure status,
and duration — no PII or person profiles.

- New analytics package with Init/Track/Close singleton
- Distinct ID resolves from server's browseros_id (server.json), falls
  back to CLI-generated UUID (~/.config/browseros-cli/install_id)
- API key injected at build time via ldflags (dev builds = silent no-op)
- Server now writes browseros_id into server.json for cross-surface
  identity correlation

* fix: address PR review feedback for #603

- Return "unknown" for unrecognized args in commandName to avoid
  sending arbitrary user input to PostHog
- Revert goreleaser to {{ .Env.POSTHOG_API_KEY }} (intentional hard
  fail — release builds must have the key set)
- go mod tidy to fix posthog-go direct/indirect marker
- Add POSTHOG_API_KEY to .env.production.example
2026-03-27 12:05:34 -07:00
Nikhil
6c053a5f29 feat: upload CLI binaries to CDN and gate release to core team (#602)
* feat: upload CLI binaries to CDN during release and gate workflow to core team

- Extend scripts/build/cli/upload.ts with uploadCliRelease() that pushes
  archives + checksums to R2 under versioned (cli/v{VERSION}/) and latest
  (cli/latest/) paths, plus a version.txt for lightweight latest resolution
- Update scripts/build/cli.ts entry point with --release/--version/--binaries-dir
  flags (existing no-args behavior preserved for upload:cli-installers)
- Rewrite install.sh and install.ps1 to fetch from cdn.browseros.com instead of
  GitHub releases API — eliminates rate limits and API dependency
- Add environment: release-core to release-cli.yml for core-team gating via
  GitHub environment protection rules
- Add Bun setup + CDN upload step to the workflow between build and GitHub release

* fix: address review feedback for PR #602

- Make loadProdEnv return empty map when .env.production is absent so
  pickEnv falls through to process.env in CI (Greptile P1)
- Add semver format validation for version string in install.sh and
  install.ps1 to guard against malformed CDN responses
- Pass inputs.version via env var instead of inline ${{ }} interpolation
  to prevent command injection in workflow shell
2026-03-27 11:47:31 -07:00
Nikhil
1c5ffdf878 fix: harden cli installer bootstrap (#601)
* fix: harden cli installer bootstrap

* refactor: rework 0327-harden_cli_installers based on feedback
2026-03-27 11:24:16 -07:00
Nikhil
39a7d49c25 feat: add workspace-centric bdev cli (#585)
* fix: clean-up bdev

* feat: add workspace-centric bdev cli

* fix: address review comments for 0326-bdev_cli_redesign

* fix: address review feedback for PR #585

* fix: address review feedback for PR #585
2026-03-27 08:48:23 -07:00
shivammittal274
ed948f4b59 Feat/cli launch ready v2 (#600)
* test: temporarily allow release workflow on any branch

* fix(cli): restore main-only guard, remove goreleaser dependency

Replaces GoReleaser (Pro-only monorepo feature) with plain go build.
Tested: RC release created successfully on branch with all 6 binaries.

* fix(cli): fix hdiutil mount detection, update README with install/launch/init flow
2026-03-27 20:20:17 +05:30
shivammittal274
aad5bc16fd Feat/cli launch ready v2 (#599)
* test: temporarily allow release workflow on any branch

* fix(cli): restore main-only guard, remove goreleaser dependency

Replaces GoReleaser (Pro-only monorepo feature) with plain go build.
Tested: RC release created successfully on branch with all 6 binaries.

* fix(cli): remove -quiet from hdiutil so mount point is detected
2026-03-27 20:17:13 +05:30
Dani Akash
cee318a40b fix: improve chat history freshness and reduce query payload (#598)
* fix: add refresh indicator to chat history when fetching latest conversations

Show a non-blocking "Fetching latest conversations" indicator at the top
of the history list while the cached data is being refreshed. Users can
still interact with the cached conversation list during the refresh.

* perf: reduce chat history query payload — fetch last 2 messages instead of 5

The conversation list only displays the last user message as a preview.
Fetching 5 messages per conversation was wasteful — each message contains
the full UIMessage object (tool calls, reasoning, etc.) multiplied by
50 conversations per page. Reduced to last 2 which is sufficient to
find the last user message in a user→assistant exchange.

* perf: use first+DESC instead of last+ASC to push LIMIT down to SQL

PostGraphile's `last: N` doesn't map to SQL LIMIT — it uses a padded
LIMIT 10 and slices in application code. Changing to `first: 2` with
ORDER_INDEX_DESC generates a true SQL LIMIT 2, reducing rows scanned
from 500 to 100 per page (50 conversations × 2 vs 10 messages each).

No UX impact — extractLastUserMessage() filters by role regardless
of message order.

* chore: update react query packages

* feat: replace localforage with idb-keyval
2026-03-27 19:49:47 +05:30
Dani Akash
febaf58f91 fix: guard filesystem tools behind workspace selection and handle mid-conversation changes (#595)
* fix: remove filesystem tools when no workspace is selected

- Make workingDir optional on ResolvedAgentConfig
- Remove resolveSessionDir() fallback that always created a session dir,
  masking the no-workspace state and keeping filesystem tools available
- Gate buildFilesystemToolSet() on workingDir being defined
- Add workspace change detection mid-conversation — rebuilds the agent
  session when workspace is added, removed, or switched (same pattern
  as existing MCP server change detection)
- download_file falls back to tmpdir() when no workspace is set
- Memory/soul tools are unaffected — they use ~/BrowserOS/ paths

* fix: sanitize message history when session rebuilds with different tools

When a session is rebuilt due to workspace or MCP changes, the carried-over
message history may contain tool parts for tools that no longer exist in
the new session. The AI SDK validates messages against the current toolset
and rejects parts with no matching schema.

- Add toolNames getter to AiSdkAgent exposing registered tool names
- Add sanitizeMessagesForToolset() to strip tool parts referencing
  removed tools from carried-over messages
- Apply sanitization in both MCP and workspace session rebuilds

* fix: prepend tool-change context to user message on session rebuild

When workspace or MCP integrations change mid-conversation, prepend a
[Context: ...] block to the user's message explaining what changed.
This prevents the LLM from hallucinating tool usage based on patterns
in the carried-over conversation history.

Context messages vary by change type:
- Workspace removed: lists unavailable filesystem tools, suggests
  selecting a working directory
- Workspace added: confirms filesystem tools are available with path
- Workspace switched: notes the new working directory
- MCP changed: notes that some integration tools may have changed

Only fires on the first message after a rebuild. Invisible in the UI.

* fix: make MCP change context specific about which apps were added/removed

Diff the old and new MCP server keys to produce specific context like:
- "The following app integrations were disconnected: Gmail, Slack."
- "The following app integrations were connected: Linear."
instead of a generic "some tools may no longer be available" message.

* refactor: extract shared rebuildSession helper in ChatService

Eliminates the duplicated 20-line dispose→create→sanitize→store flow
that existed separately in both the MCP and workspace change-detection
blocks.

Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>

* test: add sanitizeMessagesForToolset test suite

Tests for the message sanitization that runs when a session rebuilds
with a different toolset (workspace or MCP change mid-conversation):

- Preserves messages with no tool parts
- Preserves tool parts when tool is in the toolset
- Strips tool parts when tool is NOT in the toolset
- Strips multiple removed tool parts from same message
- Keeps browser tools while removing filesystem tools
- Removes messages that become empty after stripping
- Preserves non-tool parts (reasoning, step-start, file)
- Returns same references when no filtering needed
- Handles empty message array and empty toolset

* style: fix biome formatting in chat-service.ts

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
2026-03-27 18:30:25 +05:30
Dani Akash
aacb47f7ee feat: isolate new-tab agent navigation from origin tab (#593)
* feat: isolate new-tab agent navigation from origin tab

Add origin-aware navigation isolation so the agent never navigates
away from the new-tab chat UI. This is a two-layer defense:

1. Prompt adaptation: When origin is 'newtab', the system prompt's
   execution and tool-selection sections are rewritten to prohibit
   navigating the active tab and default all lookups to new_page.

2. Tool-level guards: navigate_page and close_page reject attempts
   to act on the origin tab when in newtab mode, returning an error
   that teaches the agent to self-correct.

The client now sends an `origin` field ('sidepanel' | 'newtab')
instead of injecting a soft NEWTAB_SYSTEM_PROMPT that LLMs could
ignore. Backwards compatible — defaults to 'sidepanel'.

Closes TKT-592, addresses TKT-564

* test: add newtab origin navigation guard tests

- 14 new prompt tests verifying the system prompt adapts correctly
  for newtab vs sidepanel origin (execution rules, tool selection table,
  absence of conflicting single-tab guidance)
- 6 new integration tests for navigate_page and close_page guards:
  rejects origin tab in newtab mode, allows non-origin tabs, allows
  all tabs in sidepanel mode, backwards compatible with no session
2026-03-27 12:06:32 +05:30
Dani Akash
b3003542d8 docs: overhaul READMEs across all major packages (#594)
* docs: overhaul READMEs across all major packages

- Root README: restructure with feature table, LLM provider table,
  comparison matrix, architecture map, and docs link
- New: packages/browseros/README.md (Chromium fork build system)
- New: apps/server/README.md (MCP server + agent loop)
- New: packages/cdp-protocol/README.md (CDP type bindings)
- Polish: agent-sdk (badges, prerequisites, multi-step example, links)
- Polish: cli (badges, install section, MCP server section, links)
- Polish: agent extension (badges, WXT mention, architecture context)
- Polish: eval (badges, paper links)

* fix: address review — consistent tool count and correct default port

- CLI README: "54 MCP tools" → "53+ MCP tools" to match root and server docs
- Agent SDK README: localhost:3000 → localhost:9100 to match documented default

* docs: add detailed comparison links to How We Compare section

* docs: update comparison table with verified competitor data

Research all 5 competitors via official websites and docs:
- Chrome: no AI agent, Gemini Nano only, MV3 weakening ad blocking
- Brave: BYOM feature, local models via BYOM, Shields ad blocking, MV2+MV3
- Dia: Skills-based AI, no BYOK, cloud AI, acquired by Atlassian
- Comet: full cloud-based agent, built-in ad blocking, extensions on desktop
- Atlas: standalone Chromium browser with Agent Mode, 30-day cloud memory

Renamed Arc/Dia column to just Dia (Arc is sunset).

* docs: simplify comparison table with clean checkmarks and key differentiators

* docs: update browseros-agent README — remove submodule note, add missing packages
2026-03-27 11:59:04 +05:30
Nikhil
aba7a10430 chore: server release (#592) 2026-03-26 19:13:56 -07:00
Nikhil
b7462aa042 fix(cli): move install instructions below What's Changed in release notes (#591)
The installer block was appearing above the changelog. Reorder so
What's Changed comes first and install instructions follow.
2026-03-26 18:16:23 -07:00
Nikhil
883bcc9670 fix: clean up README CLI wording and add Vertical Tabs feature (#590)
- Simplify CLI section: remove confusing MCP jargon, clarify it works
  from terminal and AI coding agents
- Replace "point the CLI at your MCP server" with plain language
- Add Vertical Tabs to the features list
2026-03-26 18:05:54 -07:00
Nikhil
279b41fdc4 feat(cli): add install commands to GitHub release notes (#589)
* feat(cli): add install commands to release notes

* fix(cli): add install header to release workflow
2026-03-26 18:04:58 -07:00
Nikhil
220577b41c feat: add CDN-hosted CLI installer flow (#588)
* feat: add CDN upload flow for cli installers

* fix: move cli install docs to top-level readme

* fix: bun.lock update
2026-03-26 17:41:03 -07:00
Nikhil
03b45013a6 feat(cli): add install scripts for macOS, Linux, and Windows (#587)
* feat(cli): add install scripts for macOS, Linux, and Windows

Bash script (install.sh) for macOS/Linux and PowerShell script
(install.ps1) for Windows. Both download the correct platform binary
from GitHub Releases with checksum verification, version resolution,
and PATH setup.

* fix(cli): address PR review comments for install scripts

- Add checksum verification to install.ps1 using Get-FileHash
- Add warnings on all checksum skip paths in install.sh
- Use grep -F (fixed-string) instead of regex for filename matching
- Add ?per_page=100 to GitHub API call in install.ps1
- Use random temp directory name in install.ps1 to avoid collisions

* fix(cli): address installer review feedback
2026-03-26 17:05:21 -07:00
shivammittal274
aa85907212 Feat/cli launch ready v2 (#582)
* fix(cli): use full path for dist artifacts in release step

* test: temporarily allow release workflow on any branch

* fix(cli): restore main-only guard, remove goreleaser dependency

Replaces GoReleaser (Pro-only monorepo feature) with plain go build.
Tested: RC release created successfully on branch with all 6 binaries.
2026-03-27 01:28:04 +05:30
Nikhil
085352a6f0 fix(ui): resolve MCP promo banner dismiss button overlapping with text (#581)
Move dismiss button from absolute positioning to inline flex child,
preventing it from overlapping with the "Set up" button.
2026-03-26 12:54:00 -07:00
shivammittal274
c0578d0e53 Feat/cli launch ready v2 (#580)
* fix(cli): update goreleaser tag_prefix to match browseros-cli-v* format

* fix(cli): replace goreleaser with plain go build for releases

GoReleaser free version cannot parse prefixed tags (browseros-cli-v*).
monorepo.tag_prefix is a Pro-only feature.

Replaced with direct go build + gh release create:
- Builds all 6 targets with go build (verified locally)
- Creates tar.gz/zip archives with checksums
- Uses gh release create to publish
- No external tool dependency
2026-03-27 01:12:25 +05:30
shivammittal274
663c18ee97 fix(cli): update goreleaser tag_prefix to match browseros-cli-v* format (#579) 2026-03-27 01:07:36 +05:30
Dani Akash
48727750b4 fix: change CLI tag format from cli/v* to browseros-cli-v* (#578)
GoReleaser free cannot parse slash-prefixed tags (cli/v0.0.1) as semver.
Switch to browseros-cli-v0.0.1 format which is valid semver after
stripping the prefix. Remove the monorepo config (GoReleaser Pro only).
2026-03-27 00:58:13 +05:30
Dani Akash
30a3a96a57 fix: add monorepo tag prefix for goreleaser to parse cli/ tags (#576) 2026-03-27 00:50:38 +05:30
shivammittal274
6773ce39da ci(cli): manual dispatch release workflow (#574)
* ci(cli): change release workflow to manual dispatch from main

- Trigger via Actions UI with a version input (e.g. "0.1.0")
- Only runs on main branch
- Creates git tag cli/v<version> automatically
- Then GoReleaser builds all 6 binaries and creates the GitHub Release

* feat: add scoped release notes, changelog PR, and idempotent tags to CLI workflow

- Add concurrency group to prevent parallel releases
- Add scoped release notes from commits touching the CLI directory
- Pass release notes to goreleaser via --release-notes flag
- Make tag creation idempotent for safe re-runs
- Tag the saved release SHA, not HEAD after branching
- Add CHANGELOG.md and auto-update via PR with auto-merge
- Add pull-requests: write permission

---------

Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>
2026-03-27 00:41:08 +05:30
github-actions[bot]
342a3e4a07 docs: update agent extension changelog for v0.0.52 (#573)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-26 19:01:46 +00:00
Dani Akash
09406ea794 feat: add release workflow for agent extension (#572)
* feat: add release workflow for agent extension

Adds a workflow_dispatch workflow that builds the WXT extension,
creates a .zip for sideloading, generates scoped release notes with
contributors and PR links, creates a GitHub release with the zip
attached, and opens an auto-merge PR to update CHANGELOG.md.

* fix: correct API URL to api.browseros.com

* fix: remove duplicate PR numbers and contributors from extension release notes

Apply the same fixes from the agent-sdk workflow:
- Skip PR number if already in commit subject (squash merges)
- Remove custom Contributors section (GitHub auto-generates one)
- Clean up unused variables

* fix: use absolute path for extension zip in release upload

* fix: wxt zip already builds, use correct output path

- Remove separate build step since wxt zip runs the build internally
- Fix zip path from .output/*.zip to dist/*-chrome.zip

* fix: run codegen before wxt zip to generate graphql types
2026-03-27 00:29:47 +05:30
Dani Akash
1f00cbc9cc feat: add release workflow for agent extension (#566)
* feat: add release workflow for agent extension

Adds a workflow_dispatch workflow that builds the WXT extension,
creates a .zip for sideloading, generates scoped release notes with
contributors and PR links, creates a GitHub release with the zip
attached, and opens an auto-merge PR to update CHANGELOG.md.

* fix: correct API URL to api.browseros.com

* fix: remove duplicate PR numbers and contributors from extension release notes

Apply the same fixes from the agent-sdk workflow:
- Skip PR number if already in commit subject (squash merges)
- Remove custom Contributors section (GitHub auto-generates one)
- Clean up unused variables

* fix: use absolute path for extension zip in release upload

* fix: wxt zip already builds, use correct output path

- Remove separate build step since wxt zip runs the build internally
- Fix zip path from .output/*.zip to dist/*-chrome.zip
2026-03-27 00:23:04 +05:30
Dani Akash
422a829f5e fix: remove duplicate PR numbers and contributors from release notes (#571)
- Skip adding PR number if already present in the commit subject
  (squash merges include "(#123)" automatically)
- Remove custom Contributors section since GitHub auto-generates one
  with avatars at the bottom of every release
2026-03-27 00:07:13 +05:30
github-actions[bot]
ed109fcedf docs: update agent-sdk changelog for v0.0.7 (#570)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-26 18:31:39 +00:00
Dani Akash
19af96d08e chore: bump @browseros-ai/agent-sdk to 0.0.7 (#569) 2026-03-27 00:00:35 +05:30
Dani Akash
e0304b203c chore: bump @browseros-ai/agent-sdk to 0.0.6 (#568) 2026-03-26 23:53:35 +05:30
Nikhil
af65bdbcfb feat(build): add build:server:ci script with --compile-only flag (#567)
Add a compile-only mode to the server build pipeline for CI/CD
environments that don't have R2 credentials. The --compile-only flag
skips resource staging and upload, producing only compiled binaries.
2026-03-26 11:21:39 -07:00
Dani Akash
d79c2a4123 feat: create GitHub release with changelog on agent-sdk publish (#564)
* feat: create GitHub release with changelog on agent-sdk publish

After publishing to npm, the workflow now:
- Tags the commit as agent-sdk-v<version>
- Generates release notes from commits that modified the agent-sdk
  directory since the last agent-sdk release tag
- Creates a GitHub release with those notes

First release will show "Initial release" since no previous tag exists.

* feat: update CHANGELOG.md on agent-sdk release

Add a CHANGELOG.md for @browseros-ai/agent-sdk and update the release
workflow to prepend a versioned entry with the release notes before
creating the GitHub release. The changelog is committed to main
automatically.

* fix: address review issues in agent-sdk release workflow

- Add explicit permissions: contents: write
- Replace sed with head/tail for safe CHANGELOG insertion (fixes
  double-quote and backslash corruption in commit messages)
- Handle empty release notes with "No notable changes." fallback
- Make git tag idempotent for workflow reruns (2>/dev/null || true)

* fix: use PR with auto-merge for changelog updates

Direct push to main fails due to branch protection requiring PRs.
Instead, create a branch, open a PR, and auto-merge via squash.

* feat: add contributors and PR links to agent-sdk release notes

Release notes now include PR numbers (linked automatically by GitHub),
GitHub usernames for each commit author, and a contributors section
at the bottom. All scoped to commits that modified the agent-sdk path.

* fix: reorder release steps and fix tag/idempotency issues

- Capture release SHA before any branching so the tag always points
  to the main commit that was built and published to npm
- Reorder: generate notes → publish → tag/release → changelog PR
  (changelog is lowest-stakes, runs last)
- Make tag push and release create idempotent for safe re-runs
  (fall back to gh release edit if release already exists)
- Add || true to gh pr merge --auto in case auto-merge is not enabled
- Explicit git checkout main before creating changelog branch

* fix: explicit error handling for tag/release and contributor dedup

- Replace silent || true guards with explicit checks that log what's
  happening (tag exists, remote tag exists, release exists) so errors
  are visible instead of swallowed
- Fix contributor dedup: use grep -qw (word match) instead of grep -qF
  (substring match) so "dan" isn't excluded when "dansmith" exists

* fix: exclude current version tag when finding previous release

On re-runs, the current version's tag already exists on the remote, so
PREV_TAG resolves to it and git log produces empty output. Filter it
out so release notes are generated against the actual previous version.

* ci: prevent concurrent agent-sdk release runs

Add concurrency group so multiple dispatches queue instead of racing
on the same tag/release/PR.
2026-03-26 23:38:14 +05:30
shivammittal274
e3d57e5347 feat(cli): production-ready CLI with auto-launch, install, and cross-platform builds (#555)
* feat(cli): production-ready CLI with auto-launch, install, and cross-platform builds

- init: accept URL argument and --auto flag for non-interactive setup
- install: new command to download BrowserOS app for current platform
- launch: auto-detect and launch BrowserOS when server is not running
- discovery: prefer server.json (live) over config.yaml (may be stale)
- errors: actionable messages guiding users to init/install
- goreleaser: cross-platform builds for 6 targets (darwin/linux/windows × amd64/arm64)
- ci: GitHub Actions workflow to release CLI binaries on cli/v* tag push

* fix(cli): check health status code and add progress dots during launch

- Health check in newClient() now verifies HTTP 200, not just no error
- waitForServer prints dots during the 30s poll so users know it's working

* refactor(cli): make launch an explicit command, remove auto-launch from newClient

- launch: new explicit command to find and open BrowserOS app
- launch: probes server.json, config, and common ports before launching
- launch: if already running, reports URL instead of launching again
- init --auto: uses port probing to find running servers
- install --deb: errors on non-Linux instead of silently downloading DMG
- error messages: guide users to launch/install/init explicitly
- removed: auto-launch from newClient() — CLI never does something surprising

* fix(cli): platform-native detection, launch, and install for all OSes

Detection (isBrowserOSInstalled):
- macOS: uses `open -Ra` to query Launch Services (no hardcoded paths)
- Linux: checks /usr/bin/browseros (.deb), browseros.desktop, AppImage search
- Windows: checks %LOCALAPPDATA%\BrowserOS\Application\BrowserOS.exe
  and HKCU/HKLM uninstall registry keys

Launch (startBrowserOS):
- macOS: `open -b com.browseros.BrowserOS` (bundle ID, not path)
- Linux: `browseros` binary, AppImage, or `gtk-launch browseros`
  (fixed: was using xdg-open which opens by MIME type, not desktop files)
- Windows: runs BrowserOS.exe from known Chromium per-user install path
  (fixed: was using `cmd /c start BrowserOS` which doesn't resolve)

Install (runPostInstall):
- macOS: hdiutil attach → cp -R to /Applications → hdiutil detach
- Linux: chmod +x for AppImage, dpkg -i instruction for .deb
- Windows: launches installer exe
- --deb flag now errors on non-Linux platforms

Removed auto-launch from newClient() — CLI never does surprising things.

Sources verified from:
- packages/browseros/build/common/context.py (binary names per platform)
- packages/browseros/build/modules/package/linux.py (.deb structure, .desktop file)
- packages/browseros/chromium_patches/chrome/install_static/chromium_install_modes.h
  (Windows base_app_name="BrowserOS", registry GUID, install paths)
- /Applications/BrowserOS.app/Contents/Info.plist (bundle ID)
2026-03-26 23:12:55 +05:30
Dani Akash
392312f203 ci: only run PR title validation on open and edit (#565)
Remove synchronize and reopened triggers since this workflow only
validates the PR title, which doesn't change on new commits or reopen.
2026-03-26 23:06:11 +05:30
Dani Akash
0f193055c7 fix: broaden connection error detection for main page and sidepanel (#563)
* fix: broaden connection error detection for main page and sidepanel

The connection error check required both "Failed to fetch" AND
"127.0.0.1" in the error message. On the main page, the browser
only produces "Failed to fetch" without the IP, so users saw a
generic "Something went wrong" instead of the troubleshooting link.

Broaden detection to also match "localhost" and bare "Failed to fetch"
errors that don't contain an external URL. Also pass providerType in
NewTabChat so provider-specific errors render correctly.

Closes #526

* fix: simplify connection error detection

All chat requests go through the local BrowserOS agent server, so any
"Failed to fetch" error is always a local connection issue. Remove the
unnecessary 127.0.0.1/localhost/URL checks.

* fix: pass providerType to agentUrlError ChatError instances
2026-03-26 20:55:40 +05:30
Dani Akash
f45cb58889 fix: stop sending port-in-use errors to Sentry (#558)
Port conflicts are expected — Chromium retries with a different port.
These errors were flooding Sentry (14k+ events) without user impact.

- handleStartupError: move Sentry.captureException below the
  port-in-use check so it only fires for unexpected startup errors
- handleControllerStartupError: skip Sentry capture for port errors
- index.ts: exit early for port errors before Sentry capture
2026-03-26 09:32:18 +05:30
shivammittal274
37ead6d129 fix: add cursor-pointer to credit badge in sidepanel (#554) 2026-03-26 00:09:58 +05:30
Nikhil
5ea9463030 fix: widen scheduled task results dialog and add horizontal scroll for tables (#549)
- Change dialog width from sm:max-w-2xl (672px) to sm:w-[70vw] sm:max-w-4xl
  so it takes 70% of viewport width, capped at 896px
- Add overflow-x-auto on table wrappers so wide tables scroll horizontally
  instead of being clipped

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:27:46 -07:00
shivammittal274
dde35ccbd5 feat: integrate models.dev for dynamic LLM provider/model data (#547)
* feat: integrate models.dev for dynamic LLM provider/model data (#TKT-657)

Replace hardcoded model lists with data sourced from models.dev so new
providers and models appear automatically when the community adds them.

- Add build script (scripts/generate-models.ts) that fetches models.dev/api.json
  and outputs a compact JSON with 10 providers and 520 models
- Replace hardcoded MODELS_DATA (50 models) with dynamic models.dev lookups
- Add searchable model combobox (Popover + Command) replacing plain Select dropdown
- Enrich provider templates with models.dev metadata (context window, image support)
- Keep chatgpt-pro, qwen-code, browseros, openai-compatible as hardcoded providers

* fix: address review — remove ollama-cloud mapping, fix default models, remove dead code

- Remove ollama from PROVIDER_MAP (ollama-cloud has cloud models, not local)
- Add ollama to CUSTOM_PROVIDER_MODELS with empty list (users type custom IDs)
- Update defaultModelIds to ones that exist in models.dev data:
  openrouter → anthropic/claude-sonnet-4.5
  lmstudio → openai/gpt-oss-20b
  bedrock → anthropic.claude-sonnet-4-6
- Remove dead isCustomModel export
- Regenerate models-dev-data.json (9 providers, 486 models)

* fix: model suggestion list focus/dismiss behavior

- List only opens when input is focused or user types
- Clicking a model selects it and closes the list
- Clicking outside (blur) dismisses the list
- onMouseDown preventDefault on list items prevents blur race condition

* refactor: extract ModelPickerList component with proper open/close UX

- Collapsed state: Select-like trigger showing selected model + chevron
- Expanded state: search input + scrollable filtered list, inline
- Click outside or Escape to close, Enter to submit custom model
- Extracted as separate component (reduces dialog nesting, testable)
- No more setTimeout hacks for blur handling

* chore: remove plan doc from repo
2026-03-25 02:41:07 +05:30
318 changed files with 17428 additions and 14447 deletions

2
.gitattributes vendored
View File

@@ -9,4 +9,6 @@ packages/browseros/chromium_patches/**/*.py linguist-generated
scripts/*.py linguist-generated
# Mark build directories as generated
build/* linguist-generated
# Mark eval/test framework as vendored so it's excluded from language stats
packages/browseros-agent/apps/eval/** linguist-vendored
docs/videos/** filter=lfs diff=lfs merge=lfs -text

View File

@@ -2,7 +2,7 @@ name: PR Conventional Commit Validation
on:
pull_request:
types: [opened, synchronize, reopened, edited]
types: [opened, edited]
permissions:
pull-requests: write

View File

@@ -0,0 +1,148 @@
name: Release BrowserOS Extension
on:
workflow_dispatch:
concurrency:
group: release-agent-extension
cancel-in-progress: false
jobs:
release:
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
defaults:
run:
working-directory: packages/browseros-agent/apps/agent
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: oven-sh/setup-bun@v2
- name: Install dependencies
run: bun ci
working-directory: packages/browseros-agent
- name: Build and zip extension
run: bun run codegen && bun run zip
env:
VITE_PUBLIC_BROWSEROS_API: https://api.browseros.com
- name: Get version and zip path
id: version
run: |
echo "version=$(node -p "require('./package.json').version")" >> "$GITHUB_OUTPUT"
echo "release_sha=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"
ZIP_FILE=$(ls "$(pwd)/dist/"*-chrome.zip | head -n 1)
echo "zip_path=$ZIP_FILE" >> "$GITHUB_OUTPUT"
echo "zip_name=$(basename "$ZIP_FILE")" >> "$GITHUB_OUTPUT"
- name: Generate release notes
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
AGENT_PATH="packages/browseros-agent/apps/agent"
CURRENT_TAG="agent-extension-v${{ steps.version.outputs.version }}"
PREV_TAG=$(git tag -l "agent-extension-v*" --sort=-v:refname | grep -v "^${CURRENT_TAG}$" | head -n 1)
if [ -z "$PREV_TAG" ]; then
echo "Initial release" > /tmp/release-notes.md
else
COMMITS=$(git log "$PREV_TAG"..HEAD --pretty=format:"%H" -- "$AGENT_PATH")
if [ -z "$COMMITS" ]; then
echo "No notable changes." > /tmp/release-notes.md
else
echo "## What's Changed" > /tmp/release-notes.md
echo "" >> /tmp/release-notes.md
while IFS= read -r SHA; do
SUBJECT=$(git log -1 --pretty=format:"%s" "$SHA")
PR_NUM=$(gh api "/repos/${{ github.repository }}/commits/${SHA}/pulls" --jq '.[0].number // empty' 2>/dev/null)
# Skip PR number if already in the commit subject (squash merges include it)
if [ -n "$PR_NUM" ] && ! echo "$SUBJECT" | grep -qF "(#${PR_NUM})"; then
echo "- ${SUBJECT} (#${PR_NUM})" >> /tmp/release-notes.md
else
echo "- ${SUBJECT}" >> /tmp/release-notes.md
fi
done <<< "$COMMITS"
fi
fi
working-directory: ${{ github.workspace }}
- name: Create GitHub release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
TAG="agent-extension-v${{ steps.version.outputs.version }}"
RELEASE_SHA="${{ steps.version.outputs.release_sha }}"
TITLE="BrowserOS Extension - v${{ steps.version.outputs.version }}"
if git rev-parse "$TAG" >/dev/null 2>&1; then
echo "Tag $TAG already exists, skipping tag creation"
else
git tag "$TAG" "$RELEASE_SHA"
fi
if git ls-remote --tags origin "$TAG" | grep -q "$TAG"; then
echo "Tag $TAG already on remote, skipping push"
else
git push origin "$TAG"
fi
if gh release view "$TAG" >/dev/null 2>&1; then
echo "Release $TAG already exists, updating"
gh release edit "$TAG" --title "$TITLE" --notes-file /tmp/release-notes.md
gh release upload "$TAG" "${{ steps.version.outputs.zip_path }}" --clobber
else
gh release create "$TAG" \
--title "$TITLE" \
--notes-file /tmp/release-notes.md \
"${{ steps.version.outputs.zip_path }}"
fi
working-directory: ${{ github.workspace }}
- name: Update CHANGELOG.md via PR
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
VERSION="${{ steps.version.outputs.version }}"
DATE=$(date -u +"%Y-%m-%d")
BRANCH="docs/agent-extension-changelog-v${VERSION}"
CHANGELOG="packages/browseros-agent/apps/agent/CHANGELOG.md"
git checkout main
{
head -n 1 "$CHANGELOG"
echo ""
echo "## v${VERSION} (${DATE})"
echo ""
cat /tmp/release-notes.md
echo ""
tail -n +2 "$CHANGELOG"
} > /tmp/new-changelog.md
mv /tmp/new-changelog.md "$CHANGELOG"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git checkout -b "$BRANCH"
git add "$CHANGELOG"
git commit -m "docs: update agent extension changelog for v${VERSION}"
git push origin "$BRANCH"
gh pr create \
--title "docs: update agent extension changelog for v${VERSION}" \
--body "Auto-generated changelog update for BrowserOS Extension v${VERSION}." \
--base main \
--head "$BRANCH"
gh pr merge "$BRANCH" --squash --auto || true
working-directory: ${{ github.workspace }}

View File

@@ -1,18 +1,27 @@
name: Release Agent SDK
name: Release BrowserOS Agent SDK
on:
workflow_dispatch:
concurrency:
group: release-agent-sdk
cancel-in-progress: false
jobs:
publish:
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
defaults:
run:
working-directory: packages/browseros-agent/packages/agent-sdk
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: oven-sh/setup-bun@v2
@@ -31,7 +40,129 @@ jobs:
- name: Test
run: bun test
- name: Get version
id: version
run: |
echo "version=$(node -p "require('./package.json').version")" >> "$GITHUB_OUTPUT"
echo "release_sha=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"
- name: Generate release notes
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
SDK_PATH="packages/browseros-agent/packages/agent-sdk"
CURRENT_TAG="agent-sdk-v${{ steps.version.outputs.version }}"
# Find the previous tag, excluding the current version's tag
# (which may already exist from a prior failed run)
PREV_TAG=$(git tag -l "agent-sdk-v*" --sort=-v:refname | grep -v "^${CURRENT_TAG}$" | head -n 1)
if [ -z "$PREV_TAG" ]; then
echo "Initial release" > /tmp/release-notes.md
else
# Get commits scoped to the SDK directory
COMMITS=$(git log "$PREV_TAG"..HEAD --pretty=format:"%H" -- "$SDK_PATH")
if [ -z "$COMMITS" ]; then
echo "No notable changes." > /tmp/release-notes.md
else
echo "## What's Changed" > /tmp/release-notes.md
echo "" >> /tmp/release-notes.md
# For each commit, find the associated PR and format with author
CONTRIBUTORS=""
while IFS= read -r SHA; do
# Get commit subject and author
SUBJECT=$(git log -1 --pretty=format:"%s" "$SHA")
AUTHOR=$(git log -1 --pretty=format:"%an" "$SHA")
GITHUB_USER=$(gh api "/repos/${{ github.repository }}/commits/${SHA}" --jq '.author.login // empty' 2>/dev/null)
# Find associated PR number
PR_NUM=$(gh api "/repos/${{ github.repository }}/commits/${SHA}/pulls" --jq '.[0].number // empty' 2>/dev/null)
# Format line: skip PR number if already in the commit subject
# (squash merges include "(#123)" in the subject automatically)
if [ -n "$PR_NUM" ] && ! echo "$SUBJECT" | grep -qF "(#${PR_NUM})"; then
echo "- ${SUBJECT} (#${PR_NUM})" >> /tmp/release-notes.md
else
echo "- ${SUBJECT}" >> /tmp/release-notes.md
fi
done <<< "$COMMITS"
fi
fi
working-directory: ${{ github.workspace }}
- name: Publish
run: npm publish --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
- name: Create GitHub release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
TAG="agent-sdk-v${{ steps.version.outputs.version }}"
RELEASE_SHA="${{ steps.version.outputs.release_sha }}"
TITLE="BrowserOS Agent SDK - v${{ steps.version.outputs.version }}"
# Create or reuse tag (idempotent for re-runs)
if git rev-parse "$TAG" >/dev/null 2>&1; then
echo "Tag $TAG already exists, skipping tag creation"
else
git tag "$TAG" "$RELEASE_SHA"
fi
# Push tag (skip if already on remote)
if git ls-remote --tags origin "$TAG" | grep -q "$TAG"; then
echo "Tag $TAG already on remote, skipping push"
else
git push origin "$TAG"
fi
# Create or update release
if gh release view "$TAG" >/dev/null 2>&1; then
echo "Release $TAG already exists, updating"
gh release edit "$TAG" --title "$TITLE" --notes-file /tmp/release-notes.md
else
gh release create "$TAG" --title "$TITLE" --notes-file /tmp/release-notes.md
fi
working-directory: ${{ github.workspace }}
- name: Update CHANGELOG.md via PR
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
VERSION="${{ steps.version.outputs.version }}"
DATE=$(date -u +"%Y-%m-%d")
BRANCH="docs/agent-sdk-changelog-v${VERSION}"
CHANGELOG="packages/browseros-agent/packages/agent-sdk/CHANGELOG.md"
# Return to main before branching
git checkout main
# Use head/tail to safely insert without sed quoting issues
{
head -n 1 "$CHANGELOG"
echo ""
echo "## v${VERSION} (${DATE})"
echo ""
cat /tmp/release-notes.md
echo ""
tail -n +2 "$CHANGELOG"
} > /tmp/new-changelog.md
mv /tmp/new-changelog.md "$CHANGELOG"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git checkout -b "$BRANCH"
git add "$CHANGELOG"
git commit -m "docs: update agent-sdk changelog for v${VERSION}"
git push origin "$BRANCH"
gh pr create \
--title "docs: update agent-sdk changelog for v${VERSION}" \
--body "Auto-generated changelog update for BrowserOS Agent SDK v${VERSION}." \
--base main \
--head "$BRANCH"
gh pr merge "$BRANCH" --squash --auto || true
working-directory: ${{ github.workspace }}

161
.github/workflows/release-cli.yml vendored Normal file
View File

@@ -0,0 +1,161 @@
name: Release BrowserOS CLI
on:
workflow_dispatch:
inputs:
version:
description: "Release version (e.g. 0.1.0)"
required: true
type: string
concurrency:
group: release-cli
cancel-in-progress: false
jobs:
release:
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: release-core
permissions:
contents: write
pull-requests: write
defaults:
run:
working-directory: packages/browseros-agent/apps/cli
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: actions/setup-go@v5
with:
go-version-file: packages/browseros-agent/apps/cli/go.mod
- uses: oven-sh/setup-bun@v2
with:
bun-version: "1.3.6"
- name: Run tests
run: make test
- name: Run vet
run: make vet
- name: Build all platforms
run: make release VERSION=${{ inputs.version }} POSTHOG_API_KEY=${{ secrets.POSTHOG_API_KEY }}
- name: Install dependencies
run: bun install
working-directory: packages/browseros-agent
- name: Upload to CDN
env:
R2_ACCOUNT_ID: ${{ secrets.R2_ACCOUNT_ID }}
R2_ACCESS_KEY_ID: ${{ secrets.R2_ACCESS_KEY_ID }}
R2_SECRET_ACCESS_KEY: ${{ secrets.R2_SECRET_ACCESS_KEY }}
R2_BUCKET: ${{ secrets.R2_BUCKET }}
R2_UPLOAD_PREFIX: cli
CLI_VERSION: ${{ inputs.version }}
run: |
bun scripts/build/cli.ts \
--release \
--version "$CLI_VERSION" \
--binaries-dir apps/cli/dist
working-directory: packages/browseros-agent
- name: Generate release notes
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
CLI_PATH="packages/browseros-agent/apps/cli"
TAG="browseros-cli-v${{ inputs.version }}"
CHANGELOG_FILE="/tmp/release-changelog.md"
PREV_TAG=$(git tag -l "browseros-cli-v*" --sort=-v:refname | grep -v "^${TAG}$" | head -n 1)
if [ -z "$PREV_TAG" ]; then
echo "Initial release of browseros-cli." > "$CHANGELOG_FILE"
else
COMMITS=$(git log "$PREV_TAG"..HEAD --pretty=format:"%H" -- "$CLI_PATH")
if [ -z "$COMMITS" ]; then
echo "No notable changes." > "$CHANGELOG_FILE"
else
echo "## What's Changed" > "$CHANGELOG_FILE"
echo "" >> "$CHANGELOG_FILE"
while IFS= read -r SHA; do
SUBJECT=$(git log -1 --pretty=format:"%s" "$SHA")
PR_NUM=$(gh api "/repos/${{ github.repository }}/commits/${SHA}/pulls" --jq '.[0].number // empty' 2>/dev/null)
if [ -n "$PR_NUM" ] && ! echo "$SUBJECT" | grep -qF "(#${PR_NUM})"; then
echo "- ${SUBJECT} (#${PR_NUM})" >> "$CHANGELOG_FILE"
else
echo "- ${SUBJECT}" >> "$CHANGELOG_FILE"
fi
done <<< "$COMMITS"
fi
fi
cat "$CHANGELOG_FILE" > /tmp/release-notes.md
cat >> /tmp/release-notes.md <<'EOF'
## Install `browseros-cli`
### npm / npx
```bash
npx browseros-cli --help
npm install -g browseros-cli
```
### macOS / Linux
```bash
curl -fsSL https://cdn.browseros.com/cli/install.sh | bash
```
### Windows
```powershell
irm https://cdn.browseros.com/cli/install.ps1 | iex
```
After install, run `browseros-cli init` to point the CLI at your BrowserOS MCP server.
EOF
working-directory: ${{ github.workspace }}
- name: Create tag and release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
TAG="browseros-cli-v${{ inputs.version }}"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
if ! git rev-parse "$TAG" >/dev/null 2>&1; then
git tag -a "$TAG" -m "browseros-cli v${{ inputs.version }}"
git push origin "$TAG"
fi
CLI_DIST="packages/browseros-agent/apps/cli/dist"
gh release create "$TAG" \
--title "BrowserOS CLI - v${{ inputs.version }}" \
--notes-file /tmp/release-notes.md \
${CLI_DIST}/*
working-directory: ${{ github.workspace }}
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: "20"
registry-url: "https://registry.npmjs.org"
- name: Publish to npm
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
run: |
make npm-version VERSION=${{ inputs.version }}
cd npm
npm publish --access public

147
.github/workflows/release-server.yml vendored Normal file
View File

@@ -0,0 +1,147 @@
name: Release BrowserOS Server
on:
workflow_dispatch:
inputs:
version:
description: "Release version (e.g. 0.0.80)"
required: true
type: string
concurrency:
group: release-server
cancel-in-progress: false
jobs:
release:
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: release-core
permissions:
contents: write
defaults:
run:
working-directory: packages/browseros-agent
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: oven-sh/setup-bun@v2
with:
bun-version: "1.3.6"
- name: Install dependencies
run: bun ci
- name: Prepare production env file
run: cp apps/server/.env.production.example apps/server/.env.production
- name: Validate version
id: version
env:
REQUESTED_VERSION: ${{ inputs.version }}
run: |
PACKAGE_VERSION=$(node -p "require('./apps/server/package.json').version")
echo "package_version=$PACKAGE_VERSION" >> "$GITHUB_OUTPUT"
echo "release_sha=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"
if [ "$PACKAGE_VERSION" != "$REQUESTED_VERSION" ]; then
echo "Requested version $REQUESTED_VERSION does not match apps/server/package.json ($PACKAGE_VERSION)"
exit 1
fi
- name: Build release artifacts
run: bun run build:server:ci
- name: Verify release artifacts
run: |
mapfile -t ZIP_FILES < <(find dist/prod/server -maxdepth 1 -type f -name 'browseros-server-resources-*.zip' | sort)
if [ "${#ZIP_FILES[@]}" -eq 0 ]; then
echo "No server release zip files were produced"
exit 1
fi
printf 'Found release artifacts:\n%s\n' "${ZIP_FILES[@]}"
- name: Generate release notes
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PACKAGE_VERSION: ${{ steps.version.outputs.package_version }}
run: |
SERVER_APP_PATH="packages/browseros-agent/apps/server"
SERVER_BUILD_DIR="packages/browseros-agent/scripts/build/server"
SERVER_BUILD_ENTRY="packages/browseros-agent/scripts/build/server.ts"
SERVER_RESOURCE_MANIFEST="packages/browseros-agent/scripts/build/config/server-prod-resources.json"
SERVER_WORKSPACE_PKG="packages/browseros-agent/package.json"
CURRENT_TAG="browseros-server-v$PACKAGE_VERSION"
PREV_TAG=$(git tag -l "browseros-server-v*" --sort=-v:refname | grep -v "^${CURRENT_TAG}$" | head -n 1)
if [ -z "$PREV_TAG" ]; then
echo "Initial release of browseros-server." > /tmp/release-notes.md
else
COMMITS=$(git log "$PREV_TAG"..HEAD --pretty=format:"%H" -- \
"$SERVER_APP_PATH" \
"$SERVER_BUILD_DIR" \
"$SERVER_BUILD_ENTRY" \
"$SERVER_RESOURCE_MANIFEST" \
"$SERVER_WORKSPACE_PKG")
if [ -z "$COMMITS" ]; then
echo "No notable changes." > /tmp/release-notes.md
else
echo "## What's Changed" > /tmp/release-notes.md
echo "" >> /tmp/release-notes.md
while IFS= read -r SHA; do
SUBJECT=$(git log -1 --pretty=format:"%s" "$SHA")
PR_NUM=$(gh api "/repos/${{ github.repository }}/commits/${SHA}/pulls" --jq '.[0].number // empty' 2>/dev/null)
if [ -n "$PR_NUM" ] && ! echo "$SUBJECT" | grep -qF "(#${PR_NUM})"; then
echo "- ${SUBJECT} (#${PR_NUM})" >> /tmp/release-notes.md
else
echo "- ${SUBJECT}" >> /tmp/release-notes.md
fi
done <<< "$COMMITS"
fi
fi
working-directory: ${{ github.workspace }}
- name: Create GitHub release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PACKAGE_VERSION: ${{ steps.version.outputs.package_version }}
RELEASE_SHA: ${{ steps.version.outputs.release_sha }}
run: |
TAG="browseros-server-v$PACKAGE_VERSION"
TITLE="BrowserOS Server - v$PACKAGE_VERSION"
mapfile -t ZIP_FILES < <(find packages/browseros-agent/dist/prod/server -maxdepth 1 -type f -name 'browseros-server-resources-*.zip' | sort)
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
if git rev-parse "$TAG" >/dev/null 2>&1; then
echo "Tag $TAG already exists, skipping tag creation"
else
git tag -a "$TAG" -m "browseros-server v$PACKAGE_VERSION" "$RELEASE_SHA"
fi
if git ls-remote --tags origin "$TAG" | grep -q "$TAG"; then
echo "Tag $TAG already on remote, skipping push"
else
git push origin "$TAG"
fi
if gh release view "$TAG" >/dev/null 2>&1; then
echo "Release $TAG already exists, updating"
gh release edit "$TAG" --title "$TITLE" --notes-file /tmp/release-notes.md
gh release upload "$TAG" "${ZIP_FILES[@]}" --clobber
else
gh release create "$TAG" \
--title "$TITLE" \
--notes-file /tmp/release-notes.md \
"${ZIP_FILES[@]}"
fi
working-directory: ${{ github.workspace }}

216
README.md
View File

@@ -6,6 +6,7 @@
[![Slack](https://img.shields.io/badge/Slack-Join%20us-4A154B?logo=slack&logoColor=white)](https://dub.sh/browserOS-slack)
[![Twitter](https://img.shields.io/twitter/follow/browserOS_ai?style=social)](https://twitter.com/browseros_ai)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](LICENSE)
[![Docs](https://img.shields.io/badge/Docs-docs.browseros.com-blue)](https://docs.browseros.com)
<br></br>
<a href="https://files.browseros.com/download/BrowserOS.dmg">
<img src="https://img.shields.io/badge/Download-macOS-black?style=flat&logo=apple&logoColor=white" alt="Download for macOS (beta)" />
@@ -22,146 +23,183 @@
<br />
</div>
##
🌐 BrowserOS is an open-source Chromium fork that runs AI agents natively. **The privacy-first alternative to ChatGPT Atlas, Perplexity Comet, and Dia.**
BrowserOS is an open-source Chromium fork that runs AI agents natively. **The privacy-first alternative to ChatGPT Atlas, Perplexity Comet, and Dia.**
🔒 Use your own API keys or run local models with Ollama. Your data never leaves your machine.
Use your own API keys or run local models with Ollama. Your data never leaves your machine.
💡 Join our [Discord](https://discord.gg/YKwjt5vuKr) or [Slack](https://dub.sh/browserOS-slack) and help us build! Have feature requests? [Suggest here](https://github.com/browseros-ai/BrowserOS/issues/99).
> **[Documentation](https://docs.browseros.com)** · **[Discord](https://discord.gg/YKwjt5vuKr)** · **[Slack](https://dub.sh/browserOS-slack)** · **[Twitter](https://x.com/browserOS_ai)** · **[Feature Requests](https://github.com/browseros-ai/BrowserOS/issues/99)**
## Quick start
## Quick Start
1. Download and install BrowserOS:
- [macOS](https://files.browseros.com/download/BrowserOS.dmg)
- [Windows](https://files.browseros.com/download/BrowserOS_installer.exe)
- [Linux (AppImage)](https://files.browseros.com/download/BrowserOS.AppImage)
- [Linux (Debian)](https://cdn.browseros.com/download/BrowserOS.deb)
1. **Download and install** BrowserOS — [macOS](https://files.browseros.com/download/BrowserOS.dmg) · [Windows](https://files.browseros.com/download/BrowserOS_installer.exe) · [Linux (AppImage)](https://files.browseros.com/download/BrowserOS.AppImage) · [Linux (Debian)](https://cdn.browseros.com/download/BrowserOS.deb)
2. **Import your Chrome data** (optional) — bookmarks, passwords, extensions all carry over
3. **Connect your AI provider** — Claude, OpenAI, Gemini, ChatGPT Pro via OAuth, or local models via Ollama/LM Studio
2. Import your Chrome data (optional)
## Features
3. Connect your AI provider — use Claude, OpenAI, Gemini, or local models via Ollama and LMStudio.
4. Start automating!
## What makes BrowserOS special
- 🏠 Feels like home — same Chrome interface, all your extensions just work
- 🤖 AI agents that run on YOUR browser, not in the cloud
- 🔒 Privacy first — bring your own keys or run local models with Ollama. Your browsing history stays on your machine
- 🤝 [BrowserOS as MCP server](https://docs.browseros.com/features/use-with-claude-code) — control the browser from `claude-code`, `gemini-cli`, or any MCP client (31 tools)
- 🔄 [Workflows](https://docs.browseros.com/features/workflows) — build repeatable browser automations with a visual graph builder
- 📂 [Cowork](https://docs.browseros.com/features/cowork) — combine browser automation with local file operations. Research the web, save reports to your folder
- ⏰ [Scheduled Tasks](https://docs.browseros.com/features/scheduled-tasks) — run the agent on autopilot, daily or every few minutes
- 💬 [LLM Hub](https://docs.browseros.com/features/llm-chat-hub) — compare Claude, ChatGPT, and Gemini side-by-side on any page
- 🛡️ Built-in ad blocker — [10x more protection than Chrome](https://docs.browseros.com/features/ad-blocking) with uBlock Origin + Manifest V2 support
- 🚀 100% open source under AGPL-3.0
| Feature | Description | Docs |
|---------|-------------|------|
| **AI Agent** | 53+ browser automation tools — navigate, click, type, extract data, all with natural language | [Guide](https://docs.browseros.com/getting-started) |
| **MCP Server** | Control the browser from Claude Code, Gemini CLI, or any MCP client | [Setup](https://docs.browseros.com/features/use-with-claude-code) |
| **Workflows** | Build repeatable browser automations with a visual graph builder | [Docs](https://docs.browseros.com/features/workflows) |
| **Cowork** | Combine browser automation with local file operations — research the web, save reports to your folder | [Docs](https://docs.browseros.com/features/cowork) |
| **Scheduled Tasks** | Run agents on autopilot — daily, hourly, or every few minutes | [Docs](https://docs.browseros.com/features/scheduled-tasks) |
| **Memory** | Persistent memory across conversations — your assistant remembers context over time | [Docs](https://docs.browseros.com/features/memory) |
| **SOUL.md** | Define your AI's personality and instructions in a single markdown file | [Docs](https://docs.browseros.com/features/soul-md) |
| **LLM Hub** | Compare Claude, ChatGPT, and Gemini responses side-by-side on any page | [Docs](https://docs.browseros.com/features/llm-chat-hub) |
| **40+ App Integrations** | Gmail, Slack, GitHub, Linear, Notion, Figma, Salesforce, and more via MCP | [Docs](https://docs.browseros.com/features/connect-apps) |
| **Vertical Tabs** | Side-panel tab management — stay organized even with 100+ tabs open | [Docs](https://docs.browseros.com/features/vertical-tabs) |
| **Ad Blocking** | uBlock Origin + Manifest V2 support — [10x more protection](https://docs.browseros.com/features/ad-blocking) than Chrome | [Docs](https://docs.browseros.com/features/ad-blocking) |
| **Cloud Sync** | Sync browser config and agent history across devices | [Docs](https://docs.browseros.com/features/sync) |
| **Skills** | Custom instruction sets that shape how your AI assistant behaves | [Docs](https://docs.browseros.com/features/skills) |
| **Smart Nudges** | Contextual suggestions to connect apps and use features at the right moment | [Docs](https://docs.browseros.com/features/smart-nudges) |
## Demos
### 🤖 BrowserOS agent in action
### BrowserOS agent in action
[![BrowserOS agent in action](docs/videos/browserOS-agent-in-action.gif)](https://www.youtube.com/watch?v=SoSFev5R5dI)
<br/><br/>
### 🎇 Install [BrowserOS as MCP](https://docs.browseros.com/features/use-with-claude-code) and control it from `claude-code`
### Install [BrowserOS as MCP](https://docs.browseros.com/features/use-with-claude-code) and control it from `claude-code`
https://github.com/user-attachments/assets/c725d6df-1a0d-40eb-a125-ea009bf664dc
<br/><br/>
### 💬 Use BrowserOS to chat
### Use BrowserOS to chat
https://github.com/user-attachments/assets/726803c5-8e36-420e-8694-c63a2607beca
<br/><br/>
### Use BrowserOS to scrape data
### Use BrowserOS to scrape data
https://github.com/user-attachments/assets/9f038216-bc24-4555-abf1-af2adcb7ebc0
<br/><br/>
## Why We're Building BrowserOS
## Install `browseros-cli`
For the first time since Netscape pioneered the web in 1994, AI gives us the chance to completely reimagine the browser. We've seen tools like Cursor deliver 10x productivity gains for developers—yet everyday browsing remains frustratingly archaic.
Use `browseros-cli` to launch and control BrowserOS from the terminal or from AI coding agents like Claude Code.
You're likely juggling 70+ tabs, battling your browser instead of having it assist you. Routine tasks, like ordering something from amazon or filling a form should be handled seamlessly by AI agents.
**macOS / Linux:**
At BrowserOS, we're convinced that AI should empower you by automating tasks locally and securely—keeping your data private. We are building the best browser for this future!
```bash
curl -fsSL https://cdn.browseros.com/cli/install.sh | bash
```
## How we compare
**Windows:**
<details>
<summary><b>vs Chrome</b></summary>
<br>
While we're grateful for Google open-sourcing Chromium, but Chrome hasn't evolved much in 10 years. No AI features, no automation, no MCP support.
</details>
```powershell
irm https://cdn.browseros.com/cli/install.ps1 | iex
```
<details>
<summary><b>vs Brave</b></summary>
<br>
We love what Brave started, but they've spread themselves too thin with crypto, search, VPNs. We're laser-focused on AI-powered browsing.
</details>
After install, run `browseros-cli init` to connect the CLI to your running BrowserOS instance.
<details>
<summary><b>vs Arc/Dia</b></summary>
<br>
Many loved Arc, but it was closed source. When they abandoned users, there was no recourse. We're 100% open source - fork it anytime!
</details>
## LLM Providers
<details>
<summary><b>vs Perplexity Comet</b></summary>
<br>
They're a search/ad company. Your browser history becomes their product. We keep everything local.
</details>
BrowserOS works with any LLM. Bring your own keys, use OAuth, or run models locally.
<details>
<summary><b>vs ChatGPT Atlas</b></summary>
<br>
Your browsing data could be used for ads or to train their models. We keep your history and agent interactions strictly local.
</details>
| Provider | Type | Auth |
|----------|------|------|
| Kimi K2.5 | Cloud (default) | Built-in |
| ChatGPT Pro/Plus | Cloud | [OAuth](https://docs.browseros.com/features/chatgpt) |
| GitHub Copilot | Cloud | [OAuth](https://docs.browseros.com/features/github-copilot) |
| Qwen Code | Cloud | [OAuth](https://docs.browseros.com/features/qwen-code) |
| Claude (Anthropic) | Cloud | API key |
| GPT-4o / o3 (OpenAI) | Cloud | API key |
| Gemini (Google) | Cloud | API key |
| Azure OpenAI | Cloud | API key |
| AWS Bedrock | Cloud | IAM credentials |
| OpenRouter | Cloud | API key |
| Ollama | Local | [Setup](https://docs.browseros.com/features/ollama) |
| LM Studio | Local | [Setup](https://docs.browseros.com/features/lm-studio) |
## How We Compare
| | BrowserOS | Chrome | Brave | Dia | Comet | Atlas |
|---|:---:|:---:|:---:|:---:|:---:|:---:|
| Open Source | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| AI Agent | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ |
| MCP Server | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Visual Workflows | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Cowork (files + browser) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Scheduled Tasks | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Bring Your Own Keys | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| Local Models (Ollama) | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| Local-first Privacy | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| Ad Blocking (MV2) | ✅ | ❌ | ✅ | ❌ | ✅ | ❌ |
**Detailed comparisons:**
- [BrowserOS vs Chrome DevTools MCP](https://docs.browseros.com/comparisons/chrome-devtools-mcp) — developer-focused comparison for browser automation
- [BrowserOS vs Claude Cowork](https://docs.browseros.com/comparisons/claude-cowork) — getting real work done with AI
- [BrowserOS vs OpenClaw](https://docs.browseros.com/comparisons/openclaw) — everyday AI assistance
## Architecture
BrowserOS is a monorepo with two main subsystems: the **browser** (Chromium fork) and the **agent platform** (TypeScript/Go).
```
BrowserOS/
├── packages/browseros/ # Chromium fork + build system (Python)
│ ├── chromium_patches/ # Patches applied to Chromium source
│ ├── build/ # Build CLI and modules
│ └── resources/ # Icons, entitlements, signing
├── packages/browseros-agent/ # Agent platform (TypeScript/Go)
│ ├── apps/
│ │ ├── server/ # MCP server + AI agent loop (Bun)
│ │ ├── agent/ # Browser extension UI (WXT + React)
│ │ ├── cli/ # CLI tool (Go)
│ │ ├── eval/ # Benchmark framework
│ │ └── controller-ext/ # Chrome API bridge extension
│ │
│ └── packages/
│ ├── agent-sdk/ # Node.js SDK (npm: @browseros-ai/agent-sdk)
│ ├── cdp-protocol/ # CDP type bindings
│ └── shared/ # Shared constants
```
| Package | What it does |
|---------|-------------|
| [`packages/browseros`](packages/browseros/) | Chromium fork — patches, build system, signing |
| [`apps/server`](packages/browseros-agent/apps/server/) | Bun server exposing 53+ MCP tools and running the AI agent loop |
| [`apps/agent`](packages/browseros-agent/apps/agent/) | Browser extension — new tab, side panel chat, onboarding, settings |
| [`apps/cli`](packages/browseros-agent/apps/cli/) | Go CLI — control BrowserOS from the terminal or AI coding agents |
| [`apps/eval`](packages/browseros-agent/apps/eval/) | Benchmark framework — WebVoyager, Mind2Web evaluation |
| [`agent-sdk`](packages/browseros-agent/packages/agent-sdk/) | Node.js SDK for browser automation with natural language |
| [`cdp-protocol`](packages/browseros-agent/packages/cdp-protocol/) | Type-safe Chrome DevTools Protocol bindings |
## Contributing
We'd love your help making BrowserOS better!
We'd love your help making BrowserOS better! See our [Contributing Guide](CONTRIBUTING.md) for details.
- 🐛 [Report bugs](https://github.com/browseros-ai/BrowserOS/issues)
- 💡 [Suggest features](https://github.com/browseros-ai/BrowserOS/issues/99)
- 💬 [Join Discord](https://discord.gg/YKwjt5vuKr)
- 🐦 [Follow on Twitter](https://x.com/browserOS_ai)
- [Report bugs](https://github.com/browseros-ai/BrowserOS/issues)
- [Suggest features](https://github.com/browseros-ai/BrowserOS/issues/99)
- [Join Discord](https://discord.gg/YKwjt5vuKr) · [Join Slack](https://dub.sh/browserOS-slack)
- [Follow on Twitter](https://x.com/browserOS_ai)
**Agent development** (TypeScript/Go) — see the [agent monorepo README](packages/browseros-agent/README.md) for setup instructions.
**Browser development** (C++/Python) — requires ~100GB disk space. See [`packages/browseros`](packages/browseros/) for build instructions.
## Credits
- [ungoogled-chromium](https://github.com/ungoogled-software/ungoogled-chromium) — BrowserOS uses some patches for enhanced privacy. Thanks to everyone behind this project!
- [The Chromium Project](https://www.chromium.org/) — at the core of BrowserOS, making it possible to exist in the first place.
## License
BrowserOS is open source under the [AGPL-3.0 license](LICENSE).
## Credits
- [ungoogled-chromium](https://github.com/ungoogled-software/ungoogled-chromium) - BrowserOS uses some patches for enhanced privacy. Thanks to everyone behind this project!
- [The Chromium Project](https://www.chromium.org/) - At the core of BrowserOS, making it possible to exist in the first place.
## Citation
If you use BrowserOS in your research or project, please cite:
```bibtex
@software{browseros2025,
author = {Sonti, Nithin and Sonti, Nikhil and {BrowserOS-team}},
title = {BrowserOS: The open-source Agentic browser},
url = {https://github.com/browseros-ai/BrowserOS},
year = {2025},
publisher = {GitHub},
license = {AGPL-3.0},
}
```
Copyright &copy; 2025 Felafax, Inc.
Copyright &copy; 2026 Felafax, Inc.
## Stargazers
Thank you to all our supporters!
[![Star History Chart](https://api.star-history.com/svg?repos=browseros-ai/BrowserOS&type=Date)](https://www.star-history.com/#browseros-ai/BrowserOS&Date)
##
<p align="center">
Built with ❤️ from San Francisco
</p>

View File

@@ -195,3 +195,4 @@ test-results/
.agent/
.llm/
.grove/
docs/plans/2026-03-24-models-dev-integration.md

View File

@@ -32,7 +32,7 @@ Use **kebab-case** for all file and folder names:
| Multi-word files | kebab-case | `gemini-agent.ts`, `mcp-context.ts` |
| Single-word files | lowercase | `types.ts`, `browser.ts`, `index.ts` |
| Test files | `.test.ts` suffix | `mcp-context.test.ts` |
| Folders | kebab-case | `controller-server/`, `rate-limiter/` |
| Folders | kebab-case | `rate-limiter/`, `browser-tools/` |
Classes remain PascalCase in code, but live in kebab-case files:
```typescript
@@ -81,6 +81,9 @@ bun run dev:server # Build server for development
bun run dev:ext # Build extension for development
bun run dist:server # Build server for production (all targets)
bun run dist:ext # Build extension for production
# Refresh models.dev data
bun run generate:models # Fetches latest from models.dev/api.json
```
## Architecture
@@ -94,21 +97,16 @@ The main MCP server that exposes browser automation tools via HTTP/SSE.
**Key components:**
- `src/tools/` - MCP tool definitions, split into:
- `cdp-based/` - Tools using Chrome DevTools Protocol (network, console, emulation, input, etc.)
- `controller-based/` - Tools using the browser extension (navigation, clicks, screenshots, tabs, history, bookmarks)
- `src/controller-server/` - WebSocket server that bridges to the browser extension
- `ControllerBridge` handles WebSocket connections with extension clients
- `ControllerContext` wraps the bridge for tool handlers
- `cdp-based/` - Tools using Chrome DevTools Protocol (navigation, DOM interaction, network, console, emulation, input, etc.)
- `src/common/` - Shared utilities (McpContext, PageCollector, browser connection, identity, db)
- `src/agent/` - AI agent functionality (Gemini adapter, rate limiting, session management)
- `src/http/` - Hono HTTP server with MCP, health, and provider routes
**Tool types:**
- CDP tools require a direct CDP connection (`--cdp-port`)
- Controller tools work via the browser extension over WebSocket
### Shared (`packages/shared`)
Shared constants, types, and configuration used by both server and extension. Avoids magic numbers.
Shared constants, types, and configuration used across packages. Avoids magic numbers.
**Structure:**
- `src/constants/` - Configuration values (ports, timeouts, limits, urls, paths)
@@ -116,22 +114,12 @@ Shared constants, types, and configuration used by both server and extension. Av
**Exports:** `@browseros/shared/constants/*`, `@browseros/shared/types/*`
### Controller Extension (`apps/controller-ext`)
Chrome extension that receives commands from the server via WebSocket.
**Entry point:** `src/background/index.ts` → `BrowserOSController`
**Structure:**
- `src/actions/` - Action handlers organized by domain (browser/, tab/, bookmark/, history/)
- `src/adapters/` - Chrome API adapters (TabAdapter, BookmarkAdapter, HistoryAdapter)
- `src/websocket/` - WebSocket client that connects to the server
### Communication Flow
```
AI Agent/MCP Client → HTTP Server (Hono) → Tool Handler
CDP (direct) ←── or ──→ WebSocket → Extension → Chrome APIs
CDP → BrowserOS / Chrome APIs
```
## Creating Packages

View File

@@ -1,8 +1,6 @@
# BrowserOS Agent
Monorepo for the BrowserOS-agent -- contains 3 packages: agent-UI, server (which contains the agent loop) and controller-extension (which is used by the tools within the agent loop).
> **⚠️ NOTE:** This is only a submodule, the main project is at -- https://github.com/browseros-ai/BrowserOS
The agent platform powering [BrowserOS](https://github.com/browseros-ai/BrowserOS) — contains the MCP server, agent UI, CLI, evaluation framework, and SDK.
## Monorepo Structure
@@ -10,24 +8,29 @@ Monorepo for the BrowserOS-agent -- contains 3 packages: agent-UI, server (which
apps/
server/ # Bun server - MCP endpoints + agent loop
agent/ # Agent UI (Chrome extension)
controller-ext/ # BrowserOS Controller (Chrome extension for chrome.* APIs)
cli/ # Go CLI for controlling BrowserOS from the terminal
eval/ # Evaluation framework for benchmarking agents
packages/
agent-sdk/ # Node.js SDK (@browseros-ai/agent-sdk)
cdp-protocol/ # Type-safe Chrome DevTools Protocol bindings
shared/ # Shared constants (ports, timeouts, limits)
```
| Package | Description |
|---------|-------------|
| `apps/server` | Bun server exposing MCP tools and running the agent loop |
| `apps/agent` | Agent UI - Chrome extension for the chat interface |
| `apps/controller-ext` | BrowserOS Controller - Chrome extension that bridges `chrome.*` APIs (tabs, bookmarks, history) to the server via WebSocket |
| `apps/agent` | Agent UI Chrome extension for the chat interface |
| `apps/cli` | Go CLI — control BrowserOS from the terminal or AI coding agents |
| `apps/eval` | Benchmark framework — WebVoyager, Mind2Web evaluation |
| `packages/agent-sdk` | Node.js SDK for browser automation with natural language |
| `packages/cdp-protocol` | Auto-generated CDP type bindings used by the server |
| `packages/shared` | Shared constants used across packages |
## Architecture
- `apps/server`: Bun server which contains the agent loop and tools.
- `apps/agent`: Agent UI (Chrome extension).
- `apps/controller-ext`: BrowserOS Controller - a Chrome extension that bridges `chrome.*` APIs to the server. Controller tools within the server communicate with this extension via WebSocket.
```
┌──────────────────────────────────────────────────────────────────────────┐
@@ -45,19 +48,19 @@ packages/
│ /health ─── Health check │
│ │
│ Tools: │
── CDP Tools (console, network, input, screenshot, ...)
└── Controller Tools (tabs, navigation, clicks, bookmarks, history)
── CDP-backed browser tools (tabs, navigation, input, screenshots, │
bookmarks, history, console, DOM, tab groups, windows, ...)
└──────────────────────────────────────────────────────────────────────────┘
│ CDP (client)WebSocket (server)
┌─────────────────────┐ ┌─────────────────────────────────────┐
Chromium CDP BrowserOS Controller Extension
(cdpPort: 9000) │ (extensionPort: 9300)
│ Server connects Bridges chrome.tabs, chrome.history
│ TO this as client │ chrome.bookmarks to the server
└─────────────────────┘ └─────────────────────────────────────┘
CDP (client)
─────────────────────┐
Chromium CDP
(cdpPort: 9000) │
│ │
Server connects
│ TO this as client
─────────────────────┘
```
### Ports
@@ -66,7 +69,7 @@ packages/
|------|--------------|---------|
| 9100 | `BROWSEROS_SERVER_PORT` | HTTP server - MCP endpoints, agent chat, health |
| 9000 | `BROWSEROS_CDP_PORT` | Chromium CDP server (BrowserOS Server connects as client) |
| 9300 | `BROWSEROS_EXTENSION_PORT` | WebSocket server for controller extension |
| 9300 | `BROWSEROS_EXTENSION_PORT` | Legacy BrowserOS launch arg kept for compatibility; not used by the server |
## Development
@@ -90,9 +93,8 @@ process-compose up
The `process-compose up` command runs the following in order:
1. `bun install` — installs dependencies
2. `bun --cwd apps/controller-ext build` — builds the controller extension
3. `bun --cwd apps/agent codegen` — generates agent code
4. `bun --cwd apps/server start` and `bun --cwd apps/agent dev` — starts server and agent in parallel
2. `bun --cwd apps/agent codegen` — generates agent code
3. `bun --cwd apps/server start` and `bun --cwd apps/agent dev` — starts server and agent in parallel
### Environment Variables
@@ -108,7 +110,7 @@ Runtime uses `.env.development`, while production artifact builds use `.env.prod
|----------|---------|-------------|
| `BROWSEROS_SERVER_PORT` | 9100 | HTTP server port (MCP, chat, health) |
| `BROWSEROS_CDP_PORT` | 9000 | Chromium CDP port (server connects as client) |
| `BROWSEROS_EXTENSION_PORT` | 9300 | WebSocket port for controller extension |
| `BROWSEROS_EXTENSION_PORT` | 9300 | Legacy BrowserOS launch arg kept for compatibility |
| `BROWSEROS_CONFIG_URL` | - | Remote config endpoint for rate limits |
| `BROWSEROS_INSTALL_ID` | - | Unique installation identifier (analytics) |
| `BROWSEROS_CLIENT_ID` | - | Client identifier (analytics) |
@@ -140,7 +142,7 @@ Copy from `apps/server/.env.production.example` before running `build:server`.
|----------|---------|-------------|
| `BROWSEROS_SERVER_PORT` | 9100 | Passed to BrowserOS via CLI args |
| `BROWSEROS_CDP_PORT` | 9000 | Passed to BrowserOS via CLI args |
| `BROWSEROS_EXTENSION_PORT` | 9300 | Passed to BrowserOS via CLI args |
| `BROWSEROS_EXTENSION_PORT` | 9300 | Legacy BrowserOS CLI arg still passed for compatibility |
| `VITE_BROWSEROS_SERVER_PORT` | 9100 | Agent UI connects to server (must match `BROWSEROS_SERVER_PORT`) |
| `BROWSEROS_BINARY` | - | Path to BrowserOS binary |
| `USE_BROWSEROS_BINARY` | true | Use BrowserOS instead of default Chrome |
@@ -157,15 +159,13 @@ bun run start:server # Start the server
bun run start:agent # Start agent extension (dev mode)
# Build
bun run build # Build server, agent, and controller extension
bun run build # Build server and agent
bun run build:server # Build production server resource artifacts and upload zips to R2
bun run build:agent # Build agent extension
bun run build:ext # Build controller extension
# Test
bun run test # Run standard tests
bun run test:cdp # Run CDP-based tests
bun run test:controller # Run controller-based tests
bun run test:integration # Run integration tests
# Quality

View File

@@ -0,0 +1,19 @@
# BrowserOS Agent Extension
## v0.0.98 (2026-03-27)
## What's Changed
- chore: update agent version (#608)
- chore: fix version number for extension (#606)
- fix: improve chat history freshness and reduce query payload (#598)
- feat: isolate new-tab agent navigation from origin tab (#593)
- docs: overhaul READMEs across all major packages (#594)
- fix(ui): resolve MCP promo banner dismiss button overlapping with text (#581)
- docs: update agent extension changelog for v0.0.52 (#573)
## v0.0.52 (2026-03-26)
Initial release

View File

@@ -1,16 +1,24 @@
# BrowserOS Agent Chrome Extension
# BrowserOS Agent Extension
The official Chrome extension for BrowserOS Agent, providing the UI layer for interacting with BrowserOS Core and Controllers. This extension enables intelligent browser automation, AI-powered search, and seamless integration with multiple LLM providers.
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](../../../../LICENSE)
The built-in browser extension that powers BrowserOS's AI interface — new tab with unified search, side panel chat, onboarding, and settings. Built with [WXT](https://wxt.dev) and React.
> For user-facing feature documentation, see [docs.browseros.com](https://docs.browseros.com).
## Features
- **AI-Powered New Tab**: Custom new tab page with unified search across Google and AI assistants
- **Side Panel Chat**: Full-featured chat interface for interacting with BrowserOS Core
- **Side Panel Chat**: Full-featured chat interface for interacting with BrowserOS
- **Multi-Provider Support**: Connect to various LLM providers (OpenAI, Anthropic, Azure, Bedrock, and more)
- **MCP Integration**: Model Context Protocol support for extending AI capabilities
- **Visual Feedback**: Animated glow effect on tabs during AI agent operations
- **Privacy-First**: Local data handling with configurable provider settings
## How It Connects
The extension communicates with the [BrowserOS Server](../../apps/server/) running locally. The server handles the AI agent loop, MCP tools, and CDP connections — the extension provides the UI layer.
## Project Structure
```
@@ -80,47 +88,20 @@ Settings dashboard with multiple sections:
Content script that creates a visual indicator (pulsing orange glow) around the browser viewport when an AI agent is actively working on a tab.
## How Tools Are Used
### Bun
Bun is the exclusive runtime and package manager:
- All scripts use `bun run <script>` instead of npm
- Package installation via `bun install`
- Environment files automatically loaded (no dotenv needed)
- Enforced via `engines` field in `package.json`
```bash
bun install # Install dependencies
bun run dev # Development mode
bun run build # Production build
bun run lint # Run Biome linting
```
### Biome
Unified linter and formatter configured in `biome.json`:
- **Formatting**: 2-space indentation, single quotes, no semicolons
- **Linting**: Recommended rules plus custom rules for unused imports/variables
- **CSS Support**: Tailwind directives parsing enabled
- **Import Organization**: Automatic import sorting via assist actions
```bash
bun run lint # Check for issues
bun run lint:fix # Auto-fix issues
```
## Development
### Prerequisites
- [Bun](https://bun.sh) installed
- Chrome or Chromium-based browser
- BrowserOS Core running locally (for full functionality)
- BrowserOS Server running locally (for full functionality)
### Setup
```bash
# Copy environment file
cp .env.example .env.development
# Install dependencies
bun install
@@ -153,12 +134,30 @@ SENTRY_AUTH_TOKEN=your-token
### GraphQL Schema
Codegen requires a GraphQL schema. By default it uses the bundled `schema/schema.graphql`, so no extra setup is needed. If you have access to the original API source, you can set the following environment variable
Codegen requires a GraphQL schema. By default it uses the bundled `schema/schema.graphql`, so no extra setup is needed. If you have access to the original API source, you can set the following environment variable:
```env
GRAPHQL_SCHEMA_PATH=/path/to/api-repo/.../schema.graphql
```
## Development Tooling
### Bun
Bun is the exclusive runtime and package manager:
- All scripts use `bun run <script>` instead of npm
- Package installation via `bun install`
- Environment files automatically loaded (no dotenv needed)
- Enforced via `engines` field in `package.json`
### Biome
Unified linter and formatter configured in `biome.json`:
- **Formatting**: 2-space indentation, single quotes, no semicolons
- **Linting**: Recommended rules plus custom rules for unused imports/variables
- **CSS Support**: Tailwind directives parsing enabled
- **Import Organization**: Automatic import sorting via assist actions
## Scripts
| Script | Description |
@@ -169,4 +168,5 @@ GRAPHQL_SCHEMA_PATH=/path/to/api-repo/.../schema.graphql
| `bun run lint` | Run Biome linter |
| `bun run lint:fix` | Auto-fix linting issues |
| `bun run typecheck` | Run TypeScript type checking |
| `bun run codegen` | Generate GraphQL types |
| `bun run clean:cache` | Clear build caches |

View File

@@ -66,7 +66,7 @@ export const RunResultDialog: FC<RunResultDialogProps> = ({
return (
<Dialog open={!!run} onOpenChange={onOpenChange}>
<DialogContent className="sm:max-w-2xl">
<DialogContent className="sm:w-[70vw] sm:max-w-4xl">
<DialogHeader>
<DialogTitle className="flex items-center gap-2">
{run.status === 'completed' ? (
@@ -94,7 +94,7 @@ export const RunResultDialog: FC<RunResultDialogProps> = ({
<p className="text-destructive text-sm">{run.result}</p>
</div>
) : run.result ? (
<div className="prose prose-sm dark:prose-invert [&_[data-streamdown='code-block']]:!w-full [&_[data-streamdown='table-wrapper']]:!w-full max-w-none break-words rounded-lg border border-border bg-muted/50 p-4">
<div className="prose prose-sm dark:prose-invert [&_[data-streamdown='code-block']]:!w-full [&_[data-streamdown='table-wrapper']]:!w-full max-w-none break-words rounded-lg border border-border bg-muted/50 p-4 [&_[data-streamdown='table-wrapper']]:overflow-x-auto">
<MessageResponse>{run.result}</MessageResponse>
</div>
) : (

View File

@@ -14,7 +14,7 @@ export const CreditBadge: FC<CreditBadgeProps> = ({ credits, onClick }) => {
type="button"
onClick={onClick}
className={cn(
'inline-flex items-center gap-1 rounded-md px-1.5 py-0.5 font-medium text-xs transition-colors hover:bg-muted/50',
'inline-flex cursor-pointer items-center gap-1 rounded-md px-1.5 py-0.5 font-medium text-xs transition-colors hover:bg-muted/50',
getCreditTextColor(credits),
)}
title={`${credits} credits remaining`}

View File

@@ -17,7 +17,7 @@ export const McpPromoBanner: FC = () => {
}
return (
<div className="relative flex items-center gap-4 rounded-xl border border-border bg-card p-4 shadow-sm transition-all hover:shadow-md">
<div className="flex items-center gap-4 rounded-xl border border-border bg-card p-4 shadow-sm transition-all hover:shadow-md">
<div className="flex h-10 w-10 shrink-0 items-center justify-center rounded-lg bg-[var(--accent-orange)]/10">
<Server className="h-5 w-5 text-[var(--accent-orange)]" />
</div>
@@ -48,7 +48,7 @@ export const McpPromoBanner: FC = () => {
<button
type="button"
onClick={() => setDismissed(true)}
className="absolute top-2 right-2 rounded-sm p-1 text-muted-foreground opacity-50 transition-opacity hover:opacity-100"
className="shrink-0 rounded-sm p-1 text-muted-foreground opacity-50 transition-opacity hover:opacity-100"
>
<X className="h-3.5 w-3.5" />
</button>

View File

@@ -1,10 +1,26 @@
import { zodResolver } from '@hookform/resolvers/zod'
import { CheckCircle2, ExternalLink, Loader2, XCircle } from 'lucide-react'
import { type FC, useEffect, useState } from 'react'
import Fuse from 'fuse.js'
import {
Check,
CheckCircle2,
ChevronDown,
ExternalLink,
Loader2,
XCircle,
} from 'lucide-react'
import { type FC, useEffect, useMemo, useState } from 'react'
import { useForm } from 'react-hook-form'
import { z } from 'zod/v3'
import { Button } from '@/components/ui/button'
import { Checkbox } from '@/components/ui/checkbox'
import {
Command,
CommandEmpty,
CommandGroup,
CommandInput,
CommandItem,
CommandList,
} from '@/components/ui/command'
import {
Dialog,
DialogContent,
@@ -23,6 +39,11 @@ import {
FormMessage,
} from '@/components/ui/form'
import { Input } from '@/components/ui/input'
import {
Popover,
PopoverContent,
PopoverTrigger,
} from '@/components/ui/popover'
import {
Select,
SelectContent,
@@ -35,8 +56,10 @@ import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
import { useCapabilities } from '@/lib/browseros/useCapabilities'
import {
AI_PROVIDER_ADDED_EVENT,
AI_PROVIDER_UPDATED_EVENT,
KIMI_API_KEY_CONFIGURED_EVENT,
KIMI_API_KEY_GUIDE_CLICKED_EVENT,
MODEL_SELECTED_EVENT,
} from '@/lib/constants/analyticsEvents'
import { useKimiLaunch } from '@/lib/feature-flags/useKimiLaunch'
import {
@@ -47,7 +70,8 @@ import {
import { type TestResult, testProvider } from '@/lib/llm-providers/testProvider'
import type { LlmProviderConfig, ProviderType } from '@/lib/llm-providers/types'
import { track } from '@/lib/metrics/track'
import { getModelContextLength, getModelOptions } from './models'
import { cn } from '@/lib/utils'
import { getModelContextLength, getModelsForProvider } from './models'
const providerTypeEnum = z.enum([
'moonshot',
@@ -163,6 +187,13 @@ export const providerFormSchema = z
*/
export type ProviderFormValues = z.infer<typeof providerFormSchema>
function formatContextWindow(tokens: number): string {
if (tokens >= 1000000)
return `${(tokens / 1000000).toFixed(tokens % 1000000 === 0 ? 0 : 1)}M`
if (tokens >= 1000) return `${Math.round(tokens / 1000)}K`
return `${tokens}`
}
/**
* Props for NewProviderDialog
* @public
@@ -188,9 +219,10 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
initialValues,
onSave,
}) => {
const [isCustomModel, setIsCustomModel] = useState(false)
const [isTesting, setIsTesting] = useState(false)
const [testResult, setTestResult] = useState<TestResult | null>(null)
const [modelPickerOpen, setModelPickerOpen] = useState(false)
const [modelSearch, setModelSearch] = useState('')
const { supports } = useCapabilities()
const { baseUrl: agentServerUrl } = useAgentServerUrl()
const kimiLaunch = useKimiLaunch()
@@ -261,8 +293,21 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
watchedSessionToken,
])
// Get model options for current provider type
const modelOptions = getModelOptions(watchedType as ProviderType)
const modelInfoList = getModelsForProvider(watchedType as ProviderType)
const modelFuse = useMemo(
() =>
new Fuse(modelInfoList, {
keys: ['modelId'],
threshold: 0.4,
distance: 100,
}),
[modelInfoList],
)
const filteredModels = modelSearch
? modelFuse.search(modelSearch).map((r) => r.item)
: modelInfoList
// Handle provider type change (user-initiated via Select)
const handleTypeChange = (newType: ProviderType) => {
@@ -272,14 +317,13 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
form.setValue('baseUrl', defaultUrl)
}
form.setValue('modelId', '')
setIsCustomModel(false)
}
// Auto-fill context window when model changes (only for new providers)
useEffect(() => {
if (initialValues?.id) return
if (watchedModelId && watchedModelId !== 'custom') {
if (watchedModelId) {
const contextLength = getModelContextLength(
watchedType as ProviderType,
watchedModelId,
@@ -290,17 +334,6 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
}
}, [watchedModelId, watchedType, form, initialValues?.id])
// Handle model selection (including custom option)
const handleModelChange = (value: string) => {
if (value === 'custom') {
setIsCustomModel(true)
form.setValue('modelId', '')
} else {
setIsCustomModel(false)
form.setValue('modelId', value)
}
}
// Reset form when initialValues change
useEffect(() => {
if (initialValues) {
@@ -325,7 +358,6 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
reasoningEffort: initialValues.reasoningEffort || 'high',
reasoningSummary: initialValues.reasoningSummary || 'auto',
})
setIsCustomModel(false)
}
}, [initialValues, form])
@@ -352,7 +384,6 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
reasoningEffort: 'high',
reasoningSummary: 'auto',
})
setIsCustomModel(false)
}
// Clear test result when dialog opens/closes
setTestResult(null)
@@ -373,6 +404,11 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
provider_type: values.type,
model: values.modelId,
})
} else {
track(AI_PROVIDER_UPDATED_EVENT, {
provider_type: values.type,
model: values.modelId,
})
}
if (values.type === 'moonshot') {
track(KIMI_API_KEY_CONFIGURED_EVENT, {
@@ -811,52 +847,110 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
control={form.control}
name="modelId"
render={({ field }) => (
<FormItem>
<FormItem className="flex flex-col">
<FormLabel>Model *</FormLabel>
{isCustomModel || modelOptions.length === 1 ? (
<>
<FormControl>
<Input
placeholder={
watchedType === 'azure'
? 'Enter your deployment name'
: watchedType === 'bedrock'
? 'e.g., anthropic.claude-3-5-sonnet-20241022-v2:0'
: 'Enter custom model ID'
}
{...field}
/>
</FormControl>
{modelOptions.length > 1 && (
<Button
type="button"
variant="link"
size="sm"
className="h-auto p-0 text-xs"
onClick={() => setIsCustomModel(false)}
>
Back to model list
</Button>
)}
</>
{modelInfoList.length === 0 ? (
<FormControl>
<Input
placeholder={
watchedType === 'azure'
? 'Enter your deployment name'
: watchedType === 'bedrock'
? 'e.g., anthropic.claude-3-5-sonnet-20241022-v2:0'
: 'Enter model ID'
}
{...field}
/>
</FormControl>
) : (
<Select
onValueChange={handleModelChange}
value={field.value}
<Popover
open={modelPickerOpen}
onOpenChange={(isOpen) => {
setModelPickerOpen(isOpen)
if (!isOpen) setModelSearch('')
}}
>
<FormControl>
<SelectTrigger className="w-full">
<SelectValue placeholder="Select a model" />
</SelectTrigger>
</FormControl>
<SelectContent>
{modelOptions.map((modelId) => (
<SelectItem key={modelId} value={modelId}>
{modelId === 'custom' ? '+ Custom model' : modelId}
</SelectItem>
))}
</SelectContent>
</Select>
<PopoverTrigger asChild>
<button
type="button"
className={cn(
'flex h-9 w-full items-center justify-between rounded-md border border-input bg-transparent px-3 py-1 text-sm shadow-xs',
field.value
? 'text-foreground'
: 'text-muted-foreground',
)}
>
<span className="truncate">
{field.value || 'Select a model...'}
</span>
<ChevronDown className="ml-2 h-4 w-4 shrink-0 opacity-50" />
</button>
</PopoverTrigger>
<PopoverContent
className="w-[var(--radix-popover-trigger-width)] p-0"
align="start"
>
<Command shouldFilter={false}>
<CommandInput
placeholder="Search models..."
value={modelSearch}
onValueChange={setModelSearch}
onKeyDown={(e) => {
if (
e.key === 'Enter' &&
modelSearch &&
filteredModels.length === 0
) {
e.preventDefault()
form.setValue('modelId', modelSearch)
track(MODEL_SELECTED_EVENT, {
provider_type: watchedType,
model_id: modelSearch,
is_custom_model: true,
})
setModelPickerOpen(false)
setModelSearch('')
}
}}
/>
<CommandList>
<CommandEmpty>
No models found. Press Enter to use &quot;
{modelSearch}&quot;
</CommandEmpty>
<CommandGroup>
{filteredModels.map((model) => (
<CommandItem
key={model.modelId}
value={model.modelId}
onSelect={() => {
form.setValue('modelId', model.modelId)
track(MODEL_SELECTED_EVENT, {
provider_type: watchedType,
model_id: model.modelId,
context_window: model.contextLength,
is_custom_model: false,
})
setModelPickerOpen(false)
setModelSearch('')
}}
>
<span className="flex-1 truncate">
{model.modelId}
</span>
<span className="ml-2 shrink-0 rounded-md bg-muted px-1.5 py-0.5 font-mono text-[10px] text-muted-foreground">
{formatContextWindow(model.contextLength)}
</span>
{field.value === model.modelId && (
<Check className="ml-2 h-4 w-4 shrink-0" />
)}
</CommandItem>
))}
</CommandGroup>
</CommandList>
</Command>
</PopoverContent>
</Popover>
)}
<FormMessage />
</FormItem>

View File

@@ -1,98 +1,21 @@
import {
getModelsDevModels,
type ModelsDevModel,
} from '@/lib/llm-providers/models-dev'
import type { ProviderType } from '@/lib/llm-providers/types'
/**
* Model information with context length
*/
export interface ModelInfo {
modelId: string
contextLength: number
supportsImages?: boolean
supportsReasoning?: boolean
supportsToolCall?: boolean
}
/**
* Models data organized by provider type (matches backend AIProvider enum)
*/
export interface ModelsData {
anthropic: ModelInfo[]
openai: ModelInfo[]
'openai-compatible': ModelInfo[]
google: ModelInfo[]
openrouter: ModelInfo[]
azure: ModelInfo[]
ollama: ModelInfo[]
lmstudio: ModelInfo[]
bedrock: ModelInfo[]
browseros: ModelInfo[]
moonshot: ModelInfo[]
'chatgpt-pro': ModelInfo[]
'github-copilot': ModelInfo[]
'qwen-code': ModelInfo[]
}
/**
* Available models per provider with context lengths
* Based on: https://github.com/browseros-ai/BrowserOS-agent/blob/main/src/options/data/models.ts
*/
export const MODELS_DATA: ModelsData = {
moonshot: [{ modelId: 'kimi-k2.5', contextLength: 200000 }],
anthropic: [
{ modelId: 'claude-opus-4-5-20251101', contextLength: 200000 },
{ modelId: 'claude-haiku-4-5-20251001', contextLength: 200000 },
{ modelId: 'claude-sonnet-4-5-20250929', contextLength: 200000 },
{ modelId: 'claude-sonnet-4-20250514', contextLength: 200000 },
{ modelId: 'claude-opus-4-20250514', contextLength: 200000 },
{ modelId: 'claude-3-7-sonnet-20250219', contextLength: 200000 },
{ modelId: 'claude-3-5-haiku-20241022', contextLength: 200000 },
],
openai: [
{ modelId: 'gpt-5.2', contextLength: 200000 },
{ modelId: 'gpt-5.2-pro', contextLength: 200000 },
{ modelId: 'gpt-5', contextLength: 200000 },
{ modelId: 'gpt-5-mini', contextLength: 200000 },
{ modelId: 'gpt-5-nano', contextLength: 200000 },
{ modelId: 'gpt-4.1', contextLength: 200000 },
{ modelId: 'gpt-4.1-mini', contextLength: 200000 },
{ modelId: 'o4-mini', contextLength: 200000 },
{ modelId: 'o3-mini', contextLength: 200000 },
{ modelId: 'gpt-4o', contextLength: 128000 },
{ modelId: 'gpt-4o-mini', contextLength: 128000 },
],
'openai-compatible': [],
google: [
{ modelId: 'gemini-3-pro-preview', contextLength: 1048576 },
{ modelId: 'gemini-3-flash-preview', contextLength: 1048576 },
{ modelId: 'gemini-2.5-flash', contextLength: 1048576 },
{ modelId: 'gemini-2.5-pro', contextLength: 1048576 },
],
openrouter: [
{ modelId: 'google/gemini-3-pro-preview', contextLength: 1048576 },
{ modelId: 'google/gemini-3-flash-preview', contextLength: 1048576 },
{ modelId: 'google/gemini-2.5-flash', contextLength: 1048576 },
{ modelId: 'anthropic/claude-opus-4.5', contextLength: 200000 },
{ modelId: 'anthropic/claude-haiku-4.5', contextLength: 200000 },
{ modelId: 'anthropic/claude-sonnet-4.5', contextLength: 200000 },
{ modelId: 'anthropic/claude-sonnet-4', contextLength: 200000 },
{ modelId: 'anthropic/claude-3.7-sonnet', contextLength: 200000 },
{ modelId: 'openai/gpt-4o', contextLength: 128000 },
{ modelId: 'openai/gpt-oss-120b', contextLength: 128000 },
{ modelId: 'openai/gpt-oss-20b', contextLength: 128000 },
{ modelId: 'qwen/qwen3-14b', contextLength: 131072 },
{ modelId: 'qwen/qwen3-8b', contextLength: 131072 },
],
azure: [],
ollama: [
{ modelId: 'qwen3:4b', contextLength: 262144 },
{ modelId: 'qwen3:8b', contextLength: 40960 },
{ modelId: 'qwen3:14b', contextLength: 40960 },
{ modelId: 'gpt-oss:20b', contextLength: 128000 },
{ modelId: 'gpt-oss:120b', contextLength: 128000 },
],
lmstudio: [
{ modelId: 'openai/gpt-oss-20b', contextLength: 128000 },
{ modelId: 'openai/gpt-oss-120b', contextLength: 128000 },
{ modelId: 'qwen/qwen3-vl-8b', contextLength: 131072 },
],
bedrock: [],
const CUSTOM_PROVIDER_MODELS: Partial<Record<ProviderType, ModelInfo[]>> = {
browseros: [{ modelId: 'browseros-auto', contextLength: 200000 }],
'openai-compatible': [],
ollama: [],
'chatgpt-pro': [
{ modelId: 'gpt-5.4', contextLength: 400000 },
{ modelId: 'gpt-5.3-codex', contextLength: 400000 },
@@ -103,32 +26,6 @@ export const MODELS_DATA: ModelsData = {
{ modelId: 'gpt-5.1-codex-mini', contextLength: 400000 },
{ modelId: 'gpt-5.1', contextLength: 200000 },
],
'github-copilot': [
// Free tier (unlimited with Pro)
{ modelId: 'gpt-5-mini', contextLength: 128000 },
{ modelId: 'claude-haiku-4.5', contextLength: 128000 },
{ modelId: 'gpt-4o', contextLength: 64000 },
{ modelId: 'gpt-4.1', contextLength: 64000 },
// Premium models (Pro: 300/mo, Pro+: 1500/mo)
{ modelId: 'claude-sonnet-4.6', contextLength: 128000 },
{ modelId: 'claude-sonnet-4.5', contextLength: 128000 },
{ modelId: 'claude-sonnet-4', contextLength: 128000 },
{ modelId: 'claude-opus-4.6', contextLength: 128000 },
{ modelId: 'claude-opus-4.5', contextLength: 128000 },
{ modelId: 'gemini-2.5-pro', contextLength: 128000 },
{ modelId: 'gemini-3-pro-preview', contextLength: 128000 },
{ modelId: 'gemini-3-flash-preview', contextLength: 128000 },
{ modelId: 'gemini-3.1-pro-preview', contextLength: 128000 },
{ modelId: 'gpt-5.4', contextLength: 272000 },
{ modelId: 'gpt-5.4-mini', contextLength: 128000 },
{ modelId: 'gpt-5.3-codex', contextLength: 272000 },
{ modelId: 'gpt-5.2-codex', contextLength: 272000 },
{ modelId: 'gpt-5.2', contextLength: 128000 },
{ modelId: 'gpt-5.1-codex', contextLength: 128000 },
{ modelId: 'gpt-5.1-codex-max', contextLength: 128000 },
{ modelId: 'gpt-5.1', contextLength: 128000 },
{ modelId: 'grok-code-fast-1', contextLength: 128000 },
],
'qwen-code': [
{ modelId: 'coder-model', contextLength: 1000000 },
{ modelId: 'qwen3-coder-plus', contextLength: 1000000 },
@@ -137,25 +34,23 @@ export const MODELS_DATA: ModelsData = {
],
}
/**
* Get models for a specific provider type
*/
function fromModelsDevModel(m: ModelsDevModel): ModelInfo {
return {
modelId: m.id,
contextLength: m.contextWindow,
supportsImages: m.supportsImages,
supportsReasoning: m.supportsReasoning,
supportsToolCall: m.supportsToolCall,
}
}
export function getModelsForProvider(providerType: ProviderType): ModelInfo[] {
return MODELS_DATA[providerType] || []
const custom = CUSTOM_PROVIDER_MODELS[providerType]
if (custom !== undefined) return custom
return getModelsDevModels(providerType).map(fromModelsDevModel)
}
/**
* Get model options for select dropdown (model IDs + custom option)
*/
export function getModelOptions(providerType: ProviderType): string[] {
const models = getModelsForProvider(providerType)
const modelIds = models.map((m) => m.modelId)
return modelIds.length > 0 ? [...modelIds, 'custom'] : ['custom']
}
/**
* Get context length for a specific model
*/
export function getModelContextLength(
providerType: ProviderType,
modelId: string,
@@ -164,14 +59,3 @@ export function getModelContextLength(
const model = models.find((m) => m.modelId === modelId)
return model?.contextLength
}
/**
* Check if model ID is a custom (user-entered) value
*/
export function isCustomModel(
providerType: ProviderType,
modelId: string,
): boolean {
const models = getModelsForProvider(providerType)
return !models.some((m) => m.modelId === modelId)
}

View File

@@ -1,5 +1,5 @@
import { useQueryClient } from '@tanstack/react-query'
import localforage from 'localforage'
import { clear } from 'idb-keyval'
import { Loader2 } from 'lucide-react'
import type { FC } from 'react'
import { useEffect } from 'react'
@@ -25,7 +25,7 @@ export const LogoutPage: FC = () => {
await providersStorage.removeValue()
await scheduledJobStorage.removeValue()
queryClient.clear()
await localforage.clear()
await clear()
resetIdentity()
await signOut()

View File

@@ -169,8 +169,15 @@ export const NewTabChat: FC = () => {
onDismissJtbdPopup={() => {}}
/>
)}
{agentUrlError && <ChatError error={agentUrlError} />}
{chatError && <ChatError error={chatError} />}
{agentUrlError && (
<ChatError
error={agentUrlError}
providerType={selectedProvider?.type}
/>
)}
{chatError && (
<ChatError error={chatError} providerType={selectedProvider?.type} />
)}
</main>
<div className="mx-auto w-full max-w-3xl flex-shrink-0 px-4 pb-2">

View File

@@ -32,6 +32,7 @@ const RemoteChatHistory: FC<{ userId: string }> = ({ userId }) => {
const {
data: graphqlData,
isLoading: isLoadingConversations,
isFetching,
hasNextPage,
isFetchingNextPage,
fetchNextPage,
@@ -112,6 +113,7 @@ const RemoteChatHistory: FC<{ userId: string }> = ({ userId }) => {
hasNextPage={hasNextPage}
isFetchingNextPage={isFetchingNextPage}
onLoadMore={fetchNextPage}
isRefreshing={isFetching && !isLoadingConversations}
/>
)
}

View File

@@ -12,6 +12,7 @@ interface ConversationListProps {
hasNextPage?: boolean
isFetchingNextPage?: boolean
onLoadMore?: () => void
isRefreshing?: boolean
}
export const ConversationList: FC<ConversationListProps> = ({
@@ -21,6 +22,7 @@ export const ConversationList: FC<ConversationListProps> = ({
hasNextPage,
isFetchingNextPage,
onLoadMore,
isRefreshing,
}) => {
const loadMoreRef = useRef<HTMLDivElement>(null)
@@ -57,6 +59,12 @@ export const ConversationList: FC<ConversationListProps> = ({
return (
<main className="mt-4 flex h-full flex-1 flex-col space-y-4 overflow-y-auto">
<div className="w-full p-3">
{isRefreshing && (
<div className="flex items-center justify-center gap-2 pb-3 text-muted-foreground text-xs">
<Loader2 className="h-3 w-3 animate-spin" />
<span>Fetching latest conversations</span>
</div>
)}
{!hasConversations ? (
<div className="flex flex-col items-center justify-center py-12 text-center">
<MessageSquare className="mb-3 h-10 w-10 text-muted-foreground/50" />

View File

@@ -11,7 +11,7 @@ export const GetConversationsForHistoryDocument = graphql(`
nodes {
rowId
lastMessagedAt
conversationMessages(last: 5, orderBy: ORDER_INDEX_ASC) {
conversationMessages(first: 2, orderBy: ORDER_INDEX_DESC) {
nodes {
message
}

View File

@@ -224,7 +224,12 @@ export const Chat = () => {
onDismissJtbdPopup={onDismissJtbdPopup}
/>
)}
{agentUrlError && <ChatError error={agentUrlError} />}
{agentUrlError && (
<ChatError
error={agentUrlError}
providerType={selectedProvider?.type}
/>
)}
{chatError && (
<ChatError error={chatError} providerType={selectedProvider?.type} />
)}

View File

@@ -34,11 +34,9 @@ function parseErrorMessage(
} {
const isBrowserosProvider = providerType === 'browseros'
// Detect MCP server connection failures (universal — affects all providers)
if (
(message.includes('Failed to fetch') || message.includes('fetch failed')) &&
message.includes('127.0.0.1')
) {
// All chat requests go through the local BrowserOS agent server, so any
// fetch failure is always a local connection issue.
if (message.includes('Failed to fetch') || message.includes('fetch failed')) {
return {
text: 'Unable to connect to BrowserOS agent. Follow below instructions.',
url: 'https://docs.browseros.com/troubleshooting/connection-issues',

View File

@@ -76,8 +76,6 @@ export interface ChatSessionOptions {
isIntegrationsSynced?: boolean
}
const NEWTAB_SYSTEM_PROMPT = `IMPORTANT: The user is chatting from the New Tab page. When performing browser actions, ALWAYS open content in a NEW TAB rather than navigating the current tab. The user's new tab page should remain accessible.`
export const useChatSession = (options?: ChatSessionOptions) => {
const {
selectedLlmProviderRef,
@@ -344,12 +342,8 @@ export const useChatSession = (options?: ChatSessionOptions) => {
reasoningEffort: provider?.reasoningEffort,
reasoningSummary: provider?.reasoningSummary,
browserContext,
userSystemPrompt:
options?.origin === 'newtab'
? [personalizationRef.current, NEWTAB_SYSTEM_PROMPT]
.filter(Boolean)
.join('\n\n')
: personalizationRef.current,
origin: options?.origin ?? 'sidepanel',
userSystemPrompt: personalizationRef.current,
userWorkingDir: workingDirRef.current,
supportsImages: provider?.supportsImages,
previousConversation,
@@ -567,9 +561,11 @@ export const useChatSession = (options?: ChatSessionOptions) => {
}, [])
const handleSelectProvider = (provider: Provider) => {
const fullProvider = llmProviders.find((p) => p.id === provider.id)
track(PROVIDER_SELECTED_EVENT, {
provider_id: provider.id,
provider_type: provider.type,
model_id: fullProvider?.modelId,
})
setDefaultProvider(provider.id)
}

View File

@@ -29,6 +29,12 @@ export const CONVERSATION_RESET_EVENT = 'ui.conversation.reset'
/** @public */
export const AI_PROVIDER_ADDED_EVENT = 'settings.ai_provider.added'
/** @public */
export const AI_PROVIDER_UPDATED_EVENT = 'settings.ai_provider.updated'
/** @public */
export const MODEL_SELECTED_EVENT = 'settings.model.selected'
/** @public */
export const CHATGPT_PRO_OAUTH_STARTED_EVENT =
'settings.chatgpt_pro.oauth_started'

View File

@@ -1,7 +1,10 @@
import { createAsyncStoragePersister } from '@tanstack/query-async-storage-persister'
import { QueryClient } from '@tanstack/react-query'
import { PersistQueryClientProvider } from '@tanstack/react-query-persist-client'
import localforage from 'localforage'
import {
type AsyncStorage,
PersistQueryClientProvider,
} from '@tanstack/react-query-persist-client'
import { del, get, set } from 'idb-keyval'
import type { FC, ReactNode } from 'react'
const queryClient = new QueryClient({
@@ -12,8 +15,14 @@ const queryClient = new QueryClient({
},
})
const idbStorage: AsyncStorage<string> = {
getItem: (key: string) => get<string>(key).then((v) => v ?? null),
setItem: (key: string, value: string) => set(key, value),
removeItem: (key: string) => del(key),
}
const asyncStoragePersister = createAsyncStoragePersister({
storage: localforage,
storage: idbStorage,
})
export const QueryProvider: FC<{ children: ReactNode }> = ({ children }) => {

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,35 @@
import data from './models-dev-data.json'
export interface ModelsDevModel {
id: string
name: string
contextWindow: number
maxOutput: number
supportsImages: boolean
supportsReasoning: boolean
supportsToolCall: boolean
inputCost?: number
outputCost?: number
}
export interface ModelsDevProvider {
name: string
api?: string
doc: string
models: ModelsDevModel[]
}
const modelsDevData: Record<string, ModelsDevProvider> = data as Record<
string,
ModelsDevProvider
>
export function getModelsDevProvider(
providerId: string,
): ModelsDevProvider | undefined {
return modelsDevData[providerId]
}
export function getModelsDevModels(providerId: string): ModelsDevModel[] {
return modelsDevData[providerId]?.models ?? []
}

View File

@@ -1,3 +1,4 @@
import { getModelsDevProvider } from './models-dev'
import type { ProviderType } from './types'
/**
@@ -15,6 +16,30 @@ export interface ProviderTemplate {
apiKeyUrl?: string
}
function enrichTemplate(
providerId: ProviderType,
overrides: {
defaultModelId: string
defaultBaseUrl?: string
apiKeyUrl?: string
setupGuideUrl?: string
},
): ProviderTemplate {
const provider = getModelsDevProvider(providerId)
const model = provider?.models.find((m) => m.id === overrides.defaultModelId)
return {
id: providerId,
name: provider?.name ?? providerId,
defaultBaseUrl: overrides.defaultBaseUrl ?? provider?.api ?? '',
defaultModelId: overrides.defaultModelId,
supportsImages: model?.supportsImages ?? true,
contextWindow: model?.contextWindow ?? 128000,
...(overrides.apiKeyUrl && { apiKeyUrl: overrides.apiKeyUrl }),
...(overrides.setupGuideUrl && { setupGuideUrl: overrides.setupGuideUrl }),
}
}
/**
* Available provider templates for quick setup
* @public
@@ -57,17 +82,12 @@ export const providerTemplates: ProviderTemplate[] = [
apiKeyUrl: 'https://platform.moonshot.ai/console/api-keys',
setupGuideUrl: 'https://platform.moonshot.ai/console/api-keys',
},
{
id: 'openai',
name: 'OpenAI',
defaultBaseUrl: 'https://api.openai.com/v1',
defaultModelId: 'gpt-4',
supportsImages: true,
contextWindow: 128000,
enrichTemplate('openai', {
defaultModelId: 'gpt-5',
apiKeyUrl: 'https://platform.openai.com/api-keys',
setupGuideUrl:
'https://docs.browseros.com/features/bring-your-own-llm#openai',
},
}),
{
id: 'openai-compatible',
name: 'OpenAI Compatible',
@@ -76,28 +96,18 @@ export const providerTemplates: ProviderTemplate[] = [
supportsImages: true,
contextWindow: 128000,
},
{
id: 'anthropic',
name: 'Anthropic',
defaultBaseUrl: 'https://api.anthropic.com/v1',
defaultModelId: 'claude-3-5-sonnet-20241022',
supportsImages: true,
contextWindow: 200000,
enrichTemplate('anthropic', {
defaultModelId: 'claude-sonnet-4-6',
apiKeyUrl: 'https://console.anthropic.com/settings/keys',
setupGuideUrl:
'https://docs.browseros.com/features/bring-your-own-llm#claude',
},
{
id: 'google',
name: 'Gemini',
defaultBaseUrl: 'https://generativelanguage.googleapis.com/v1beta',
defaultModelId: 'gemini-1.5-pro',
supportsImages: true,
contextWindow: 1000000,
}),
enrichTemplate('google', {
defaultModelId: 'gemini-2.5-flash',
apiKeyUrl: 'https://aistudio.google.com/app/apikey',
setupGuideUrl:
'https://docs.browseros.com/features/bring-your-own-llm#gemini',
},
}),
{
id: 'ollama',
name: 'Ollama',
@@ -108,47 +118,28 @@ export const providerTemplates: ProviderTemplate[] = [
setupGuideUrl:
'https://docs.browseros.com/features/bring-your-own-llm#ollama',
},
{
id: 'openrouter',
name: 'OpenRouter',
defaultBaseUrl: 'https://openrouter.ai/api/v1',
defaultModelId: 'openai/gpt-4-turbo',
supportsImages: true,
contextWindow: 128000,
enrichTemplate('openrouter', {
defaultModelId: 'anthropic/claude-sonnet-4.5',
apiKeyUrl: 'https://openrouter.ai/keys',
setupGuideUrl:
'https://docs.browseros.com/features/bring-your-own-llm#openrouter',
},
{
id: 'lmstudio',
name: 'LM Studio',
}),
enrichTemplate('lmstudio', {
defaultModelId: 'openai/gpt-oss-20b',
defaultBaseUrl: 'http://localhost:1234/v1',
defaultModelId: 'local-model',
supportsImages: false,
contextWindow: 32000,
setupGuideUrl:
'https://docs.browseros.com/features/bring-your-own-llm#lmstudio',
},
{
id: 'azure',
name: 'Azure',
defaultBaseUrl: '',
}),
enrichTemplate('azure', {
defaultModelId: '',
supportsImages: true,
contextWindow: 128000,
apiKeyUrl:
'https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/OpenAI',
},
{
id: 'bedrock',
name: 'AWS Bedrock',
defaultBaseUrl: '',
defaultModelId: '',
supportsImages: true,
contextWindow: 200000,
}),
enrichTemplate('bedrock', {
defaultModelId: 'anthropic.claude-sonnet-4-6',
setupGuideUrl:
'https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started.html',
},
}),
]
/**

View File

@@ -2,7 +2,7 @@
"name": "@browseros/agent",
"description": "manifest.json description",
"private": true,
"version": "0.0.52",
"version": "0.0.98",
"type": "module",
"scripts": {
"dev": "test -d generated/graphql || bun run codegen; mkdir -p /tmp/browseros-dev; bun --env-file=.env.development wxt",
@@ -44,9 +44,9 @@
"@radix-ui/react-use-controllable-state": "^1.2.2",
"@sentry/react": "^10.31.0",
"@sentry/vite-plugin": "^4.6.1",
"@tanstack/query-async-storage-persister": "^5.90.21",
"@tanstack/react-query": "^5.90.19",
"@tanstack/react-query-persist-client": "^5.90.21",
"@tanstack/query-async-storage-persister": "^5.95.2",
"@tanstack/react-query": "^5.95.2",
"@tanstack/react-query-persist-client": "^5.95.2",
"@types/cytoscape": "^3.31.0",
"@types/dompurify": "^3.2.0",
"@webext-core/messaging": "^2.3.0",
@@ -67,10 +67,11 @@
"embla-carousel-react": "^8.6.0",
"es-toolkit": "^1.42.0",
"eventsource-parser": "^3.0.6",
"fuse.js": "^7.1.0",
"graphql": "^16.12.0",
"hono": "^4.12.3",
"idb-keyval": "^6.2.2",
"klavis": "^2.15.0",
"localforage": "^1.10.0",
"lucide-react": "^0.562.0",
"motion": "^12.23.24",
"nanoid": "^5.1.6",

View File

@@ -1,19 +1,13 @@
import { dirname, join } from 'node:path'
import { fileURLToPath } from 'node:url'
import { defineWebExtConfig } from 'wxt'
// biome-ignore lint/style/noProcessEnv: config file needs env access
const env = process.env
const MONOREPO_ROOT = join(dirname(fileURLToPath(import.meta.url)), '../..')
const CONTROLLER_EXT_DIR = join(MONOREPO_ROOT, 'apps/controller-ext/dist')
const chromiumArgs = [
'--use-mock-keychain',
'--show-component-extension-options',
'--disable-browseros-server',
'--disable-browseros-extensions',
`--load-extension=${CONTROLLER_EXT_DIR}`,
]
if (env.BROWSEROS_CDP_PORT) {

View File

@@ -0,0 +1,10 @@
# Production build env for CLI
POSTHOG_API_KEY=
# Upload env for CLI installer scripts
R2_ACCOUNT_ID=
R2_ACCESS_KEY_ID=
R2_SECRET_ACCESS_KEY=
R2_BUCKET=browseros
R2_UPLOAD_PREFIX=cli

View File

@@ -1 +1,2 @@
browseros-cli
dist

View File

@@ -0,0 +1 @@
# BrowserOS CLI

View File

@@ -1,20 +1,69 @@
BINARY := browseros-cli
SOURCES := $(shell find . -name '*.go')
VERSION ?= dev
POSTHOG_API_KEY ?=
DIST := dist
LDFLAGS := -X main.version=$(VERSION) -X browseros-cli/analytics.posthogAPIKey=$(POSTHOG_API_KEY)
HOST_OS := $(shell go env GOOS)
HOST_ARCH := $(shell go env GOARCH)
HOST_EXT := $(if $(filter windows,$(HOST_OS)),.exe,)
HOST_BINARY = $(DIST)/$(BINARY)_$(HOST_OS)_$(HOST_ARCH)$(HOST_EXT)
$(BINARY): $(SOURCES)
go build -ldflags "-X main.version=$(VERSION)" -o $(BINARY) .
go build -ldflags "$(LDFLAGS)" -o $(BINARY) .
.PHONY: install clean vet test
PLATFORMS := darwin/amd64 darwin/arm64 linux/amd64 linux/arm64 windows/amd64 windows/arm64
.PHONY: install clean vet test release
install:
go install -ldflags "-X main.version=$(VERSION)" .
go install -ldflags "$(LDFLAGS)" .
clean:
rm -f $(BINARY)
rm -rf $(DIST)
vet:
go vet ./...
test:
go test -tags integration -v -timeout 120s ./...
release:
@if [ "$(VERSION)" = "dev" ]; then echo "Error: VERSION required (e.g. make release VERSION=0.1.0)" >&2; exit 1; fi
@rm -rf $(DIST) && mkdir -p $(DIST)
@for pair in $(PLATFORMS); do \
OS=$${pair%/*}; \
ARCH=$${pair#*/}; \
EXT=""; \
if [ "$$OS" = "windows" ]; then EXT=".exe"; fi; \
echo "Building $$OS/$$ARCH..."; \
GOOS=$$OS GOARCH=$$ARCH CGO_ENABLED=0 go build -trimpath \
-ldflags "-s -w $(LDFLAGS)" \
-o "$(DIST)/$(BINARY)$$EXT" .; \
ARCHIVE="$(BINARY)_$(VERSION)_$${OS}_$${ARCH}"; \
if [ "$$OS" = "windows" ]; then \
(cd $(DIST) && zip "$${ARCHIVE}.zip" "$(BINARY)$$EXT"); \
else \
(cd $(DIST) && tar czf "$${ARCHIVE}.tar.gz" "$(BINARY)"); \
fi; \
mv "$(DIST)/$(BINARY)$$EXT" "$(DIST)/$(BINARY)_$${OS}_$${ARCH}$$EXT"; \
done
@ACTUAL_VERSION=$$($(HOST_BINARY) --version | awk '{print $$3}'); \
if [ "$$ACTUAL_VERSION" != "$(VERSION)" ]; then \
echo "Error: expected $(HOST_BINARY) to report version $(VERSION), got $$ACTUAL_VERSION" >&2; \
exit 1; \
fi
@cd $(DIST) && (command -v sha256sum >/dev/null 2>&1 && sha256sum *.tar.gz *.zip || shasum -a 256 *.tar.gz *.zip) > checksums.txt
@echo "=== Built artifacts ==="
@ls -lh $(DIST)
.PHONY: npm-version npm-publish
npm-version:
@if [ "$(VERSION)" = "dev" ]; then echo "Error: VERSION required" >&2; exit 1; fi
@node -e "const p=require('./npm/package.json');p.version='$(VERSION)';require('fs').writeFileSync('./npm/package.json',JSON.stringify(p,null,2)+'\n')"
@echo "npm/package.json version set to $(VERSION)"
npm-publish: npm-version
cd npm && npm publish

View File

@@ -1,25 +1,68 @@
# browseros-cli
Command-line interface for controlling BrowserOS via MCP. Talks to the BrowserOS MCP server over JSON-RPC 2.0 / StreamableHTTP.
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](../../../../LICENSE)
## Setup
Command-line interface for controlling BrowserOS — launch and automate the browser from the terminal or from AI coding agents like Claude Code and Gemini CLI.
Communicates with the BrowserOS MCP server over JSON-RPC 2.0 / StreamableHTTP. All 53+ MCP tools are mapped to CLI commands.
## Install
### macOS / Linux
```bash
curl -fsSL https://cdn.browseros.com/cli/install.sh | bash
```
### Windows
```powershell
irm https://cdn.browseros.com/cli/install.ps1 | iex
```
### Build from Source
Requires Go 1.25+.
```bash
# Build
make
# First run — configure server connection
./browseros-cli init
make # Build binary
make install # Install to $GOPATH/bin
```
The `init` command prompts for your MCP server URL. Find it in:
**BrowserOS → Settings → BrowserOS MCP → Server URL**
## Quick Start
The port varies per installation (e.g., `http://127.0.0.1:9004/mcp`).
```bash
# If BrowserOS is not installed yet
browseros-cli install # downloads BrowserOS for your platform
Config is saved to `~/.config/browseros-cli/config.yaml`.
# If BrowserOS is installed but not running
browseros-cli launch # opens BrowserOS, waits for server
# Configure the CLI (auto-discovers running BrowserOS)
browseros-cli init --auto # detects server URL and saves config
# Verify connection
browseros-cli health
```
### Other init modes
```bash
browseros-cli init <url> # non-interactive — pass URL directly
browseros-cli init # interactive — prompts for URL
```
Config is saved to `~/.config/browseros-cli/config.yaml`. The CLI also auto-discovers the server from `~/.browseros/server.json` (written by BrowserOS on startup).
### CLI updates
The CLI checks for a newer BrowserOS CLI release in the background about once per day and will suggest an update on a later run when one is available.
```bash
browseros-cli update # check and apply the latest CLI release
browseros-cli update --check # check only
browseros-cli update --yes # apply without prompting
```
## Usage
@@ -67,6 +110,12 @@ browseros-cli history recent
browseros-cli group list
```
## Use as MCP Server
BrowserOS exposes an MCP server that AI coding agents can connect to directly. The CLI is the easiest way to verify the connection and interact with tools from the terminal.
To connect Claude Code, Gemini CLI, or any MCP client, see the [MCP setup guide](https://docs.browseros.com/features/use-with-claude-code).
## Global Flags
| Flag | Env Var | Description |
@@ -77,9 +126,9 @@ browseros-cli group list
| `--debug` | `BOS_DEBUG=1` | Debug output |
| `--timeout, -t` | | Request timeout (default: 2m) |
Priority for server URL: `--server` flag > `BROWSEROS_URL` env > config file
Priority for server URL: `--server` flag > `BROWSEROS_URL` env > `~/.browseros/server.json` > config file
If no server URL is configured, the CLI exits with setup instructions instead of assuming a localhost port.
If no server URL is configured, the CLI exits with setup instructions pointing to `install`, `launch`, and `init`.
## Testing
@@ -130,7 +179,9 @@ apps/cli/
│ └── config.go # Config file (~/.config/browseros-cli/config.yaml)
├── cmd/
│ ├── root.go # Root command, global flags
│ ├── init.go # Server URL configuration
│ ├── init.go # Server URL configuration (URL arg, --auto, interactive)
│ ├── install.go # install (download BrowserOS for current platform)
│ ├── launch.go # launch (find and start BrowserOS, wait for server)
│ ├── open.go # open (new_page / new_hidden_page)
│ ├── nav.go # nav, back, forward, reload
│ ├── pages.go # pages, active, close
@@ -163,4 +214,8 @@ The CLI communicates with BrowserOS via two HTTP POST requests per command:
1. `initialize` — MCP handshake
2. `tools/call` — execute the actual tool
All 54 MCP tools are mapped to CLI commands.
## Links
- [Documentation](https://docs.browseros.com)
- [MCP Setup Guide](https://docs.browseros.com/features/use-with-claude-code)
- [Changelog](./CHANGELOG.md)

View File

@@ -0,0 +1,129 @@
package analytics
import (
"crypto/rand"
"encoding/json"
"fmt"
"os"
"path/filepath"
"runtime"
"strings"
"time"
"browseros-cli/config"
"github.com/posthog/posthog-go"
)
var (
posthogAPIKey string // set via ldflags
posthogHost = "https://us.i.posthog.com"
)
const eventPrefix = "browseros.cli."
var svc *service
type service struct {
client posthog.Client
distinctID string
}
func Init(version string) {
if posthogAPIKey == "" {
return
}
distinctID := resolveDistinctID()
if distinctID == "" {
return
}
client, err := posthog.NewWithConfig(posthogAPIKey, posthog.Config{
Endpoint: posthogHost,
BatchSize: 10,
ShutdownTimeout: 3 * time.Second,
DefaultEventProperties: posthog.NewProperties().
Set("cli_version", version).
Set("os", runtime.GOOS).
Set("arch", runtime.GOARCH),
})
if err != nil {
return
}
svc = &service{client: client, distinctID: distinctID}
}
func Track(command string, success bool, duration time.Duration) {
if svc == nil {
return
}
svc.client.Enqueue(posthog.Capture{
DistinctId: svc.distinctID,
Event: eventPrefix + "command_executed",
Properties: posthog.NewProperties().
Set("command", command).
Set("success", success).
Set("duration_ms", duration.Milliseconds()).
Set("$process_person_profile", false),
})
}
func Close() {
if svc == nil {
return
}
svc.client.Close()
svc = nil
}
func resolveDistinctID() string {
if id := loadBrowserosID(); id != "" {
return id
}
return loadOrCreateInstallID()
}
func loadBrowserosID() string {
home, err := os.UserHomeDir()
if err != nil {
return ""
}
data, err := os.ReadFile(filepath.Join(home, ".browseros", "server.json"))
if err != nil {
return ""
}
var sc struct {
BrowserosID string `json:"browseros_id"`
}
if json.Unmarshal(data, &sc) != nil {
return ""
}
return sc.BrowserosID
}
func loadOrCreateInstallID() string {
dir := config.Dir()
idPath := filepath.Join(dir, "install_id")
data, err := os.ReadFile(idPath)
if err == nil {
if id := strings.TrimSpace(string(data)); id != "" {
return id
}
}
id := generateUUID()
os.MkdirAll(dir, 0755)
os.WriteFile(idPath, []byte(id), 0644)
return id
}
func generateUUID() string {
var b [16]byte
rand.Read(b[:])
b[6] = (b[6] & 0x0f) | 0x40 // version 4
b[8] = (b[8] & 0x3f) | 0x80 // variant 2
return fmt.Sprintf("%x-%x-%x-%x-%x", b[0:4], b[4:6], b[6:8], b[8:10], b[10:])
}

View File

@@ -0,0 +1,132 @@
package analytics
import (
"encoding/json"
"os"
"path/filepath"
"regexp"
"testing"
"time"
)
func TestGenerateUUID(t *testing.T) {
id := generateUUID()
uuidRe := regexp.MustCompile(`^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$`)
if !uuidRe.MatchString(id) {
t.Errorf("generateUUID() = %q, does not match UUID v4 pattern", id)
}
id2 := generateUUID()
if id == id2 {
t.Error("generateUUID() returned the same value twice")
}
}
func TestLoadBrowserosID(t *testing.T) {
tmp := t.TempDir()
t.Setenv("HOME", tmp)
// No server.json → empty
if got := loadBrowserosID(); got != "" {
t.Errorf("loadBrowserosID() = %q, want empty", got)
}
// server.json without browseros_id → empty
dir := filepath.Join(tmp, ".browseros")
os.MkdirAll(dir, 0755)
data, _ := json.Marshal(map[string]any{"server_port": 9100, "url": "http://127.0.0.1:9100"})
os.WriteFile(filepath.Join(dir, "server.json"), data, 0644)
if got := loadBrowserosID(); got != "" {
t.Errorf("loadBrowserosID() = %q, want empty (no browseros_id field)", got)
}
// server.json with browseros_id → returns it
data, _ = json.Marshal(map[string]any{
"server_port": 9100,
"url": "http://127.0.0.1:9100",
"browseros_id": "test-uuid-1234",
})
os.WriteFile(filepath.Join(dir, "server.json"), data, 0644)
if got := loadBrowserosID(); got != "test-uuid-1234" {
t.Errorf("loadBrowserosID() = %q, want %q", got, "test-uuid-1234")
}
}
func TestLoadOrCreateInstallID(t *testing.T) {
tmp := t.TempDir()
configDir := filepath.Join(tmp, "browseros-cli")
t.Setenv("XDG_CONFIG_HOME", tmp)
// First call creates the file
id := loadOrCreateInstallID()
if id == "" {
t.Fatal("loadOrCreateInstallID() returned empty string")
}
// File was persisted
data, err := os.ReadFile(filepath.Join(configDir, "install_id"))
if err != nil {
t.Fatalf("install_id file not created: %v", err)
}
if string(data) != id {
t.Errorf("persisted id = %q, want %q", string(data), id)
}
// Second call returns the same ID
id2 := loadOrCreateInstallID()
if id2 != id {
t.Errorf("loadOrCreateInstallID() = %q, want stable %q", id2, id)
}
}
func TestResolveDistinctID_PrefersBrowserosID(t *testing.T) {
tmp := t.TempDir()
t.Setenv("HOME", tmp)
t.Setenv("XDG_CONFIG_HOME", tmp)
// Write server.json with browseros_id
dir := filepath.Join(tmp, ".browseros")
os.MkdirAll(dir, 0755)
data, _ := json.Marshal(map[string]any{"browseros_id": "server-uuid"})
os.WriteFile(filepath.Join(dir, "server.json"), data, 0644)
got := resolveDistinctID()
if got != "server-uuid" {
t.Errorf("resolveDistinctID() = %q, want %q", got, "server-uuid")
}
}
func TestResolveDistinctID_FallsBackToInstallID(t *testing.T) {
tmp := t.TempDir()
t.Setenv("HOME", tmp)
t.Setenv("XDG_CONFIG_HOME", tmp)
// No server.json → should generate install_id
got := resolveDistinctID()
if got == "" {
t.Error("resolveDistinctID() returned empty string")
}
}
func TestInitNoopsWithoutAPIKey(t *testing.T) {
old := posthogAPIKey
posthogAPIKey = ""
defer func() { posthogAPIKey = old }()
Init("1.0.0")
if svc != nil {
t.Error("Init() created service without API key")
}
}
func TestTrackAndCloseNoopWithoutInit(t *testing.T) {
old := svc
svc = nil
defer func() { svc = old }()
// Should not panic
Track("test", true, time.Second)
Close()
}

View File

@@ -49,7 +49,7 @@ func init() {
statusCmd := &cobra.Command{
Use: "status",
Annotations: map[string]string{"group": "Setup:"},
Short: "Check extension connection status",
Short: "Check BrowserOS runtime status",
Args: cobra.NoArgs,
Run: func(cmd *cobra.Command, args []string) {
c := newClient()
@@ -64,12 +64,12 @@ func init() {
green := color.New(color.FgGreen).SprintFunc()
red := color.New(color.FgRed).SprintFunc()
ext := data["extensionConnected"]
extStr := red("disconnected")
if b, ok := ext.(bool); ok && b {
extStr = green("connected")
cdp := data["cdpConnected"]
cdpStr := red("disconnected")
if b, ok := cdp.(bool); ok && b {
cdpStr = green("connected")
}
fmt.Printf("Extension: %s\n", extStr)
fmt.Printf("Browser: %s\n", cdpStr)
},
}

View File

@@ -17,42 +17,75 @@ import (
)
func init() {
var autoDiscover bool
cmd := &cobra.Command{
Use: "init",
Use: "init [url]",
Short: "Configure the BrowserOS server connection",
Long: `Set up the CLI by providing the MCP server URL from BrowserOS.
Open BrowserOS → Settings → BrowserOS MCP to find your Server URL.
The URL looks like: http://127.0.0.1:9004/mcp
The URL looks like: http://127.0.0.1:9000/mcp
The port varies per installation, so this step is required on first use.
Run again if your port changes.`,
Run again if your port changes.
You can provide the full URL or just the port number:
browseros-cli init http://127.0.0.1:9000/mcp
browseros-cli init 9000
Three modes:
browseros-cli init <url> Non-interactive (full URL or port number)
browseros-cli init --auto Auto-discover from ~/.browseros/server.json
browseros-cli init Interactive prompt`,
Annotations: map[string]string{"group": "Setup:"},
Args: cobra.NoArgs,
Args: cobra.MaximumNArgs(1),
Run: func(cmd *cobra.Command, args []string) {
bold := color.New(color.Bold)
green := color.New(color.FgGreen)
dim := color.New(color.Faint)
fmt.Println()
bold.Println("BrowserOS CLI Setup")
fmt.Println()
fmt.Println("Open BrowserOS → Settings → BrowserOS MCP")
fmt.Println("Copy the Server URL shown there.")
fmt.Println()
dim.Println("It looks like: http://127.0.0.1:9004/mcp")
fmt.Println()
var input string
reader := bufio.NewReader(os.Stdin)
fmt.Print("Server URL: ")
input, err := reader.ReadString('\n')
if err != nil {
output.Error("failed to read input", 1)
}
input = strings.TrimSpace(input)
switch {
case len(args) == 1:
// Non-interactive: URL provided as argument
input = args[0]
if input == "" {
output.Error("no URL provided", 1)
case autoDiscover:
// Auto-discover: server.json → config → probe common ports
discovered := probeRunningServer()
if discovered == "" {
output.Error("auto-discovery failed: no running BrowserOS found.\n\n"+
" If not running: browseros-cli launch\n"+
" If not installed: browseros-cli install", 1)
}
input = discovered
fmt.Printf("Auto-discovered server at %s\n", input)
default:
// Interactive prompt (original behavior)
fmt.Println()
bold.Println("BrowserOS CLI Setup")
fmt.Println()
fmt.Println("Open BrowserOS → Settings → BrowserOS MCP")
fmt.Println("Copy the Server URL or port number shown there.")
fmt.Println()
dim.Println("Examples: http://127.0.0.1:9000/mcp")
dim.Println(" 9000")
fmt.Println()
reader := bufio.NewReader(os.Stdin)
fmt.Print("Server URL or port: ")
line, err := reader.ReadString('\n')
if err != nil {
output.Error("failed to read input", 1)
}
input = strings.TrimSpace(line)
if input == "" {
output.Error("no URL provided", 1)
}
}
baseURL := normalizeServerURL(input)
@@ -88,5 +121,6 @@ Run again if your port changes.`,
},
}
cmd.Flags().BoolVar(&autoDiscover, "auto", false, "Auto-discover server URL from ~/.browseros/server.json")
rootCmd.AddCommand(cmd)
}

View File

@@ -0,0 +1,247 @@
package cmd
import (
"fmt"
"io"
"net/http"
"os"
"os/exec"
"path/filepath"
"runtime"
"time"
"browseros-cli/output"
"github.com/fatih/color"
"github.com/spf13/cobra"
)
func init() {
cmd := &cobra.Command{
Use: "install",
Short: "Download and install BrowserOS for the current platform",
Long: `Download BrowserOS for your platform and start the installation.
macOS: Downloads .dmg, mounts it, and copies BrowserOS to /Applications
Windows: Downloads installer .exe and launches it
Linux: Downloads AppImage (or .deb with --deb flag)
After installation:
browseros-cli launch # start BrowserOS
browseros-cli init --auto # configure the CLI`,
Annotations: map[string]string{"group": "Setup:"},
Args: cobra.NoArgs,
Run: func(cmd *cobra.Command, args []string) {
dir, _ := cmd.Flags().GetString("dir")
deb, _ := cmd.Flags().GetBool("deb")
if deb && runtime.GOOS != "linux" {
output.Error("--deb is only available on Linux", 1)
}
downloadURL, filename := resolveDownload(deb)
destPath := filepath.Join(dir, filename)
bold := color.New(color.Bold)
green := color.New(color.FgGreen)
dim := color.New(color.Faint)
bold.Printf("Downloading BrowserOS for %s...\n", platformDisplayName())
dim.Printf(" %s\n", downloadURL)
fmt.Println()
client := &http.Client{Timeout: 10 * time.Minute}
resp, err := client.Get(downloadURL)
if err != nil {
output.Errorf(1, "download failed: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
output.Errorf(1, "download failed: HTTP %d", resp.StatusCode)
}
file, err := os.Create(destPath)
if err != nil {
output.Errorf(1, "create file: %v", err)
}
written, err := io.Copy(file, resp.Body)
file.Close()
if err != nil {
os.Remove(destPath)
output.Errorf(1, "download interrupted: %v", err)
}
green.Printf("Downloaded %s (%.1f MB)\n", filename, float64(written)/(1024*1024))
fmt.Println()
runPostInstall(destPath, deb, dim)
fmt.Println()
bold.Println("Next steps:")
dim.Println(" browseros-cli launch # start BrowserOS")
dim.Println(" browseros-cli init --auto # configure the CLI")
},
}
cmd.Flags().String("dir", ".", "Directory to download the installer to")
cmd.Flags().Bool("deb", false, "Download .deb package instead of AppImage (Linux only)")
rootCmd.AddCommand(cmd)
}
func resolveDownload(deb bool) (url, filename string) {
switch runtime.GOOS {
case "darwin":
return "https://files.browseros.com/download/BrowserOS.dmg", "BrowserOS.dmg"
case "windows":
return "https://files.browseros.com/download/BrowserOS_installer.exe", "BrowserOS_installer.exe"
case "linux":
if deb {
return "https://cdn.browseros.com/download/BrowserOS.deb", "BrowserOS.deb"
}
return "https://files.browseros.com/download/BrowserOS.AppImage", "BrowserOS.AppImage"
default:
output.Errorf(1, "unsupported platform: %s/%s\n Download manually from https://browseros.com", runtime.GOOS, runtime.GOARCH)
return "", ""
}
}
func platformDisplayName() string {
switch runtime.GOOS {
case "darwin":
return "macOS"
case "windows":
return "Windows"
case "linux":
return "Linux"
default:
return runtime.GOOS
}
}
func runPostInstall(path string, deb bool, dim *color.Color) {
switch runtime.GOOS {
case "darwin":
installMacOS(path, dim)
case "linux":
if deb {
dim.Println("Install the .deb package:")
fmt.Printf(" sudo dpkg -i %s\n", path)
} else {
os.Chmod(path, 0755)
dim.Printf("AppImage is ready to run: ./%s\n", filepath.Base(path))
}
case "windows":
fmt.Println("Launching installer...")
if err := exec.Command("cmd", "/c", "start", "", path).Run(); err != nil {
dim.Printf("Could not launch installer automatically. Run: %s\n", path)
} else {
dim.Println("Follow the installer prompts to complete setup.")
}
}
}
// installMacOS mounts the DMG and copies BrowserOS.app to /Applications.
func installMacOS(dmgPath string, dim *color.Color) {
fmt.Println("Mounting disk image...")
mountOut, err := exec.Command("hdiutil", "attach", dmgPath, "-nobrowse").Output()
if err != nil {
dim.Println("Could not mount DMG automatically.")
dim.Printf(" Open it manually: open %s\n", dmgPath)
return
}
// Find the mount point (last field of last line of hdiutil output)
mountPoint := ""
for _, line := range splitLines(string(mountOut)) {
fields := splitTabs(line)
if len(fields) > 0 {
mountPoint = fields[len(fields)-1]
}
}
if mountPoint == "" {
dim.Println("DMG mounted but could not determine mount point.")
dim.Printf(" Open it manually: open %s\n", dmgPath)
return
}
// Look for BrowserOS.app in the mounted volume
appSrc := filepath.Join(mountPoint, "BrowserOS.app")
if _, err := os.Stat(appSrc); err != nil {
dim.Printf("DMG mounted at %s but BrowserOS.app not found inside.\n", mountPoint)
dim.Printf(" Check the volume manually: open %s\n", mountPoint)
exec.Command("hdiutil", "detach", mountPoint, "-quiet").Run()
return
}
appDest := "/Applications/BrowserOS.app"
fmt.Printf("Installing to %s...\n", appDest)
// Remove existing installation if present
os.RemoveAll(appDest)
// Copy using cp -R (preserves code signatures, symlinks, etc.)
if err := exec.Command("cp", "-R", appSrc, appDest).Run(); err != nil {
dim.Printf("Could not copy to /Applications (may need sudo).\n")
dim.Printf(" Try: sudo cp -R \"%s\" /Applications/\n", appSrc)
exec.Command("hdiutil", "detach", mountPoint, "-quiet").Run()
return
}
// Unmount
exec.Command("hdiutil", "detach", mountPoint, "-quiet").Run()
// Clean up DMG
os.Remove(dmgPath)
fmt.Println("BrowserOS installed to /Applications/BrowserOS.app")
}
func splitLines(s string) []string {
var lines []string
for _, line := range filepath.SplitList(s) {
lines = append(lines, line)
}
// filepath.SplitList uses : on unix, not newlines — use manual split
result := []string{}
start := 0
for i := 0; i < len(s); i++ {
if s[i] == '\n' {
line := s[start:i]
if len(line) > 0 {
result = append(result, line)
}
start = i + 1
}
}
if start < len(s) {
result = append(result, s[start:])
}
return result
}
func splitTabs(s string) []string {
result := []string{}
start := 0
for i := 0; i < len(s); i++ {
if s[i] == '\t' {
field := s[start:i]
if len(field) > 0 {
result = append(result, field)
}
start = i + 1
}
}
if start < len(s) {
field := s[start:]
if len(field) > 0 {
result = append(result, field)
}
}
return result
}

View File

@@ -0,0 +1,287 @@
package cmd
import (
"fmt"
"net/http"
"os"
"os/exec"
"path/filepath"
"runtime"
"strings"
"time"
"browseros-cli/output"
"github.com/fatih/color"
"github.com/spf13/cobra"
)
// macOS bundle identifier — verified from BrowserOS.app/Contents/Info.plist
const browserOSBundleID = "com.browseros.BrowserOS"
func init() {
cmd := &cobra.Command{
Use: "launch",
Short: "Launch the BrowserOS application",
Long: `Find and launch the BrowserOS application.
Uses platform-native detection to find BrowserOS, launches it,
and waits for the server to become ready.
If BrowserOS is already running, reports the server URL.`,
Annotations: map[string]string{"group": "Setup:"},
Args: cobra.NoArgs,
Run: func(cmd *cobra.Command, args []string) {
green := color.New(color.FgGreen)
dim := color.New(color.Faint)
waitSecs, _ := cmd.Flags().GetInt("wait")
if url := probeRunningServer(); url != "" {
green.Printf("BrowserOS is already running at %s\n", url)
return
}
if !isBrowserOSInstalled() {
output.Error("BrowserOS is not installed.\n\n"+
" To install: browseros-cli install", 1)
}
fmt.Println("Launching BrowserOS...")
if err := startBrowserOS(); err != nil {
output.Errorf(1, "failed to launch: %v", err)
}
fmt.Print("Waiting for server")
url, ok := waitForServer(time.Duration(waitSecs) * time.Second)
fmt.Println()
if !ok {
output.Error("BrowserOS launched but server didn't respond within "+
fmt.Sprintf("%d seconds.\n", waitSecs)+
" Check if BrowserOS is fully loaded, then retry.", 1)
}
green.Printf("BrowserOS is ready at %s\n", url)
fmt.Println()
dim.Println("Next: browseros-cli init --auto")
},
}
cmd.Flags().Int("wait", 30, "Seconds to wait for server to start")
rootCmd.AddCommand(cmd)
}
// ---------------------------------------------------------------------------
// Server probing
// ---------------------------------------------------------------------------
// probeRunningServer checks server.json, config, and common ports for a running server.
func probeRunningServer() string {
check := func(baseURL string) bool {
client := &http.Client{Timeout: 2 * time.Second}
resp, err := client.Get(baseURL + "/health")
if err != nil {
return false
}
resp.Body.Close()
return resp.StatusCode == 200
}
// 1. server.json — written by BrowserOS on startup with the actual port
if url := loadBrowserosServerURL(); url != "" && check(url) {
return url
}
// 2. Saved config / env var
if url := defaultServerURL(); url != "" && check(url) {
return url
}
// 3. Probe common BrowserOS ports as last resort
for _, port := range []int{9100, 9200, 9300} {
url := fmt.Sprintf("http://127.0.0.1:%d", port)
if check(url) {
return url
}
}
return ""
}
// ---------------------------------------------------------------------------
// Platform-native installation detection
// ---------------------------------------------------------------------------
// isBrowserOSInstalled checks if BrowserOS is installed using platform-native methods.
//
// macOS: `open -Ra "BrowserOS"` — queries Launch Services (finds apps anywhere)
// Linux: checks /usr/bin/browseros (.deb), browseros.desktop, or AppImage files
// Windows: checks executable at %LOCALAPPDATA%\BrowserOS\Application\BrowserOS.exe
// and registry uninstall key (per-user Chromium install pattern)
func isBrowserOSInstalled() bool {
switch runtime.GOOS {
case "darwin":
// open -Ra checks if Launch Services knows about the app without launching it.
// Works regardless of where the app is installed.
return exec.Command("open", "-Ra", "BrowserOS").Run() == nil
case "linux":
// .deb install puts `browseros` in /usr/bin/
if _, err := exec.LookPath("browseros"); err == nil {
return true
}
// .deb also creates browseros.desktop
for _, dir := range []string{
"/usr/share/applications",
filepath.Join(userHomeDir(), ".local/share/applications"),
} {
if _, err := os.Stat(filepath.Join(dir, "browseros.desktop")); err == nil {
return true
}
}
// AppImage — user may have it in ~/Downloads, ~/Applications, etc.
return findLinuxAppImage() != ""
case "windows":
// Chromium per-user install: %LOCALAPPDATA%\BrowserOS\Application\BrowserOS.exe
if exePath := windowsBrowserOSExe(); exePath != "" {
if _, err := os.Stat(exePath); err == nil {
return true
}
}
// Fallback: check uninstall registry (per-user install uses HKCU)
for _, root := range []string{"HKCU", "HKLM"} {
key := root + `\Software\Microsoft\Windows\CurrentVersion\Uninstall\BrowserOS`
if exec.Command("reg", "query", key, "/v", "DisplayName").Run() == nil {
return true
}
}
return false
}
return false
}
// ---------------------------------------------------------------------------
// Platform-native launch
// ---------------------------------------------------------------------------
// startBrowserOS launches BrowserOS using platform-native methods.
//
// macOS: `open -b com.browseros.BrowserOS` — launches by bundle ID
// Linux: runs `browseros` binary or AppImage directly
// Windows: runs BrowserOS.exe from the known install path
func startBrowserOS() error {
switch runtime.GOOS {
case "darwin":
// Launch by bundle ID via Launch Services — no hardcoded paths needed.
return exec.Command("open", "-b", browserOSBundleID).Run()
case "linux":
// .deb install: browseros is in PATH
if p, err := exec.LookPath("browseros"); err == nil {
return startDetached(p)
}
// AppImage: run it directly
if appImage := findLinuxAppImage(); appImage != "" {
return startDetached(appImage)
}
// .desktop file: use gtk-launch (not xdg-open, which opens by MIME type)
if _, err := exec.LookPath("gtk-launch"); err == nil {
return exec.Command("gtk-launch", "browseros").Run()
}
return fmt.Errorf("BrowserOS found but could not determine how to launch it")
case "windows":
if exePath := windowsBrowserOSExe(); exePath != "" {
if _, err := os.Stat(exePath); err == nil {
return startDetached(exePath)
}
}
return fmt.Errorf("BrowserOS.exe not found at expected location")
default:
return fmt.Errorf("unsupported platform: %s", runtime.GOOS)
}
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
// startDetached starts a process in the background without inheriting stdio.
func startDetached(path string, args ...string) error {
cmd := exec.Command(path, args...)
cmd.Stdout = nil
cmd.Stderr = nil
cmd.Stdin = nil
return cmd.Start()
}
// windowsBrowserOSExe returns the expected BrowserOS.exe path on Windows.
// Chromium per-user installs go to %LOCALAPPDATA%\<base_app_name>\Application\<binary>.
// base_app_name = "BrowserOS" (from chromium_install_modes.h)
func windowsBrowserOSExe() string {
localAppData := os.Getenv("LOCALAPPDATA")
if localAppData == "" {
return ""
}
return filepath.Join(localAppData, "BrowserOS", "Application", "BrowserOS.exe")
}
// findLinuxAppImage searches common locations for a BrowserOS AppImage.
func findLinuxAppImage() string {
home := userHomeDir()
if home == "" {
return ""
}
for _, dir := range []string{
home,
filepath.Join(home, "Applications"),
filepath.Join(home, "Downloads"),
"/opt",
} {
entries, err := os.ReadDir(dir)
if err != nil {
continue
}
for _, e := range entries {
name := e.Name()
if strings.HasPrefix(name, "BrowserOS") && strings.HasSuffix(name, ".AppImage") {
return filepath.Join(dir, name)
}
}
}
return ""
}
// userHomeDir returns the home directory or empty string.
func userHomeDir() string {
home, err := os.UserHomeDir()
if err != nil {
return ""
}
return home
}
// waitForServer polls until a BrowserOS server responds or timeout.
func waitForServer(maxWait time.Duration) (string, bool) {
client := &http.Client{Timeout: 2 * time.Second}
deadline := time.Now().Add(maxWait)
for time.Now().Before(deadline) {
// server.json is written by BrowserOS on startup with the actual port
if url := loadBrowserosServerURL(); url != "" {
resp, err := client.Get(url + "/health")
if err == nil {
resp.Body.Close()
if resp.StatusCode == 200 {
return url, true
}
}
}
fmt.Print(".")
time.Sleep(1 * time.Second)
}
return "", false
}

View File

@@ -1,6 +1,7 @@
package cmd
import (
"context"
"encoding/json"
"fmt"
"os"
@@ -9,9 +10,11 @@ import (
"strings"
"time"
"browseros-cli/analytics"
"browseros-cli/config"
"browseros-cli/mcp"
"browseros-cli/output"
"browseros-cli/update"
"github.com/fatih/color"
"github.com/spf13/cobra"
@@ -27,8 +30,11 @@ var (
version = "dev"
)
const automaticUpdateDrainTimeout = 150 * time.Millisecond
func SetVersion(v string) {
version = v
rootCmd.Version = v
}
var (
@@ -113,11 +119,40 @@ var rootCmd = &cobra.Command{
}
func Execute() {
if err := rootCmd.Execute(); err != nil {
automaticUpdater := newAutomaticUpdateManager(os.Args[1:])
automaticNotice := ""
var automaticCheckDone <-chan struct{}
if automaticUpdater != nil {
automaticNotice = automaticUpdater.CachedNotice()
automaticCheckDone = automaticUpdater.StartBackgroundCheck(context.Background())
}
analytics.Init(version)
start := time.Now()
err := rootCmd.Execute()
if automaticNotice != "" && err == nil {
fmt.Fprintln(os.Stderr, automaticNotice)
}
drainAutomaticUpdateCheck(automaticCheckDone)
analytics.Track(commandName(os.Args[1:]), err == nil, time.Since(start))
analytics.Close()
if err != nil {
os.Exit(1)
}
}
func commandName(args []string) string {
cmd, _, err := rootCmd.Find(args)
if err != nil || cmd == rootCmd {
return "unknown"
}
return cmd.CommandPath()
}
func init() {
cobra.AddTemplateFunc("helpHeader", helpHeader)
cobra.AddTemplateFunc("helpCmdCol", helpCmdCol)
@@ -166,11 +201,105 @@ func envBool(key string) bool {
return v == "1" || v == "true"
}
func newAutomaticUpdateManager(args []string) *update.Manager {
if shouldSkipAutomaticUpdates(args) {
return nil
}
return update.NewManager(update.Options{
CurrentVersion: version,
JSONOutput: requestedBoolFlag(args, "--json", jsonOut),
Debug: requestedBoolFlag(args, "--debug", debug),
Automatic: true,
})
}
func shouldSkipAutomaticUpdates(args []string) bool {
if hasHelpFlag(args) || requestedBoolFlag(args, "--version", false) {
return true
}
switch primaryCommand(args) {
case "help", "completion", "update", "self-update", "upgrade":
return true
default:
return false
}
}
func hasHelpFlag(args []string) bool {
if requestedBoolFlag(args, "--help", false) {
return true
}
for _, arg := range args {
if arg == "-h" {
return true
}
}
return false
}
func primaryCommand(args []string) string {
for _, arg := range args {
if strings.HasPrefix(arg, "-") {
continue
}
return arg
}
return ""
}
func requestedBoolFlag(args []string, flagName string, current bool) bool {
if current {
return true
}
prefix := flagName + "="
for _, arg := range args {
if arg == flagName {
return true
}
if strings.HasPrefix(arg, prefix) {
value, err := strconv.ParseBool(strings.TrimPrefix(arg, prefix))
return err == nil && value
}
}
return false
}
func drainAutomaticUpdateCheck(done <-chan struct{}) {
drainAutomaticUpdateCheckWithTimeout(done, automaticUpdateDrainTimeout)
}
func drainAutomaticUpdateCheckWithTimeout(done <-chan struct{}, timeout time.Duration) {
if done == nil {
return
}
timer := time.NewTimer(timeout)
defer timer.Stop()
select {
case <-done:
case <-timer.C:
}
}
func defaultServerURL() string {
// 1. Explicit env var always wins
if env := normalizeServerURL(os.Getenv("BROWSEROS_URL")); env != "" {
return env
}
// 2. Live discovery file from running BrowserOS (most current)
if url := loadBrowserosServerURL(); url != "" {
return url
}
// 3. Saved config (may be stale if port changed)
cfg, err := config.Load()
if err == nil {
if url := normalizeServerURL(cfg.ServerURL); url != "" {
@@ -178,10 +307,6 @@ func defaultServerURL() string {
}
}
if url := loadBrowserosServerURL(); url != "" {
return url
}
return ""
}
@@ -214,10 +339,27 @@ func loadBrowserosServerURL() string {
func normalizeServerURL(raw string) string {
normalized := strings.TrimSpace(raw)
if isPortOnly(normalized) {
normalized = "http://127.0.0.1:" + normalized
}
normalized = strings.TrimSuffix(normalized, "/mcp")
return strings.TrimSuffix(normalized, "/")
}
func isPortOnly(s string) bool {
if s == "" {
return false
}
for _, c := range s {
if c < '0' || c > '9' {
return false
}
}
return true
}
func validateServerURL(raw string) (string, error) {
baseURL := normalizeServerURL(raw)
if baseURL != "" {
@@ -225,6 +367,9 @@ func validateServerURL(raw string) (string, error) {
}
return "", fmt.Errorf(
"BrowserOS server URL is not configured.\n Open BrowserOS -> Settings -> BrowserOS MCP and copy the Server URL.\n Then run: browseros-cli init",
"BrowserOS server URL is not configured.\n\n" +
" If BrowserOS is running: browseros-cli init --auto\n" +
" If BrowserOS is closed: browseros-cli launch\n" +
" If not installed: browseros-cli install",
)
}

View File

@@ -0,0 +1,146 @@
package cmd
import (
"testing"
"time"
)
func TestSetVersionUpdatesRootCommand(t *testing.T) {
originalVersion := version
originalRootVersion := rootCmd.Version
t.Cleanup(func() {
version = originalVersion
rootCmd.Version = originalRootVersion
})
SetVersion("1.2.3")
if version != "1.2.3" {
t.Fatalf("version = %q, want %q", version, "1.2.3")
}
if rootCmd.Version != "1.2.3" {
t.Fatalf("rootCmd.Version = %q, want %q", rootCmd.Version, "1.2.3")
}
}
func TestCommandName(t *testing.T) {
tests := []struct {
name string
args []string
want string
}{
{"empty args", nil, "unknown"},
{"known command", []string{"health"}, "browseros-cli health"},
{"unknown command", []string{"nonexistent"}, "unknown"},
{"subcommand", []string{"bookmark", "search"}, "browseros-cli bookmark search"},
{"known with extra args", []string{"snap", "--enhanced"}, "browseros-cli snap"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := commandName(tt.args)
if got != tt.want {
t.Errorf("commandName(%v) = %q, want %q", tt.args, got, tt.want)
}
})
}
}
func TestPrimaryCommand(t *testing.T) {
tests := []struct {
name string
args []string
want string
}{
{"empty", nil, ""},
{"root flag then command", []string{"--json", "update"}, "update"},
{"subcommand", []string{"bookmark", "update"}, "bookmark"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := primaryCommand(tt.args); got != tt.want {
t.Fatalf("primaryCommand(%v) = %q, want %q", tt.args, got, tt.want)
}
})
}
}
func TestRequestedBoolFlag(t *testing.T) {
if !requestedBoolFlag([]string{"--json"}, "--json", false) {
t.Fatal("requestedBoolFlag() = false, want true")
}
if !requestedBoolFlag([]string{"--debug=true"}, "--debug", false) {
t.Fatal("requestedBoolFlag() with assignment = false, want true")
}
if requestedBoolFlag([]string{"--debug=false"}, "--debug", false) {
t.Fatal("requestedBoolFlag() with false assignment = true, want false")
}
}
func TestShouldSkipAutomaticUpdates(t *testing.T) {
tests := []struct {
name string
args []string
want bool
}{
{"short help flag", []string{"-h"}, true},
{"help flag", []string{"--help"}, true},
{"version flag", []string{"--version"}, true},
{"update command", []string{"update"}, true},
{"bookmark update subcommand", []string{"bookmark", "update"}, false},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := shouldSkipAutomaticUpdates(tt.args); got != tt.want {
t.Fatalf("shouldSkipAutomaticUpdates(%v) = %t, want %t", tt.args, got, tt.want)
}
})
}
}
func TestDrainAutomaticUpdateCheckWithTimeoutWaitsForCompletion(t *testing.T) {
done := make(chan struct{})
returned := make(chan struct{})
go func() {
drainAutomaticUpdateCheckWithTimeout(done, time.Second)
close(returned)
}()
select {
case <-returned:
t.Fatal("drainAutomaticUpdateCheckWithTimeout() returned before check completed")
case <-time.After(10 * time.Millisecond):
}
close(done)
select {
case <-returned:
case <-time.After(100 * time.Millisecond):
t.Fatal("drainAutomaticUpdateCheckWithTimeout() did not return after check completed")
}
}
func TestDrainAutomaticUpdateCheckWithTimeoutStopsWaiting(t *testing.T) {
done := make(chan struct{})
returned := make(chan struct{})
go func() {
drainAutomaticUpdateCheckWithTimeout(done, 20*time.Millisecond)
close(returned)
}()
select {
case <-returned:
t.Fatal("drainAutomaticUpdateCheckWithTimeout() returned before timeout elapsed")
case <-time.After(5 * time.Millisecond):
}
select {
case <-returned:
case <-time.After(100 * time.Millisecond):
t.Fatal("drainAutomaticUpdateCheckWithTimeout() did not return after timeout")
}
}

View File

@@ -0,0 +1,179 @@
package cmd
import (
"bufio"
"context"
"fmt"
"io"
"os"
"strings"
"browseros-cli/output"
"browseros-cli/update"
"github.com/spf13/cobra"
)
type updateManager interface {
CheckNow(context.Context) (*update.CheckResult, error)
Apply(context.Context, *update.CheckResult) error
}
type updateOutcome struct {
result *update.CheckResult
applied bool
canceled bool
}
func init() {
cmd := &cobra.Command{
Use: "update",
Aliases: []string{"self-update", "upgrade"},
Annotations: map[string]string{"group": "Setup:"},
Short: "Check for and apply CLI updates",
Args: cobra.NoArgs,
Run: func(cmd *cobra.Command, args []string) {
checkOnly, _ := cmd.Flags().GetBool("check")
yes, _ := cmd.Flags().GetBool("yes")
manager := update.NewManager(update.Options{
CurrentVersion: version,
JSONOutput: jsonOut,
Debug: debug,
Automatic: false,
})
outcome, err := runUpdateCommand(
cmd.Context(),
manager,
checkOnly,
yes,
stdinIsInteractive(os.Stdin),
os.Stdin,
os.Stderr,
)
if err != nil {
output.Error(err.Error(), 1)
}
printUpdateOutcome(outcome)
},
}
cmd.Flags().Bool("check", false, "Check for updates without applying them")
cmd.Flags().Bool("yes", false, "Apply update without prompting")
rootCmd.AddCommand(cmd)
}
func runUpdateCommand(
ctx context.Context,
manager updateManager,
checkOnly bool,
yes bool,
interactive bool,
stdin io.Reader,
stderr io.Writer,
) (*updateOutcome, error) {
result, err := manager.CheckNow(ctx)
if err != nil {
return nil, err
}
outcome := &updateOutcome{result: result}
if checkOnly || !result.UpdateAvailable {
return outcome, nil
}
if !yes {
if !interactive {
return nil, fmt.Errorf("update requires confirmation; rerun with --yes")
}
confirmed, err := confirmUpdate(stdin, stderr, result)
if err != nil {
return nil, err
}
if !confirmed {
outcome.canceled = true
return outcome, nil
}
}
if err := manager.Apply(ctx, result); err != nil {
return nil, err
}
outcome.applied = true
return outcome, nil
}
func printUpdateOutcome(outcome *updateOutcome) {
if jsonOut {
output.JSONRaw(updateOutcomePayload(outcome))
return
}
switch {
case outcome.applied:
fmt.Printf("Updated browseros-cli to v%s\n", outcome.result.LatestVersion)
case outcome.canceled:
fmt.Println("Update canceled.")
case outcome.result.UpdateAvailable:
fmt.Println(update.FormatNotice(outcome.result.CurrentVersion, outcome.result.LatestVersion))
case outcome.result != nil:
fmt.Printf("browseros-cli is up to date (v%s)\n", outcome.result.CurrentVersion)
}
}
func updateOutcomePayload(outcome *updateOutcome) map[string]any {
payload := map[string]any{
"applied": outcome.applied,
}
if outcome.canceled {
payload["canceled"] = true
}
if outcome.result == nil {
return payload
}
payload["currentVersion"] = outcome.result.CurrentVersion
payload["latestVersion"] = outcome.result.LatestVersion
payload["updateAvailable"] = outcome.result.UpdateAvailable
if outcome.result.Asset != nil {
payload["asset"] = map[string]any{
"filename": outcome.result.Asset.Filename,
"url": outcome.result.Asset.URL,
"archiveFormat": outcome.result.Asset.ArchiveFormat,
}
}
return payload
}
func confirmUpdate(
stdin io.Reader,
stderr io.Writer,
result *update.CheckResult,
) (bool, error) {
if _, err := fmt.Fprintf(
stderr,
"Install browseros-cli v%s over v%s? [y/N]: ",
result.LatestVersion,
result.CurrentVersion,
); err != nil {
return false, err
}
line, err := bufio.NewReader(stdin).ReadString('\n')
if err != nil && err != io.EOF {
return false, err
}
answer := strings.ToLower(strings.TrimSpace(line))
return answer == "y" || answer == "yes", nil
}
func stdinIsInteractive(file *os.File) bool {
info, err := file.Stat()
if err != nil {
return false
}
return info.Mode()&os.ModeCharDevice != 0
}

View File

@@ -0,0 +1,176 @@
package cmd
import (
"bytes"
"context"
"errors"
"net/http"
"net/http/httptest"
"runtime"
"testing"
"browseros-cli/update"
)
func TestRunUpdateCommandCheckOnly(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
manager := newTestUpdateManager(t)
outcome, err := runUpdateCommand(
context.Background(),
manager,
true,
false,
false,
bytes.NewBufferString(""),
&bytes.Buffer{},
)
if err != nil {
t.Fatalf("runUpdateCommand() error = %v", err)
}
if outcome.applied {
t.Fatal("runUpdateCommand() applied = true, want false")
}
if !outcome.result.UpdateAvailable {
t.Fatal("runUpdateCommand() UpdateAvailable = false, want true")
}
}
func TestRunUpdateCommandRequiresYesWithoutTTY(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
_, err := runUpdateCommand(
context.Background(),
newTestUpdateManager(t),
false,
false,
false,
bytes.NewBufferString(""),
&bytes.Buffer{},
)
if err == nil {
t.Fatal("runUpdateCommand() error = nil, want confirmation error")
}
}
func TestRunUpdateCommandCancel(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
stderr := &bytes.Buffer{}
outcome, err := runUpdateCommand(
context.Background(),
newTestUpdateManager(t),
false,
false,
true,
bytes.NewBufferString("n\n"),
stderr,
)
if err != nil {
t.Fatalf("runUpdateCommand() error = %v", err)
}
if !outcome.canceled {
t.Fatal("runUpdateCommand() canceled = false, want true")
}
if stderr.Len() == 0 {
t.Fatal("confirm prompt was not written to stderr")
}
}
func TestRunUpdateCommandYesAppliesWithoutPrompt(t *testing.T) {
manager := &fakeUpdateManager{
result: &update.CheckResult{
CurrentVersion: "1.0.0",
LatestVersion: "9.9.9",
UpdateAvailable: true,
Asset: &update.Asset{
Filename: "browseros-cli_9.9.9_test.tar.gz",
URL: "https://cdn.example.com/cli/v9.9.9/browseros-cli_9.9.9_test.tar.gz",
ArchiveFormat: "tar.gz",
SHA256: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
},
},
}
stderr := &bytes.Buffer{}
outcome, err := runUpdateCommand(
context.Background(),
manager,
false,
true,
false,
bytes.NewBufferString(""),
stderr,
)
if err != nil {
t.Fatalf("runUpdateCommand() error = %v", err)
}
if !outcome.applied {
t.Fatal("runUpdateCommand() applied = false, want true")
}
if manager.applyCalls != 1 {
t.Fatalf("Apply() calls = %d, want 1", manager.applyCalls)
}
if stderr.Len() != 0 {
t.Fatal("prompt was written despite --yes")
}
}
type fakeUpdateManager struct {
result *update.CheckResult
checkErr error
applyErr error
applyCalls int
}
func (m *fakeUpdateManager) CheckNow(context.Context) (*update.CheckResult, error) {
if m.checkErr != nil {
return nil, m.checkErr
}
if m.result == nil {
return nil, errors.New("missing check result")
}
return m.result, nil
}
func (m *fakeUpdateManager) Apply(context.Context, *update.CheckResult) error {
m.applyCalls++
return m.applyErr
}
func newTestUpdateManager(t *testing.T) *update.Manager {
t.Helper()
key, err := update.PlatformKey(runtime.GOOS, runtime.GOARCH)
if err != nil {
t.Fatalf("PlatformKey() error = %v", err)
}
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{
"version":"9.9.9",
"published_at":"2026-03-27T19:00:00Z",
"tag":"browseros-cli-v9.9.9",
"assets":{
"` + key + `":{
"filename":"browseros-cli_9.9.9_test.tar.gz",
"url":"https://cdn.example.com/cli/v9.9.9/browseros-cli_9.9.9_test.tar.gz",
"archive_format":"tar.gz",
"sha256":"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
}
}
}`))
}))
t.Cleanup(server.Close)
return update.NewManager(update.Options{
CurrentVersion: "1.0.0",
ManifestURL: server.URL,
Automatic: false,
HTTPClient: server.Client(),
})
}

View File

@@ -4,20 +4,28 @@ go 1.25.7
require (
github.com/fatih/color v1.18.0
github.com/minio/selfupdate v0.6.0
github.com/modelcontextprotocol/go-sdk v1.4.0
github.com/posthog/posthog-go v1.11.2
github.com/spf13/cobra v1.10.2
golang.org/x/mod v0.34.0
gopkg.in/yaml.v3 v3.0.1
)
require (
aead.dev/minisign v0.2.0 // indirect
github.com/goccy/go-json v0.10.5 // indirect
github.com/google/jsonschema-go v0.4.2 // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/hashicorp/golang-lru/v2 v2.0.7 // indirect
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/modelcontextprotocol/go-sdk v1.4.0 // indirect
github.com/segmentio/asm v1.1.3 // indirect
github.com/segmentio/encoding v0.5.3 // indirect
github.com/spf13/pflag v1.0.9 // indirect
github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
golang.org/x/crypto v0.0.0-20211209193657-4570a0811e8b // indirect
golang.org/x/oauth2 v0.34.0 // indirect
golang.org/x/sys v0.40.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

View File

@@ -1,8 +1,22 @@
aead.dev/minisign v0.2.0 h1:kAWrq/hBRu4AARY6AlciO83xhNnW9UaC8YipS2uhLPk=
aead.dev/minisign v0.2.0/go.mod h1:zdq6LdSd9TbuSxchxwhpA9zEb9YXcVGoE8JakuiGaIQ=
github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/fatih/color v1.18.0 h1:S8gINlzdQ840/4pfAwic/ZE0djQEH3wM94VfqLTZcOM=
github.com/fatih/color v1.18.0/go.mod h1:4FelSpRwEGDpQ12mAdzqdOukCy4u8WUtOY6lkT/6HfU=
github.com/goccy/go-json v0.10.5 h1:Fq85nIqj+gXn/S5ahsiTlK3TmC85qgirsdTP/+DeaC4=
github.com/goccy/go-json v0.10.5/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
github.com/golang-jwt/jwt/v5 v5.3.0 h1:pv4AsKCKKZuqlgs5sUmn4x8UlGa0kEVt/puTpKx9vvo=
github.com/golang-jwt/jwt/v5 v5.3.0/go.mod h1:fxCRLWMO43lRc8nhHWY6LGqRcf+1gQWArsqaEUEa5bE=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/jsonschema-go v0.4.2 h1:tmrUohrwoLZZS/P3x7ex0WAVknEkBZM46iALbcqoRA8=
github.com/google/jsonschema-go v0.4.2/go.mod h1:r5quNTdLOYEz95Ru18zA0ydNbBuYoo9tgaYcxEYhJVE=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k=
github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM=
github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
github.com/mattn/go-colorable v0.1.13 h1:fFA4WZxdEF4tXPZVKMLwD8oUnCTTo08duU7wxecdEvA=
@@ -10,8 +24,14 @@ github.com/mattn/go-colorable v0.1.13/go.mod h1:7S9/ev0klgBDR4GtXTXX8a3vIGJpMovk
github.com/mattn/go-isatty v0.0.16/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/minio/selfupdate v0.6.0 h1:i76PgT0K5xO9+hjzKcacQtO7+MjJ4JKA8Ak8XQ9DDwU=
github.com/minio/selfupdate v0.6.0/go.mod h1:bO02GTIPCMQFTEvE5h4DjYB58bCoZ35XLeBf0buTDdM=
github.com/modelcontextprotocol/go-sdk v1.4.0 h1:u0kr8lbJc1oBcawK7Df+/ajNMpIDFE41OEPxdeTLOn8=
github.com/modelcontextprotocol/go-sdk v1.4.0/go.mod h1:Nxc2n+n/GdCebUaqCOhTetptS17SXXNu9IfNTaLDi1E=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/posthog/posthog-go v1.11.2 h1:ApKTtOhIeWhUBc4ByO+mlbg2o0iZaEGJnJHX2QDnn5Q=
github.com/posthog/posthog-go v1.11.2/go.mod h1:xsVOW9YImilUcazwPNEq4PJDqEZf2KeCS758zXjwkPg=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/segmentio/asm v1.1.3 h1:WM03sfUOENvvKexOLp+pCqgb/WDjsi7EK8gIsICtzhc=
github.com/segmentio/asm v1.1.3/go.mod h1:Ld3L4ZXGNcSLRg4JBsZ3//1+f/TjYl0Mzen/DQy1EJg=
@@ -21,17 +41,39 @@ github.com/spf13/cobra v1.10.2 h1:DMTTonx5m65Ic0GOoRY2c16WCbHxOOw6xxezuLaBpcU=
github.com/spf13/cobra v1.10.2/go.mod h1:7C1pvHqHw5A4vrJfjNwvOdzYu0Gml16OCs2GRiTUUS4=
github.com/spf13/pflag v1.0.9 h1:9exaQaMOCwffKiiiYk6/BndUBv+iRViNW+4lEMi0PvY=
github.com/spf13/pflag v1.0.9/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
github.com/yosida95/uritemplate/v3 v3.0.2 h1:Ed3Oyj9yrmi9087+NczuL5BwkIc4wvTb5zIM+UJPGz4=
github.com/yosida95/uritemplate/v3 v3.0.2/go.mod h1:ILOh0sOhIJR3+L/8afwt/kE++YT040gmv5BQTMR2HP4=
go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20210220033148-5ea612d1eb83/go.mod h1:jdWPYTVW3xRLrWPugEBEK3UY2ZEsg3UU495nc5E+M+I=
golang.org/x/crypto v0.0.0-20211209193657-4570a0811e8b h1:QAqMVf3pSa6eeTsuklijukjXBlj7Es2QQplab+/RbQ4=
golang.org/x/crypto v0.0.0-20211209193657-4570a0811e8b/go.mod h1:IxCIyHEi3zRg3s0A5j5BB6A9Jmi73HwBIUl50j+osU4=
golang.org/x/mod v0.34.0 h1:xIHgNUUnW6sYkcM5Jleh05DvLOtwc6RitGHbDk4akRI=
golang.org/x/mod v0.34.0/go.mod h1:ykgH52iCZe79kzLLMhyCUzhMci+nQj+0XkbXpNYtVjY=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20211112202133-69e39bad7dc2/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y=
golang.org/x/oauth2 v0.34.0 h1:hqK/t4AKgbqWkdkcAeI8XLmbK+4m4G5YeQRrmiotGlw=
golang.org/x/oauth2 v0.34.0/go.mod h1:lzm5WQJQwKZ3nwavOZ3IS5Aulzxi68dUSgRHujetwEA=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210228012217-479acdf4ea46/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210423082822-04245dca01da/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.25.0 h1:r+8e+loiHxRqhXVl6ML1nO3l1+oFoWbnlu2Ehimmi34=
golang.org/x/sys v0.25.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.40.0 h1:DBZZqJ2Rkml6QMQsZywtnjnnGvHza6BTfYFWY9kjEWQ=
golang.org/x/sys v0.40.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
golang.org/x/term v0.0.0-20201117132131-f5c789dd3221/go.mod h1:Nr5EML6q2oocZ2LXRh80K7BxOlk5/8JxuGnuhpl+muw=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.42.0 h1:uNgphsn75Tdz5Ji2q36v/nsFSfR/9BRFvqhGBaJGd5k=
golang.org/x/tools v0.42.0/go.mod h1:Ma6lCIwGZvHK6XtgbswSoWroEkhugApmsXyrUmBhfr0=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -44,7 +44,10 @@ func (c *Client) connect(ctx context.Context) (*sdkmcp.ClientSession, error) {
session, err := sdkClient.Connect(ctx, transport, nil)
if err != nil {
return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w\n Is the server running? Try: browseros-cli init", c.BaseURL, err)
return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w\n\n"+
" If BrowserOS is running on a different port: browseros-cli init --auto\n"+
" If BrowserOS is not running: browseros-cli launch\n"+
" If not installed: browseros-cli install", c.BaseURL, err)
}
return session, nil
}
@@ -184,7 +187,10 @@ func (c *Client) Status() (map[string]any, error) {
func (c *Client) restGET(path string) (map[string]any, error) {
resp, err := c.HTTPClient.Get(c.BaseURL + path)
if err != nil {
return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w\n Try: browseros-cli init", c.BaseURL, err)
return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w\n\n"+
" If BrowserOS is running on a different port: browseros-cli init --auto\n"+
" If BrowserOS is not running: browseros-cli launch\n"+
" If not installed: browseros-cli install", c.BaseURL, err)
}
defer resp.Body.Close()

View File

@@ -0,0 +1,2 @@
.binary/
node_modules/

View File

@@ -0,0 +1,81 @@
# browseros-cli
Command-line interface for controlling BrowserOS -- launch and automate the browser from the terminal.
## Installation
**Zero install (recommended):**
```bash
npx browseros-cli --help
```
**Global install:**
```bash
npm install -g browseros-cli
```
**Shell script fallback:**
```bash
curl -fsSL https://cdn.browseros.com/cli/install.sh | bash
```
## Quick Start
```bash
# Download BrowserOS
browseros-cli install
# Start BrowserOS
browseros-cli launch
# Auto-configure MCP settings for your AI tools
browseros-cli init --auto
# Verify everything is working
browseros-cli health
```
## Usage
### Navigation
```bash
browseros-cli navigate "https://example.com"
```
### Observation
```bash
browseros-cli snapshot # Get the accessibility tree of the current page
browseros-cli console-logs # View browser console output
```
### Screenshots
```bash
browseros-cli screenshot # Capture the current page
```
### Input
```bash
browseros-cli click 42 # Click an element by its node ID
browseros-cli fill 85 "query" # Type text into an input field
```
### Agent Mode
```bash
browseros-cli agent "Search for flights to Tokyo"
```
## Documentation
Full documentation is available at [browseros.com](https://browseros.com).
## License
MIT

View File

@@ -0,0 +1,32 @@
#!/usr/bin/env node
const { execFileSync, spawnSync } = require('node:child_process')
const path = require('node:path')
const fs = require('node:fs')
const BINARY_DIR = path.join(__dirname, '..', '.binary')
const EXT = process.platform === 'win32' ? '.exe' : ''
const BIN_PATH = path.join(BINARY_DIR, `browseros-cli${EXT}`)
if (!fs.existsSync(BIN_PATH)) {
console.error('browseros-cli: binary not found, downloading...')
try {
execFileSync(
process.execPath,
[path.join(__dirname, '..', 'scripts', 'postinstall.js')],
{ stdio: 'inherit', env: { ...process.env, BROWSEROS_NPM_FORCE: '1' } },
)
} catch {
console.error(
'browseros-cli: failed to download binary. Try reinstalling:\n npm install -g browseros-cli',
)
process.exit(1)
}
}
const result = spawnSync(BIN_PATH, process.argv.slice(2), {
stdio: 'inherit',
env: { ...process.env, BROWSEROS_INSTALL_METHOD: 'npm' },
})
process.exit(result.status ?? 1)

View File

@@ -0,0 +1,45 @@
{
"name": "browseros-cli",
"version": "0.2.0",
"description": "Command-line interface for controlling BrowserOS — launch and automate the browser from the terminal",
"bin": {
"browseros-cli": "bin/browseros-cli.js"
},
"scripts": {
"postinstall": "node scripts/postinstall.js"
},
"keywords": [
"browseros",
"cli",
"browser",
"automation",
"mcp",
"ai-agent",
"model-context-protocol"
],
"repository": {
"type": "git",
"url": "https://github.com/browseros-ai/BrowserOS",
"directory": "packages/browseros-agent/apps/cli/npm"
},
"homepage": "https://browseros.com",
"bugs": "https://github.com/browseros-ai/BrowserOS/issues",
"license": "MIT",
"os": [
"darwin",
"linux",
"win32"
],
"cpu": [
"x64",
"arm64"
],
"engines": {
"node": ">=18"
},
"files": [
"bin/",
"scripts/",
"README.md"
]
}

View File

@@ -0,0 +1,142 @@
const https = require('node:https')
const http = require('node:http')
const fs = require('node:fs')
const path = require('node:path')
const { execSync } = require('node:child_process')
const { createHash } = require('node:crypto')
const VERSION = require('../package.json').version
const GITHUB_RELEASE_BASE = `https://github.com/browseros-ai/BrowserOS/releases/download/browseros-cli-v${VERSION}`
const BINARY_DIR = path.join(__dirname, '..', '.binary')
const EXT = process.platform === 'win32' ? '.exe' : ''
const BINARY_PATH = path.join(BINARY_DIR, `browseros-cli${EXT}`)
if (process.env.CI && !process.env.BROWSEROS_NPM_FORCE) {
process.exit(0)
}
const PLATFORM_MAP = { darwin: 'darwin', linux: 'linux', win32: 'windows' }
const ARCH_MAP = { x64: 'amd64', arm64: 'arm64' }
const platform = PLATFORM_MAP[process.platform]
const arch = ARCH_MAP[process.arch]
if (!platform || !arch) {
console.error(
`browseros-cli: unsupported platform ${process.platform}/${process.arch}`,
)
process.exit(1)
}
const isWindows = platform === 'windows'
const archiveExt = isWindows ? 'zip' : 'tar.gz'
const archiveName = `browseros-cli_${VERSION}_${platform}_${arch}.${archiveExt}`
const archiveURL = `${GITHUB_RELEASE_BASE}/${archiveName}`
const checksumURL = `${GITHUB_RELEASE_BASE}/checksums.txt`
const MAX_REDIRECTS = 5
function download(url, redirects = 0) {
return new Promise((resolve, reject) => {
if (redirects > MAX_REDIRECTS) {
return reject(new Error(`Too many redirects for ${url}`))
}
const client = url.startsWith('https') ? https : http
client
.get(url, { headers: { 'User-Agent': 'browseros-cli-npm' } }, (res) => {
if (
res.statusCode >= 300 &&
res.statusCode < 400 &&
res.headers.location
) {
return download(res.headers.location, redirects + 1).then(
resolve,
reject,
)
}
if (res.statusCode !== 200) {
return reject(new Error(`HTTP ${res.statusCode} for ${url}`))
}
const chunks = []
res.on('data', (chunk) => chunks.push(chunk))
res.on('end', () => resolve(Buffer.concat(chunks)))
res.on('error', reject)
})
.on('error', reject)
})
}
async function main() {
console.log(
`browseros-cli: downloading v${VERSION} for ${platform}/${arch}...`,
)
const [archiveBuffer, checksumBuffer] = await Promise.all([
download(archiveURL),
download(checksumURL).catch(() => null),
])
if (checksumBuffer) {
const checksumText = checksumBuffer.toString('utf-8')
const expectedLine = checksumText
.split('\n')
.find((l) => l.includes(archiveName))
if (expectedLine) {
const expected = expectedLine.split(/\s+/)[0]
const actual = createHash('sha256').update(archiveBuffer).digest('hex')
if (actual !== expected) {
console.error(
`browseros-cli: checksum mismatch!\n expected: ${expected}\n got: ${actual}`,
)
process.exit(1)
}
console.log('browseros-cli: checksum verified.')
} else {
console.warn(
'browseros-cli: warning: checksum entry not found in checksums.txt, skipping verification.',
)
}
} else {
console.warn(
'browseros-cli: warning: could not fetch checksums.txt, skipping verification.',
)
}
fs.mkdirSync(BINARY_DIR, { recursive: true })
const tmpArchive = path.join(BINARY_DIR, archiveName)
fs.writeFileSync(tmpArchive, archiveBuffer)
if (isWindows) {
execSync(
`powershell -Command "Expand-Archive -Force -Path '${tmpArchive}' -DestinationPath '${BINARY_DIR}'"`,
{ stdio: 'inherit' },
)
} else {
execSync(`tar -xzf "${tmpArchive}" -C "${BINARY_DIR}"`, {
stdio: 'inherit',
})
}
fs.unlinkSync(tmpArchive)
if (!fs.existsSync(BINARY_PATH)) {
console.error(
`browseros-cli: binary not found after extraction at ${BINARY_PATH}`,
)
process.exit(1)
}
if (!isWindows) {
fs.chmodSync(BINARY_PATH, 0o755)
}
console.log(`browseros-cli: installed v${VERSION} successfully.`)
}
main().catch((err) => {
console.error(`browseros-cli: installation failed: ${err.message}`)
console.error(
'You can install manually: curl -fsSL https://cdn.browseros.com/cli/install.sh | bash',
)
process.exit(1)
})

View File

@@ -0,0 +1,115 @@
#
# Install browseros-cli for Windows — downloads the latest release binary.
#
# Usage (PowerShell — save and run):
# Invoke-WebRequest -Uri "https://cdn.browseros.com/cli/install.ps1" -OutFile install.ps1
# .\install.ps1
# .\install.ps1 -Version "0.1.0" -Dir "C:\tools\browseros"
#
# Usage (one-liner, uses env vars for options):
# & { $env:BROWSEROS_VERSION="0.1.0"; irm https://cdn.browseros.com/cli/install.ps1 | iex }
#
param(
[string]$Version = "",
[string]$Dir = ""
)
$ErrorActionPreference = "Stop"
# TLS 1.2 — older PS 5.1 defaults to TLS 1.0
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$CdnBase = "https://cdn.browseros.com/cli"
$Binary = "browseros-cli"
# When piped via irm | iex, param() is ignored — fall back to env vars
if (-not $Version) { $Version = $env:BROWSEROS_VERSION }
if (-not $Dir) { $Dir = if ($env:BROWSEROS_DIR) { $env:BROWSEROS_DIR } else { "$env:LOCALAPPDATA\browseros-cli\bin" } }
# ── Resolve latest version ───────────────────────────────────────────────────
if (-not $Version) {
Write-Host "Fetching latest version..."
$Version = (Invoke-WebRequest -Uri "$CdnBase/latest/version.txt" -UseBasicParsing).Content.Trim()
if (-not $Version) {
Write-Error "Could not determine latest version. Try: -Version 0.1.0"
exit 1
}
}
if ($Version -notmatch '^\d+\.\d+\.\d+(-[a-zA-Z0-9.]+)?$') {
Write-Error "Unexpected version format: '$Version'"
exit 1
}
Write-Host "Installing browseros-cli v$Version..."
# ── Detect architecture ──────────────────────────────────────────────────────
# $env:PROCESSOR_ARCHITECTURE lies under x64 emulation on ARM64 Windows.
# Use .NET RuntimeInformation when available, fall back to PROCESSOR_ARCHITEW6432.
$Arch = "amd64"
try {
$osArch = [System.Runtime.InteropServices.RuntimeInformation]::OSArchitecture
if ($osArch -eq [System.Runtime.InteropServices.Architecture]::Arm64) { $Arch = "arm64" }
} catch {
if ($env:PROCESSOR_ARCHITEW6432 -eq "ARM64" -or $env:PROCESSOR_ARCHITECTURE -eq "ARM64") {
$Arch = "arm64"
}
}
if (-not [Environment]::Is64BitOperatingSystem) {
Write-Error "32-bit Windows is not supported."
exit 1
}
# ── Download and extract ─────────────────────────────────────────────────────
$Filename = "${Binary}_${Version}_windows_${Arch}.zip"
$Url = "$CdnBase/v$Version/$Filename"
$TmpDir = Join-Path ([System.IO.Path]::GetTempPath()) ("browseros-cli-install-" + [System.IO.Path]::GetRandomFileName())
try {
New-Item -ItemType Directory -Path $TmpDir | Out-Null
$ZipPath = Join-Path $TmpDir $Filename
Write-Host "Downloading $Url..."
Invoke-WebRequest -Uri $Url -OutFile $ZipPath -UseBasicParsing
Expand-Archive -Path $ZipPath -DestinationPath $TmpDir -Force
$Exe = Get-ChildItem -Path $TmpDir -Filter "$Binary.exe" -File -Recurse | Select-Object -First 1
if (-not $Exe) {
Write-Error "Binary not found in archive."
exit 1
}
# ── Install ──────────────────────────────────────────────────────────────
if (-not (Test-Path $Dir)) {
New-Item -ItemType Directory -Path $Dir -Force | Out-Null
}
Move-Item -Force $Exe.FullName (Join-Path $Dir "$Binary.exe")
Write-Host "Installed $Binary.exe to $Dir"
} finally {
if (Test-Path $TmpDir) { Remove-Item -Recurse -Force $TmpDir -ErrorAction SilentlyContinue }
}
# ── PATH ─────────────────────────────────────────────────────────────────────
$UserPath = [Environment]::GetEnvironmentVariable("Path", "User")
$PathEntries = $UserPath -split ";" | Where-Object { $_ -ne "" }
if ($Dir -notin $PathEntries) {
Write-Host ""
Write-Host "Adding $Dir to your user PATH..."
[Environment]::SetEnvironmentVariable("Path", "$Dir;$UserPath", "User")
$env:Path = "$Dir;$env:Path"
Write-Host "Done. Restart your terminal for PATH changes to take effect."
}
Write-Host ""
Write-Host "Run 'browseros-cli --help' to get started."

View File

@@ -0,0 +1,151 @@
#!/usr/bin/env bash
#
# Install browseros-cli — downloads the latest release binary for your platform.
#
# Usage:
# curl -fsSL https://cdn.browseros.com/cli/install.sh | bash
#
# # Or with options:
# curl -fsSL https://cdn.browseros.com/cli/install.sh | bash -s -- --version 0.1.0 --dir /usr/local/bin
set -euo pipefail
CDN_BASE="https://cdn.browseros.com/cli"
BINARY="browseros-cli"
INSTALL_DIR="${HOME}/.browseros/bin"
# ── Parse arguments ──────────────────────────────────────────────────────────
VERSION=""
CUSTOM_DIR=""
while [[ $# -gt 0 ]]; do
case "$1" in
--version)
[[ $# -lt 2 ]] && { echo "Error: --version requires a value" >&2; exit 1; }
VERSION="$2"; shift 2 ;;
--dir)
[[ $# -lt 2 ]] && { echo "Error: --dir requires a value" >&2; exit 1; }
CUSTOM_DIR="$2"; shift 2 ;;
--help)
echo "Usage: install.sh [--version VERSION] [--dir INSTALL_DIR]"
echo ""
echo " --version Install a specific version (default: latest)"
echo " --dir Install directory (default: ~/.browseros/bin)"
exit 0
;;
*) echo "Unknown option: $1" >&2; exit 1 ;;
esac
done
[[ -n "$CUSTOM_DIR" ]] && INSTALL_DIR="$CUSTOM_DIR"
# ── Resolve latest version ───────────────────────────────────────────────────
if [[ -z "$VERSION" ]]; then
VERSION=$(curl -fsSL "${CDN_BASE}/latest/version.txt" | tr -d '[:space:]')
if [[ -z "$VERSION" ]]; then
echo "Error: could not determine latest version." >&2
echo " Try: install.sh --version 0.1.0" >&2
exit 1
fi
fi
if [[ ! "$VERSION" =~ ^[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9.]+)?$ ]]; then
echo "Error: unexpected version format: '$VERSION'" >&2
exit 1
fi
echo "Installing browseros-cli v${VERSION}..."
# ── Detect platform ──────────────────────────────────────────────────────────
OS=$(uname -s | tr '[:upper:]' '[:lower:]')
ARCH=$(uname -m)
case "$OS" in
darwin) OS="darwin" ;;
linux) OS="linux" ;;
*) echo "Error: unsupported OS: $OS" >&2; exit 1 ;;
esac
case "$ARCH" in
x86_64|amd64) ARCH="amd64" ;;
arm64|aarch64) ARCH="arm64" ;;
*) echo "Error: unsupported architecture: $ARCH" >&2; exit 1 ;;
esac
# ── Download and extract ─────────────────────────────────────────────────────
FILENAME="${BINARY}_${VERSION}_${OS}_${ARCH}.tar.gz"
URL="${CDN_BASE}/v${VERSION}/${FILENAME}"
CHECKSUM_URL="${CDN_BASE}/v${VERSION}/checksums.txt"
TMPDIR_DL=$(mktemp -d)
trap 'rm -rf "$TMPDIR_DL"' EXIT
echo "Downloading ${URL}..."
curl -fSL --progress-bar -o "${TMPDIR_DL}/${FILENAME}" "$URL"
# Verify checksum if sha256sum/shasum is available
if curl -fsSL -o "${TMPDIR_DL}/checksums.txt" "$CHECKSUM_URL" 2>/dev/null; then
expected=$(awk -v filename="$FILENAME" '$2 == filename { print $1; exit }' "${TMPDIR_DL}/checksums.txt")
if [[ -n "$expected" ]]; then
if command -v sha256sum >/dev/null 2>&1; then
actual=$(sha256sum "${TMPDIR_DL}/${FILENAME}" | awk '{print $1}')
elif command -v shasum >/dev/null 2>&1; then
actual=$(shasum -a 256 "${TMPDIR_DL}/${FILENAME}" | awk '{print $1}')
else
actual=""
echo "Warning: no sha256sum/shasum found; skipping checksum verification." >&2
fi
if [[ -n "$actual" && "$actual" != "$expected" ]]; then
echo "Error: checksum mismatch (expected ${expected}, got ${actual})" >&2
exit 1
fi
[[ -n "$actual" ]] && echo "Checksum verified."
else
echo "Warning: checksum not found in checksums.txt; skipping verification." >&2
fi
else
echo "Warning: could not fetch checksums.txt; skipping checksum verification." >&2
fi
tar -xzf "${TMPDIR_DL}/${FILENAME}" -C "$TMPDIR_DL"
BINARY_PATH="${TMPDIR_DL}/${BINARY}"
if [[ ! -f "$BINARY_PATH" ]]; then
BINARY_PATH=$(find "$TMPDIR_DL" -type f -name "$BINARY" -print -quit)
fi
if [[ -z "$BINARY_PATH" || ! -f "$BINARY_PATH" ]]; then
echo "Error: binary not found in archive." >&2
exit 1
fi
# ── Install ──────────────────────────────────────────────────────────────────
mkdir -p "$INSTALL_DIR"
mv "$BINARY_PATH" "${INSTALL_DIR}/${BINARY}"
chmod +x "${INSTALL_DIR}/${BINARY}"
echo "Installed ${BINARY} to ${INSTALL_DIR}/${BINARY}"
# ── PATH hint ────────────────────────────────────────────────────────────────
if ! echo "$PATH" | tr ':' '\n' | grep -qx "$INSTALL_DIR"; then
echo ""
echo "Add browseros-cli to your PATH:"
echo ""
SHELL_NAME=$(basename "${SHELL:-/bin/bash}")
case "$SHELL_NAME" in
zsh) echo " echo 'export PATH=\"${INSTALL_DIR}:\$PATH\"' >> ~/.zshrc && source ~/.zshrc" ;;
fish) echo " fish_add_path ${INSTALL_DIR}" ;;
*) echo " echo 'export PATH=\"${INSTALL_DIR}:\$PATH\"' >> ~/.bashrc && source ~/.bashrc" ;;
esac
fi
echo ""
echo "Run 'browseros-cli --help' to get started."

View File

@@ -0,0 +1,49 @@
package update
import (
"bytes"
"crypto/sha256"
"encoding/hex"
"fmt"
"strings"
"github.com/minio/selfupdate"
)
func CheckPermissions(targetPath string) error {
options := selfupdate.Options{TargetPath: targetPath}
return options.CheckPermissions()
}
func VerifyChecksum(data []byte, expectedHex string) error {
expected, err := decodeChecksum(expectedHex)
if err != nil {
return err
}
actual := sha256.Sum256(data)
if !bytes.Equal(actual[:], expected) {
return fmt.Errorf(
"checksum mismatch: expected %s, got %s",
hex.EncodeToString(expected),
hex.EncodeToString(actual[:]),
)
}
return nil
}
func ApplyBinary(binary []byte, targetPath string) error {
options := selfupdate.Options{TargetPath: targetPath}
err := selfupdate.Apply(bytes.NewReader(binary), options)
if rollbackErr := selfupdate.RollbackError(err); rollbackErr != nil {
return fmt.Errorf("update failed and rollback failed: %w", rollbackErr)
}
return err
}
func decodeChecksum(checksumHex string) ([]byte, error) {
value := strings.TrimSpace(checksumHex)
if value == "" {
return nil, fmt.Errorf("missing checksum")
}
return hex.DecodeString(value)
}

View File

@@ -0,0 +1,138 @@
package update
import (
"archive/tar"
"archive/zip"
"bytes"
"compress/gzip"
"context"
"fmt"
"io"
"net/http"
)
const maxAssetSize = 64 << 20
const maxBinarySize = 256 << 20
func DownloadAsset(ctx context.Context, client *http.Client, asset Asset) ([]byte, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, asset.URL, nil)
if err != nil {
return nil, err
}
resp, err := client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("update download returned HTTP %d", resp.StatusCode)
}
return readAssetBytes(resp.Body)
}
func readAssetBytes(reader io.Reader) ([]byte, error) {
limited := io.LimitReader(reader, maxAssetSize+1)
data, err := io.ReadAll(limited)
if err != nil {
return nil, err
}
if len(data) > maxAssetSize {
return nil, fmt.Errorf("update asset exceeds %d bytes", maxAssetSize)
}
return data, nil
}
func ExtractBinary(archive []byte, format string) ([]byte, error) {
switch format {
case "tar.gz":
return extractTarGzBinary(archive)
case "zip":
return extractZipBinary(archive)
default:
return nil, fmt.Errorf("unsupported archive format %q", format)
}
}
func extractTarGzBinary(archive []byte) ([]byte, error) {
gzipReader, err := gzip.NewReader(bytes.NewReader(archive))
if err != nil {
return nil, err
}
defer gzipReader.Close()
tarReader := tar.NewReader(gzipReader)
return readTarBinary(tarReader)
}
func readTarBinary(reader *tar.Reader) ([]byte, error) {
var binary []byte
for {
header, err := reader.Next()
if err == io.EOF {
break
}
if err != nil {
return nil, err
}
if header.Typeflag != tar.TypeReg {
continue
}
if binary != nil {
return nil, fmt.Errorf("archive contains multiple files; expected exactly one binary")
}
binary, err = io.ReadAll(io.LimitReader(reader, maxBinarySize+1))
if err != nil {
return nil, err
}
if len(binary) > maxBinarySize {
return nil, fmt.Errorf("extracted binary exceeds %d bytes", maxBinarySize)
}
}
if binary == nil {
return nil, fmt.Errorf("archive does not contain a file")
}
return binary, nil
}
func extractZipBinary(archive []byte) ([]byte, error) {
reader, err := zip.NewReader(bytes.NewReader(archive), int64(len(archive)))
if err != nil {
return nil, err
}
var binary []byte
for _, file := range reader.File {
if file.FileInfo().IsDir() {
continue
}
if binary != nil {
return nil, fmt.Errorf("archive contains multiple files; expected exactly one binary")
}
rc, err := file.Open()
if err != nil {
return nil, err
}
binary, err = io.ReadAll(io.LimitReader(rc, maxBinarySize+1))
rc.Close()
if err != nil {
return nil, err
}
if len(binary) > maxBinarySize {
return nil, fmt.Errorf("extracted binary exceeds %d bytes", maxBinarySize)
}
}
if binary == nil {
return nil, fmt.Errorf("archive does not contain a file")
}
return binary, nil
}

View File

@@ -0,0 +1,168 @@
package update
import (
"archive/tar"
"archive/zip"
"bytes"
"compress/gzip"
"crypto/sha256"
"encoding/hex"
"os"
"path/filepath"
"testing"
)
func TestExtractBinaryTarGz(t *testing.T) {
archive := createTarGz(t, map[string]string{"browseros-cli": "new-binary"})
binary, err := ExtractBinary(archive, "tar.gz")
if err != nil {
t.Fatalf("ExtractBinary() error = %v", err)
}
if string(binary) != "new-binary" {
t.Fatalf("ExtractBinary() = %q, want %q", string(binary), "new-binary")
}
}
func TestExtractBinaryZip(t *testing.T) {
archive := createZip(t, map[string]string{"browseros-cli.exe": "new-binary"})
binary, err := ExtractBinary(archive, "zip")
if err != nil {
t.Fatalf("ExtractBinary() error = %v", err)
}
if string(binary) != "new-binary" {
t.Fatalf("ExtractBinary() = %q, want %q", string(binary), "new-binary")
}
}
func TestExtractBinaryTarGzRejectsMultipleFiles(t *testing.T) {
archive := createTarGz(t, map[string]string{
"browseros-cli": "new-binary",
"browseros-cli.sig": "signature",
})
_, err := ExtractBinary(archive, "tar.gz")
if err == nil {
t.Fatal("ExtractBinary() error = nil, want multiple files error")
}
if err.Error() != "archive contains multiple files; expected exactly one binary" {
t.Fatalf("ExtractBinary() error = %q", err)
}
}
func TestVerifyChecksumValid(t *testing.T) {
data := []byte("some-data")
sum := sha256.Sum256(data)
if err := VerifyChecksum(data, hex.EncodeToString(sum[:])); err != nil {
t.Fatalf("VerifyChecksum() error = %v", err)
}
}
func TestVerifyChecksumMismatch(t *testing.T) {
data := []byte("some-data")
badChecksum := "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
if err := VerifyChecksum(data, badChecksum); err == nil {
t.Fatal("VerifyChecksum() error = nil, want mismatch error")
}
}
func TestApplyBinary(t *testing.T) {
targetPath := filepath.Join(t.TempDir(), "browseros-cli")
if err := os.WriteFile(targetPath, []byte("old-binary"), 0755); err != nil {
t.Fatalf("WriteFile() error = %v", err)
}
newBinary := []byte("new-binary")
if err := ApplyBinary(newBinary, targetPath); err != nil {
t.Fatalf("ApplyBinary() error = %v", err)
}
data, err := os.ReadFile(targetPath)
if err != nil {
t.Fatalf("ReadFile() error = %v", err)
}
if string(data) != "new-binary" {
t.Fatalf("updated binary = %q, want %q", string(data), "new-binary")
}
}
func TestVerifyThenApplyIntegration(t *testing.T) {
archive := createTarGz(t, map[string]string{"browseros-cli": "updated-binary"})
archiveSum := sha256.Sum256(archive)
if err := VerifyChecksum(archive, hex.EncodeToString(archiveSum[:])); err != nil {
t.Fatalf("VerifyChecksum(archive) error = %v", err)
}
binary, err := ExtractBinary(archive, "tar.gz")
if err != nil {
t.Fatalf("ExtractBinary() error = %v", err)
}
targetPath := filepath.Join(t.TempDir(), "browseros-cli")
if err := os.WriteFile(targetPath, []byte("old"), 0755); err != nil {
t.Fatalf("WriteFile() error = %v", err)
}
if err := ApplyBinary(binary, targetPath); err != nil {
t.Fatalf("ApplyBinary() error = %v", err)
}
data, err := os.ReadFile(targetPath)
if err != nil {
t.Fatalf("ReadFile() error = %v", err)
}
if string(data) != "updated-binary" {
t.Fatalf("binary = %q, want %q", string(data), "updated-binary")
}
}
func createTarGz(t *testing.T, files map[string]string) []byte {
t.Helper()
var buffer bytes.Buffer
gzipWriter := gzip.NewWriter(&buffer)
tarWriter := tar.NewWriter(gzipWriter)
for name, body := range files {
data := []byte(body)
if err := tarWriter.WriteHeader(&tar.Header{
Name: name,
Mode: 0755,
Size: int64(len(data)),
}); err != nil {
t.Fatalf("WriteHeader() error = %v", err)
}
if _, err := tarWriter.Write(data); err != nil {
t.Fatalf("Write() error = %v", err)
}
}
if err := tarWriter.Close(); err != nil {
t.Fatalf("Close() error = %v", err)
}
if err := gzipWriter.Close(); err != nil {
t.Fatalf("Close() error = %v", err)
}
return buffer.Bytes()
}
func createZip(t *testing.T, files map[string]string) []byte {
t.Helper()
var buffer bytes.Buffer
zipWriter := zip.NewWriter(&buffer)
for name, body := range files {
fileWriter, err := zipWriter.Create(name)
if err != nil {
t.Fatalf("Create() error = %v", err)
}
if _, err := fileWriter.Write([]byte(body)); err != nil {
t.Fatalf("Write() error = %v", err)
}
}
if err := zipWriter.Close(); err != nil {
t.Fatalf("Close() error = %v", err)
}
return buffer.Bytes()
}

View File

@@ -0,0 +1,273 @@
package update
import (
"context"
"fmt"
"net/http"
"os"
"runtime"
"time"
)
const (
DefaultManifestURL = "https://cdn.browseros.com/cli/latest/manifest.json"
DefaultCheckTTL = 24 * time.Hour
DefaultHTTPTimeout = 2 * time.Second
DefaultDownloadTimeout = 5 * time.Minute
SkipCheckEnv = "BROWSEROS_SKIP_UPDATE_CHECK"
InstallMethodEnv = "BROWSEROS_INSTALL_METHOD"
)
type Options struct {
CurrentVersion string
ManifestURL string
CheckTTL time.Duration
HTTPTimeout time.Duration
DownloadTimeout time.Duration
JSONOutput bool
Debug bool
Automatic bool
HTTPClient *http.Client
Now func() time.Time
}
type Manager struct {
options Options
state *State
}
type CheckResult struct {
CurrentVersion string `json:"current_version"`
LatestVersion string `json:"latest_version"`
LatestPublishedAt string `json:"latest_published_at,omitempty"`
UpdateAvailable bool `json:"update_available"`
CheckedAt time.Time `json:"checked_at"`
Asset *Asset `json:"asset,omitempty"`
}
func NewManager(options Options) *Manager {
if options.ManifestURL == "" {
options.ManifestURL = DefaultManifestURL
}
if options.CheckTTL == 0 {
options.CheckTTL = DefaultCheckTTL
}
if options.HTTPTimeout == 0 {
options.HTTPTimeout = DefaultHTTPTimeout
}
if options.DownloadTimeout == 0 {
options.DownloadTimeout = DefaultDownloadTimeout
}
if options.Now == nil {
options.Now = time.Now
}
if options.HTTPClient == nil {
options.HTTPClient = &http.Client{}
}
state, err := LoadState()
if err != nil {
state = &State{}
}
return &Manager{
options: options,
state: state,
}
}
func (m *Manager) CachedNotice() string {
if !m.AutomaticEnabled() || m.state == nil || m.state.LatestVersion == "" {
return ""
}
comparison, err := CompareVersions(m.options.CurrentVersion, m.state.LatestVersion)
if err != nil || comparison >= 0 {
return ""
}
return FormatNotice(m.options.CurrentVersion, m.state.LatestVersion)
}
func (m *Manager) AutomaticEnabled() bool {
if !m.options.Automatic || m.options.JSONOutput {
return false
}
if os.Getenv(SkipCheckEnv) != "" {
return false
}
if installedViaPackageManager() {
return false
}
return IsReleaseVersion(m.options.CurrentVersion)
}
func installedViaPackageManager() bool {
method := os.Getenv(InstallMethodEnv)
switch method {
case "npm", "brew", "homebrew":
return true
}
return false
}
func (m *Manager) ShouldCheck() bool {
if !m.AutomaticEnabled() {
return false
}
return m.state.IsStale(m.options.Now(), m.options.CheckTTL)
}
func (m *Manager) StartBackgroundCheck(ctx context.Context) <-chan struct{} {
done := make(chan struct{})
if !m.ShouldCheck() {
close(done)
return done
}
go func() {
defer close(done)
_, _ = m.CheckNow(ctx)
}()
return done
}
func (m *Manager) CheckNow(ctx context.Context) (*CheckResult, error) {
if !IsReleaseVersion(m.options.CurrentVersion) {
return nil, fmt.Errorf("self-update is unavailable for non-release build %q", m.options.CurrentVersion)
}
checkCtx, cancel := context.WithTimeout(ctx, m.options.HTTPTimeout)
defer cancel()
manifest, err := FetchManifest(checkCtx, cloneHTTPClient(m.options.HTTPClient), m.options.ManifestURL)
if err != nil {
m.recordError(err)
return nil, err
}
asset, err := SelectAsset(manifest, runtime.GOOS, runtime.GOARCH)
if err != nil {
m.recordError(err)
return nil, err
}
comparison, err := CompareVersions(m.options.CurrentVersion, manifest.Version)
if err != nil {
m.recordError(err)
return nil, err
}
result := &CheckResult{
CurrentVersion: m.options.CurrentVersion,
LatestVersion: manifest.Version,
LatestPublishedAt: manifest.PublishedAt,
UpdateAvailable: comparison < 0,
CheckedAt: m.options.Now(),
}
if result.UpdateAvailable {
assetCopy := asset
result.Asset = &assetCopy
}
m.state = &State{
LastCheckedAt: result.CheckedAt,
LatestVersion: manifest.Version,
LatestPublishedAt: manifest.PublishedAt,
AssetURL: asset.URL,
}
_ = SaveState(m.state)
return result, nil
}
func (m *Manager) Apply(ctx context.Context, result *CheckResult) error {
if result == nil || !result.UpdateAvailable || result.Asset == nil {
return fmt.Errorf("browseros-cli is already up to date")
}
downloadCtx, cancel := context.WithTimeout(ctx, m.options.DownloadTimeout)
defer cancel()
archive, err := DownloadAsset(downloadCtx, cloneHTTPClient(m.options.HTTPClient), *result.Asset)
if err != nil {
return err
}
if err := VerifyChecksum(archive, result.Asset.SHA256); err != nil {
return err
}
binary, err := ExtractBinary(archive, result.Asset.ArchiveFormat)
if err != nil {
return err
}
targetPath, err := os.Executable()
if err != nil {
return err
}
if err := CheckPermissions(targetPath); err != nil {
return fmt.Errorf(
"cannot replace %s: %w\n\nReinstall with the installer script or move the binary to a writable location.",
targetPath,
err,
)
}
if err := ApplyBinary(binary, targetPath); err != nil {
return err
}
m.saveAppliedState(result)
return nil
}
func FormatNotice(currentVersion, latestVersion string) string {
notice := fmt.Sprintf(
"Update available: browseros-cli v%s (current v%s)",
latestVersion,
currentVersion,
)
switch os.Getenv(InstallMethodEnv) {
case "npm":
notice += "\nRun `npm update -g browseros-cli` to upgrade."
case "brew", "homebrew":
notice += "\nRun `brew upgrade browseros-cli` to upgrade."
default:
notice += "\nRun `browseros-cli update` to upgrade."
}
return notice
}
func (m *Manager) recordError(err error) {
state := &State{}
if m.state != nil {
*state = *m.state
}
state.CheckError = err.Error()
m.state = state
_ = SaveState(state)
}
func (m *Manager) saveAppliedState(result *CheckResult) {
state := &State{
LastCheckedAt: m.options.Now(),
LatestVersion: result.LatestVersion,
LatestPublishedAt: result.LatestPublishedAt,
AssetURL: result.Asset.URL,
}
m.state = state
_ = SaveState(state)
}
func cloneHTTPClient(client *http.Client) *http.Client {
if client == nil {
return &http.Client{}
}
cloned := *client
cloned.Timeout = 0
return &cloned
}

View File

@@ -0,0 +1,188 @@
package update
import (
"context"
"net/http"
"net/http/httptest"
"runtime"
"testing"
"time"
)
func TestManagerCachedNotice(t *testing.T) {
manager := NewManager(Options{
CurrentVersion: "1.0.0",
Automatic: true,
})
manager.state = &State{LatestVersion: "1.2.0"}
notice := manager.CachedNotice()
if notice == "" {
t.Fatal("CachedNotice() returned empty notice")
}
}
func TestManagerShouldCheck(t *testing.T) {
manager := NewManager(Options{
CurrentVersion: "1.0.0",
Automatic: true,
CheckTTL: time.Minute,
Now: func() time.Time {
return time.Unix(1000, 0).UTC()
},
})
manager.state = &State{LastCheckedAt: time.Unix(0, 0).UTC()}
if !manager.ShouldCheck() {
t.Fatal("ShouldCheck() = false, want true")
}
}
func TestManagerCheckNow(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{
"version":"9.9.9",
"published_at":"2026-03-27T19:00:00Z",
"tag":"browseros-cli-v9.9.9",
"assets":{
"` + runtimePlatformKey(t) + `":{
"filename":"browseros-cli_9.9.9_test.tar.gz",
"url":"https://cdn.example.com/cli/v9.9.9/browseros-cli_9.9.9_test.tar.gz",
"archive_format":"tar.gz",
"sha256":"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
}
}
}`))
}))
defer server.Close()
manager := NewManager(Options{
CurrentVersion: "1.0.0",
ManifestURL: server.URL,
Automatic: false,
HTTPClient: server.Client(),
Now: func() time.Time {
return time.Unix(100, 0).UTC()
},
})
result, err := manager.CheckNow(context.Background())
if err != nil {
t.Fatalf("CheckNow() error = %v", err)
}
if !result.UpdateAvailable {
t.Fatal("CheckNow() UpdateAvailable = false, want true")
}
if result.LatestPublishedAt != "2026-03-27T19:00:00Z" {
t.Fatalf(
"CheckNow() LatestPublishedAt = %q, want %q",
result.LatestPublishedAt,
"2026-03-27T19:00:00Z",
)
}
if manager.state.LatestPublishedAt != "2026-03-27T19:00:00Z" {
t.Fatalf(
"state LatestPublishedAt = %q, want %q",
manager.state.LatestPublishedAt,
"2026-03-27T19:00:00Z",
)
}
}
func TestCloneHTTPClientClearsTimeout(t *testing.T) {
base := &http.Client{Timeout: time.Second}
cloned := cloneHTTPClient(base)
if cloned == base {
t.Fatal("cloneHTTPClient() returned the original client")
}
if cloned.Timeout != 0 {
t.Fatalf("cloneHTTPClient() Timeout = %s, want 0", cloned.Timeout)
}
if base.Timeout != time.Second {
t.Fatalf("base Timeout = %s, want %s", base.Timeout, time.Second)
}
}
func TestManagerSaveAppliedState(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
now := time.Unix(200, 0).UTC()
manager := NewManager(Options{
CurrentVersion: "1.0.0",
Now: func() time.Time {
return now
},
})
manager.state = &State{
LastCheckedAt: time.Unix(100, 0).UTC(),
CheckError: "manifest fetch failed",
}
manager.saveAppliedState(&CheckResult{
LatestVersion: "9.9.9",
LatestPublishedAt: "2026-03-27T19:00:00Z",
Asset: &Asset{
URL: "https://cdn.example.com/cli/v9.9.9/browseros-cli_9.9.9_test.tar.gz",
},
})
if manager.state.LastCheckedAt != now {
t.Fatalf("LastCheckedAt = %v, want %v", manager.state.LastCheckedAt, now)
}
if manager.state.CheckError != "" {
t.Fatalf("CheckError = %q, want empty", manager.state.CheckError)
}
if manager.state.LatestPublishedAt != "2026-03-27T19:00:00Z" {
t.Fatalf("LatestPublishedAt = %q", manager.state.LatestPublishedAt)
}
}
func TestAutomaticEnabledSkipsForPackageManagerInstall(t *testing.T) {
t.Setenv("BROWSEROS_INSTALL_METHOD", "npm")
manager := NewManager(Options{
CurrentVersion: "1.0.0",
Automatic: true,
})
if manager.AutomaticEnabled() {
t.Fatal("AutomaticEnabled() = true, want false when BROWSEROS_INSTALL_METHOD=npm")
}
}
func TestAutomaticEnabledAllowsNormalInstall(t *testing.T) {
t.Setenv("BROWSEROS_INSTALL_METHOD", "")
manager := NewManager(Options{
CurrentVersion: "1.0.0",
Automatic: true,
})
if !manager.AutomaticEnabled() {
t.Fatal("AutomaticEnabled() = false, want true when BROWSEROS_INSTALL_METHOD is empty")
}
}
func runtimePlatformKey(t *testing.T) string {
t.Helper()
key, err := PlatformKey(runtimeGOOS(), runtimeGOARCH())
if err != nil {
t.Fatalf("PlatformKey() error = %v", err)
}
return key
}
func runtimeGOOS() string {
return runtime.GOOS
}
func runtimeGOARCH() string {
return runtime.GOARCH
}

View File

@@ -0,0 +1,144 @@
package update
import (
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
"golang.org/x/mod/semver"
)
const maxManifestSize = 1 << 20
type Manifest struct {
Version string `json:"version"`
PublishedAt string `json:"published_at"`
Tag string `json:"tag"`
Assets map[string]Asset `json:"assets"`
}
type Asset struct {
Filename string `json:"filename"`
URL string `json:"url"`
ArchiveFormat string `json:"archive_format"`
SHA256 string `json:"sha256"`
}
func FetchManifest(
ctx context.Context,
client *http.Client,
url string,
) (*Manifest, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, err
}
resp, err := client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("update manifest returned HTTP %d", resp.StatusCode)
}
var manifest Manifest
if err := json.NewDecoder(io.LimitReader(resp.Body, maxManifestSize)).Decode(&manifest); err != nil {
return nil, err
}
if err := manifest.Validate(); err != nil {
return nil, err
}
return &manifest, nil
}
func (m *Manifest) Validate() error {
if m == nil {
return fmt.Errorf("update manifest is nil")
}
if !IsReleaseVersion(m.Version) {
return fmt.Errorf("invalid manifest version %q", m.Version)
}
if len(m.Assets) == 0 {
return fmt.Errorf("update manifest has no assets")
}
for key, asset := range m.Assets {
if asset.URL == "" {
return fmt.Errorf("asset %q is missing url", key)
}
if asset.SHA256 == "" {
return fmt.Errorf("asset %q is missing sha256", key)
}
if asset.ArchiveFormat != "tar.gz" && asset.ArchiveFormat != "zip" {
return fmt.Errorf("asset %q has unsupported archive format %q", key, asset.ArchiveFormat)
}
}
return nil
}
func NormalizeVersion(version string) string {
value := strings.TrimSpace(version)
if value == "" {
return ""
}
if !strings.HasPrefix(value, "v") {
value = "v" + value
}
return semver.Canonical(value)
}
func IsReleaseVersion(version string) bool {
return NormalizeVersion(version) != ""
}
func CompareVersions(current, latest string) (int, error) {
normalizedCurrent := NormalizeVersion(current)
if normalizedCurrent == "" {
return 0, fmt.Errorf("invalid current version %q", current)
}
normalizedLatest := NormalizeVersion(latest)
if normalizedLatest == "" {
return 0, fmt.Errorf("invalid latest version %q", latest)
}
return semver.Compare(normalizedCurrent, normalizedLatest), nil
}
func PlatformKey(goos, goarch string) (string, error) {
switch goos {
case "darwin", "linux", "windows":
default:
return "", fmt.Errorf("unsupported os %q", goos)
}
switch goarch {
case "amd64", "arm64":
default:
return "", fmt.Errorf("unsupported arch %q", goarch)
}
return goos + "/" + goarch, nil
}
func SelectAsset(manifest *Manifest, goos, goarch string) (Asset, error) {
key, err := PlatformKey(goos, goarch)
if err != nil {
return Asset{}, err
}
asset, ok := manifest.Assets[key]
if !ok {
return Asset{}, fmt.Errorf("no update asset for %s", key)
}
return asset, nil
}

View File

@@ -0,0 +1,102 @@
package update
import (
"context"
"net/http"
"net/http/httptest"
"strings"
"testing"
)
func TestNormalizeVersion(t *testing.T) {
if got := NormalizeVersion("1.2.3"); got != "v1.2.3" {
t.Fatalf("NormalizeVersion() = %q, want %q", got, "v1.2.3")
}
if got := NormalizeVersion("dev"); got != "" {
t.Fatalf("NormalizeVersion(dev) = %q, want empty", got)
}
}
func TestCompareVersions(t *testing.T) {
got, err := CompareVersions("1.2.3", "1.3.0")
if err != nil {
t.Fatalf("CompareVersions() error = %v", err)
}
if got >= 0 {
t.Fatalf("CompareVersions() = %d, want < 0", got)
}
}
func TestSelectAsset(t *testing.T) {
manifest := &Manifest{
Version: "1.2.3",
Assets: map[string]Asset{
"darwin/arm64": {
URL: "https://cdn.example.com/cli/v1.2.3/browseros-cli.tar.gz",
ArchiveFormat: "tar.gz",
SHA256: "abc",
},
},
}
asset, err := SelectAsset(manifest, "darwin", "arm64")
if err != nil {
t.Fatalf("SelectAsset() error = %v", err)
}
if asset.URL == "" {
t.Fatal("SelectAsset() returned empty URL")
}
}
func TestFetchManifest(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{
"version":"1.2.3",
"published_at":"2026-03-27T19:00:00Z",
"tag":"browseros-cli-v1.2.3",
"assets":{
"darwin/arm64":{
"filename":"browseros-cli_1.2.3_darwin_arm64.tar.gz",
"url":"https://cdn.example.com/cli/v1.2.3/browseros-cli_1.2.3_darwin_arm64.tar.gz",
"archive_format":"tar.gz",
"sha256":"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
}
}
}`))
}))
defer server.Close()
manifest, err := FetchManifest(context.Background(), server.Client(), server.URL)
if err != nil {
t.Fatalf("FetchManifest() error = %v", err)
}
if manifest.Version != "1.2.3" {
t.Fatalf("FetchManifest() version = %q, want %q", manifest.Version, "1.2.3")
}
}
func TestFetchManifestRejectsOversizedResponse(t *testing.T) {
hugeName := strings.Repeat("a", maxManifestSize)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{
"version":"1.2.3",
"published_at":"2026-03-27T19:00:00Z",
"tag":"browseros-cli-v1.2.3",
"assets":{
"darwin/arm64":{
"filename":"` + hugeName + `",
"url":"https://cdn.example.com/cli/v1.2.3/browseros-cli_1.2.3_darwin_arm64.tar.gz",
"archive_format":"tar.gz",
"sha256":"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
}
}
}`))
}))
defer server.Close()
if _, err := FetchManifest(context.Background(), server.Client(), server.URL); err == nil {
t.Fatal("FetchManifest() error = nil, want oversized response error")
}
}

View File

@@ -0,0 +1,80 @@
package update
import (
"encoding/json"
"os"
"path/filepath"
"time"
"browseros-cli/config"
)
type State struct {
LastCheckedAt time.Time `json:"last_checked_at"`
LatestVersion string `json:"latest_version,omitempty"`
LatestPublishedAt string `json:"latest_published_at,omitempty"`
AssetURL string `json:"asset_url,omitempty"`
CheckError string `json:"check_error,omitempty"`
}
func StatePath() string {
return filepath.Join(config.Dir(), "update-state.json")
}
func LoadState() (*State, error) {
data, err := os.ReadFile(StatePath())
if err != nil {
if os.IsNotExist(err) {
return &State{}, nil
}
return nil, err
}
var state State
if err := json.Unmarshal(data, &state); err != nil {
return nil, err
}
return &state, nil
}
func SaveState(state *State) error {
if state == nil {
state = &State{}
}
dir := config.Dir()
if err := os.MkdirAll(dir, 0755); err != nil {
return err
}
tmpFile, err := os.CreateTemp(dir, "update-state-*.json")
if err != nil {
return err
}
encoder := json.NewEncoder(tmpFile)
encoder.SetIndent("", " ")
if err := encoder.Encode(state); err != nil {
tmpFile.Close()
os.Remove(tmpFile.Name())
return err
}
if err := tmpFile.Close(); err != nil {
os.Remove(tmpFile.Name())
return err
}
if err := os.Rename(tmpFile.Name(), StatePath()); err != nil {
os.Remove(tmpFile.Name())
return err
}
return nil
}
func (s *State) IsStale(now time.Time, ttl time.Duration) bool {
if s == nil || s.LastCheckedAt.IsZero() {
return true
}
return now.Sub(s.LastCheckedAt) >= ttl
}

View File

@@ -0,0 +1,54 @@
package update
import (
"path/filepath"
"testing"
"time"
)
func TestLoadStateMissing(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
state, err := LoadState()
if err != nil {
t.Fatalf("LoadState() error = %v", err)
}
if state == nil {
t.Fatal("LoadState() returned nil state")
}
}
func TestSaveStateRoundTrip(t *testing.T) {
configRoot := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", configRoot)
want := &State{
LastCheckedAt: time.Unix(100, 0).UTC(),
LatestVersion: "1.2.3",
LatestPublishedAt: "2026-03-27T19:00:00Z",
AssetURL: "https://cdn.example.com/cli/v1.2.3/browseros-cli.tar.gz",
}
if err := SaveState(want); err != nil {
t.Fatalf("SaveState() error = %v", err)
}
got, err := LoadState()
if err != nil {
t.Fatalf("LoadState() error = %v", err)
}
if got.LatestVersion != want.LatestVersion {
t.Fatalf("LatestVersion = %q, want %q", got.LatestVersion, want.LatestVersion)
}
if StatePath() != filepath.Join(configRoot, "browseros-cli", "update-state.json") {
t.Fatalf("StatePath() = %q", StatePath())
}
}
func TestStateIsStale(t *testing.T) {
now := time.Unix(200, 0).UTC()
state := &State{LastCheckedAt: time.Unix(0, 0).UTC()}
if !state.IsStale(now, time.Minute) {
t.Fatal("IsStale() = false, want true")
}
}

View File

@@ -1,32 +0,0 @@
# Dependencies
node_modules/
# Build output
dist/
# Build unpublished docs
docs/
# TypeScript
*.tsbuildinfo
# IDE
.vscode/
.idea/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
# Logs
*.log
npm-debug.log*
# Environment
.env
.env.local
# Claude
.claude

View File

@@ -1,430 +0,0 @@
# BrowserOS Controller
WebSocket-based Chrome Extension that exposes browser automation APIs for remote control.
**⚠️ IMPORTANT:** This extension ONLY works in **BrowserOS Chrome**, not regular Chrome!
---
## 🚀 Quick Start
### 1. Build the Extension
```bash
npm install
npm run build
```
### 2. Load Extension in BrowserOS Chrome
1. Open BrowserOS Chrome
2. Go to `chrome://extensions/`
3. Enable **"Developer mode"** (top-right toggle)
4. Click **"Load unpacked"**
5. Select the `dist/` folder
6. Verify extension is loaded (you should see "BrowserOS Controller")
### 3. Test the Extension
```bash
npm test
```
This starts an interactive test client. You should see:
```
🚀 Starting BrowserOS Controller Test Client
──────────────────────────────────────────────────────────
WebSocket Server Started
Listening on: ws://localhost:9224/controller
Waiting for extension to connect...
✅ Extension connected!
Running Diagnostic Test
============================================================
📤 Sending: checkBrowserOS
Request ID: test-1729012345678
📨 Response: test-1729012345678
Status: ✅ SUCCESS
Data: {
"available": true,
"apis": [
"captureScreenshot",
"clear",
"click",
...
]
}
```
**If you see "available": true**, you're all set! 🎉
**If you see "available": false**, you're not using BrowserOS Chrome.
---
## ⚙️ Configuration
The extension can be configured using environment variables. This is optional - sensible defaults are provided.
### Environment Variables
Create a `.env` file in the project root to customize configuration:
```bash
# Copy the example file
cp .env.example .env
# Edit .env with your values
```
### Available Configuration Options
#### WebSocket Configuration
```bash
WEBSOCKET_PROTOCOL=ws # ws or wss (default: ws)
WEBSOCKET_HOST=localhost # Server host (default: localhost)
WEBSOCKET_PORT=9224 # Server port (default: 9224)
WEBSOCKET_PATH=/controller # Server path (default: /controller)
```
#### Connection Settings
```bash
WEBSOCKET_RECONNECT_DELAY=1000 # Initial reconnect delay in ms (default: 1000)
WEBSOCKET_MAX_RECONNECT_DELAY=30000 # Max reconnect delay in ms (default: 30000)
WEBSOCKET_RECONNECT_MULTIPLIER=1.5 # Exponential backoff multiplier (default: 1.5)
WEBSOCKET_MAX_RECONNECT_ATTEMPTS=0 # Max reconnect attempts, 0 = infinite (default: 0)
WEBSOCKET_HEARTBEAT_INTERVAL=30000 # Heartbeat interval in ms (default: 30000)
WEBSOCKET_HEARTBEAT_TIMEOUT=5000 # Heartbeat timeout in ms (default: 5000)
WEBSOCKET_CONNECTION_TIMEOUT=10000 # Connection timeout in ms (default: 10000)
WEBSOCKET_REQUEST_TIMEOUT=30000 # Request timeout in ms (default: 30000)
```
#### Concurrency Settings
```bash
CONCURRENCY_MAX_CONCURRENT=100 # Max concurrent requests (default: 100)
CONCURRENCY_MAX_QUEUE_SIZE=1000 # Max queued requests (default: 1000)
```
#### Logging Settings
```bash
LOGGING_ENABLED=true # Enable/disable logging (default: true)
LOGGING_LEVEL=info # Log level: debug, info, warn, error (default: info)
LOGGING_PREFIX=[BrowserOS Controller] # Log message prefix (default: [BrowserOS Controller])
```
### Example: Custom Port Configuration
If you want to use a different port (e.g., 8080):
```bash
# .env
WEBSOCKET_PORT=8080
```
Then rebuild the extension:
```bash
npm run build
```
The extension will now connect to `ws://localhost:8080/controller` instead of the default port 9224.
---
## 📖 Architecture
See [ARCHITECTURE.md](./ARCHITECTURE.md) for complete system documentation including:
- High-level architecture diagram
- Request flow (step-by-step)
- Component details
- All 14 registered actions
- WebSocket protocol specification
- Debugging guide
---
## 🧪 Testing
The test client (`npm test`) provides an interactive menu:
```
Available Commands:
Tab Actions:
1. getActiveTab - Get currently active tab
2. getTabs - Get all tabs
Browser Actions:
3. getInteractiveSnapshot - Get page elements (requires tabId)
4. click - Click element (requires tabId, nodeId)
5. inputText - Type text (requires tabId, nodeId, text)
6. captureScreenshot - Take screenshot (requires tabId)
Diagnostic:
d. checkBrowserOS - Check if chrome.browserOS is available
Other:
h. Show this menu
q. Quit
```
### Example Usage:
1. Type `1` → Get active tab
2. Type `d` → Run diagnostic
3. Type `q` → Quit
---
## 🔧 Development
### Build Commands
```bash
npm run build # Production build
npm run build:dev # Development build (with source maps)
npm run watch # Watch mode for development
```
### Debug Extension
1. Go to `chrome://extensions/`
2. Click **"Inspect views service worker"** under "BrowserOS Controller"
3. Service worker console shows all logs
**Check extension status:**
```javascript
__browserosController.getStats();
```
**Expected output:**
```javascript
{
connection: "connected",
requests: { inFlight: 0, avgDuration: 0, errorRate: 0, totalRequests: 0 },
concurrency: { inFlight: 0, queued: 0, utilization: 0 },
validator: { activeIds: 0 },
responseQueue: { size: 0 }
}
```
**Check registered actions:**
Look for this log on extension load:
```
Registered 14 action(s): checkBrowserOS, getActiveTab, getTabs, ...
```
---
## 📋 Available Actions
| Action | Input | Output | Description |
| ------------------------ | --------------------------------- | ------------------------------- | -------------------------------------- |
| `checkBrowserOS` | `{}` | `{available, apis}` | Check if chrome.browserOS is available |
| `getActiveTab` | `{}` | `{tabId, url, title, windowId}` | Get currently active tab |
| `getTabs` | `{}` | `{tabs[]}` | Get all open tabs |
| `getInteractiveSnapshot` | `{tabId, options?}` | `InteractiveSnapshot` | Get all interactive elements on page |
| `click` | `{tabId, nodeId}` | `{success}` | Click element by nodeId |
| `inputText` | `{tabId, nodeId, text}` | `{success}` | Type text into element |
| `clear` | `{tabId, nodeId}` | `{success}` | Clear text from element |
| `scrollToNode` | `{tabId, nodeId}` | `{scrolled}` | Scroll element into view |
| `captureScreenshot` | `{tabId, size?, showHighlights?}` | `{dataUrl}` | Take screenshot |
| `sendKeys` | `{tabId, keys}` | `{success}` | Send keyboard keys |
| `getPageLoadStatus` | `{tabId}` | `PageLoadStatus` | Get page load status |
| `getSnapshot` | `{tabId, type, options?}` | `Snapshot` | Get text/links snapshot |
| `clickCoordinates` | `{tabId, x, y}` | `{success}` | Click at coordinates |
| `typeAtCoordinates` | `{tabId, x, y, text}` | `{success}` | Type at coordinates |
---
## 🔌 WebSocket Protocol
**Endpoint:** `ws://localhost:9224/controller`
**Request Format:**
```json
{
"id": "unique-request-id",
"action": "click",
"payload": {
"tabId": 12345,
"nodeId": 42
}
}
```
**Response Format:**
```json
{
"id": "unique-request-id",
"ok": true,
"data": {
"success": true
}
}
```
**Error Response:**
```json
{
"id": "unique-request-id",
"ok": false,
"error": "Element not found: nodeId 42"
}
```
---
## ⚠️ Common Issues
### Issue 1: "chrome.browserOS is undefined"
**Symptoms:**
- Diagnostic shows `"available": false`
- All browser actions fail
**Cause:** Not using BrowserOS Chrome
**Solution:**
- Download and use BrowserOS Chrome (not regular Chrome)
- Verify at `chrome://version` - should show "BrowserOS" in the name
---
### Issue 2: "Port 9224 is already in use"
**Symptoms:**
```
❌ Fatal Error: Port 9224 is already in use!
```
**Solution:**
```bash
lsof -ti:9224 | xargs kill -9
npm test
```
---
### Issue 3: Extension Not Connecting
**Symptoms:**
- Test client shows "Waiting for extension to connect..." forever
- Service worker console shows "Connection timeout"
**Checklist:**
1. ✅ Test server running (`npm test`)
2. ✅ Extension loaded in BrowserOS Chrome
3. ✅ Extension enabled (chrome://extensions/)
4. ✅ Service worker active (not suspended)
**Solution:**
1. Reload extension: chrome://extensions/ → "Reload" button
2. Restart test server: Ctrl+C, then `npm test`
---
### Issue 4: "Unknown action"
**Symptoms:**
```
Error: Unknown action: "click". Available actions: getActiveTab, getTabs, ...
```
**Cause:** Action not registered (extension didn't reload properly)
**Solution:**
1. Toggle extension OFF and ON at chrome://extensions/
2. Check service worker console for: `Registered 14 action(s): ...`
---
## 📁 Project Structure
```
browseros-controller/
├── README.md # This file
├── ARCHITECTURE.md # Complete architecture documentation
├── .env.example # Environment variable template
├── manifest.json # Extension manifest
├── package.json # Node dependencies
├── webpack.config.js # Build configuration
├── src/ # Source code
│ ├── background/ # Service worker entry point
│ ├── actions/ # Action handlers
│ │ ├── bookmark/ # Bookmark management actions
│ │ ├── browser/ # Browser interaction actions
│ │ ├── diagnostics/ # Diagnostic actions
│ │ ├── history/ # History management actions
│ │ └── tab/ # Tab management actions
│ ├── adapters/ # Chrome API wrappers
│ ├── config/ # Configuration management
│ │ ├── constants.ts # Application constants
│ │ └── environment.ts # Environment variable handling
│ ├── websocket/ # WebSocket client
│ ├── utils/ # Utilities
│ ├── protocol/ # Protocol types
│ └── types/ # TypeScript definitions
├── tests/ # Test files
│ ├── test-simple.js # Interactive test client
│ └── test-auto.js # Automated test client
└── dist/ # Built extension (generated)
├── background.js
└── manifest.json
```
---
## 🔗 Related Projects
- **BrowserOS-agent**: AI agent that uses this controller for browser automation
- **BrowserOS Chrome**: Custom Chrome build with `chrome.browserOS` APIs
---
## 📄 License
MIT
---
## 🆘 Support
For issues or questions:
1. Check [ARCHITECTURE.md](./ARCHITECTURE.md) for detailed documentation
2. Review the "Common Issues" section above
3. Check service worker console for detailed error logs
4. Verify you're using BrowserOS Chrome (run diagnostic test)
---
**Happy automating! 🚀**

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 574 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.2 KiB

View File

@@ -1,38 +0,0 @@
{
"manifest_version": 3,
"name": "BrowserOS Controller",
"version": "1.0.0.8",
"description": "BrowserOS API bridge for BrowserOS Server",
"key": "MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAhlh9i/c2A3f0PL86hXhGPzguLIOQ+sPf3/Y8RD11gmdvoU6XqnUqv7GgBvm7SW7316uPnS58AYZY13jGtF4rFrscdda5H2CjZrtOyOycmKp2KzibJLwibXNm/JwKhZ3QEfgsW/orh1SMY2kNj62JemkWLcLyn3E1T+KTcTVyFOxiJS3hyQ+Y0/Jp1HOqGh5lYS58YYzwhId5rrJjfL7wFYtALgt2dEA2r7p4qpe+SW0QLA+ayjRAjS+yt+qitR0eWg+XgqcIk1f1KblN8/yDISssSD4LWiPofe5CmJPnqlHIuI0CpgvAFv9dvgR/w8OFkXxK5h06i6saum1xExj+IwIDAQAB",
"permissions": [
"tabs",
"activeTab",
"bookmarks",
"history",
"scripting",
"storage",
"tabGroups",
"webNavigation",
"downloads",
"browserOS",
"alarms"
],
"update_url": "https://cdn.browseros.com/extensions/update-manifest.xml",
"host_permissions": ["<all_urls>"],
"background": {
"service_worker": "background.js",
"type": "module"
},
"action": {
"default_icon": {
"16": "assets/icon16.png",
"48": "assets/icon48.png",
"128": "assets/icon128.png"
}
},
"icons": {
"16": "assets/icon16.png",
"48": "assets/icon48.png",
"128": "assets/icon128.png"
}
}

View File

@@ -1,39 +0,0 @@
{
"name": "browseros-controller",
"version": "1.0.0",
"description": "Chrome Extension API bridge for BrowserOS Server",
"directories": {
"doc": "docs"
},
"scripts": {
"build": "webpack --mode production",
"build:dev": "webpack --mode development",
"watch": "webpack --mode development --watch",
"test": "node tests/test-simple.js",
"test:auto": "node tests/test-auto.js",
"typecheck": "tsc --noEmit"
},
"keywords": [
"browser-automation",
"chrome-extension",
"browseros"
],
"author": "BrowserOS Team",
"license": "MIT",
"type": "commonjs",
"dependencies": {
"@browseros/shared": "workspace:*",
"zod": "^4.1.12"
},
"devDependencies": {
"@types/chrome": "^0.1.24",
"@types/node": "^24.7.1",
"copy-webpack-plugin": "^12.0.2",
"terser-webpack-plugin": "^5.3.11",
"ts-loader": "^9.5.4",
"typescript": "^5.9.3",
"webpack": "^5.102.1",
"webpack-cli": "^6.0.1",
"ws": "^8.18.3"
}
}

View File

@@ -1,106 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import type { ActionResponse } from '@/protocol/types'
import { ActionResponseSchema } from '@/protocol/types'
import { logger } from '@/utils/logger'
// Re-export for convenience
export type { ActionResponse }
export { ActionResponseSchema }
/**
* ActionHandler - Abstract base class for all actions
*
* Responsibilities:
* - Define contract for all actions (must implement inputSchema + execute)
* - Validate input using Zod schemas
* - Handle validation and execution errors
* - Return standardized ActionResponse
*
* Usage:
* class MyAction extends ActionHandler<InputType, OutputType> {
* inputSchema = z.object({ ... });
* async execute(input: InputType): Promise<OutputType> { ... }
* }
*/
export abstract class ActionHandler<TInput = unknown, TOutput = unknown> {
/**
* Zod schema for input validation
* Must be implemented by concrete actions
*/
abstract readonly inputSchema: z.ZodSchema<TInput>
/**
* Execute the action logic
* Must be implemented by concrete actions
*
* @param input - Validated input (guaranteed to match inputSchema)
* @returns Action result
*/
abstract execute(input: TInput): Promise<TOutput>
/**
* Handle request with validation and error handling
* Called by ActionRegistry
*
* Flow:
* 1. Validate input with Zod schema
* 2. Execute action logic
* 3. Return standardized response (ok/error)
*
* @param payload - Raw payload from request (unvalidated)
* @returns Standardized action response
*/
async handle(payload: unknown): Promise<ActionResponse> {
const actionName = this.constructor.name
try {
// Step 1: Validate input
logger.debug(`[${actionName}] Validating input`)
const validatedInput = this.inputSchema.parse(payload)
// Step 2: Execute action
logger.debug(`[${actionName}] Executing action`)
const result = await this.execute(validatedInput)
// Step 3: Return success response
logger.debug(`[${actionName}] Action completed successfully`)
return { ok: true, data: result }
} catch (error) {
// Handle validation or execution errors
const errorMessage = this._formatError(error)
logger.error(`[${actionName}] Action failed: ${errorMessage}`)
return { ok: false, error: errorMessage }
}
}
/**
* Format error for user-friendly response
*
* @param error - Error from validation or execution
* @returns Formatted error message
*/
protected _formatError(error: unknown): string {
// Zod validation error
if (error instanceof z.ZodError) {
const errors = error.issues.map((e: z.ZodIssue) => {
const path = e.path.length > 0 ? `${e.path.join('.')}: ` : ''
return `${path}${e.message}`
})
return `Validation error: ${errors.join(', ')}`
}
// Standard Error
if (error instanceof Error) {
return error.message
}
// Unknown error
return String(error)
}
}

View File

@@ -1,148 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { logger } from '@/utils/logger'
import type { ActionHandler, ActionResponse } from './ActionHandler'
/**
* ActionRegistry - Central dispatcher for all actions
*
* Responsibilities:
* - Register action handlers by name
* - Dispatch requests to correct handler
* - Return error for unknown actions
* - Provide introspection (list available actions)
*
* Usage:
* const registry = new ActionRegistry();
* registry.register('getActiveTab', new GetActiveTabAction());
* const response = await registry.dispatch('getActiveTab', {});
*/
export class ActionRegistry {
private handlers = new Map<string, ActionHandler>()
/**
* Register an action handler
*
* @param actionName - Unique action name (e.g., "getActiveTab")
* @param handler - Action handler instance
*/
register(actionName: string, handler: ActionHandler): void {
if (this.handlers.has(actionName)) {
logger.warn(
`[ActionRegistry] Action "${actionName}" already registered, overwriting`,
)
}
this.handlers.set(actionName, handler)
logger.info(`[ActionRegistry] Registered action: ${actionName}`)
}
/**
* Dispatch request to appropriate action handler
*
* Flow:
* 1. Find handler for action name
* 2. If not found, return error
* 3. If found, delegate to handler.handle()
* 4. Handler validates input and executes
* 5. Return result
*
* @param actionName - Action to execute
* @param payload - Action payload (unvalidated)
* @returns Action response
*/
async dispatch(
actionName: string,
payload: unknown,
): Promise<ActionResponse> {
logger.debug(`[ActionRegistry] Dispatching action: ${actionName}`)
// Check if action exists
const handler = this.handlers.get(actionName)
if (!handler) {
const availableActions = Array.from(this.handlers.keys()).join(', ')
const errorMessage = `Unknown action: "${actionName}". Available actions: ${availableActions || 'none'}`
logger.error(`[ActionRegistry] ${errorMessage}`)
return {
ok: false,
error: errorMessage,
}
}
// Delegate to handler
try {
const response = await handler.handle(payload)
logger.debug(
`[ActionRegistry] Action "${actionName}" ${response.ok ? 'succeeded' : 'failed'}`,
)
return response
} catch (error) {
// Catch any unexpected errors from handler
const errorMessage =
error instanceof Error ? error.message : String(error)
logger.error(
`[ActionRegistry] Unexpected error in "${actionName}": ${errorMessage}`,
)
return {
ok: false,
error: `Action execution failed: ${errorMessage}`,
}
}
}
/**
* Get list of registered action names
*
* @returns Array of action names
*/
getAvailableActions(): string[] {
return Array.from(this.handlers.keys())
}
/**
* Check if action is registered
*
* @param actionName - Action name to check
* @returns True if action exists
*/
hasAction(actionName: string): boolean {
return this.handlers.has(actionName)
}
/**
* Get number of registered actions
*
* @returns Count of registered actions
*/
getActionCount(): number {
return this.handlers.size
}
/**
* Unregister an action (useful for testing)
*
* @param actionName - Action to remove
* @returns True if action was removed
*/
unregister(actionName: string): boolean {
const removed = this.handlers.delete(actionName)
if (removed) {
logger.info(`[ActionRegistry] Unregistered action: ${actionName}`)
}
return removed
}
/**
* Clear all registered actions (useful for testing)
*/
clear(): void {
const count = this.handlers.size
this.handlers.clear()
logger.info(`[ActionRegistry] Cleared ${count} registered actions`)
}
}

View File

@@ -1,81 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const CreateBookmarkInputSchema = z.object({
title: z.string().describe('Bookmark title'),
url: z.string().url().describe('Bookmark URL'),
parentId: z
.string()
.optional()
.describe('Parent folder ID (optional, defaults to "Other Bookmarks")'),
})
// Output schema
const CreateBookmarkOutputSchema = z.object({
id: z.string().describe('Created bookmark ID'),
title: z.string().describe('Bookmark title'),
url: z.string().describe('Bookmark URL'),
dateAdded: z
.number()
.optional()
.describe('Timestamp when bookmark was created'),
})
type CreateBookmarkInput = z.infer<typeof CreateBookmarkInputSchema>
type CreateBookmarkOutput = z.infer<typeof CreateBookmarkOutputSchema>
/**
* CreateBookmarkAction - Create a new bookmark
*
* Creates a bookmark with the specified title and URL.
*
* Input:
* - title: Display title for the bookmark
* - url: Full URL to bookmark
* - parentId (optional): Parent folder ID
*
* Output:
* - id: Created bookmark ID
* - title: Bookmark title
* - url: Bookmark URL
* - dateAdded: Creation timestamp
*
* Usage:
* Create a bookmark in the default location (Other Bookmarks).
*
* Example:
* {
* "title": "Google",
* "url": "https://www.google.com"
* }
* // Returns: { id: "123", title: "Google", url: "https://www.google.com", dateAdded: 1729012345678 }
*/
export class CreateBookmarkAction extends ActionHandler<
CreateBookmarkInput,
CreateBookmarkOutput
> {
readonly inputSchema = CreateBookmarkInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(input: CreateBookmarkInput): Promise<CreateBookmarkOutput> {
const created = await this.bookmarkAdapter.createBookmark({
title: input.title,
url: input.url,
parentId: input.parentId,
})
return {
id: created.id,
title: created.title,
url: created.url || '',
dateAdded: created.dateAdded,
}
}
}

View File

@@ -1,52 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
const CreateBookmarkFolderInputSchema = z.object({
title: z.string().describe('Folder name'),
parentId: z
.string()
.optional()
.describe('Parent folder ID (defaults to "1" = Bookmarks Bar)'),
})
const CreateBookmarkFolderOutputSchema = z.object({
id: z.string().describe('Created folder ID'),
title: z.string().describe('Folder name'),
parentId: z.string().optional().describe('Parent folder ID'),
dateAdded: z.number().optional().describe('Creation timestamp'),
})
type CreateBookmarkFolderInput = z.infer<typeof CreateBookmarkFolderInputSchema>
type CreateBookmarkFolderOutput = z.infer<
typeof CreateBookmarkFolderOutputSchema
>
export class CreateBookmarkFolderAction extends ActionHandler<
CreateBookmarkFolderInput,
CreateBookmarkFolderOutput
> {
readonly inputSchema = CreateBookmarkFolderInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(
input: CreateBookmarkFolderInput,
): Promise<CreateBookmarkFolderOutput> {
const created = await this.bookmarkAdapter.createBookmarkFolder({
title: input.title,
parentId: input.parentId,
})
return {
id: created.id,
title: created.title,
parentId: created.parentId,
dateAdded: created.dateAdded,
}
}
}

View File

@@ -1,59 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
const GetBookmarkChildrenInputSchema = z.object({
folderId: z.string().describe('Folder ID to get children from'),
})
const GetBookmarkChildrenOutputSchema = z.object({
children: z.array(
z.object({
id: z.string(),
title: z.string(),
url: z.string().optional(),
parentId: z.string().optional(),
dateAdded: z.number().optional(),
isFolder: z.boolean(),
}),
),
count: z.number(),
})
type GetBookmarkChildrenInput = z.infer<typeof GetBookmarkChildrenInputSchema>
type GetBookmarkChildrenOutput = z.infer<typeof GetBookmarkChildrenOutputSchema>
export class GetBookmarkChildrenAction extends ActionHandler<
GetBookmarkChildrenInput,
GetBookmarkChildrenOutput
> {
readonly inputSchema = GetBookmarkChildrenInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(
input: GetBookmarkChildrenInput,
): Promise<GetBookmarkChildrenOutput> {
const results = await this.bookmarkAdapter.getBookmarkChildren(
input.folderId,
)
const children = results.map((node) => ({
id: node.id,
title: node.title,
url: node.url,
parentId: node.parentId,
dateAdded: node.dateAdded,
isFolder: !node.url,
}))
return {
children,
count: children.length,
}
}
}

View File

@@ -1,111 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const GetBookmarksInputSchema = z.object({
query: z
.string()
.optional()
.describe(
'Search query to filter bookmarks (optional, returns all if not provided)',
),
limit: z
.number()
.int()
.positive()
.optional()
.default(20)
.describe('Maximum number of results (default: 20)'),
recent: z
.boolean()
.optional()
.default(false)
.describe('Get recent bookmarks instead of searching'),
})
// Output schema
const GetBookmarksOutputSchema = z.object({
bookmarks: z.array(
z.object({
id: z.string(),
title: z.string(),
url: z.string().optional(),
dateAdded: z.number().optional(),
parentId: z.string().optional(),
}),
),
count: z.number(),
})
type GetBookmarksInput = z.infer<typeof GetBookmarksInputSchema>
type GetBookmarksOutput = z.infer<typeof GetBookmarksOutputSchema>
/**
* GetBookmarksAction - Get or search bookmarks
*
* Retrieves bookmarks with optional filtering.
*
* Input:
* - query (optional): Search query to match title or URL
* - limit (optional): Maximum results (default: 20)
* - recent (optional): Get recent bookmarks instead (default: false)
*
* Output:
* - bookmarks: Array of bookmark objects
* - count: Number of bookmarks returned
*
* Usage:
* - Get recent: { "recent": true }
* - Search: { "query": "github" }
* - Get all (limited): { "limit": 50 }
*
* Example:
* {
* "query": "google",
* "limit": 10
* }
* // Returns: { bookmarks: [{id: "1", title: "Google", url: "https://google.com"}], count: 1 }
*/
export class GetBookmarksAction extends ActionHandler<
GetBookmarksInput,
GetBookmarksOutput
> {
readonly inputSchema = GetBookmarksInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(input: GetBookmarksInput): Promise<GetBookmarksOutput> {
let results: chrome.bookmarks.BookmarkTreeNode[]
if (input.recent) {
// Get recent bookmarks
results = await this.bookmarkAdapter.getRecentBookmarks(input.limit)
} else if (input.query) {
// Search bookmarks
results = await this.bookmarkAdapter.searchBookmarks(input.query)
results = results.slice(0, input.limit)
} else {
// Get recent by default
results = await this.bookmarkAdapter.getRecentBookmarks(input.limit)
}
// Map to output format
const bookmarks = results.map((b) => ({
id: b.id,
title: b.title,
url: b.url,
dateAdded: b.dateAdded,
parentId: b.parentId,
}))
return {
bookmarks,
count: bookmarks.length,
}
}
}

View File

@@ -1,49 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
const MoveBookmarkInputSchema = z.object({
id: z.string().describe('Bookmark or folder ID to move'),
parentId: z.string().optional().describe('New parent folder ID'),
index: z.number().int().min(0).optional().describe('Position within parent'),
})
const MoveBookmarkOutputSchema = z.object({
id: z.string().describe('Moved bookmark ID'),
title: z.string().describe('Bookmark title'),
url: z.string().optional().describe('Bookmark URL (undefined if folder)'),
parentId: z.string().optional().describe('New parent folder ID'),
index: z.number().optional().describe('New position within parent'),
})
type MoveBookmarkInput = z.infer<typeof MoveBookmarkInputSchema>
type MoveBookmarkOutput = z.infer<typeof MoveBookmarkOutputSchema>
export class MoveBookmarkAction extends ActionHandler<
MoveBookmarkInput,
MoveBookmarkOutput
> {
readonly inputSchema = MoveBookmarkInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(input: MoveBookmarkInput): Promise<MoveBookmarkOutput> {
const destination: { parentId?: string; index?: number } = {}
if (input.parentId !== undefined) destination.parentId = input.parentId
if (input.index !== undefined) destination.index = input.index
const moved = await this.bookmarkAdapter.moveBookmark(input.id, destination)
return {
id: moved.id,
title: moved.title,
url: moved.url,
parentId: moved.parentId,
index: moved.index,
}
}
}

View File

@@ -1,62 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const RemoveBookmarkInputSchema = z.object({
id: z.string().describe('Bookmark ID to remove'),
})
// Output schema
const RemoveBookmarkOutputSchema = z.object({
success: z
.boolean()
.describe('Whether the bookmark was successfully removed'),
message: z.string().describe('Confirmation message'),
})
type RemoveBookmarkInput = z.infer<typeof RemoveBookmarkInputSchema>
type RemoveBookmarkOutput = z.infer<typeof RemoveBookmarkOutputSchema>
/**
* RemoveBookmarkAction - Remove a bookmark
*
* Deletes a bookmark by its ID.
*
* Input:
* - id: Bookmark ID to remove
*
* Output:
* - success: true if removed
* - message: Confirmation message
*
* Usage:
* Get the bookmark ID from getBookmarks first, then remove it.
*
* Example:
* {
* "id": "123"
* }
* // Returns: { success: true, message: "Removed bookmark 123" }
*/
export class RemoveBookmarkAction extends ActionHandler<
RemoveBookmarkInput,
RemoveBookmarkOutput
> {
readonly inputSchema = RemoveBookmarkInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(input: RemoveBookmarkInput): Promise<RemoveBookmarkOutput> {
await this.bookmarkAdapter.removeBookmark(input.id)
return {
success: true,
message: `Removed bookmark ${input.id}`,
}
}
}

View File

@@ -1,48 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
const RemoveBookmarkTreeInputSchema = z.object({
id: z.string().describe('Folder ID to remove'),
confirm: z.boolean().describe('Must be true to confirm recursive deletion'),
})
const RemoveBookmarkTreeOutputSchema = z.object({
success: z.boolean().describe('Whether the folder was removed'),
message: z.string().describe('Result message'),
})
type RemoveBookmarkTreeInput = z.infer<typeof RemoveBookmarkTreeInputSchema>
type RemoveBookmarkTreeOutput = z.infer<typeof RemoveBookmarkTreeOutputSchema>
export class RemoveBookmarkTreeAction extends ActionHandler<
RemoveBookmarkTreeInput,
RemoveBookmarkTreeOutput
> {
readonly inputSchema = RemoveBookmarkTreeInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(
input: RemoveBookmarkTreeInput,
): Promise<RemoveBookmarkTreeOutput> {
if (input.confirm !== true) {
return {
success: false,
message:
'Recursive deletion requires confirm: true. This will permanently delete the folder and all its contents.',
}
}
await this.bookmarkAdapter.removeBookmarkTree(input.id)
return {
success: true,
message: `Removed folder ${input.id} and all its contents`,
}
}
}

View File

@@ -1,82 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BookmarkAdapter } from '@/adapters/BookmarkAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const UpdateBookmarkInputSchema = z.object({
id: z.string().describe('Bookmark ID to update'),
title: z.string().optional().describe('New bookmark title'),
url: z.string().url().optional().describe('New bookmark URL'),
})
// Output schema
const UpdateBookmarkOutputSchema = z.object({
id: z.string().describe('Bookmark ID'),
title: z.string().describe('Updated bookmark title'),
url: z.string().optional().describe('Updated bookmark URL'),
})
type UpdateBookmarkInput = z.infer<typeof UpdateBookmarkInputSchema>
type UpdateBookmarkOutput = z.infer<typeof UpdateBookmarkOutputSchema>
/**
* UpdateBookmarkAction - Update a bookmark's title or URL
*
* Updates an existing bookmark with new title and/or URL.
*
* Input:
* - id: Bookmark ID to update
* - title (optional): New title for the bookmark
* - url (optional): New URL for the bookmark
*
* Output:
* - id: Bookmark ID
* - title: Updated title
* - url: Updated URL
*
* Usage:
* Update a bookmark's title or URL (at least one must be provided).
*
* Example:
* {
* "id": "123",
* "title": "New Title",
* "url": "https://www.example.com"
* }
* // Returns: { id: "123", title: "New Title", url: "https://www.example.com" }
*/
export class UpdateBookmarkAction extends ActionHandler<
UpdateBookmarkInput,
UpdateBookmarkOutput
> {
readonly inputSchema = UpdateBookmarkInputSchema
private bookmarkAdapter = new BookmarkAdapter()
async execute(input: UpdateBookmarkInput): Promise<UpdateBookmarkOutput> {
const changes: { title?: string; url?: string } = {}
if (input.title !== undefined) {
changes.title = input.title
}
if (input.url !== undefined) {
changes.url = input.url
}
if (Object.keys(changes).length === 0) {
throw new Error('At least one of title or url must be provided')
}
const updated = await this.bookmarkAdapter.updateBookmark(input.id, changes)
return {
id: updated.id,
title: updated.title,
url: updated.url,
}
}
}

View File

@@ -1,79 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import {
BrowserOSAdapter,
type ScreenshotSizeKey,
} from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const CaptureScreenshotInputSchema = z.object({
tabId: z.number().describe('The tab ID to capture'),
size: z
.enum(['small', 'medium', 'large'])
.optional()
.default('medium')
.describe('Screenshot size preset (default: medium)'),
showHighlights: z
.boolean()
.optional()
.default(true)
.describe('Show element highlights (default: true)'),
width: z.number().optional().describe('Exact width in pixels'),
height: z.number().optional().describe('Exact height in pixels'),
})
// Output schema
const CaptureScreenshotOutputSchema = z.object({
dataUrl: z.string().describe('Base64-encoded PNG data URL'),
})
type CaptureScreenshotInput = z.infer<typeof CaptureScreenshotInputSchema>
type CaptureScreenshotOutput = z.infer<typeof CaptureScreenshotOutputSchema>
/**
* CaptureScreenshotAction - Capture a screenshot of the page
*
* Captures a screenshot with configurable size and options.
*
* Size Options:
* - small (512px): Low detail, minimal tokens
* - medium (768px): Balanced quality/tokens (default)
* - large (1028px): High detail, maximum tokens
*
* Or specify exact dimensions with width/height.
*
* Returns:
* - dataUrl: PNG image as base64 data URL (data:image/png;base64,...)
*
* Usage:
* 1. For AI vision models: use 'medium' or 'large'
* 2. For debugging: use 'small'
* 3. For exact size: specify width and height
*
* Used by: ScreenshotTool, VisualClick, VisualType
*/
export class CaptureScreenshotAction extends ActionHandler<
CaptureScreenshotInput,
CaptureScreenshotOutput
> {
readonly inputSchema = CaptureScreenshotInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(
input: CaptureScreenshotInput,
): Promise<CaptureScreenshotOutput> {
const dataUrl = await this.browserOSAdapter.captureScreenshot(
input.tabId,
input.size as ScreenshotSizeKey | undefined,
input.showHighlights,
input.width,
input.height,
)
return { dataUrl }
}
}

View File

@@ -1,124 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import {
BrowserOSAdapter,
type ScreenshotSizeKey,
} from '@/adapters/BrowserOSAdapter'
import { logger } from '@/utils/logger'
import { PointerOverlay } from '@/utils/PointerOverlay'
import { SnapshotCache } from '@/utils/SnapshotCache'
import { ActionHandler } from '../ActionHandler'
// Input schema
const CaptureScreenshotPointerInputSchema = z.object({
tabId: z.number().describe('The tab ID to capture'),
nodeId: z
.number()
.int()
.positive()
.describe('The nodeId to show pointer over'),
size: z
.enum(['small', 'medium', 'large'])
.optional()
.default('medium')
.describe('Screenshot size preset (default: medium)'),
pointerLabel: z
.string()
.optional()
.describe('Optional label to show with pointer (e.g., "Click", "Type")'),
})
// Output schema
const CaptureScreenshotPointerOutputSchema = z.object({
dataUrl: z.string().describe('Base64-encoded PNG data URL'),
pointerPosition: z
.object({
x: z.number(),
y: z.number(),
})
.optional()
.describe('Coordinates where pointer was shown'),
})
type CaptureScreenshotPointerInput = z.infer<
typeof CaptureScreenshotPointerInputSchema
>
type CaptureScreenshotPointerOutput = z.infer<
typeof CaptureScreenshotPointerOutputSchema
>
/**
* CaptureScreenshotPointerAction - Show pointer over element and capture screenshot
*
* Shows a visual pointer overlay at the center of the specified element,
* then captures a screenshot with the pointer visible.
*
* Prerequisites:
* - Must call getInteractiveSnapshot first to populate the cache
* - NodeId must exist in the cached snapshot
*
* Usage:
* 1. Get snapshot to find elements and populate cache
* 2. Call captureScreenshotPointer with tabId and nodeId
* 3. Returns screenshot with pointer overlay visible
*
* Used by: Visual debugging, automation demos, step-by-step captures
*/
export class CaptureScreenshotPointerAction extends ActionHandler<
CaptureScreenshotPointerInput,
CaptureScreenshotPointerOutput
> {
readonly inputSchema = CaptureScreenshotPointerInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(
input: CaptureScreenshotPointerInput,
): Promise<CaptureScreenshotPointerOutput> {
const { tabId, nodeId, size, pointerLabel } = input
// Get element rect from cache
const rect = SnapshotCache.getNodeRect(tabId, nodeId)
let pointerPosition: { x: number; y: number } | undefined
if (rect) {
// Calculate center coordinates
const { x, y } = PointerOverlay.getCenterCoordinates(rect)
pointerPosition = { x, y }
// Show pointer
await PointerOverlay.showPointer(tabId, x, y, pointerLabel)
logger.debug(
`[CaptureScreenshotPointerAction] Showed pointer at (${x}, ${y}) for node ${nodeId}`,
)
} else {
logger.warn(
`[CaptureScreenshotPointerAction] No cached rect for node ${nodeId} in tab ${tabId}. Capturing without pointer.`,
)
}
// Small delay to ensure pointer is rendered
await this.delay(100)
// Capture screenshot with pointer visible
const dataUrl = await this.browserOSAdapter.captureScreenshot(
tabId,
size as ScreenshotSizeKey | undefined,
false, // Don't show highlights, we have the pointer
)
return {
dataUrl,
pointerPosition,
}
}
private delay(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms))
}
}

View File

@@ -1,38 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
const ClearInputSchema = z.object({
tabId: z.number().describe('The tab ID containing the element'),
nodeId: z
.number()
.int()
.positive()
.describe('The nodeId from interactive snapshot'),
})
type ClearInput = z.infer<typeof ClearInputSchema>
interface ClearOutput {
success: boolean
}
/**
* ClearAction - Clear text from an input element
*
* Clears all text from an input field or textarea.
* Used before inputText or to reset form fields.
*/
export class ClearAction extends ActionHandler<ClearInput, ClearOutput> {
readonly inputSchema = ClearInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: ClearInput): Promise<ClearOutput> {
await this.browserOSAdapter.clear(input.tabId, input.nodeId)
return { success: true }
}
}

View File

@@ -1,62 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { PointerOverlay } from '@/utils/PointerOverlay'
import { SnapshotCache } from '@/utils/SnapshotCache'
import { ActionHandler } from '../ActionHandler'
// Input schema
const ClickInputSchema = z.object({
tabId: z.number().describe('The tab ID containing the element'),
nodeId: z
.number()
.int()
.positive()
.describe('The nodeId from interactive snapshot'),
})
// Output schema
const ClickOutputSchema = z.object({
success: z.boolean().describe('Whether the click succeeded'),
})
type ClickInput = z.infer<typeof ClickInputSchema>
type ClickOutput = z.infer<typeof ClickOutputSchema>
/**
* ClickAction - Click an element by its nodeId
*
* This action clicks an interactive element identified by its nodeId from getInteractiveSnapshot.
*
* Prerequisites:
* - Must call getInteractiveSnapshot first to get valid nodeIds
* - NodeIds are valid only for the current page state
* - NodeIds are invalidated on page navigation
*
* Usage:
* 1. Get snapshot to find clickable elements
* 2. Choose element by nodeId
* 3. Call click with tabId and nodeId
*
* Used by: ClickTool, all automation workflows
*/
export class ClickAction extends ActionHandler<ClickInput, ClickOutput> {
readonly inputSchema = ClickInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: ClickInput): Promise<ClickOutput> {
// Show pointer overlay before click
const rect = SnapshotCache.getNodeRect(input.tabId, input.nodeId)
if (rect) {
const { x, y } = PointerOverlay.getCenterCoordinates(rect)
await PointerOverlay.showPointerAndWait(input.tabId, x, y, 'Click')
}
await this.browserOSAdapter.click(input.tabId, input.nodeId)
return { success: true }
}
}

View File

@@ -1,69 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { getBrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { PointerOverlay } from '@/utils/PointerOverlay'
import { ActionHandler } from '../ActionHandler'
// Input schema for clickCoordinates action
const ClickCoordinatesInputSchema = z.object({
tabId: z.number().int().positive().describe('Tab ID to click in'),
x: z.number().int().nonnegative().describe('X coordinate in viewport pixels'),
y: z.number().int().nonnegative().describe('Y coordinate in viewport pixels'),
})
type ClickCoordinatesInput = z.infer<typeof ClickCoordinatesInputSchema>
// Output confirms the click
export interface ClickCoordinatesOutput {
success: boolean
message: string
coordinates: {
x: number
y: number
}
}
/**
* ClickCoordinatesAction - Click at specific viewport coordinates
*
* Performs a click at the specified (x, y) coordinates in the viewport.
* Coordinates are in pixels relative to the top-left of the visible viewport (0, 0).
*
* Useful when:
* - Elements don't have accessible node IDs
* - Working with canvas or interactive graphics
* - Vision-based automation (e.g., AI identifies coordinates from screenshots)
*
* Example payload:
* {
* "tabId": 123,
* "x": 500,
* "y": 300
* }
*/
export class ClickCoordinatesAction extends ActionHandler<
ClickCoordinatesInput,
ClickCoordinatesOutput
> {
readonly inputSchema = ClickCoordinatesInputSchema
private browserOS = getBrowserOSAdapter()
async execute(input: ClickCoordinatesInput): Promise<ClickCoordinatesOutput> {
const { tabId, x, y } = input
// Show pointer overlay before click
await PointerOverlay.showPointerAndWait(tabId, x, y, 'Click')
await this.browserOS.clickCoordinates(tabId, x, y)
return {
success: true,
message: `Successfully clicked at coordinates (${x}, ${y}) in tab ${tabId}`,
coordinates: { x, y },
}
}
}

View File

@@ -1,38 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { CHROME_API_TIMEOUTS, withTimeout } from '@/utils/timeout'
import { ActionHandler } from '../ActionHandler'
const CloseWindowInputSchema = z.object({
windowId: z.number().int().positive().describe('ID of the window to close'),
})
const CloseWindowOutputSchema = z.object({
success: z.boolean().describe('Whether the window was successfully closed'),
})
type CloseWindowInput = z.infer<typeof CloseWindowInputSchema>
type CloseWindowOutput = z.infer<typeof CloseWindowOutputSchema>
export class CloseWindowAction extends ActionHandler<
CloseWindowInput,
CloseWindowOutput
> {
readonly inputSchema = CloseWindowInputSchema
async execute(input: CloseWindowInput): Promise<CloseWindowOutput> {
await withTimeout(
chrome.windows.remove(input.windowId),
CHROME_API_TIMEOUTS.CHROME_API,
'chrome.windows.remove',
)
return {
success: true,
}
}
}

View File

@@ -1,73 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { CHROME_API_TIMEOUTS, withTimeout } from '@/utils/timeout'
import { ActionHandler } from '../ActionHandler'
const CreateWindowInputSchema = z.object({
url: z
.string()
.optional()
.default('about:blank')
.describe('URL to open in the new window'),
incognito: z
.boolean()
.optional()
.default(false)
.describe('Create an incognito window'),
focused: z
.boolean()
.optional()
.default(true)
.describe('Whether to focus the new window'),
})
const CreateWindowOutputSchema = z.object({
windowId: z.number().describe('ID of the newly created window'),
tabId: z.number().describe('ID of the first tab in the new window'),
})
type CreateWindowInput = z.infer<typeof CreateWindowInputSchema>
type CreateWindowOutput = z.infer<typeof CreateWindowOutputSchema>
export class CreateWindowAction extends ActionHandler<
CreateWindowInput,
CreateWindowOutput
> {
readonly inputSchema = CreateWindowInputSchema
async execute(input: CreateWindowInput): Promise<CreateWindowOutput> {
const createData: chrome.windows.CreateData = {
url: input.url,
focused: input.focused,
incognito: input.incognito,
}
const createdWindow = await withTimeout(
chrome.windows.create(createData),
CHROME_API_TIMEOUTS.CHROME_API,
'chrome.windows.create',
)
if (!createdWindow) {
throw new Error('Failed to create window')
}
if (createdWindow.id === undefined) {
throw new Error('Created window has no ID')
}
const tabId = createdWindow.tabs?.[0]?.id
if (tabId === undefined) {
throw new Error('Created window has no tab')
}
return {
windowId: createdWindow.id,
tabId,
}
}
}

View File

@@ -1,64 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const ExecuteJavaScriptInputSchema = z.object({
tabId: z.number().describe('The tab ID to execute code in'),
code: z.string().describe('JavaScript code to execute'),
})
// Output schema
const ExecuteJavaScriptOutputSchema = z.object({
result: z.any().describe('The result of the code execution'),
})
type ExecuteJavaScriptInput = z.infer<typeof ExecuteJavaScriptInputSchema>
type ExecuteJavaScriptOutput = z.infer<typeof ExecuteJavaScriptOutputSchema>
/**
* ExecuteJavaScriptAction - Execute JavaScript code in page context
*
* Executes arbitrary JavaScript code in the page and returns the result.
*
* Input:
* - tabId: Tab ID to execute code in
* - code: JavaScript code as string
*
* Output:
* - result: The return value of the executed code
*
* Usage:
* - Extract data from page: "document.title"
* - Manipulate DOM: "document.body.style.background = 'red'"
* - Get element values: "document.querySelector('#email').value"
*
* Example:
* {
* "tabId": 123,
* "code": "document.title"
* }
* // Returns: { result: "Google" }
*/
export class ExecuteJavaScriptAction extends ActionHandler<
ExecuteJavaScriptInput,
ExecuteJavaScriptOutput
> {
readonly inputSchema = ExecuteJavaScriptInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(
input: ExecuteJavaScriptInput,
): Promise<ExecuteJavaScriptOutput> {
const result = await this.browserOSAdapter.executeJavaScript(
input.tabId,
input.code,
)
return { result }
}
}

View File

@@ -1,53 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
const GetAccessibilityTreeInputSchema = z.object({
tabId: z
.number()
.int()
.positive()
.describe('Tab ID to get accessibility tree from'),
})
type GetAccessibilityTreeInput = z.infer<typeof GetAccessibilityTreeInputSchema>
export type GetAccessibilityTreeOutput = chrome.browserOS.AccessibilityTree
/**
* GetAccessibilityTreeAction - Get accessibility tree for a tab
*
* Returns the full accessibility tree structure containing:
* - rootId: The root node ID
* - nodes: Map of node IDs to accessibility nodes
*
* Each node contains:
* - nodeId: Unique node identifier
* - role: Accessibility role (e.g., 'staticText', 'heading', 'button')
* - name: Text content or label
* - childIds: Array of child node IDs
*
* Example payload:
* {
* "tabId": 123
* }
*/
export class GetAccessibilityTreeAction extends ActionHandler<
GetAccessibilityTreeInput,
GetAccessibilityTreeOutput
> {
readonly inputSchema = GetAccessibilityTreeInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(
input: GetAccessibilityTreeInput,
): Promise<GetAccessibilityTreeOutput> {
const { tabId } = input
const tree = await this.browserOSAdapter.getAccessibilityTree(tabId)
return tree
}
}

View File

@@ -1,71 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import type {
InteractiveSnapshot,
InteractiveSnapshotOptions,
} from '@/adapters/BrowserOSAdapter'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { SnapshotCache } from '@/utils/SnapshotCache'
import { ActionHandler } from '../ActionHandler'
// Input schema
const GetInteractiveSnapshotInputSchema = z.object({
tabId: z.number().describe('The tab ID to get snapshot from'),
options: z
.object({
includeHidden: z
.boolean()
.optional()
.default(false)
.describe('Include hidden elements (default: false)'),
})
.optional()
.describe('Optional snapshot options'),
})
type GetInteractiveSnapshotInput = z.infer<
typeof GetInteractiveSnapshotInputSchema
>
/**
* GetInteractiveSnapshotAction - Get interactive elements from the page
*
* This is THE MOST CRITICAL action - it returns all interactive elements
* with their nodeIds, which are needed by click, inputText, clear, and scrollToNode actions.
*
* Returns:
* - elements: Array of interactive nodes with nodeIds
* - hierarchicalStructure: String representation of page structure
*
* Each element contains:
* - nodeId: Sequential integer ID (1, 2, 3...)
* - type: 'clickable' | 'typeable' | 'selectable'
* - name: Element text/label
* - attributes: Element properties (html-tag, role, etc.)
* - rect: Bounding box coordinates
*/
export class GetInteractiveSnapshotAction extends ActionHandler<
GetInteractiveSnapshotInput,
InteractiveSnapshot
> {
readonly inputSchema = GetInteractiveSnapshotInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(
input: GetInteractiveSnapshotInput,
): Promise<InteractiveSnapshot> {
const snapshot = await this.browserOSAdapter.getInteractiveSnapshot(
input.tabId,
input.options as InteractiveSnapshotOptions | undefined,
)
// Cache snapshot for pointer overlay lookup
SnapshotCache.set(input.tabId, snapshot)
return snapshot
}
}

View File

@@ -1,69 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import {
BrowserOSAdapter,
type PageLoadStatus,
} from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema for getPageLoadStatus action
const GetPageLoadStatusInputSchema = z.object({
tabId: z
.number()
.int()
.positive()
.describe('Tab ID to check page load status'),
})
type GetPageLoadStatusInput = z.infer<typeof GetPageLoadStatusInputSchema>
// Output includes page load status details
export interface GetPageLoadStatusOutput {
tabId: number
isResourcesLoading: boolean
isDOMContentLoaded: boolean
isPageComplete: boolean
}
/**
* GetPageLoadStatusAction - Get page loading status for a tab
*
* Returns the current page load status including:
* - isResourcesLoading: Whether resources (images, scripts, etc.) are still loading
* - isDOMContentLoaded: Whether the DOM is fully parsed and ready
* - isPageComplete: Whether the page has completely finished loading
*
* Useful for waiting for pages to load before taking actions.
*
* Example payload:
* {
* "tabId": 123
* }
*/
export class GetPageLoadStatusAction extends ActionHandler<
GetPageLoadStatusInput,
GetPageLoadStatusOutput
> {
readonly inputSchema = GetPageLoadStatusInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(
input: GetPageLoadStatusInput,
): Promise<GetPageLoadStatusOutput> {
const { tabId } = input
const status: PageLoadStatus =
await this.browserOSAdapter.getPageLoadStatus(tabId)
return {
tabId,
isResourcesLoading: status.isResourcesLoading,
isDOMContentLoaded: status.isDOMContentLoaded,
isPageComplete: status.isPageComplete,
}
}
}

View File

@@ -1,74 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter, type Snapshot } from '@/adapters/BrowserOSAdapter'
import { logger } from '@/utils/logger'
import { ActionHandler } from '../ActionHandler'
// Input schema for getSnapshot action
const GetSnapshotInputSchema = z.object({
tabId: z.number().int().positive().describe('Tab ID to get snapshot from'),
type: z
.enum(['text', 'links'])
.default('text')
.describe('Type of snapshot: text or links'),
options: z
.object({
context: z.enum(['visible', 'full']).optional(),
includeSections: z
.array(
z.enum([
'main',
'navigation',
'footer',
'header',
'article',
'aside',
]),
)
.optional(),
})
.optional()
.describe('Optional snapshot configuration'),
})
type GetSnapshotInput = z.infer<typeof GetSnapshotInputSchema>
// Output is the full snapshot structure
export type GetSnapshotOutput = Snapshot
/**
* GetSnapshotAction - Extract page content snapshot
*
* Extracts structured content from the page including:
* - Headings (with levels)
* - Text content
* - Links (with URLs)
*
* Returns items in document order with type information.
*
* Example payload:
* {
* "tabId": 123,
* "type": "text"
* }
*/
export class GetSnapshotAction extends ActionHandler<
GetSnapshotInput,
GetSnapshotOutput
> {
readonly inputSchema = GetSnapshotInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: GetSnapshotInput): Promise<GetSnapshotOutput> {
const { tabId, type } = input
logger.info(
`[GetSnapshotAction] Getting snapshot for tab ${tabId} with type ${type}`,
)
const snapshot = await this.browserOSAdapter.getSnapshot(tabId, type)
return snapshot
}
}

View File

@@ -1,75 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { PointerOverlay } from '@/utils/PointerOverlay'
import { SnapshotCache } from '@/utils/SnapshotCache'
import { ActionHandler } from '../ActionHandler'
// Input schema
const InputTextInputSchema = z.object({
tabId: z.number().describe('The tab ID containing the element'),
nodeId: z
.number()
.int()
.positive()
.describe('The nodeId from interactive snapshot'),
text: z.string().describe('Text to type into the element'),
})
// Output schema
const InputTextOutputSchema = z.object({
success: z.boolean().describe('Whether the input succeeded'),
})
type InputTextInput = z.infer<typeof InputTextInputSchema>
type InputTextOutput = z.infer<typeof InputTextOutputSchema>
/**
* InputTextAction - Type text into an element by its nodeId
*
* This action types text into an input field or textarea identified by its nodeId.
*
* Prerequisites:
* - Must call getInteractiveSnapshot first to get valid nodeIds
* - Element must be typeable (type: 'typeable' in snapshot)
* - NodeIds are valid only for the current page state
*
* Behavior:
* - Automatically clears existing text before typing (handled by adapter)
* - Types the full text string
* - Triggers input/change events
*
* Usage:
* 1. Get snapshot to find typeable elements
* 2. Choose input field by nodeId
* 3. Call inputText with tabId, nodeId, and text
*
* Used by: TypeTool, form automation workflows
*/
export class InputTextAction extends ActionHandler<
InputTextInput,
InputTextOutput
> {
readonly inputSchema = InputTextInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: InputTextInput): Promise<InputTextOutput> {
// Show pointer overlay before typing
const rect = SnapshotCache.getNodeRect(input.tabId, input.nodeId)
if (rect) {
const { x, y } = PointerOverlay.getLeftCenterCoordinates(rect)
const textPreview =
input.text.length > 20
? `Type: ${input.text.substring(0, 20)}...`
: `Type: ${input.text}`
await PointerOverlay.showPointerAndWait(input.tabId, x, y, textPreview)
}
await this.browserOSAdapter.inputText(input.tabId, input.nodeId, input.text)
return { success: true }
}
}

View File

@@ -1,54 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const ScrollDownInputSchema = z.object({
tabId: z.number().describe('The tab ID to scroll'),
})
// Output schema
const ScrollDownOutputSchema = z.object({
success: z.boolean().describe('Whether the scroll succeeded'),
})
type ScrollDownInput = z.infer<typeof ScrollDownInputSchema>
type ScrollDownOutput = z.infer<typeof ScrollDownOutputSchema>
/**
* ScrollDownAction - Scroll page down
*
* Scrolls the page down by one viewport height using PageDown key.
* This approach is more reliable than the direct scrollDown API.
*
* Input:
* - tabId: Tab ID to scroll
*
* Output:
* - success: true if scroll succeeded
*
* Usage:
* Used for scrolling through long pages to view content below the fold.
*/
export class ScrollDownAction extends ActionHandler<
ScrollDownInput,
ScrollDownOutput
> {
readonly inputSchema = ScrollDownInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: ScrollDownInput): Promise<ScrollDownOutput> {
// Use sendKeys with PageDown instead of scrollDown API (more reliable)
await this.browserOSAdapter.sendKeys(input.tabId, 'PageDown')
// Add small delay for scroll to complete
await new Promise((resolve) => setTimeout(resolve, 100))
return { success: true }
}
}

View File

@@ -1,42 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
const ScrollToNodeInputSchema = z.object({
tabId: z.number().describe('The tab ID containing the element'),
nodeId: z.number().int().positive().describe('The nodeId to scroll to'),
})
type ScrollToNodeInput = z.infer<typeof ScrollToNodeInputSchema>
interface ScrollToNodeOutput {
scrolled: boolean
}
/**
* ScrollToNodeAction - Scroll an element into view
*
* Scrolls the page so that the specified element is visible in the viewport.
* Returns whether scrolling actually occurred.
*
* Used by: Click/Type tools to ensure element is visible before interaction
*/
export class ScrollToNodeAction extends ActionHandler<
ScrollToNodeInput,
ScrollToNodeOutput
> {
readonly inputSchema = ScrollToNodeInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: ScrollToNodeInput): Promise<ScrollToNodeOutput> {
const scrolled = await this.browserOSAdapter.scrollToNode(
input.tabId,
input.nodeId,
)
return { scrolled }
}
}

View File

@@ -1,54 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { BrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema
const ScrollUpInputSchema = z.object({
tabId: z.number().describe('The tab ID to scroll'),
})
// Output schema
const ScrollUpOutputSchema = z.object({
success: z.boolean().describe('Whether the scroll succeeded'),
})
type ScrollUpInput = z.infer<typeof ScrollUpInputSchema>
type ScrollUpOutput = z.infer<typeof ScrollUpOutputSchema>
/**
* ScrollUpAction - Scroll page up
*
* Scrolls the page up by one viewport height using PageUp key.
* This approach is more reliable than the direct scrollUp API.
*
* Input:
* - tabId: Tab ID to scroll
*
* Output:
* - success: true if scroll succeeded
*
* Usage:
* Used for scrolling back up through long pages.
*/
export class ScrollUpAction extends ActionHandler<
ScrollUpInput,
ScrollUpOutput
> {
readonly inputSchema = ScrollUpInputSchema
private browserOSAdapter = BrowserOSAdapter.getInstance()
async execute(input: ScrollUpInput): Promise<ScrollUpOutput> {
// Use sendKeys with PageUp instead of scrollUp API (more reliable)
await this.browserOSAdapter.sendKeys(input.tabId, 'PageUp')
// Add small delay for scroll to complete
await new Promise((resolve) => setTimeout(resolve, 100))
return { success: true }
}
}

View File

@@ -1,69 +0,0 @@
/**
* @license
* Copyright 2025 BrowserOS
* SPDX-License-Identifier: AGPL-3.0-or-later
*/
import { z } from 'zod'
import { getBrowserOSAdapter } from '@/adapters/BrowserOSAdapter'
import { ActionHandler } from '../ActionHandler'
// Input schema for sendKeys action
const SendKeysInputSchema = z.object({
tabId: z.number().int().positive().describe('Tab ID to send keys to'),
key: z
.enum([
'Enter',
'Delete',
'Backspace',
'Tab',
'Escape',
'ArrowUp',
'ArrowDown',
'ArrowLeft',
'ArrowRight',
'Home',
'End',
'PageUp',
'PageDown',
])
.describe('Keyboard key to send'),
})
type SendKeysInput = z.infer<typeof SendKeysInputSchema>
// Output is just success (void result)
export interface SendKeysOutput {
success: boolean
message: string
}
/**
* SendKeysAction - Send keyboard keys to a tab
*
* Sends special keyboard keys (Enter, Escape, arrows, etc.) to the specified tab.
* Useful for navigation, form submission, closing dialogs, etc.
*
* Example payload:
* {
* "tabId": 123,
* "key": "Enter"
* }
*/
export class SendKeysAction extends ActionHandler<
SendKeysInput,
SendKeysOutput
> {
readonly inputSchema = SendKeysInputSchema
private browserOS = getBrowserOSAdapter()
async execute(input: SendKeysInput): Promise<SendKeysOutput> {
const { tabId, key } = input
await this.browserOS.sendKeys(tabId, key as chrome.browserOS.Key)
return {
success: true,
message: `Successfully sent "${key}" to tab ${tabId}`,
}
}
}

Some files were not shown because too many files have changed in this diff Show More