* feat: add runtime vm cache sync
* feat: configure runtime vm cache sync
* feat: prefetch vm cache on startup
* feat: await vm cache before vm startup
* fix: recheck vm cache after prefetch wait
* fix: address vm cache review feedback
* build(server): require VM cache manifest env
* feat(build-tools): add Lima template for BrowserOS VM
* feat(build-tools): remove build-disk pipeline and recipe directory
Task 2 verification removed the scripts, recipe directory, workflow, and package scripts. Typecheck remains green here because manifest disk fields are removed in the next task, so the plan's expected missing-import failure does not apply yet.
* feat(build-tools): rename VmManifest to AgentManifest, drop disk fields
* feat(build): stage Lima template into server resources
Verified local-resource staging with: bun scripts/build/server.ts --target=darwin-arm64 --ci. The template was copied to dist/prod/server/darwin-arm64/resources/vm/browseros-vm.yaml and included in the zip. bun run build:server:test still fails on the pre-existing R2 limactl resource with: The specified key does not exist.
* docs(build-tools): Lima template dev loop + record D9
Updated the build-tools README in this worktree. Also recorded D9 in the canonical external spec file at /Users/shadowfax/llm/code/browseros-project/grove-ref/browseros-main/specs/decisions.md, which is outside this git checkout.
* chore(build-tools): sweep orphaned references to retired disk pipeline
* chore: self-review fixes
* feat(build): swap podman server resources for Lima (WS3)
- Upload limactl (arm64 + x64) to R2 via new 'browseros upload lima' CLI.
- Rewrite scripts/build/config/server-prod-resources.json: 2 Lima entries,
12 podman-family entries removed.
- Update codesign metadata (server_binaries.py) to add limactl, drop podman
family. Sign modules need no edits (data-driven).
- Delete orphaned podman-{vfkit,krunkit} entitlement plists.
- Release-gating note in browseros-agent/CLAUDE.md: don't cut releases off
dev between this commit and WS6 landing (OpenClaw still invokes podman).
* fix: address review comments for 0422-ws3_lima_resources
- Tighten _find_limactl_member to match exactly .../bin/limactl via
Path.parts, avoiding incidental matches like 'xbin/limactl'.
- Fall back USER -> USERNAME -> 'unknown' for uploaded_by so Windows
shells don't all record 'unknown'.
- Comment the broad except in upload_lima to explain why rollback
must fire for any mid-loop failure.
* chore: drop bun + rg from Windows sign list
These executables are already absent from server-prod-resources.json (no
Windows entries shipped); keeping them in the sign list produces
"Binary not found" warnings on every Windows build.
* fix: run full browseros-agent test suite
* fix: stabilize server test reporting in CI
* fix: address PR review feedback
* refactor: extract server core test runner
* refactor: group server tests by filesystem
* fix: align CI suites with server test groups
* fix: provision server env for all CI suites
* fix: stabilize ci checks
* fix: report real test counts in ci
The --compile-only and --ci flags served overlapping purposes for CI
builds. Remove --compile-only entirely since --ci already handles the
CI use case (skip R2, skip prod env validation, local zip packaging)
and --no-upload covers the upload-skipping use case for full builds.
The server release CI workflow fails on ubuntu-latest because
patch-windows-exe.ts requires Wine to run rcedit. Thread the existing
--ci flag through compileServerBinaries so Windows PE metadata patching
is skipped in CI mode with a warning log.
* feat: add server release workflow
* fix: address PR review comments for 0331-add_server_release_workflow
* refactor: rework 0331-add_server_release_workflow based on feedback
* refactor: rework 0331-add_server_release_workflow based on feedback
* feat: add browseros-cli self-updater
* fix: address review comments for 0327-cli_self_updater
* fix: address PR review comments for 0327-cli_self_updater
* fix: replace goreleaser with Makefile-based release build
Remove .goreleaser.yml (required Pro license for monorepo field) and
consolidate cross-compilation into `make release`. CI now uses the same
Makefile target, fixing a bug where POSTHOG_API_KEY was missing from
release ldflags.
* fix: address critical self-updater bugs from code review
- Fix SHA256 checksum mismatch: verify archive checksum before extraction
instead of verifying extracted binary against archive hash (was always
failing). Add VerifyChecksum() and integration test.
- Fix JSON field name mismatch: TypeScript was emitting camelCase
(publishedAt, archiveFormat) but Go expected snake_case
(published_at, archive_format). Manifest parsing was silently broken.
- Add decompression size limit (256 MB) to prevent zip/gzip bombs.
- Don't update LastCheckedAt on transient errors so retry happens on
next CLI invocation instead of waiting 24h.
* feat: upload CLI binaries to CDN during release and gate workflow to core team
- Extend scripts/build/cli/upload.ts with uploadCliRelease() that pushes
archives + checksums to R2 under versioned (cli/v{VERSION}/) and latest
(cli/latest/) paths, plus a version.txt for lightweight latest resolution
- Update scripts/build/cli.ts entry point with --release/--version/--binaries-dir
flags (existing no-args behavior preserved for upload:cli-installers)
- Rewrite install.sh and install.ps1 to fetch from cdn.browseros.com instead of
GitHub releases API — eliminates rate limits and API dependency
- Add environment: release-core to release-cli.yml for core-team gating via
GitHub environment protection rules
- Add Bun setup + CDN upload step to the workflow between build and GitHub release
* fix: address review feedback for PR #602
- Make loadProdEnv return empty map when .env.production is absent so
pickEnv falls through to process.env in CI (Greptile P1)
- Add semver format validation for version string in install.sh and
install.ps1 to guard against malformed CDN responses
- Pass inputs.version via env var instead of inline ${{ }} interpolation
to prevent command injection in workflow shell
Add a compile-only mode to the server build pipeline for CI/CD
environments that don't have R2 credentials. The --compile-only flag
skips resource staging and upload, producing only compiled binaries.
* feat: integrate models.dev for dynamic LLM provider/model data (#TKT-657)
Replace hardcoded model lists with data sourced from models.dev so new
providers and models appear automatically when the community adds them.
- Add build script (scripts/generate-models.ts) that fetches models.dev/api.json
and outputs a compact JSON with 10 providers and 520 models
- Replace hardcoded MODELS_DATA (50 models) with dynamic models.dev lookups
- Add searchable model combobox (Popover + Command) replacing plain Select dropdown
- Enrich provider templates with models.dev metadata (context window, image support)
- Keep chatgpt-pro, qwen-code, browseros, openai-compatible as hardcoded providers
* fix: address review — remove ollama-cloud mapping, fix default models, remove dead code
- Remove ollama from PROVIDER_MAP (ollama-cloud has cloud models, not local)
- Add ollama to CUSTOM_PROVIDER_MODELS with empty list (users type custom IDs)
- Update defaultModelIds to ones that exist in models.dev data:
openrouter → anthropic/claude-sonnet-4.5
lmstudio → openai/gpt-oss-20b
bedrock → anthropic.claude-sonnet-4-6
- Remove dead isCustomModel export
- Regenerate models-dev-data.json (9 providers, 486 models)
* fix: model suggestion list focus/dismiss behavior
- List only opens when input is focused or user types
- Clicking a model selects it and closes the list
- Clicking outside (blur) dismisses the list
- onMouseDown preventDefault on list items prevents blur race condition
* refactor: extract ModelPickerList component with proper open/close UX
- Collapsed state: Select-like trigger showing selected model + chevron
- Expanded state: search input + scrollable filtered list, inline
- Click outside or Escape to close, Enter to submit custom model
- Extracted as separate component (reduces dialog nesting, testable)
- No more setTimeout hacks for blur handling
* chore: remove plan doc from repo
* feat: add remote skill download and auto-sync
Download default skills from remote catalog on first setup with
bundled fallback when offline. Background sync every 45 minutes
checks for new/updated skills without overwriting user-customized
ones. Tracks installed defaults via content hashes in a local
manifest file.
* feat: make skills catalog URL configurable and add generation script
Add SKILLS_CATALOG_URL env var (following CODEGEN_SERVICE_URL pattern)
with fallback to the default constant. Add script to generate
catalog.json from bundled defaults for static hosting.
* feat: add R2 upload script and use cdn.browseros.com for catalog URL
Add upload-skills-catalog.ts that generates and uploads catalog.json
to Cloudflare R2 (same infra as existing build artifacts). Update
default catalog URL to cdn.browseros.com/skills/v1/catalog.json.
* test: add E2E tests for remote skill sync against live CDN
* fix: address code review findings — security, validation, DRY
- Add path traversal protection via safeSkillDir in writeSkillFile
and readSkillContent (reuses existing validation from service.ts)
- Add runtime type guards for catalog JSON and manifest JSON parsing
- Fix seedFromRemote to return false on partial failure so bundled
fallback kicks in
- Add per-skill error handling in syncRemoteSkills so one bad skill
doesn't crash the entire sync
- Wire stopSkillSync into Application.stop() shutdown path
- Extract version from frontmatter in seedFromBundled instead of
hardcoding '1.0'
- Consolidate duplicated logic: reuse installSkill/writeSkillFile/
contentHash/saveManifest from remote-sync.ts in seed.ts
- Extract shared catalog generation into scripts/catalog-utils.ts
* test: add flow tests for all four sync scenarios against live CDN
* refactor: remove redundant scripts and inline catalog generation
Drop generate-skills-catalog.ts, catalog-utils.ts, and
e2e-remote-sync.test.ts (covered by flows.test.ts). Inline
catalog generation into upload-skills-catalog.ts.
* test: add full E2E server flow test against live CDN
Tests all 7 steps of the real server lifecycle: fresh seed from CDN,
no-op sync, user edit preservation, skill reinstall, custom skill
protection, background timer firing, and second startup skip.
* chore: remove e2e-server-flow test
* fix: address Greptile review — entry validation, size limit, DRY, no-op saves
- Validate individual skill entries in catalog (id, version, content
must all be strings) not just the top-level shape
- Add 1MB response size limit on catalog fetch to prevent resource
exhaustion from compromised/misconfigured CDN
- Skip manifest save when sync cycle had no changes (avoids
unnecessary disk I/O every 45 minutes)
- Share extractVersion via remote-sync.ts export, remove duplicate
from seed.ts
* fix: prevent bundled fallback from overwriting partial remote seeds
When seedFromRemote partially fails, the bundled fallback now skips
skills already in the manifest (installed by the partial remote
seed). Also adds Content-Length early check before downloading the
full catalog response body.
* fix: run sync immediately on startup, not just on interval
Previously the first sync fired 45 minutes after boot. Now
startSkillSync runs one sync immediately so returning users
get skill updates right away.
* refactor: simplify sync — remote always wins, remove manifest
Remote catalog is the source of truth. If a skill exists in the
catalog, its version is compared against local frontmatter and
overwritten when newer. No manifest file, no content hashes.
User-created skills (IDs not in catalog) are never touched.
* fix: skip bundled skills already installed by partial remote seed
* chore: remove unreliable Content-Length check
* chore: remove size limit checks, fetch timeout is sufficient
* feat: add CDP UI inspector script for dev self-testing
* fix: address code review feedback for inspect-ui script
- Use Delete key (not Backspace) to match server's keyboard.ts clearField
- Add windowId resolution to open-sidepanel (chrome.sidePanel.open requires it)
- Make target matching case-insensitive
- Replace process.exit(1) in eval with thrown error for proper cleanup
- Add comment referencing DEV_PORTS source of truth
* docs: add self-testing workflow for UI changes via CDP inspector
* fix: runtime fixes for inspect-ui discovered during live testing
- Remove Input.enable (domain has no enable method)
- Add DOM.getDocument before DOM operations (required by protocol)
- Use BrowserOS-specific sidePanel.browserosToggle API instead of
standard chrome.sidePanel.open (side panel starts disabled)
- Enable side panel with setOptions before toggling
* feat: add test-ui skill for visual testing of agent extension UI
Adds a Claude Code skill that lets the agent visually test both
surfaces of the BrowserOS extension:
- New tab page (app.html) — left sidebar with Home, Scheduled Tasks,
Settings, Skills, Memory, Soul, Connect Apps
- Right side panel (sidepanel.html) — chat interface
Includes all gotchas discovered through real testing: randomized ports,
fresh profile onboarding redirect, stale element IDs after navigation,
BrowserOS-specific sidePanel APIs, DOM.getDocument requirement.
* feat: add press_key, scroll, hover, select_option, wait_for to inspect-ui
Brings inspect-ui.ts to parity with server's MCP input tools:
- press_key: key combos like Enter, Control+A, Meta+Shift+P
(ported from keyboard.ts pressCombo)
- scroll: up/down/left/right with configurable amount
- hover: hover over element by ID for tooltip/hover state testing
- select_option: select dropdown option by value or visible text
(ported from browser.ts selectOption)
- wait_for: poll for text or CSS selector with 10s timeout
Updated skill documentation with new commands and examples.
* docs: prefer snapshot over screenshot, add holistic debugging guidance
- Add snapshot vs screenshot guidance table — prefer snapshot for
structural checks, screenshot only for visual/layout verification
- Add server log checking instructions ([agent], [server], [build] tags)
- Add JS error checking via eval
- Add API connectivity verification
- Add common issues troubleshooting table
- Update all examples to use snapshot as default verification
* fix: address Greptile review feedback
- Replace process.exit(1) with process.exitCode + return in cmdWaitFor
to allow async CDP cleanup in finally blocks
- Fix cmdScroll enabling Runtime instead of Page domain
- Add BROWSEROS_EXTENSION_ID env var override for extension ID
- Align CLAUDE.md dev server command with SKILL.md canonical command