* fix: run full browseros-agent test suite
* fix: stabilize server test reporting in CI
* fix: address PR review feedback
* refactor: extract server core test runner
* refactor: group server tests by filesystem
* fix: align CI suites with server test groups
* fix: provision server env for all CI suites
* fix: stabilize ci checks
* fix: report real test counts in ci
* feat(openclaw): add CLI client
* fix(openclaw): swap service to cli client
* fix(openclaw): restore mixed json parsing
* fix(openclaw): validate agent list payloads
* fix(openclaw): simplify cli client boundary
* fix(openclaw): simplify cli client boundary
* fix(openclaw): prefer outer config json payloads
* fix(openclaw): ignore trailing config log payloads
* refactor(openclaw): bootstrap config through cli
* fix(openclaw): narrow bootstrap ownership
* fix(openclaw): avoid noop key restarts
* fix(openclaw): enforce supported provider sync
* refactor(openclaw): remove agent role contract
* fix(openclaw): migrate legacy state and apply model updates
* fix(openclaw): migrate legacy agent state
* fix(openclaw): harden state updates
* refactor: stabilize local OpenClaw bootstrap and chat auth
* fix(openclaw): propagate container env and drop legacy paths
Compose now loads provider creds from .openclaw/.env and passes the
gateway token through, so in-container CLI commands (tui, doctor,
config) authenticate correctly and the gateway process sees
OPENROUTER_API_KEY. Service ensures the state env file exists and
rewrites the compose env with the token before composeUp in setup,
start, and tryAutoStart. Podman machine gets larger defaults and the
container enables NODE_COMPILE_CACHE + OPENCLAW_NO_RESPAWN. Legacy
state migration, the unused WebSocket gateway-client, memorySearch,
and thinking defaults are removed.
Introduces release.macos.arm64.yaml for single-architecture arm64
macOS release builds. Mirrors the windows/linux single-arch pattern
(configure -> compile -> sign_macos -> package_macos -> upload),
skipping the universal_build module to avoid the x64 cross-compile
and lipo merge. Reuses the sparkle_setup step and the same
notarization env vars as the universal macOS config.
* feat(ota): bundle full server resources tree (server + third_party bins)
The OTA Sparkle payload now ships the complete resources/ tree the agent
build produced, not just browseros_server. Every third-party binary (bun,
ripgrep, podman, gvproxy, vfkit, krunkit, podman-mac-helper, win-sshproxy)
flows to OTA-updated installs so podman integration works for users on the
OTA channel, matching fresh Chromium-build installs.
Extract the per-binary sign table into build/common/server_binaries.py so
the Chromium-build sign path (modules/sign/) and OTA sign path (modules/ota/)
share a single source of truth. Adding a new third-party dep is now a
one-file edit that both paths pick up automatically; unknown executables
under resources/bin/ are a hard error at release time.
* fix(ota): address review comments on bundle signing flow
- Avoid double-zipping during notarization: add notarize_macos_zip for
pre-built Sparkle bundles so notarytool submits the zip directly
instead of re-wrapping it through ditto --keepParent (Apple's service
does not descend into nested archives). Keep notarize_macos_binary for
single-binary callers. Share credential setup + submit logic via
internal helpers.
- Fail fast on unknown executables in sign_server_bundle_macos: collect
the unknown-files list before any codesign call so a missing shared-
table entry aborts in seconds, not after a full signing round.
- Drop dead get_entitlements_path helper (no callers remain after the
bundle refactor).
* fix(ota): address PR review comments (greptile + claude)
- sign_server_bundle_macos filters to executables only (p.is_file() +
not p.is_symlink() + os.access X_OK) before applying the unknown-file
guard. Non-Mach-O files (configs, dylibs, etc.) under resources/bin/
no longer cause misleading 'unknown executable' hard failures.
- sign_server_bundle_windows now hard-errors on a missing expected
binary instead of silently skipping it. Symmetric with the macOS
guard — an incomplete bundle must not publish.
- ServerOTAModule.execute() uses tempfile.TemporaryDirectory context
managers for both the download and staging roots so they are cleaned
up on every path, including failures.
- Per-platform sign/notarize/Sparkle-sign failures now raise RuntimeError
instead of silently skipping the platform — a release pipeline can no
longer omit a target while reporting success.
- Move import os and import shutil to the top of ota/sign_binary.py.
- Drop unused log_error import from ota/server.py.
* chore: bump server
* fix(ci): add PR comment with test summary and block on failure
Add a `comment` job to the test workflow that parses JUnit XML artifacts
and posts a sticky PR comment showing pass/fail counts per suite, with
failed test names listed in a collapsible section and a link to the run.
Guards against fork PRs (read-only token) and stale overlapping runs
(skips comment if PR head has moved past our SHA).
* fix(ci): use payload SHA for staleness check, handle missing artifacts
- Replace context.sha (merge commit SHA) with
context.payload.pull_request.head.sha so the staleness guard
compares the correct values and the comment actually gets posted
- Add continue-on-error to download-artifact so cancelled runs
gracefully fall through to the "no test results" message
* fix(ci): show warning icon for zero-test suites instead of failure
* fix: isolate ACL semantic tests from Bun teardown crash
* fix: time out ACL semantic fixture subprocess
* fix: run full root test suite and repair sdk browser context
* fix: address PR review comments for 0415-fix_all_tests_and_issues
* test: temporarily skip sdk suite
* test: clarify sdk suite disable message
Pre-kill BrowserOS processes whose --user-data-dir path contains the
browseros-test- prefix before each spawnBrowser, and in the test:cleanup
hook. This prevents a crashed prior test run from leaving a headless
BrowserOS attached to a stale port, without touching the developer's
regular BrowserOS.app instance (its user-data-dir is
~/Library/Application Support/BrowserOS, which does not match).
OpenRouter's public model slugs use dots in version numbers
(e.g. `anthropic/claude-haiku-4.5`), but openclaw's model registry only
recognises the dashed form (`claude-haiku-4-5`). Passing the dotted form
makes openclaw's registry lookup miss silently — the agent turn completes
with `stopReason=stop payloads=0` and the UI shows no reply. Rewrite dots
to dashes in the model portion for openrouter providers only so
copy-pasted OpenRouter slugs resolve correctly.
Also, in development mode:
- Inject `logging.level: debug` into generated openclaw.json so the
gateway emits debug-level entries to its file log.
- Patch an existing openclaw.json on start/restart so already-provisioned
users pick up the debug setting without a reset.
- Tail the gateway container's logs into the browseros server logger so
they appear in the same stream as the rest of dev output.
* refactor: remove redundant context-overflow middleware
The middleware caught provider overflow errors and re-tried with a
naive prompt truncation, but its `nonSystem.slice()` had no awareness
of tool_use/tool_result pairing — a cut between an assistant tool_use
and the matching tool_result produces an orphaned tool_use that
providers reject with a different error.
Compaction (`createCompactionPrepareStep`) already handles this safely:
`findSafeSplitPoint` walks past tool messages to preserve pair
integrity, and the pipeline (strip binary → prune → reduce outputs →
LLM summarize → sliding window) handles every overflow path before
the request leaves the agent.
Drops 426 lines: the middleware itself, its wiring in ai-sdk-agent,
and the matching test block + helpers in compaction.test.ts.
* docs: document BROWSEROS_AI_SDK_DEVTOOLS in .env.example
Surfaces the opt-in dev flag so contributors know it exists. Captures
every LLM call to .devtools/generations.json for post-hoc inspection.
* chore: add auctor configuration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add project-level Claude Code skills for team
Adds 14 development workflow skills (brainstorming, planning, debugging,
TDD, code review, subagent-driven development, etc.) to .claude/skills/
so all team members get them automatically on pull.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The typecheck and compile scripts failed on fresh checkouts with
TS5083 because tsconfig.json extends .wxt/tsconfig.json, which is
gitignored and only generated by 'wxt prepare'. Run wxt prepare
before tsgo so the extended config and wxt.d.ts are always in place.
Expose the 7 Klavis Strata MCP tools as CLI subcommands under
`browseros-cli strata`, so CLI users (claude-code, gemini-cli) can
discover and execute actions on 40+ external services.
Commands: check, discover, actions, details, exec, search, auth.
Includes discovery flow guidance in help text, integration tests,
and an "Integrations:" group in the root help output.
Agents connecting over MCP URL/CLI (like claude-code) had no way to know
which Klavis connectors were available or authenticated, causing them to
fall back to browser automation. This adds a connector_mcp_servers tool
that checks connection status and returns an auth URL when needed.
* fix(openclaw): compose file path after service dir move, loopback auth fallback
- Fix COMPOSE_RESOURCE path: services moved to api/services/openclaw/
so the relative path needs one more parent directory traversal
- Fix requireTrustedAppOrigin middleware: Chrome extensions cannot set
the Origin header (forbidden header name). When Origin is absent,
fall back to checking the Host header is a loopback address. The
server only binds to loopback so only local processes can reach it.
Requests with an explicit non-trusted Origin are still rejected.
* fix: request header check
* chore: remove setup openclaw button
---------
Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>