42 Commits

Author SHA1 Message Date
Nikhil
a59f96f657 test(agent): tighten claude acp command assertion (#1022) 2026-05-17 11:28:59 -07:00
Nikhil
4b2e887fbb fix(agent): resolve claude codex acp commands (#1020) 2026-05-17 10:17:42 -07:00
Nikhil
0be59dccdd fix: revert recent agent/server changes (#995)
* Revert "fix(server): tolerate existing workspace dirs"

This reverts commit d7e1125db3.

* Revert "fix(server): support Gemini computer use requests"

This reverts commit 8b6483a633.

* Revert "feat(agent): add reset controls for sessions and memory"

This reverts commit f54eff4543.

* Revert "fix: add cloud sync sign-in disclosure"

This reverts commit f1ebfa5232.

* Revert "fix: allow pasted images in agent text box"

This reverts commit b89ea201fa.

* fix(server): stabilize Hermes harness state paths

* fix: address review feedback for PR #995
2026-05-11 14:26:56 -07:00
shivammittal274
dad2331448 refactor(agent): clean up hermes adapter structure (#994) 2026-05-11 22:57:59 +05:30
Nikhil
d7e1125db3 fix(server): tolerate existing workspace dirs
Fixes #974
2026-05-08 19:17:29 -07:00
Dani Akash
4e405681a7 feat(container): richen ManagedContainer — isImageCurrent + logs + sibling-exec (#968)
* feat(container): add isImageCurrent + getLogs + tailLogs + runOneShot to ManagedContainer

Four base-class additions ahead of the OpenClaw runtime migration so
the upcoming subclass doesn't have to re-implement them:

- isImageCurrent() — pure predicate comparing the existing container's
  image ref to descriptor.defaultImage. Treats SHA-pinned variants as
  matches. start() is unchanged; subclasses + service layers compose
  the predicate where they want short-circuit behaviour.
- getLogs(tail) and tailLogs(onLine) — generic log primitives, thin
  pass-throughs to ContainerCli.
- runOneShot(argv, opts) — sibling-container helper that spawns a
  <name>-setup container with the same image+mounts+env (no ports/
  health/restart), runs argv, force-removes after. Includes the
  retry-on-name-collision behaviour previously bespoke to OpenClaw.

Hermes inherits unused surface only — no behavioural change. The
in-flight base-class tests cover all four primitives.

* fix(container): tighten getLogs error path + close runOneShot timeout-onLog leak; trim docstrings

- getLogs now distinguishes a missing container (returns []) from
  other CLI failures (throws). Previously nerdctl's stderr ("Error:
  no such container: …") leaked into the lines array as if it were
  log output. isNoSuchContainer is exported from container-cli to
  share the predicate.
- runWithOptionalTimeout wraps the caller's onLog so post-timeout
  lines from the abandoned runCommand promise become no-ops; before
  this, callers could see onLog fire after runOneShot had already
  rejected, hitting state the caller may have torn down on the
  timeout error.
- Tightens the new docstrings to one short line per the project
  convention; drops a restating comment in the test file.
2026-05-08 15:58:05 +05:30
Dani Akash
b445615d61 refactor(claude+codex): migrate onto HostProcessAgentRuntime; collapse adapter-health (#967)
* feat(runtime): add ClaudeRuntime + CodexRuntime + factories

* refactor(host-adapters): switch wire-up + dispatch + health to runtime registry

main.ts registers ClaudeRuntime + CodexRuntime alongside Hermes. ACP
runtime resolves all three via the registry; legacy host-process
spawn is preserved as a fallback so unit tests that don't bootstrap
runtimes keep working. AdapterHealthChecker now reads runtime
snapshots through the registry — the embedded execAsync probe,
ADAPTER_HEALTH_COMMANDS table, and friendlyProbeFailure mapper
delete. As a side-effect this also fixes the Hermes "Unavailable"
chip (Hermes was missing from ADAPTER_HEALTH_COMMANDS).

Drops the standalone claude-code/prepare.ts and codex/prepare.ts
modules (their bodies are exported from the runtime files now).

* test(runtime): cover ClaudeRuntime + CodexRuntime descriptor + prep + factory

* fix(runtime): coalesce concurrent host-process probes; expose probedAt on snapshot

* fix(runtime): preserve acpx-core npx-wrapped spawn for claude + codex

The host-process runtimes were resolving the ACP spawn command
through their own getAcpExecSpec, which returned argv [claude] /
[codex] — bare binaries. acpx-core's built-in registry actually
resolves these adapters to npx wrappers around the official
ACP-aware packages (claude-agent-acp, codex-acp), and the package
version range is owned by acpx-core. The bare-binary spawn would
fail because either the binary is missing or doesn't speak ACP.

Spawn dispatch now goes through registry.resolve() + wrapCommandWithEnv
for claude/codex (matching pre-#967 behaviour). The runtime
registrations still drive health probing and per-turn prep — only
the spawn-command source-of-truth stays in acpx-core. Drops the
misleading getAcpExecSpec from the host-process runtime classes.

Regression test asserts the spawn command contains the npx package
name (claude-agent-acp / codex-acp) for each adapter.
2026-05-08 13:02:19 +05:30
Dani Akash
d68e8905fe refactor(hermes): migrate Hermes onto ContainerAgentRuntime (#965)
* feat(runtime): add HermesContainerRuntime + factory

* refactor(hermes): switch wire-up + dispatch to runtime registry

main.ts and the agent route stack now resolve Hermes through
`AgentRuntimeRegistry`. Drops the `hermesGateway` plumbing chain
(server.ts → routes → harness → AcpxRuntime), the
`HermesGatewayAccessor` interface, and `resolveHermesAcpCommand`.
Removes `HermesContainerService`, `HermesContainer`, and
`prepareHermesContext`'s standalone module — their behaviour is now
owned by `HermesContainerRuntime`.

* test(runtime): cover HermesContainerRuntime descriptor + lifecycle + factory

* test(runtime): move registry reset to afterEach to survive assertion failures
2026-05-08 11:32:19 +05:30
Dani Akash
e89fccd997 feat(runtime): introduce AgentRuntime abstraction (types, interface, registry, abstract bases) (#964)
* feat(runtime): introduce AgentRuntime types + interface + registry

Foundation for the unified agent-runtime abstraction. No adapter
migrates yet; the existing acpx-runtime, per-adapter prepare
modules, OpenClawService, HermesContainerService, and
adapter-health.ts all keep working unchanged.

This commit adds the data layer of the abstraction:

- `RuntimeDescriptor` discriminates the two kinds we ship today
  (`'container'` | `'host-process'`). UI components route on this.
- `RuntimeState` is the union of both kinds' states — container
  flow `not_installed → installing → installed → starting →
  running → stopped`, host flow `cli_missing | cli_present |
  cli_unhealthy`, plus the shared `errored` and
  `unsupported_platform` terminals.
- `RuntimeStatusSnapshot` carries a single `isReady: boolean` so
  the harness has one bit to read before spawning turns.
- `RuntimeAction` is a typed discriminated union — required args
  (e.g. `agentId` for `'reset-wipe-agent'`) are compile-time
  enforced, removing the previous footgun of optional args on a
  string-keyed dispatch.
- `RuntimeCapability` lists every action a runtime can advertise;
  `getCapabilities()` is the single switchboard the UI uses to
  decide which buttons to render.

`AgentRuntime` interface declares the contract every runtime
implements: status snapshot + subscriber, capability list,
`executeAction(action)`, `buildExecArgv(spec)`, and per-agent home
dir. `prepareTurnContext` is intentionally absent until the first
adapter migrates so callers can't depend on a method that has no
implementation.

`AgentRuntimeRegistry` is a small class + module-level singleton —
adapters register themselves at boot, the harness/UI look up by
`adapterId`. `resetAgentRuntimeRegistry()` is for tests only.

Two error classes round it out: `ActionNotSupportedError`
(capability gate, mapped to HTTP 405 in a later phase) and
`RuntimeNotReadyError` (state gate at the runtime layer, distinct
from the container-layer's `ContainerNotReadyError`).

* feat(runtime): add ContainerAgentRuntime + HostProcessAgentRuntime abstract bases

* test(runtime): cover state translation, action dispatch, registry

* fix(runtime): gate host-process executeAction on capabilities; only stamp probe cache after probe resolves
2026-05-08 09:47:38 +05:30
Dani Akash
805ae8e607 feat(server): ManagedContainer abstraction — Hermes readiness gate + ACP layering fix (#962)
* feat(container): add waitForContainerRunning primitive + typed error

Adds `ContainerCli.waitForContainerRunning(name, opts)` polling
`inspectContainer().running === true` until either the container
reports running or the timeout expires. Distinct from the existing
`waitForContainerNameRelease` (which waits for *deletion*).

Used by the upcoming managed-container layer between
`nerdctl create + start` and "container is ready for exec" so the
harness never spawns a turn against a half-started container —
which is the root cause of the silent first-turn failure on Hermes
today (`hermes-container.ts:130-160` returns immediately after
start).

Defaults sized for cold-start: 30s budget at 500ms cadence.
Throws `ContainerNotRunningError` (new, in `lib/vm/errors.ts`) on
timeout — distinct from `ContainerNameReleaseTimeoutError` so
callers can branch on "didn't come up" vs "didn't get cleaned up".

* feat(container): add ManagedContainer abstract base + state machine

Introduces the abstract base every container-backed agent adapter
will subclass. Owns the canonical state machine (not_installed |
installing | installed | starting | running | stopped | errored),
the lifecycle lock (per-process promise chain + cross-process file
lock), the gated `execute*` family, and the host↔container path
translator.

Subclasses provide only what's actually adapter-specific:
- `descriptor` (image, container name, supported platforms)
- `buildContainerSpec()` for the `nerdctl create` args
- `readinessProbe()` after the container reaches running
- `mountRoots()` for the path translator

Three execute methods, all sharing one invariant — every entry
point gates on state == running:

- `execProcess(spec)` spawns a long-lived child process via Bun,
  waits through `starting` up to 60s, throws typed
  `ContainerNotReadyError` if the container is not_installed /
  stopped / errored / timed out.
- `execOneShot(spec)` is a buffered convenience wrapper.
- `buildExecArgv(spec)` is the pure builder for callers (acpx-core)
  that need a shell-command string. Single source of truth for the
  `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` chain
  that today's ACP runtime hand-rolls in two places (`acpx-runtime
  .ts:780-820` and `:823-870`).

`reset(level)` is on the API surface but throws
`ResetNotSupportedError` so the next PR can wire soft / wipe-agent
/ hard without revving the abstract class.

Path translator uses lexical containment against declared mount
roots; the realpath-based symlink-escape check lives one layer up
(in the file-attribution code that already shipped) since the
translator itself never reads from disk.

* feat(container): HermesContainer subclass + wrapper-service bridge

`HermesContainer` (lib/container/managed/) is the first concrete
adapter on the new `ManagedContainer` base. Provides the four bits
that are actually adapter-specific:

- `descriptor`: image, container name, supported platforms,
  readiness-probe tuning.
- `mountRoots()`: host↔container path mapping for the harness dir.
- `buildContainerSpec()`: nerdctl create args (env, mounts,
  add-hosts, entrypoint override).
- `readinessProbe()`: execs `hermes --version` inside the
  freshly-started container; bypasses the state gate via
  `cli.exec` since we're in `starting`, not `running`, when the
  probe runs.

`HermesContainerService` (api/services/hermes/) is rewritten as a
thin wrapper that delegates `prewarm` / `start` / `stop` /
`restart` / `shutdown` to the underlying `HermesContainer`. Public
surface is preserved so `main.ts`, `server.ts`, and
`agent-harness-service` compile unchanged in this PR; `getAccessor()`
still returns the structural `HermesAccessor` the ACP runtime
expects today (the runtime swap is the next commit). The wrapper
also exposes `getContainer(): HermesContainer | null` for callers
that want the richer surface.

The user-visible bug — Hermes silent first-turn failure — is fixed
as a side effect: `start()` now waits through
`cli.waitForContainerRunning` and runs the `hermes --version`
readiness probe before transitioning to `running`. Subsequent
chat turns are gated on the container actually being ready, not
just on `nerdctl create + start` having returned.

* feat(agent): ACP runtime spawns Hermes via ManagedContainer.buildExecArgv

`resolveHermesAcpCommand` no longer hand-rolls the
`env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` chain.
It now delegates to `gateway.buildExecArgv`, which the wrapper
service routes to the underlying `ManagedContainer.buildExecArgv`.

The structural `HermesGatewayAccessor` type gains one method
(`buildExecArgv`) — keeps the existing four getters so any
test/legacy caller still works. The wrapper's `getAccessor()`
delegates `buildExecArgv` to its `HermesContainer`. Net effect:
the `limactl shell ... -- nerdctl exec ...` argv chain has
exactly one owner (`ManagedContainer.buildExecArgv` in the
container layer) instead of being duplicated across `acpx-runtime`
and the now-deleted hand-built chain.

The OpenClaw branch (`resolveOpenclawAcpCommand`) is untouched —
its migration to ManagedContainer is a separate, larger PR that
also has to model the gateway / control-plane surfaces.

Tests: the existing acpx-runtime test suite expected the four
old getters; updated the Hermes-container fixture to also
provide `buildExecArgv` (mirrors the production builder inline so
the test stays independent of the production class wiring). All
320 server tests pass.

* test(container): managed-container + hermes-container coverage

20 cases across two files in `tests/lib/container/managed/`.

ManagedContainer base (14 cases):
- State machine: start() walks installing → starting → running;
  probe-false lands errored with lastError populated; stop()
  force-transitions to stopped even from errored.
- execProcess gating: rejects ContainerNotReadyError with
  reason='not_installed' when never started; reason='errored'
  when in errored state (preserving lastError); resolves once
  state flips to running while waiting; reason='timeout' when
  starting never resolves.
- buildExecArgv: snapshot test pinning the exact canonical
  `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` string
  for the Hermes-shaped invocation; -e flags omitted when env is
  empty.
- reset(level): throws ResetNotSupportedError for all three
  levels (Phase 1 stub).
- Path translation: round-trip host ↔ container under a declared
  mount; mount-root itself translates without suffix; rejects
  PathOutsideMountsError for /etc/passwd / /proc/cpuinfo.
- subscribeState fires every transition, stops after unsubscribe.

HermesContainer subclass (6 cases):
- Descriptor declares adapterId='hermes', the canonical container
  name, image, and darwin platform support.
- start() happy path reaches running + invokes the
  `hermes --version` probe via cli.exec.
- Probe-non-zero start() lands errored with the right error.
- ContainerSpec built with idle entrypoint, harness bind-mount
  (source = /mnt/browseros/vm/hermes/harness, target =
  HERMES_CONTAINER_HARNESS_DIR), and host.containers.internal
  add-host pointing at the VM gateway.
- toContainerPath maps host harness paths to /data/agents/harness.
- buildExecArgv produces the canonical Hermes ACP spawn string
  with LIMA_HOME, container name, hermes binary path, and -e env.

Pre-existing test in tests/lib/container/container-cli.test.ts
(`waits until a container name is no longer resolvable`) flakes
under parallel test load on dev; passes solo. Last touched in
fd5aba24, well before this branch.

* chore: tidy comments

* fix(hermes): use provider:custom for openai + openai-compatible

Hermes (v2026.4.x) does not have a provider key called "openai" —
its `PROVIDER_REGISTRY` enumerates 33 named providers (anthropic,
deepseek, gemini, kimi-coding, etc.) and "openai" is not one of
them. Per the upstream docs, the canonical shape for any
OpenAI-compatible endpoint with an API key is:

    model:
      provider: custom
      base_url: "<endpoint>"

When `base_url` is set, Hermes ignores provider lookup and calls
the URL directly using OPENAI_API_KEY (or the configured api_key).
Today's mapping wrote `provider: "openai"` for both BrowserOS
provider types — Hermes' main-model loader rejected that with
`unknown provider 'openai'`, and the harness surfaced an opaque
"Internal error" on every first chat for any Hermes agent backed
by a Fireworks / Together / Groq / OpenAI provider.

Fix:
- `openai` and `openai-compatible` BrowserOS types now both map
  to `hermesProvider: 'custom'`.
- HermesProviderMapping gains an optional `defaultBaseUrl` field
  used when `provider: 'custom'` is set with no caller-supplied
  baseUrl (BrowserOS' `openai` type doesn't require base_url at
  the API edge, but Hermes' `custom` always does — so we fall
  back to https://api.openai.com/v1).
- writeHermesPerAgentProvider rejects `provider: 'custom'` with
  no base_url so a future regression fails loudly instead of
  silently writing an unusable config.yaml.

Tests updated: the existing openai-compatible case now asserts
`provider: "custom"` instead of `"openai"`, plus a new case
covering the openai-default-base-url fallback path.

Note: the `openrouter` mapping is left untouched because its
fix is unverified (Hermes' PROVIDER_REGISTRY doesn't appear to
contain "openrouter" either, but the auxiliary fallback chain
recognises it). Worth a separate follow-up — out of scope for
this fix which targets the user-reported reproduction.

* fix(container): install() must ensure VM is ready before image pull

Image operations run inside the Lima VM, so `nerdctl pull` fails
on a cold-boot run if the VM hasn't been started yet.
`HermesContainerService.prewarm()` (the original wrapper) always
called `vm.ensureReady()` before `ensureImageLoaded()` — the
wrapper-bridge introduced earlier in this PR delegated `prewarm()`
to `container.install()` and dropped the VM-ensure step.

`start()` does ensure VM, but on cold boot `prewarm()` and
`start()` race for the lifecycle lock and there is no guarantee
which one wins. When `prewarm()` lands first, the image pull
crashes against an unstarted VM and Hermes never comes up.

Fix: `install()` now awaits `deps.vm.ensureReady()` before
transitioning to `installing`. Errors land in `errored` exactly
as before. New regression test pins the call order
(`vm.ensureReady` → `loader.ensureImageLoaded`) so a future edit
can't silently re-introduce the gap.
2026-05-08 08:14:45 +05:30
shivammittal274
7a2a8e09bc feat(agent): add Hermes as 4th ACPX adapter (in-VM container, BrowserOS-managed providers) (#956)
* feat(agent): add Hermes as a 4th ACPX adapter (Phase A)

Adds Hermes Agent (NousResearch/hermes-agent) as a host-process ACPX
adapter, mirroring the Claude Code pattern.

- agent-types.ts: extend AgentAdapter union with 'hermes'
- agent-catalog.ts: add Hermes catalog entry
- lib/agents/hermes/prepare.ts (new): minimal prepare using prepareBrowserosManagedContext
- acpx-agent-adapter.ts: register the adapter
- acpx-runtime.ts: add 'hermes' branch returning 'hermes acp' (host)
- AdapterIcon.tsx: add Hermes icon
- db schema + supporting frontend types/literals updated for the new adapter

Phase A scope: host-process only. Phase A.5 swaps to nerdctl exec
into a Hermes container.

OpenClaw is untouched. Verified by all 6 POC spikes
(plans/features/claude-browseros-hermes-poc/findings.md).

* fix(agent): address Hermes adapter review issues

- NewAgentDialog: add 'hermes' to onValueChange guard so the dropdown
  option actually wires through onRuntimeChange/onHarnessAdapterChange
  (was a no-op before — selecting Hermes silently kept previous value)
- tests/acpx-runtime: add coverage for the new 'hermes' registry branch
- tests/acpx-agent-adapter: fold hermes prepare test into existing file,
  matching the pattern used for claude/codex/openclaw
- Delete tests/lib/agents/hermes-prepare.test.ts (now redundant)
- Reconcile install-mechanism comment between acpx-runtime.ts and
  agent-catalog.ts

* fix(agent): make Hermes adapter actually work end-to-end

Two surgical fixes uncovered while running the Phase A smoke test
through the BrowserOS chat HTTP API:

1. lib/agents/hermes/prepare.ts — seed per-agent HERMES_HOME from
   the user's global ~/.hermes/ on first use. ensureAgentHome only
   writes SOUL.md and MEMORY.md; without seeding config.yaml, .env,
   and auth.json, hermes acp comes up unconfigured and either hangs
   or errors with "No LLM provider configured." Copy is idempotent
   (skip if dest exists) so subsequent prepare calls don't clobber
   per-agent edits.

2. lib/agents/acpx-runtime.ts — wrap the hermes spawn in
   `bash -c "exec hermes acp | tee /dev/null"` to bridge Bun's
   socketpair-based child stdio with Python's asyncio.connect_write_pipe
   (which only drains correctly to a real pipe(2)). Without it, hermes'
   stdout never reaches the harness — verified by inspecting hermes
   process FDs: Bun gives the child unix sockets, asyncio queues writes
   that never become readable on Bun's end. With tee in the middle,
   hermes writes to a real pipe and tee bridges the bytes through the
   socket. Verified 2026-05-06 against hermes-agent 0.12.0 on macOS
   arm64 + Bun 1.3.6.

Smoke-test result with both fixes:
- ACP session created end-to-end
- BrowserOS MCP wired (96 browser tools registered with hermes)
- Reasoning + text streamed back through /agents/:id/sidepanel/chat
- Final stream: text-delta "PONG", finishReason "stop"

Updates the existing acpx-runtime test to assert the new spawn shape
(bash -c, tee /dev/null bridge) so the workaround can't silently regress.

* feat(agent): run Hermes adapter in Lima container (Phase A.5)

Move Hermes ACPX adapter from host-process spawn to running inside
docker.io/nousresearch/hermes-agent:v2026.4.30 in the existing
BrowserOS Lima VM, mirroring the OpenClaw container pattern.

Container lifecycle (api/services/hermes/hermes-container.ts):
- prewarm: ensure VM ready, pull image (or skip if already in
  containerd), start an idle container with /bin/sh -c "exec sleep
  infinity" so the harness can nerdctl exec into it per turn
- Tini bypassed — tini 0.19.0 in upstream image getopt-parses any
  -x token even after PROGRAM, breaking /bin/sh -c
- --add-host host.containers.internal:<vm-gateway> so hermes inside
  the container can reach the BrowserOS HTTP MCP endpoint
- Bind-mount <browserosDir>/vm/hermes/harness onto /data/agents/harness
  so per-agent HERMES_HOME dirs are visible to the container

Spawn (acpx-runtime.ts):
- HermesGatewayAccessor interface (mirrors OpenclawGatewayAccessor)
- resolveHermesAcpCommand builds:
  env LIMA_HOME=... limactl shell --workdir / browseros-vm --
    nerdctl exec -i -e PYTHONUNBUFFERED=1 -e HERMES_HOME=... <container>
    /opt/hermes/.venv/bin/hermes acp
- Absolute path /opt/hermes/.venv/bin/hermes (not bare "hermes") since
  upstream image's PATH is set by its entrypoint script which we
  override to keep the container idle
- Falls back to host-process spawn when no HermesGatewayAccessor wired
  (test path / dev fallback)
- Drops the host-mode bash+tee workaround — limactl/SSH/nerdctl pipe
  chain is sufficient for asyncio's pipe writer

MCP wiring:
- New PreparedAcpxAgentContext.browserosMcpHost field threads through
  prepare → getRuntime → createBrowserosMcpServers
- Hermes prepare sets browserosMcpHost='host.containers.internal' so
  the URL injected into newSession.mcpServers resolves from inside
  the container; other adapters keep '127.0.0.1' default

Per-agent home (lib/agents/hermes/prepare.ts):
- HERMES_HOME points at /data/agents/harness/<agentId>/home (in-container)
- Host-side seedHermesHomeFromGlobal still copies ~/.hermes/{config.yaml,
  .env, auth.json} into the per-agent home; the volume mount makes them
  visible inside the container
- New api/services/hermes/hermes-paths.ts holds host/container path helpers

End-to-end smoke tests against the dev server (clean Lima state):
- Plain text: PONG round-trip via /sidepanel/chat ✓
- Multi-turn context: RUBY-7421 stored + recalled ✓
- Multi-agent isolation: agent 2 doesn't see agent 1's secret ✓
- MCP tool execution: mcp_browseros_browseros_info fires ✓
- Image attachment via /chat: model identifies "Red" from a 128x128 PNG ✓
- Concurrent turns + 409 attachUrl: full attach streams the in-flight
  Pacific Ocean essay turn cleanly ✓
- Cancel midstream + recovery turn: ALIVE response ✓
- Persistence across server restart: agents survive ✓

Companion knowledge doc:
plans/features/claude-browseros-hermes-acp-knowledge.md

* feat(agent): per-agent provider/key for Hermes adapter

Lets users create multiple Hermes agents each with its own provider,
model, and API key. NewAgentDialog now shows provider/model/key fields
inline when 'Hermes' is selected. On submit, the harness writes the
per-agent <browserosDir>/vm/hermes/harness/<agentId>/home/{config.yaml,
.env} directly so the agent has the right config from turn 1 — no
dependency on the user having run `hermes setup` outside BrowserOS.

The existing seedHermesHomeFromGlobal flow remains as a fallback for
agents created without provider fields (e.g. via direct API or with
an existing ~/.hermes/ install).

Backend:
- shared/constants/hermes.ts: HERMES_SUPPORTED_PROVIDERS registry
  (openrouter, anthropic, openai, custom — bedrock follow-up)
- api/services/hermes/hermes-paths.ts: writeHermesPerAgentProvider
- agent-harness-service: writes per-agent config.yaml + .env in
  createAgent when adapter=hermes and apiKey present
- routes/agents.ts: relax modelId catalog validation for adapter=hermes
  (catalog has empty models[] by design; per-agent modelId is free-form)
- tests/agent-harness-service: cover write + skip paths

Frontend:
- HermesProviderFields.tsx (new): provider dropdown, model field, API
  key + optional baseUrl when provider=custom
- NewAgentDialog: render the new fields when adapter=hermes
- agents-page-actions: thread fields through createHarnessAgent
- AgentsPage / agent-harness-types: minor pass-through edits

Smoke-tested end-to-end against the dev server (clean Hermes per-agent
home, no ~/.hermes/ seed): create agent with apiKey + modelId, files
written at the per-agent path with mode 0600, first chat returns the
expected response, all without touching ~/.hermes/.

* feat(agent): source Hermes provider config from BrowserOS LLM providers

Replace the Hermes-specific provider/model/API-key form in New Agent
with a chooser that pulls from the same global LLM providers OpenClaw
uses (Settings → BrowserOS AI). Backend rejects creation with a 400
when the selected provider is missing required fields (apiKey, modelId,
plus baseUrl for openai-compatible) or is not in the Hermes-supported
set; the ~/.hermes/ fallback is removed so Hermes agents always carry
their own per-agent config.
2026-05-07 21:54:36 +05:30
shivammittal274
6f8da5b7fb refactor(openclaw): TKT-788 cleanup (relanded, openclaw-only) — bump image, lock no-auth, delete observer + image bypass (#954)
* refactor(openclaw): TKT-788 cleanup — bump image, lock no-auth, delete observer + image bypass

Re-lands the openclaw-only changes from #934 (reverted in #953 because the
original PR's working tree had stale rollback content for
`packages/browseros/tools/patch/`). This commit is the same openclaw
diff with zero changes outside `packages/browseros-agent/`.

What changes (TKT-788 work-streams A + B + C):

WS-A — bundled gateway no-auth:
- Bump image from `ghcr.io/openclaw/openclaw:2026.4.12` to
  `ghcr.io/browseros-ai/openclaw:2026.5.2-browseros.1` (BrowserOS-
  pinned variant with the no-auth contract baked in).
- Configure gateway with `auth.mode: 'none'`; remove the device-auth
  bootstrap dance that the older binary required.
- Delete the per-call token plumbing the http-client / observer / chat-
  client carried (340 LOC). The harness still passes a stable token in
  headers for backwards-compat with code that hasn't been re-pointed yet,
  but it is no longer required by the gateway.

WS-C — delete the image-attachment bypass:
- The HTTP `/v1/chat/completions` carve-out for OpenClaw image turns
  is gone. Image attachments now ride through ACP as image content
  blocks (which acpx 0.6.x supports natively for openclaw, claude, codex).
- Delete `openclaw-gateway-chat-client.ts` (211 LOC) and `image-turn.ts`
  (219 LOC).
- Drop `maybeHandleTurn` from the `AcpxAgentAdapter` interface and
  the openclaw entry. `AcpxAdapterTurnInput` removed.
- Drop the corresponding 'diverts OpenClaw image turns to the gateway
  chat client' test from `acpx-runtime.test.ts`.

WS-B — replace the WS observer with harness events:
- Delete `openclaw-observer.ts` (276 LOC) — no more parallel WS
  subscription, no more `new OpenClawObserver`, no more
  `ensureObserverConnected` / `observer.disconnect()` plumbing.
- Wire `AgentHarnessService` to receive turn-lifecycle events from
  the runtime stream itself (`turnLifecycleListeners`) and feed
  ClawSession from those, preserving the dashboard SSE shape.

Net: 314 insertions / 1144 deletions, all under
`packages/browseros-agent/`. Typecheck clean across all 6 packages.
946 server tests pass (1 unrelated CDP-dependent test skipped — same
state as origin/dev).

Reference: TKT-788. The patch-CLI rollback that was in the squash of
#934 is intentionally NOT in this commit.

* fix(openclaw): handle 2026.5.4 acp-cli envelope shapes (media + injected timestamp) + bump image

OpenClaw 2026.5.4 (the BrowserOS-pinned image variant with the no-auth
handshake bypass needed for cron tool calls from inside ACP) introduced
two new envelope prefix shapes that the post-bypass-deletion path now
surfaces in user-message text:

  [media attached: <internal-path> (<mime>)]
  [<weekday> <YYYY-MM-DD HH:MM> <TZ>] [Working directory: <path>]
  <BrowserOS role envelope>

The previous cleaner only matched a leading [Working directory: ...]
\n\n line. With media + timestamp prefixes ahead of it the anchor
no longer matched, so image-attachment user turns rendered with
8+ lines of envelope leak in the chat panel.

Replaces the single OPENCLAW_WORKDIR_PREFIX with three content-shape-
anchored patterns chained through stripOpenClawAcpCliEnvelope():

  1. [media attached: <path> (<mime>)]      ← repeats per attachment
  2. [<weekday> <YYYY-MM-DD HH:MM> <TZ>]    ← injectTimestamp
  3. [Working directory: <path>]            ← acp-cli prefixCwd

Each is anchored on its content shape (media attached:, weekday
abbrev + ISO date, Working directory:) rather than just '[…]', so
user-typed lines that happen to start with brackets are not eaten.

Also bumps OPENCLAW_IMAGE from 2026.5.2-browseros.1 to
2026.5.4-browseros.1. The 5.2 image refused tool-side WS connections
with 'device identity required' even though gateway auth.mode=none —
PR #6 in browseros-ai/openclaw added the OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH
bypass that ships in 5.4. Without 5.4, the cron tool (and any other
tool that opens a fresh gateway WS from inside the embedded runner)
fails with 1008.

Verified end-to-end with the BrowserOS chat endpoint:
- Plain text turn: clean
- Image attachment turn: clean (was leaking 8 envelope lines pre-fix)
- One-shot kind:at cron fires, PING fire renders clean
- Second openclaw agent creates, runs, history isolated

15/15 history-mapper unit tests pass; typecheck clean across all
packages.
2026-05-07 02:26:25 +05:30
shivammittal274
50cbe48558 Revert "refactor(openclaw): lock no-auth gateway, bump image, delete token pl…" (#953)
This reverts commit d81b99c8e3.
2026-05-07 01:49:50 +05:30
shivammittal274
d81b99c8e3 refactor(openclaw): lock no-auth gateway, bump image, delete token plumbing (TKT-788 WS-A) (#934)
* fix: disable bundled OpenClaw gateway auth

* refactor(openclaw): delete token plumbing now that auth is locked off

Builds on the cherry-picked spike (#933). With gateway.auth.mode=none
locked in as the only path the bundled gateway runs, the BrowserOS-side
token machinery becomes dead weight. This commit deletes:

- OpenClawService: token field, tokenLoaded, gatewayAuthMode state
  machine, getGatewayToken(), getGatewayHttpToken(),
  ensureTokenLoaded(), refreshGatewayAuthToken(),
  loadTokenFromConfig() and all six lifecycle call sites.
- OpenclawGatewayAccessor.getGatewayToken interface field.
- OpenClawHttpClient / OpenClawGatewayChatClient: optional getToken
  constructor arg and authHeaders() helpers.
- OpenClawObserver: gatewayToken field/parameter and the auth.token
  branch in the connect frame.
- GatewayContainerSpec.gatewayToken and the
  OPENCLAW_GATEWAY_TOKEN env wiring; the
  OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH=1 env is now always set
  rather than conditional.

Test suites: dropped bearer-token assertions and the two persisted-token
tests in openclaw-service that asserted deleted behavior.

Net: -310 LOC across src + tests, with 118 openclaw + acpx tests still
green. Typecheck and biome clean.

Reference: TKT-788 (move OpenClaw integration to ACPX runtime), WS-A.

* refactor(openclaw): delete gateway image bypass, route image turns via ACP (TKT-788 WS-C) (#935)

* refactor(openclaw): delete gateway image bypass, route image turns through ACP

The browseros-ai/openclaw ACP bridge accepts image content blocks
natively (extractAttachmentsFromPrompt at openclaw/src/acp/event-mapper.ts:92,
forwarded via chat.send attachments at translator.ts:295), so the
BrowserOS-side carve-out that diverted image-bearing turns to the
gateway HTTP /v1/chat/completions endpoint is no longer needed.

Deletes:

- apps/server/src/api/services/openclaw/openclaw-gateway-chat-client.ts
- The corresponding test file
- AcpxRuntime.sendOpenclawViaGateway, persistGatewayTurn,
  recordToOpenAIMessages helpers
- The image-attachment carve-out branch in AcpxRuntime.send
- openclawGatewayChat option from AcpxRuntime + AgentHarnessService
  + agent routes ctor wiring
- The randomUUID import (only the deleted helper used it)
- The acpx-runtime test for the deleted carve-out

Net: 614 LOC removed, 0 added, all 142 openclaw + acpx + agent tests
still green.

Reference: TKT-788, WS-C. Stacked on WS-A (#934).

* refactor(openclaw): delete WS observer, feed ClawSession from harness events (#936)

The openclaw-observer.ts WebSocket observer was a second tap on the
same gateway events the AcpxRuntime already sees as ACP session/update
notifications. Replace it with a pull from the AgentHarnessService's
turn lifecycle stream — keeping ClawSession and the /openclaw/dashboard
SSE endpoint shape unchanged for the BrowserOS UI.

Changes:

- AgentHarnessService: emit `turn_started` / `turn_event` / `turn_ended`
  to subscribers via a new `onTurnLifecycle(listener)` API. Wired around
  the existing `notifyTurnStarted/Ended` calls and inside the
  per-event read loop.
- agents route: forward an optional `onTurnLifecycle` dep into the
  service it constructs.
- server.ts: subscribe and route OpenClaw-adapter events to
  `OpenClawService.recordAgentTurnEvent(agentId, sessionKey, event)`.
- OpenClawService: new `recordAgentTurnEvent` method that maps stream
  events to ClawSession transitions (working/idle/error + currentTool
  from `tool_call` events). Keeps the existing
  `onAgentStatusChange` / `getAgentState` / `getDashboard` API.
- Delete `openclaw-observer.ts` (276 LOC) and all observer wiring
  (`new OpenClawObserver`, `ensureObserverConnected`, three
  `observer.disconnect()` call sites, the import).

Net: 276 LOC removed from the observer; ~130 LOC added across harness
event plumbing + recorder method. -146 LOC overall, all 141 tests still
green, typecheck clean, biome clean.

Reference: TKT-788, WS-B (Path 1: keep ClawSession + dashboard SSE shape).
Independent of WS-A (#934) and WS-C (#935); will rebase on top of
whichever lands first.

---------

Co-authored-by: Nikhil Sonti <nikhilsv92@gmail.com>
2026-05-07 01:40:37 +05:30
Dani Akash
db5e55a174 feat(agent-files): expose openclaw produced files inline + outputs rail (#946)
* feat(server): foundation for OpenClaw agent file-output attribution

Phase 1 of TKT-762 — surface files OpenClaw agents produce as
artifacts inline in chat + a per-agent Outputs rail. This commit
lays the storage + I/O foundation only; turn-lifecycle wiring,
HTTP routes, and UI follow in subsequent phases.

- New `produced_files` Drizzle table (FK→agent_definitions with
  cascade, unique on (agent, path) so re-modifications upsert).
  Migration 0002_chemical_whirlwind.sql. Adapter-agnostic schema
  — V1 only enables the watcher for openclaw, V2 can plug Claude
  / Codex into the same table without migrating.
- `ProducedFilesStore` — snapshot/finalize-turn diff API plus
  by-turn / by-agent queries and a path-resolver that enforces
  workspace-root containment for the download / preview routes.
- `walkWorkspace` — bounded recursive workspace walker; skips
  symlinks (no host-fs smuggling), excludes node_modules / .git /
  .cache, hard-capped at 50k entries / depth 16.
- `file-preview` helper — extension + magic-byte MIME detection,
  bounded text-snippet reader (1 MB cap), inline image base64
  reader (4 MB cap). Streaming download path lives in the route
  layer (next phase) — this module only handles the small
  in-memory reads the preview UX needs.

* feat(server): attribute openclaw turn outputs to the harness layer

Phase 2 of TKT-762 — wire the per-turn workspace diff into the
single dispatch path that owns every turn's lifecycle. Two prior
wiring points the original plan named (the OpenClaw HTTP chat
route + OutboundQueueService.tryDispatch) were collapsed in dev
into agent-harness-service.runDetachedTurn — both direct sends
and queued sends route through it now, so a single hook covers
both. The old `OutboundQueueService` is gone; its successor
`message-queue.ts` re-enters runDetachedTurn for the queued
case, so we still only need to bracket once.

Changes:

- New `produced_files` variant on `AgentStreamEvent` so the
  inline artifact card has a wire-format hook independent of the
  REST API.
- `ProducedFilesStore` gains `resolveAgentDefinitionId` to bridge
  gateway-side openclaw agent names to the harness's
  `agent_definitions.id`, handling both the reconciled-row shape
  (id == openclaw name) and the BrowserOS-created shape
  (id = oc-<uuid>, name = openclaw display name).
- `AgentHarnessService.runDetachedTurn`: snapshot the openclaw
  workspace before `runtime.send(...)`, finalize the diff in the
  outer finally, push the resulting rows as a `produced_files`
  event. Adapter-gated to openclaw only — Claude / Codex agents
  write to the user's own filesystem and don't need
  attribution.
- Skip attribution on user-cancel (`abort.signal.aborted`) so
  the side effects of an aborted turn don't get surfaced as
  "outputs you asked for." On runtime errors we still attribute,
  because partial outputs are what the user is most likely to
  want to recover.
- Lazy-init the store via `tryGetProducedFilesStore()` so tests
  that swap in a fake `agentStore` don't trip the
  process-wide `getDb()` initialisation guard.
- File attribution extracted into `attributeTurnFiles` helper to
  keep `runDetachedTurn`'s cognitive complexity under the lint
  ceiling.

Verifications:
- Server tsgo --noEmit clean for changed files.
- 162/162 server-api tests pass.
- Biome lint clean on all three changed files.

* feat(server): expose produced-files HTTP API for /agents

Phase 3 of TKT-762 — surface the rows Phase 2 attributes via four
read-only endpoints under the existing `/agents` router. Mounted
where the agents page already polls so the rail UI doesn't add
a second router/origin to its trust boundary.

Routes:

- GET /agents/:agentId/files
    Outputs-rail data, grouped by the assistant turn that
    produced each batch, newest first. `?limit=` clamps to N
    rows server-side (default 200).

- GET /agents/:agentId/files/turn/:turnId
    Per-turn refresh — used by the inline-card consumer to
    rebuild metadata after the SSE `produced_files` event lands,
    and by direct fetches that missed the live event.

- GET /agents/files/:fileId/preview
    Discriminated `FilePreview` JSON: text snippet (≤1MB),
    base64 image (≤4MB), pdf metadata, or `binary` placeholder
    when neither preview path applies. 404 when the file id is
    unknown OR the on-disk file disappeared after attribution.

- GET /agents/files/:fileId/download
    Streams raw bytes via `Bun.file().stream()` with
    `Content-Disposition: attachment` and the detected MIME
    type. The fileId is opaque — the server resolves the agent
    and on-disk path; the client never sees a path, so traversal
    is impossible by construction.

Service layer:

- `AgentHarnessService` gains `listAgentFiles`,
  `listAgentFilesForTurn`, `previewProducedFile`, and
  `resolveProducedFileForDownload`. All four are no-ops for
  claude / codex adapters (they return null/[]) so the route
  contract stays uniform across adapters even though only
  openclaw produces rows in v1.
- New `ProducedFileEntry` and `ProducedFilesRailGroup` DTOs —
  trimmed wire shapes that strip `agentDefinitionId` and
  `sessionKey` from the on-disk row.

Verifications:
- Server tsgo --noEmit clean for changed files (only pre-
  existing `Bun` global warning).
- 162/162 server-api tests pass.
- Biome clean on both changed files.

Smoke-test instructions for the route shape live in the plan
under §6 and §8; full end-to-end smoke happens in Phase 6.

* feat(agent): client-side hooks + types for agent file outputs

Phase 4 of TKT-762 — frontend foundation for the inline artifact
card and the per-agent Outputs rail. UI components themselves
land in Phase 5; this commit only adds types, hooks, and shared
helpers so the wiring is in place when the components arrive.

New module: `apps/agent/lib/agent-files/`

- `types.ts` — `ProducedFile`, `ProducedFilesRailGroup`, and the
  discriminated `FilePreview` union, mirrored from the server-side
  DTOs in `apps/server/src/api/services/agents/agent-harness-service.ts`.
  The `agentDefinitionId` / `sessionKey` columns on the on-disk
  rows deliberately do NOT exist at the type boundary — clients
  refer to files by opaque `id`.
- `file-helpers.ts` — pure helpers: `inferFileKind` (icon
  routing), `formatFileSize`, `extensionOf`, `basenameOf`,
  `buildFileDownloadUrl`. No React, no fetch, no DOM — anything
  stateful belongs in the hooks.
- `useAgentOutputs.ts` — `useAgentOutputs(agentId)` for the rail,
  `useAgentTurnFiles(agentId, turnId)` for the inline card,
  `useInvalidateAgentOutputs()` for the chat-stream-completion
  hook (Phase 5 will plumb this), and `useRefreshAgentOutputs()`
  for the rail's manual refresh button.
- `useFilePreview.ts` — `useFilePreview(fileId)` with
  `staleTime: Infinity` (previews are immutable for a given id;
  no point refetching on focus). Always opt-in (`enabled`) — the
  preview only loads when the user clicks a row.
- `index.ts` — barrel re-export so consumers import from one path.

Touched in `apps/agent/entrypoints/app/agents/`:

- `agent-harness-types.ts` — added `produced_files` variant + the
  `HarnessProducedFile` type to `AgentHarnessStreamEvent`. Mirrors
  the server-side change from Phase 2 so the client SSE consumer
  type-narrows correctly.
- `useAgents.ts` — exported the previously-private `agentsFetch`
  helper and the `AGENT_QUERY_KEYS` registry so the agent-files
  hooks reuse them without duplicating fetch / key conventions.
  Three new keys added: `agentOutputs`, `agentTurnFiles`,
  `filePreview`.

Verifications:
- Agent tsgo --noEmit clean.
- Biome clean on all touched files.

* feat(agent): inline artifact card + per-agent outputs rail

Wires the chat surface to the produced-files API shipped earlier:

- Inline artifact card under each assistant turn that produced files,
  populated by the live `produced_files` SSE event (resumes also stamp
  `turnId` so a missed live event can fall back to the per-turn fetch).
- Collapsible right-side Outputs rail on the agent conversation page,
  grouped by turn, with Refresh + per-agent open/close persistence in
  localStorage. Gated to openclaw adapters in v1.
- Shared file preview Sheet branches on the FilePreview union: text
  snippet (markdown for `.md`/`.mdx`, otherwise pre+code), image data
  URL, and download-only fallback for pdf/binary/missing.
- Conversation hook invalidates the rail's React Query cache from its
  finally block so newly attributed files appear without a manual
  refresh.

* feat(agent-files): polish — symlink-safe paths + toast on failures

- `resolveFilePath` now rejects symlink-escapes from the workspace
  by realpath-resolving both endpoints and re-checking containment.
  Lexical traversal (`..` segments) still fails fast without
  touching the filesystem.
- Added `produced-files-store.test.ts` with 6 path-resolution cases
  including a symlink whose target lives outside the workspace
  root — the prior string-only check would have allowed this.
- File preview Sheet: surfaces preview-load failures in a toast
  (in addition to the inline error block, which is easy to miss
  when the body has scrolled). Download button now intercepts the
  click so a missing baseUrl shows a toast instead of silently
  hiding the button.
- Outputs rail: refresh failures fire `toast.error` with the
  underlying message.

* fix(agent-files): drop duplicate `/agents` prefix from client paths

`agentsFetch` / `buildAgentApiUrl` already prepend `/agents`, but
the file-output hooks were passing fully-qualified paths
(`/agents/<id>/files`, `/agents/files/<id>/preview`, etc.) which
resolved to `/agents/agents/...` and 404'd. Fixed the four call
sites to pass paths relative to the `/agents` root.

* fix(agents): strip openclaw role envelope from chat history

PR #924 introduced a second `<role>…</role>` prefix for openclaw
turns — a single-line block distinct from the multi-line BrowserOS
role TKT-774 wired the unwrap against. Because TKT-774's
`stripOuterRoleEnvelope` matched the BrowserOS prefix exactly, the
openclaw envelope sailed through unstripped and user messages on
openclaw agents rendered the full preamble in /sessions/main/history
responses.

Make the strip adapter-agnostic: any
`<role …>…</role>\n\n<user_request>\n…\n</user_request>` shape gets
unwrapped. Drops the now-unused BROWSEROS_ACP_AGENT_INSTRUCTIONS
constant and adds a regression test that uses the openclaw form
verbatim.

* feat(agent-files): inline file-card strip with rail deep-link

Replaces Phase 5's row-list ArtifactCard with a horizontal strip
of small file cards under any assistant turn that produced files.
Click a card → opens the FilePreviewSheet directly (preview +
download). Click View / +N → opens the per-agent Outputs rail and
scrolls / expands the matching turn group.

The card strip:
- Caps at 4 visible cards; remainder collapses into a +N pill that
  shares the View handler.
- Owns its own FilePreviewSheet instance (parallel to the
  deprecated ArtifactCard) so the per-card preview path doesn't
  fight with the rail's Sheet.
- Hidden during streaming and absent when producedFiles is empty.
- Adapter-gated upstream: AgentCommandConversation only passes the
  open-rail callback when adapter==='openclaw', so claude / codex
  agents render no rail-opening affordance.

Rail changes:
- Accepts focusTurnId + onFocusTurnConsumed; the matching
  RailTurnGroup expands and scrollIntoView's on focus, then fires
  the consumed callback so the parent can drop the URL state.
- ?outputsTurn=<turnId> deep-links work: external nav opens the
  rail, sets focusTurnId, and clears the param after consumption.

ArtifactCard is marked @deprecated; remove in a follow-up once
nothing imports it.

* fix(agent-files): keep file-card strip visible after history reload

After Phase 7 the inline FileCardStrip vanished as soon as a turn
finished: `filterTurnsPersistedInHistory` dropped the optimistic
turn once history reloaded, and history items don't carry
`producedFiles`. So the user could see a file produced inside an
assistant message but no card to open it.

Two fixes in tandem so the strip survives both the just-finished
case AND a fresh page load:

- New `selectStripOnlyTurns` keeps persisted turns that still
  carry `producedFiles`. `ConversationMessage` learns a
  `stripOnly` mode that renders only the trailing strip (no
  duplicate user/assistant bubbles, since those are rendered by
  `ClawChatMessage`).
- `AgentCommandConversation` now also calls `useAgentOutputs` and
  passes `tailStripGroups` to `ClawChat`. Each rail group not
  already covered by a live or strip-only turn renders as its own
  tail `FileCardStrip` after history. Dedup keys on `turnId` so
  the same turn never doubles up.

Adapter-gated upstream — claude / codex agents skip the
useAgentOutputs fetch entirely. The card click still opens the
preview Sheet directly; View / +N still deep-link to the rail at
the matching turn group.

* fix(agent-files): per-turn association + cache invalidation

Two fixes for the inline file-card strip:

1. Strips were stacking at the conversation tail because every
   produced-files group rendered as a tail strip after history.
   New `mapHistoryToProducedFilesGroups` matches each group to
   the assistant history message that came from its turn — by
   `group.turnPrompt` vs the first non-blank line of the
   preceding user message — and ClawChat renders the strip
   directly under that bubble. Groups that don't match any
   history pair (orphans) still fall through to the tail.

2. `useInvalidateAgentOutputs` was passing `undefined` as the
   baseUrl placeholder to `invalidateQueries({ queryKey })` —
   react-query's positional partial-match doesn't treat
   undefined as a wildcard, so the cache stayed stale until the
   query refetched on its own (e.g. window focus). Switched to
   predicate-based invalidation that matches by [agentOutputs
   marker, agentId] regardless of baseUrl. Same for the per-turn
   files key.

Net effect: send a turn that produces files → strip appears
under the just-finished assistant message; reload the page →
strips still appear under the right bubbles, not bunched at
the bottom.

* fix(agent-files): review feedback — name guard, RFC 5987, limit cap

Three review-flagged issues:

1. Path traversal via agent display name — `getHostWorkspaceDir`
   accepted any string and `path.join`'d it, so a name like
   `../../tmp` escaped `.openclaw`. The pre-turn snapshot would
   then walk that escaped directory and attribute every file to
   the new turn; resolveSafeWorkspacePath's containment check is
   relative to the same escaped root so it would later serve
   arbitrary host paths. Added `isAgentWorkspaceNameSafe` (rejects
   `..`, separators, control chars, leading dots, empty); the
   builder now throws on unsafe names plus a defensive
   realpath-style containment check after the join. Harness
   wraps the call so the path-traversal trip just disables file
   attribution for the turn instead of failing the whole send.
   Six-case regression test pinned.

2. `encodeRfc6266Filename` JSDoc claimed an RFC 5987
   `filename*=UTF-8''<percent-encoded>` fallback but the impl
   only stripped CRLFs/quotes. Now actually emits the fallback
   when non-ASCII is present; helper returns the full
   `filename="…"; filename*=UTF-8''…` attribute pair so the call
   site doesn't have to wrap in quotes.

3. `/agents/:agentId/files` `?limit=` was forwarded to the DB
   uncapped — extracted `parseAgentFilesLimit` that clamps to
   [1, 500] before forwarding.

Also extracted `resolveSafeWorkspaceDir` + `snapshotWorkspaceForTurn`
helpers off `runDetachedTurn` so the new safety branch doesn't
push it past biome's cognitive-complexity cap.
2026-05-05 19:48:28 +05:30
Nikhil
d61d6fc8a9 feat: add ACPX agent runtime adapters (#924)
* feat: add acpx claude runtime paths

* feat: add acpx adapter preparation

* refactor: use acpx adapter preparation

* refactor: move openclaw image turns to adapter

* fix: keep openclaw independent of host cwd

* fix: address acpx review feedback

* fix: preserve claude host auth in acpx
2026-05-04 11:04:24 -07:00
Nikhil
c07d3d95d4 feat: add sqlite drizzle persistence (#919)
* feat: add drizzle agent schema

* feat: run sqlite drizzle migrations

* refactor: remove old sql identity dependency

* feat: store harness agents in sqlite

* build: package db migrations

* refactor: remove sqlite oauth token store

* feat: restore oauth token storage

* fix: handle empty install id

* chore: ignore server runtime state

* fix: address review feedback for PR 919
2026-05-02 15:19:57 -07:00
Nikhil
921a797c5b feat: add ACPX agent soul and memory support (#917)
* feat: add acpx agent runtime context helpers

* feat: add acpx runtime state store

* feat: prepare acpx agent runtime context

* feat: inject acpx agent command environment

* feat: forward acpx agent chat cwd

* fix: normalize acpx session record fallback

* feat: improve acpx agent soul and memory prompts

* fix: address PR review comments for memory-soul-acp

* fix: satisfy acpx runtime deepscan checks
2026-05-02 13:45:40 -07:00
Nikhil
d94597bbf9 fix(agent): add CLI model catalog entries (#915)
* fix(agent): add CLI model catalog entries

* fix: address PR review comments for acpx-models
2026-05-02 13:06:41 -07:00
Dani Akash
974e7e9b86 fix(agents): hide BrowserOS ACP envelope from chat history payloads (TKT-774) (#907)
* fix(agents): hide BrowserOS ACP envelope from chat history payloads (TKT-774)

The user-message text persisted on the wire carried two nested
envelopes — the outer `<role>You are BrowserOS…</role>` +
`<user_request>…</user_request>` block from buildBrowserosAcpPrompt
and the inner `## Browser Context` + `<selected_text>` +
`<USER_QUERY>` block from formatUserMessage. PR #856 had unwrapped
only the outer envelope on history reads, so the user bubble in
the agent rail still rendered the inner envelope, and the LLM
chat-service path leaked the wrapper all the way back to the
sidepanel client through AI SDK's stream sync.

Two surgical fixes, both server-only:

1) ACP path (acpx-runtime.ts) — replace unwrapBrowserosAcpPrompt
   with a comprehensive unwrapBrowserosAcpUserMessage that strips
   both layers and decodes the &lt;/&gt;/&amp; escapes the server
   applied via escapePromptTagText. Each step is independently
   defensive (anchors that don't match are skipped) so the helper
   is idempotent and tolerates partial / older / future-shape
   envelopes. Applied in userContentToText (history mapper) and
   inherited by extractLastUserMessage (listing's lastUserMessage).

2) LLM chat path (chat-service.ts) — split the persisted user
   message from the prompt-time copy. session.agent.appendUserMessage
   now stores the raw user text; a transient promptUiMessages array
   is built with the wrapped (formatUserMessage + context-change
   prefix) form and passed to createAgentUIStreamResponse for the
   model. onFinish restores the raw form before persisting, so the
   user-visible message and any future history reads see only the
   user's typed text.

Tests:

- acpx-runtime.test.ts: new dedicated unwrapBrowserosAcpUserMessage
  suite covering fully-wrapped messages, only-outer / only-inner
  inputs, selected_text blocks with attribute strings, idempotency,
  literal user-typed angle-bracket round-trip, and an integration
  test that round-trips the real formatUserMessage output through
  the unwrap to pin the writer/reader contract.
- chat-service.test.ts: existing 'rebuilds a managed-app session'
  test updated for the new behaviour — asserts the persisted user
  message is the raw text and the prompt copy passed to the agent
  carries the Klavis context-change notice.

* fix(agents): decode entity escapes before stripping inner envelope (TKT-774)

The unwrap was running its inner-envelope strips against the
literal-tag form (<USER_QUERY>, <selected_text>) but the persisted
payload has those tags entity-escaped (&lt;USER_QUERY&gt;,
&lt;selected_text&gt;) — buildBrowserosAcpPrompt runs
escapePromptTagText over the entire formatUserMessage payload
before adding the outer <role>+<user_request> envelope, so the
inner anchors never matched against the on-disk text and the user
was still seeing <USER_QUERY> in /agents/:id/sessions/main/history
responses.

Reorder unwrapBrowserosAcpUserMessage to: outer-strip → decode
entities → inner-strips. Test fixtures updated to reflect the
actual on-wire form (escaped inner tags); the round-trip test
duplicates the escape rule inline so the contract between
buildBrowserosAcpPrompt and the unwrap is pinned end-to-end.
2026-05-01 19:42:48 +05:30
Nikhil
fd5aba249b fix: stabilize OpenClaw gateway startup (#888)
* feat(server): add shared process lock helper

* feat(container): add container name reconciliation helpers

* feat(openclaw): serialize lifecycle across processes

* fix(openclaw): reconcile fixed gateway container startup

* test(openclaw): cover lifecycle race recovery

* fix(server): satisfy process lock error override

* fix(openclaw): address review feedback

* test(openclaw): align serialization mock with image check
2026-04-30 11:31:40 -07:00
Nikhil
492f3fcdf2 feat(openclaw): prewarm ghcr image in vm (#887)
* feat(openclaw): add gateway image inspection

* feat(openclaw): pull gateway image from registry

* refactor(vm): decouple readiness from image cache

* refactor(openclaw): remove vm cache from runtime factory

* feat(openclaw): detect current gateway image

* feat(openclaw): prewarm vm runtime and reuse current gateway

* feat(openclaw): prewarm runtime on server startup

* refactor(vm): remove browseros image cache runtime

* refactor(build-tools): remove openclaw tarball pipeline

* chore: self-review fixes

* fix(openclaw): suppress prewarm pull progress logs

* fix(openclaw): address review feedback

* fix(openclaw): resolve review findings

* fix(dev): stop stale watch supervisors
2026-04-30 11:18:11 -07:00
Dani Akash
8712f89f18 feat(agents): durable per-agent chat message queue + composer Stop (#880)
* feat(agents): durable per-agent chat message queue + composer Stop button

* fix(agents): tighten queue UI — smaller Stop, drop empty indicator, live drain attach

User feedback round 1 on the message-queue UX:

1) The Stop button matched the send/voice mics at h-10 w-10 with a
   solid destructive fill, which read as alarming. Shrunk to h-8 w-8,
   ghost variant with a soft destructive/10 background, smaller
   filled square glyph. Reads as a calm 'stop' affordance instead of
   a panic button.

2) The QueueItem's leading <QueueItemIndicator> dot was decorative
   only — no state, no interaction. Dropped it from QueuePanel along
   with the import; queue items now render as a clean preview line
   with the trailing X remove action.

3) When the server drained the queue and started the next turn, the
   chat panel didn't pick up the live stream until the user
   navigated away and back. The hook's resume effect previously
   only fired on agent change, not on listing-observed activeTurnId
   change. Surface activeTurnId from useHarnessAgents into
   useAgentConversation; effect now re-runs when the id changes,
   calls /chat/active, and attaches to the new turn — so a queued
   message starts streaming the moment the server drain pops it.

* fix(agents): don't reset streaming state from the resume effect's no-op paths

The Stop button was disappearing while the agent was actively
streaming, even though events were still flowing into the chat. Root
cause: the resume effect's `finally` block reset `streaming`,
`turnIdRef`, and `lastSeqRef` unconditionally — including on the
early-return paths (no active turn, or another mechanism already
owns the stream).

Sequence that triggered it:
  1) User sends a message → send() sets streamAbortRef + streaming=true
     and starts consuming the SSE.
  2) User enqueues another message → enqueue mutation invalidates the
     listing query.
  3) Listing refetches with the live activeTurnId → the resume
     effect re-fires (deps include activeTurnIdDep).
  4) attemptResume hits `if (streamAbortRef.current) return` because
     send() owns it.
  5) The finally clause fires anyway and calls setStreaming(false),
     clobbering the live state set by send(). The SSE consumer keeps
     running (refs are intact) so text keeps streaming, but the React
     flag is wrong, so the Stop button gates off.

Fix: track whether *this* run actually started a stream
(`weStartedStream`). The finally only resets state when it does.
Early-return / no-active-turn paths now leave streaming/turnIdRef/
lastSeqRef alone for whoever does own them.

Also widens the Stop button's visibility (`canStop` prop on
ConversationInput) so it stays steady across the brief gap between
turns when a queue drain is mid-flight; the parent computes
`streaming || activeTurnId !== null || queue.length > 0`. The
visibility widening is independent of the streaming-state fix above
— both are now in place.

* revert: drop canStop widening — Stop only shows while streaming

Reverts the canStop prop on ConversationInput and the OR-with-queue
visibility from AgentCommandConversation. Stop is gated solely on
`streaming` again. Between turns (queue draining) the button stays
hidden — only the actively-streaming turn is interruptible from the
composer, which matches what the user actually expects.

* fix(agents): persist the kicking-off prompt on active turns so the resume placeholder isn't empty

When a queued message drained and started a new turn, the chat
panel's resume effect staged a placeholder turn with userText: ''
because the hook had no way to know what message kicked off the
turn — only the agent-side stream was visible, and the user bubble
above it was blank until the user navigated away and back (at which
point the session record's history loaded normally).

Fix: ActiveTurnRegistry.register now accepts an optional `prompt`
that's stashed on the turn and surfaced via describe() / the
ActiveTurnInfo response. AgentHarnessService.startTurn passes the
incoming message into register. /chat/active returns it. The chat
hook's resume effect uses active.prompt as the placeholder
turn's userText, so the user bubble shows the queued message text
the moment streaming begins. Falls back to '' for older clients
that haven't been refetched yet.

* fix(agents): always release streamAbortRef on resume cleanup, even when cancelled

Greptile P1 follow-up. The previous `weStartedStream` guard correctly
stopped the resume effect's no-op early-returns from clobbering an
in-flight `send()` stream — but it also stopped a *cancelled*
mid-stream resume from clearing its own `streamAbortRef`. When the
cleanup fires (e.g. the 5s listing poll captures a new queue-drain
turn id while the SSE for the prior turn is still finishing), the
next effect run hits the `if (streamAbortRef.current) return` guard
against the now-aborted controller and never reattaches, leaving
`streaming === true` with no live stream until the user navigates
away.

Split the finally block: always release `streamAbortRef` when we
owned the controller (so the next run can take over), but only
reset the streaming flag / turn id / lastSeq on a clean exit (the
new run will set those itself, so resetting on cancel would just
flicker).
2026-04-30 18:26:56 +05:30
Nikhil
edfc5c751c fix: align OpenClaw gateway image with VM cache (#868)
* fix: load OpenClaw gateway image from VM cache

* fix: use container port for OpenClaw ACP bridge

* fix: address review feedback for PR #868
2026-04-29 12:11:00 -07:00
Nikhil
471256f31c fix: stop passing native permission flags to ACP adapters (#867) 2026-04-29 11:07:51 -07:00
Nikhil
4c90ca696b fix(agents): connect OpenClaw ACP inside gateway container (#866) 2026-04-29 11:07:29 -07:00
Dani Akash
a228c278c6 feat(agents): background-resilient chat — turns survive tab disconnect (#863)
* feat(agents): decouple chat turn lifecycle from SSE response

Introduce a per-process ActiveTurnRegistry that owns each agent turn's
lifecycle and a ring-buffered event stream, so chat tabs that close,
refresh, or navigate away no longer cancel the in-flight turn. New
endpoints:

  POST   /agents/:id/chat          starts a turn (now returns 409 when
                                   one is already running, with the
                                   active turnId for attaching)
  GET    /agents/:id/chat/active   reports the running turn for a UI
                                   that just mounted
  GET    /agents/:id/chat/stream   subscribes to a turn; supports
                                   Last-Event-ID resume via per-event
                                   seq ids
  POST   /agents/:id/chat/cancel   explicit cancel — fetch abort no
                                   longer affects the underlying turn

The chat hook now captures X-Turn-Id, tracks lastSeq from SSE id lines,
re-attaches on mount when the server still has an active turn, and
routes Stop through the cancel endpoint. The runtime call uses the
registry's per-turn AbortController instead of the HTTP request signal,
which is the core decoupling that lets turns outlive their initiator.

* feat(agents): add ActiveTurnRegistry primitive backing the new chat lifecycle

The previous commit referenced these files in tests and the harness
service but global gitignore swallowed them on the first add.

The registry owns the per-turn ring buffer (drop-oldest, terminal frame
preserved), the per-turn AbortController, and subscriber fan-out used
by /chat/stream resume.
2026-04-29 21:01:06 +05:30
Dani Akash
0c84547e8f feat(agents): migrate OpenClaw chat onto the unified harness/ACP path (#859)
* chore(acp): smoke-test ACP capabilities against running gateway

Adds apps/server/scripts/acp-smoke.ts which spawns `openclaw acp`
inside the gateway container and exercises every method we plan to
depend on: initialize, newSession, prompt (text + image), cancel,
listSessions, loadSession.

SDK pinned to 0.19.1 (Bun's minimum-release-age policy blocks 0.20+
which were released < 7 days ago).

Findings (full notes in plan outcomes):
- promptCapabilities advertises image:true but the model does NOT see
  image bytes — silently dropped at the bridge.
- sessionCapabilities advertises {list:{}} but session/list throws
  "Method not found": stale capability advertising.
- loadSession works; replays user/assistant/thought text and
  session_info/usage/commands updates. No tool_call replay, as
  documented.
- cancel works end-to-end: stopReason=cancelled.
- closeSession/resumeSession are not on ClientSideConnection in
  0.19.1; kill child to close, use loadSession for rebind.

Plan revisions triggered by spike are recorded in
plans/browseros-ai/BrowserOS/features/2026-04-28-2310-claude-code-acp-implementation-roadmap.md.

* chore(acp): re-run smoke on SDK 0.21.0 and add mode/config/auth scenarios

After bypassing Bun's minimum-release-age and upgrading the SDK to
0.21.0, restore the previously-skipped resume/close paths and add
three new scenarios: mode (setSessionMode), config (setSessionConfigOption,
correct configId field), and auth (authenticate noop).

Findings, all bridge-side (independent of SDK):
- session/list, session/resume, session/close all throw -32601 on
  OpenClaw 2026.4.12 — capability advertising is stale.
- Image content blocks silently dropped; model never sees the bytes.
- setSessionMode and setSessionConfigOption work; latter requires
  `configId` (not `optionId`) per the schema.
- loadSession replays user/assistant/thought text + session_info +
  usage + available_commands; no tool_call replay (documented).
- authenticate is a noop on OpenClaw (no authMethods advertised).

Plan outcomes updated with full method-support matrix.

* chore(deps): promote @agentclientprotocol/sdk to a runtime dependency

The smoke script in apps/server/scripts/acp-smoke.ts used the SDK as
devDependency. The upcoming ACP bridge (apps/server/src/api/services/acp/)
needs it at runtime, not just for tooling. Move the entry from
devDependencies to dependencies, alphabetically first under @a*.

Pinned to 0.21.0 — same version the smoke script validated against.
README gains a small Dependencies note pointing at the future bridge
location.

No code changes yet. The bridge wiring lands in subsequent commits.

* fix(openclaw): wire LlmProvider.supportsImages through to OpenClaw model config

When BrowserOS sets up a custom OpenAI-compat provider on the gateway,
the agent UI's "Supports Image" flag (LlmProviderConfig.supportsImages)
was being dropped on the floor. As a result the persisted model entry
had no `input` field, OpenClaw defaulted it to ['text'], and image_url
content parts were silently stripped before the model saw them.

Fix:
- Extend OpenClawSetupInput / OpenClawAgentMutationInput on the agent
  side (useOpenClaw.ts) and the route body schema + SetupInput +
  createAgent input on the server side with `supportsImages?: boolean`.
- AgentsPage forwards `llmOption?.supportsImages` from the selected
  LlmProviderConfig in both handleSetup and handleCreate.
- provider-map.resolveSupportedOpenClawProvider emits
  `input: ['text', 'image']` on the model entry when the flag is
  truthy; otherwise emits the explicit `['text']` so the value is
  always pinned (avoids relying on OpenClaw's implicit default).
- applyBrowserosConfig adds `tools.media.image.enabled = true` to the
  bootstrap batch so the gateway's image-understanding pipeline is
  always wired up — per-model `input` still gates which models see
  images, this just enables the global path.

ACP image content blocks are still dropped by the OpenClaw bridge —
that's a separate bridge bug, not addressed here. This commit
restores image support for the OpenAI-compat /v1/chat/completions
path that the upcoming ACP chat panel will use as a carve-out for
image-bearing prompts.

Existing custom-provider configs are NOT auto-migrated; users will
re-acquire image support either by re-running setup or by editing
their model entries' `input` field manually. A migration pass for
legacy installs is not in scope for this commit because the
"supportsImages" intent isn't recoverable from the persisted config
alone — the source of truth is the LlmProvider record on the agent
side.

* feat(agents): add OpenClaw to AgentAdapter union and catalog

Extends AgentAdapter to 'claude' | 'codex' | 'openclaw' and adds the
OpenClaw entry to AGENT_ADAPTER_CATALOG. The new entry has:

- defaultModelId: 'default' — OpenClaw's ACP bridge does not surface
  per-session model selection (verified during the ACP spike), so
  models live in the OpenClawService config, not in the adapter
  catalog. AgentDefinition.modelId carries the gateway-side model
  name for display only.
- models: [] — empty list signals "no per-session model picker" in
  the UI; isSupportedAgentModel('openclaw', undefined|'default')
  returns true via the existing fallback path.
- reasoningEfforts mirror OpenClaw's session-level `thought_level`
  config option (off / minimal / low / medium / high / adaptive).

Also extends:
- isAgentAdapter type guard recognizes 'openclaw'
- HarnessAgentAdapter union on the extension side
- agents.test.ts createAgent fake type
- agent-catalog.test.ts asserts on the new entry, empty model list
  passthrough behavior, and OpenClaw's reasoning effort set

Lockfile delta is the workspace SDK pin reconciling 0.20.0 (taken
from dev's lock) up to our package.json's 0.21.0 (added in
c1d987ea). acpx still uses 0.20.0 transitively — both are present.

No runtime wiring yet — the registry override and AcpxRuntime
plumbing land in subsequent commits.

* feat(agents): plumb OpenClaw gateway accessors into AcpxRuntime

Adds an optional `openclawGateway` accessor to AcpxRuntime so the
upcoming registry override (Step 4) can spawn `openclaw acp` inside
the gateway container with the right port, token, and container/VM
identity. All accessors are getter-shaped so values stay live across
gateway restarts (port can change, token can rotate).

The accessor is threaded:
  server.ts → createAgentRoutes → AgentHarnessService → AcpxRuntime
                            ↘ sidepanel lazy AcpxRuntime

Also adds OpenClawService.getGatewayToken() returning the in-memory
token string. We pass it via OPENCLAW_GATEWAY_TOKEN env var on the
spawn (per OpenClaw's documented env-var precedence) instead of via
`--token` flag (which leaks to ps aux) or `--token-file` path (no
discrete token file lives inside the container — the token is nested
inside openclaw.json).

Wiring is dormant — the registry override that consumes these
accessors lands in Step 4. Typecheck + existing acpx/harness/routes
tests pass unchanged.

* refactor(agents): scrub local plan-step references from code comments

Replaces forward-looking comments that referenced internal plan
steps (e.g. "Step 4 wires this into…") with comments that justify
the code on its own merits. Plan files live locally on the
contributor's machine, so cross-references are noise to the rest of
the project.

No behavior change.

* feat(agents): spawn openclaw ACP adapter inside the gateway container

When the harness resolves the `openclaw` adapter, it now returns a
command that runs `openclaw acp` inside the bundled gateway container
via `limactl shell <vm> -- nerdctl exec -i ... openclaw acp --url
ws://127.0.0.1:<port>`. This reuses the openclaw binary already
installed alongside the gateway — no host-side openclaw install is
required.

Auth: the gateway token is injected via OPENCLAW_GATEWAY_TOKEN on
the container exec rather than `--token` on the openclaw CLI, so
the secret never appears in `ps aux`.

Banner output: OPENCLAW_HIDE_BANNER=1 and OPENCLAW_SUPPRESS_NOTES=1
keep stdout JSON-RPC-clean.

LIMA_HOME: prefixed via `env LIMA_HOME=<path>` on the resolved
command so the spawned limactl finds the BrowserOS-owned VM (the
server doesn't set LIMA_HOME on its own process env).

When the gateway accessor is absent, falls through to acpx's
built-in openclaw adapter which assumes a host-side install — that
branch will fail at spawn time with a descriptive error.

Verified end-to-end via the existing acp-smoke script during the
Step 0 spike.

* feat(agents): dual-create OpenClaw harness agents on the gateway

When the harness creates an `openclaw` adapter agent, it now also
provisions a matching agent on the OpenClaw gateway via the existing
CLI path (OpenClawService.createAgent). Symmetric on delete: gateway
removeAgent runs alongside the harness-store delete.

- Adds an OpenClawProvisioner interface (decoupled from OpenClawService
  for testability) and injects it through AgentHarnessService.
- createAgent rolls back the harness record if gateway provisioning
  fails; deleteAgent tolerates gateway-side failures so harness
  identity stays consistent with the user-facing UI.
- New OpenClawProvisionerUnavailableError surfaces as a 503 when an
  openclaw create request lands on a harness with no provisioner
  wired in (instead of a generic 500).
- FileAgentStore mints openclaw agent ids with an 'oc-' prefix so
  the id satisfies the gateway's `^[a-z][a-z0-9-]*$` agent name
  pattern. Other adapters keep raw UUIDs to preserve compatibility.
- POST /agents body schema accepts providerType / providerName /
  baseUrl / apiKey / supportsImages, forwarded to the provisioner
  when adapter='openclaw'.

The agents-page UI still routes openclaw create through the legacy
/claw/agents flow; switching that path to the harness is a separate
UI cutover.

Tests cover: gateway dual-create on success, rollback on gateway
failure, 503 when provisioner is missing, and tolerant delete on
gateway-side failure.

* fix(agents): skip catalog model validation for OpenClaw adapter

OpenClaw agents resolve their model from the gateway-side provider
config (set at agent-create time via OpenClawService) rather than
from the harness catalog, which has an empty `models: []` entry by
design. Without this carve-out, every OpenClaw create body fails
parsing with "Invalid modelId" because no concrete model id can
satisfy isSupportedAgentModel('openclaw', ...).

The reasoning-effort check still runs against the catalog (those
values map directly to OpenClaw's session `thought_level` config
option).

* fix(agents): pass --session to openclaw bridge so newSession routes correctly

acpx's AcpClient.createSession calls connection.newSession with cwd
and mcpServers but never forwards the sessionKey. Without it, the
openclaw bridge falls back to a synthetic acp:<uuid> session that
doesn't resolve to any provisioned gateway agent — every harness
chat returns a generic "Internal error" from -32603.

Fix: bake `--session <key>` into the resolved spawn command. The
bridge then uses that as the default session key for any newSession
the bridge receives, routing the turn to the matching gateway agent.

Per-session keying means each openclaw agent gets its own
AcpxCoreRuntime instance (cached by sessionKey on top of the
existing cwd/permissionMode key). This adds one extra runtime per
active openclaw session — claude/codex are unaffected.

Test asserts the resolved command includes the right --session arg.

* fix(agents): suppress BrowserOS MCP for openclaw bridge

The openclaw ACP bridge rejects newSession when mcpServers is non-empty
because its provider tooling comes from the gateway, not from ACP-side
MCP servers. Forwarding the BrowserOS HTTP MCP made every harness chat
fail with a JSON-RPC -32603 "Internal error" before the session was even
opened. Claude/codex still need the BrowserOS MCP for browser tooling,
so the carve-out is keyed off whether the runtime is for an openclaw
session.

* feat(agents): route OpenClaw chat through the harness behind a flag

Adds the `feature.useAcpxForOpenClaw` extension storage flag. When on,
OpenClaw agents in the agent-command chat panel use the harness
/agents/<id>/chat SSE and harness history hook instead of the legacy
/claw/agents/<id>/chat. When off, behavior is unchanged.

Also dedupes the agent rail when the same id appears in both stores
(dual-created agents from /claw/agents and /agents) by preferring the
harness entry — without this, every dual-created OpenClaw agent shows
up twice after Step 5.

Image attachments are temporarily disabled when the harness path is
active; the carve-out lands in the next commit.

* fix(agents): keep legacy OpenClaw agents on ClawChat

The previous commit's flag-gated branch routed every `source='openclaw'`
agent through `/agents/<id>/chat` when the flag was on, but the layout
dedup means the only agents that ever reach that branch are legacy
gateway-only entries (`main`, orphan agents from rolled-back creates) —
which by definition have no harness record, so the harness path 404s
and chat is unusable. Source is the only routing signal again: harness
agents go through the harness, legacy agents stay on ClawChat. The
storage flag stays for Step 9/10's migration story.

* feat(agents): expose OpenClaw in sidepanel and route through gateway main

`buildSidepanelChatTargets` now emits a single default ACP target for
adapters with no per-session model picker (OpenClaw, whose model is
configured on the gateway-side agent). Without this, OpenClaw never
appeared in the sidepanel target picker because the catalog entry has
`models: []`.

Sidepanel sessions don't have a dedicated provisioned gateway agent.
The openclaw bridge `--session` flag previously got the raw sidepanel
key (`sidepanel:<convId>:openclaw:...`), which doesn't match any
gateway agent — newSession was accepted but every prompt hung
forever. The bridge command now rewrites non-harness session keys
onto the always-present `main` gateway agent, encoding the original
key as a channel suffix to keep state segregated per conversation.
Verified end-to-end via curl: sidepanel openclaw chat streams
`text-delta` + `finish: stop`.

* feat(agents): backfill harness records for legacy gateway agents

Reframes Step 9 of the OpenClaw-on-acpx migration. The plan's literal
Step 9 (route OpenClaw history through the harness when the flag is on)
was already a no-op after the Step 6 walkback — history is routed by
source today. The actual blocker for Steps 10–13 was that legacy
gateway-only agents (e.g. `main`, orphans from rolled-back creates) had
no harness record, so they could never migrate to the harness path
without breaking chat.

`AgentHarnessService.reconcileWithGateway()` now lists every gateway
agent and upserts a matching harness record for any that are missing.
The pass runs lazily on first `listAgents()` call (memoized on success,
retried on failure so a gateway-down boot doesn't permanently disable
backfill). Verified end-to-end: the legacy `agent` agent now streams
`text_delta` + `done(end_turn)` through `/agents/agent/chat`, with the
bridge resolving to the gateway's `agent` record via the existing
`agent:<name>:main` session-key format.

After this, every OpenClaw agent surfaces as `source='agent-harness'`
post-dedup, the legacy `useClawChatHistory` hook becomes unreachable
for OpenClaw, and Steps 11–13 (delete legacy chat/history paths) are
unblocked.

* fix(agents): drop duplicate OpenClaw entry from NewAgentDialog adapter list

The adapter Select hardcoded an `<SelectItem value="openclaw">OpenClaw</SelectItem>`
on top of iterating `adapters`, which now includes OpenClaw post the
catalog change. The dropdown rendered "OpenClaw" twice — once at the
top, once at the bottom of the list. The literal was a pre-catalog
artifact; removing it leaves a single OpenClaw entry sourced from the
catalog. Routing into `handleOpenClawCreate` is unchanged because
the value (`'openclaw'`) is identical either way.

* fix(agents): always reconcile harness with gateway on list, just dedupe concurrent calls

Memoizing the first successful reconcile meant new gateway agents (created
via the legacy /claw/agents path or out-of-band CLI) never appeared in the
harness until server restart. The Promise now serves as a concurrent-call
dedupe only — cleared on settle — so every listAgents call picks up the
current gateway state. Reconcile is one cheap idempotent CLI call.

* chore(agents): remove dormant useAcpxForOpenClaw flag

The flag was scaffolded in Step 6 but its routing effect was walked
back the same day after it broke chat for legacy gateway-only agents.
After the Step 9 backfill, every OpenClaw agent has a harness record
and routes through the harness path purely from `source='agent-harness'`
— no flag is consulted anywhere. Remove the dead storage item, hook,
and stale comment.

* refactor(agents): drop legacy /claw/agents/:id/history endpoint

The harness /agents/:id/sessions/main/history endpoint replaced this
once every OpenClaw agent got a harness record (Step 9 backfill).
Routing is fully source-driven now, so the UI's useClawChatHistory
hook is never enabled today — verified live: legacy URL returns 404,
harness history hydrates correctly for the same agent.

Removes the GET /claw/agents/:id/history route, OpenClawService's
getAgentHistoryPage method plus its cursor/limit helpers and the
history-only types it owned (BrowserOSOpenClawHistoryPageResponse,
HistoryPageInput, normalizeHistoryLimit, encodeHistoryCursor,
decodeHistoryCursor, jsonlEventsToHistoryItems), and the route +
service tests that covered the dropped endpoint.

OpenClawJsonlReader stays alive — still feeds /claw/dashboard,
/claw/agents/:id/sessions, and the boot-time clawSession seed.
Removing those is its own follow-up since the dashboard would need
a harness-side replacement first.

* feat(agents): wire image attachments through the harness ACP path

Composer attachments now flow into the ACP `prompt` request as
spec-compliant `image` content blocks alongside the user's text. End
to end:

  composer → chatWithHarnessAgent({attachments}) →
  POST /agents/:id/chat {message, attachments} →
  parseChatBody decodes data: URLs to {mediaType, base64} →
  AgentHarnessService.send forwards →
  AcpxRuntime.send forwards →
  acpx startTurn({attachments}) → ACP image blocks

UI no longer disables the attach button on harness agents — the
gating was just a placeholder before this commit landed. Verified
end to end with a 1×1 red PNG against a Claude harness agent: model
replies "Red." correctly.

OpenClaw's `acp` bridge still drops image content blocks upstream
(verified by the same probe — Kimi-k2p5 reports "I don't see an
image"). That's an upstream openclaw limitation, not a harness-side
gap; Claude/Codex agents work as advertised today.

* chore(openclaw): delete OpenClawJsonlReader and JSONL-backed routes

* chore(openclaw): remove legacy /claw/agents/:id/chat and /queue routes

* chore(agents): collapse chat panel to harness-only path

* feat(agents): route OpenClaw image turns through the gateway HTTP client

The OpenClaw `acp` bridge silently drops ACP `image` content blocks
(verified during dogfood — model says "I don't see an image"). When
the user attaches images to an OpenClaw agent, the harness now diverts
that turn to the gateway's HTTP `/v1/chat/completions` endpoint, which
accepts OpenAI-style `image_url` parts and forwards them natively to
the provider.

  - New `OpenClawGatewayChatClient` translates an OpenAI streaming
    response into the same `AgentStreamEvent` shape the rest of the
    harness already consumes, so the chat panel renders identically
    whether a turn went through ACP or the gateway carve-out.
  - `AcpxRuntime.send` forks at the top: openclaw + any image
    attachment + a wired gateway client → `sendOpenclawViaGateway`.
    Other turns (text-only openclaw, claude, codex) take the existing
    ACP path unchanged.
  - The diverted path reads the prior turn history from the acpx
    session record so context is preserved, builds the OpenAI
    multimodal user message with text + image_url parts, and pumps
    the gateway SSE back to the caller through a tee that accumulates
    the assistant text. On natural completion, persists a synthetic
    user+assistant message pair to the acpx session record so reload
    shows the image turn in history.
  - Wired `OpenClawGatewayChatClient` into `AgentHarnessService` via
    `server.ts` (gateway port + token accessor, just like the existing
    `openclawGateway`).

Persistence note: the acpx record requires User messages to carry an
`id` and Agent messages to carry `tool_results` — without them the
record fails to round-trip through `parseSessionRecord`. The persist
helper now sets both.

Limitation by design: image recognition only works if the OpenClaw
agent's provider supports vision (e.g. Claude-via-OpenClaw, GPT-4o).
The pipeline routes images correctly to the provider regardless;
text-only providers like Kimi-k2p5 will reply "I don't see an image"
because the model itself has no vision capability — that's a provider
config issue, not a routing bug. The unit test asserts the image_url
part is present in the OpenAI request the gateway client sends.

The wider plan (background-resilient chat, queue, replay) remains in
`plans/.../2026-04-29-1527-...-background-resilient-chat-and-image-uploads.md`
as Phases 3–12; this commit ships only Phases 1–2.

* feat(agents): validate inbound image attachments on /agents/:id/chat

The harness chat body parser was accepting any mediaType and any
dataUrl length. The composer enforces these caps client-side but the
endpoint also serves direct curl/script callers, so the server has to
defend itself.

Restores the same caps the legacy /claw/agents/:id/chat parser had
before it was deleted in the migration:

  - 10 attachments per message
  - 5 MB raw image bytes (≈ 6.7 MB once base64-encoded plus prefix)
  - PNG / JPEG / WebP / GIF only
  - Must start with `data:`

Each violation returns 400 with a specific error message instead of
silently dropping or forwarding the payload.
2026-04-29 16:37:03 +05:30
Nikhil
2ff5c12840 feat: add sidepanel ACP chat targets (#857)
* feat(agent): add sidepanel chat target catalog

* feat(agent): show acp models in sidepanel selector

* feat(server): adapt acp events to ui message streams

* feat(server): add sidepanel acp chat route

* feat(agent): route sidepanel chat through acp targets

* chore: self-review fixes

* fix: address review feedback for PR #857
2026-04-28 18:23:38 -07:00
Nikhil
d87422eea1 fix: hide BrowserOS ACP wrapper in history (#856) 2026-04-28 17:31:11 -07:00
Nikhil
cb32b8191d fix: show rich ACP harness history from ACPX (#852)
* fix: load ACP harness history from ACPX

* fix: address ACP history review comments
2026-04-28 16:40:22 -07:00
Nikhil
7a92654abc feat: add BrowserOS MCP to ACP agents (#851)
* feat: add BrowserOS MCP to ACP agents

* fix: bypass ACP agent permissions

* fix: address review feedback for PR #851
2026-04-28 16:30:20 -07:00
Nikhil
91d3285aa0 feat: add ACP agent harness (#849)
* feat: add acp agent runtime spike

* feat: add agent harness catalog

* feat: persist harness agents in json

* feat: persist agent transcripts

* feat: route harness service through agent records

* feat: expose generic agent harness routes

* feat: add harness agent frontend api

* feat: create harness agents from agents page

* feat: chat with persisted harness agents

* chore: remove obsolete agent profile spike

* chore: self-review fixes

* fix: combine openclaw and harness agents UI

* refactor: split agents page components

* fix: hide persisted harness turns
2026-04-28 15:29:38 -07:00
Neel Gupta
4284e88625 feat: Implement lazy LLM judge for passive monitoring (#777)
* fix: double close on stream controller

* feat: initial lazy llm judge impl

* feat: added regex-based matching to insert button context

* fix: tests & bugfix

fix: redundant truthiness check

* fix(tests): stabilize server suites on dev
2026-04-25 12:52:41 +01:00
Nikhil
d189b50b03 fix: package bundled Lima guest agent (#813)
* fix(build): upload Lima runtime files

* fix(build): stage Lima prefix resources

* fix(vm): resolve bundled Lima prefix

* docs(build): document Lima runtime packaging

* chore: self-review fixes

* fix: address review feedback for PR #813
2026-04-24 12:03:26 -07:00
Nikhil
a407e48209 Prefetch runtime VM cache (#811)
* feat: add runtime vm cache sync

* feat: configure runtime vm cache sync

* feat: prefetch vm cache on startup

* feat: await vm cache before vm startup

* fix: recheck vm cache after prefetch wait

* fix: address vm cache review feedback

* build(server): require VM cache manifest env
2026-04-24 10:41:20 -07:00
Nikhil
c6c902a4ab feat: improve dev watch Lima preflights (#802)
* feat: improve dev watch lima preflights

* fix: note vm cache sync duration

* fix: address review feedback for PR #802
2026-04-23 17:16:50 -07:00
Nikhil
0288cc040d feat: use rootless nerdctl in BrowserOS VM (#800)
* feat: use rootless nerdctl in BrowserOS VM

* fix: validate openclaw gateway auth before reuse

* fix: forward rootless containerd socket

* fix: address VM review comments
2026-04-23 16:36:51 -07:00
Nikhil
d1a3d67e29 chore(dev): add VM cache setup flow (#798) 2026-04-23 15:47:00 -07:00
Nikhil
35134518f0 fix(vm): use system nerdctl in Lima runtime (#797) 2026-04-23 15:47:00 -07:00
Nikhil Sonti
4083155e81 feat(container): migrate container runtime to nerdctl over Lima VM
Replace the podman-based runtime with nerdctl running inside the Lima
VM introduced in the previous commit. OpenClaw is cut over to the new
VM-backed container runtime; legacy podman code paths are removed.

- New container CLI (lib/container): nerdctl ContainerCli, ImageLoader
  with cache-tarball fallback, shared types
- OpenClaw: container-runtime-factory orchestrates VM lifecycle + gateway
  startup; container-runtime.ts rewritten to speak nerdctl; Linux test
  startup kept disabled behind the factory
- Terminal: session + routes moved onto Lima shell transport; server
  wires the VM-backed runtime via main.ts
- Agent UI: simplify AgentsPage/useOpenClaw after route consolidation
- Remove podman-runtime, podman-overrides, and their tests
- Tests: container-cli, image-loader, container-runtime-factory, and
  updated openclaw/terminal/main suites
2026-04-23 15:46:50 -07:00
Nikhil Sonti
72ef4f068e feat(vm): add Lima-based BrowserOS VM runtime
Introduce a new VM runtime layer using Lima for running containerised
workloads on macOS. Lifecycle covers decompress/create/start/stop with
stubs for upgrade/reset plus version-mismatch warnings.

- Foundation modules: paths, errors, manifest, telemetry
- lima.yaml generator + typed limactl wrapper with structured debug logging
- ssh ControlMaster transport for fast in-VM commands
- Ubuntu 24.04 minimal template, containerd default, 30GiB overlay disk
- browseros-dir helpers (getLimaHomeDir, getVmStateDir, getVmDisksDir);
  OpenClaw dir moves into VM state dir
- Test helpers (fake-limactl, fake-ssh, test-env), vm-smoke integration
  coverage, NODE_ENV propagation for spawned server test groups
2026-04-23 15:46:25 -07:00