BrowserOS

LLM/BrowserOS

Fork 0

mirror of https://github.com/browseros-ai/BrowserOS.git synced 2026-05-21 21:05:09 +00:00

Commit Graph

Author SHA1 Message Date

Author	SHA1	Message	Date
Dani Akash	4e405681a7	feat(container): richen ManagedContainer — isImageCurrent + logs + sibling-exec (#968 ) * feat(container): add isImageCurrent + getLogs + tailLogs + runOneShot to ManagedContainer Four base-class additions ahead of the OpenClaw runtime migration so the upcoming subclass doesn't have to re-implement them: - isImageCurrent() — pure predicate comparing the existing container's image ref to descriptor.defaultImage. Treats SHA-pinned variants as matches. start() is unchanged; subclasses + service layers compose the predicate where they want short-circuit behaviour. - getLogs(tail) and tailLogs(onLine) — generic log primitives, thin pass-throughs to ContainerCli. - runOneShot(argv, opts) — sibling-container helper that spawns a <name>-setup container with the same image+mounts+env (no ports/ health/restart), runs argv, force-removes after. Includes the retry-on-name-collision behaviour previously bespoke to OpenClaw. Hermes inherits unused surface only — no behavioural change. The in-flight base-class tests cover all four primitives. * fix(container): tighten getLogs error path + close runOneShot timeout-onLog leak; trim docstrings - getLogs now distinguishes a missing container (returns []) from other CLI failures (throws). Previously nerdctl's stderr ("Error: no such container: …") leaked into the lines array as if it were log output. isNoSuchContainer is exported from container-cli to share the predicate. - runWithOptionalTimeout wraps the caller's onLog so post-timeout lines from the abandoned runCommand promise become no-ops; before this, callers could see onLog fire after runOneShot had already rejected, hitting state the caller may have torn down on the timeout error. - Tightens the new docstrings to one short line per the project convention; drops a restating comment in the test file.	2026-05-08 15:58:05 +05:30
Dani Akash	d68e8905fe	refactor(hermes): migrate Hermes onto ContainerAgentRuntime (#965 ) * feat(runtime): add HermesContainerRuntime + factory * refactor(hermes): switch wire-up + dispatch to runtime registry main.ts and the agent route stack now resolve Hermes through `AgentRuntimeRegistry`. Drops the `hermesGateway` plumbing chain (server.ts → routes → harness → AcpxRuntime), the `HermesGatewayAccessor` interface, and `resolveHermesAcpCommand`. Removes `HermesContainerService`, `HermesContainer`, and `prepareHermesContext`'s standalone module — their behaviour is now owned by `HermesContainerRuntime`. * test(runtime): cover HermesContainerRuntime descriptor + lifecycle + factory * test(runtime): move registry reset to afterEach to survive assertion failures	2026-05-08 11:32:19 +05:30
Dani Akash	805ae8e607	feat(server): ManagedContainer abstraction — Hermes readiness gate + ACP layering fix (#962 ) * feat(container): add waitForContainerRunning primitive + typed error Adds `ContainerCli.waitForContainerRunning(name, opts)` polling `inspectContainer().running === true` until either the container reports running or the timeout expires. Distinct from the existing `waitForContainerNameRelease` (which waits for deletion). Used by the upcoming managed-container layer between `nerdctl create + start` and "container is ready for exec" so the harness never spawns a turn against a half-started container — which is the root cause of the silent first-turn failure on Hermes today (`hermes-container.ts:130-160` returns immediately after start). Defaults sized for cold-start: 30s budget at 500ms cadence. Throws `ContainerNotRunningError` (new, in `lib/vm/errors.ts`) on timeout — distinct from `ContainerNameReleaseTimeoutError` so callers can branch on "didn't come up" vs "didn't get cleaned up". * feat(container): add ManagedContainer abstract base + state machine Introduces the abstract base every container-backed agent adapter will subclass. Owns the canonical state machine (not_installed \| installing \| installed \| starting \| running \| stopped \| errored), the lifecycle lock (per-process promise chain + cross-process file lock), the gated `execute` family, and the host↔container path translator. Subclasses provide only what's actually adapter-specific: - `descriptor` (image, container name, supported platforms) - `buildContainerSpec()` for the `nerdctl create` args - `readinessProbe()` after the container reaches running - `mountRoots()` for the path translator Three execute methods, all sharing one invariant — every entry point gates on state == running: - `execProcess(spec)` spawns a long-lived child process via Bun, waits through `starting` up to 60s, throws typed `ContainerNotReadyError` if the container is not_installed / stopped / errored / timed out. - `execOneShot(spec)` is a buffered convenience wrapper. - `buildExecArgv(spec)` is the pure builder for callers (acpx-core) that need a shell-command string. Single source of truth for the `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` chain that today's ACP runtime hand-rolls in two places (`acpx-runtime .ts:780-820` and `:823-870`). `reset(level)` is on the API surface but throws `ResetNotSupportedError` so the next PR can wire soft / wipe-agent / hard without revving the abstract class. Path translator uses lexical containment against declared mount roots; the realpath-based symlink-escape check lives one layer up (in the file-attribution code that already shipped) since the translator itself never reads from disk. feat(container): HermesContainer subclass + wrapper-service bridge `HermesContainer` (lib/container/managed/) is the first concrete adapter on the new `ManagedContainer` base. Provides the four bits that are actually adapter-specific: - `descriptor`: image, container name, supported platforms, readiness-probe tuning. - `mountRoots()`: host↔container path mapping for the harness dir. - `buildContainerSpec()`: nerdctl create args (env, mounts, add-hosts, entrypoint override). - `readinessProbe()`: execs `hermes --version` inside the freshly-started container; bypasses the state gate via `cli.exec` since we're in `starting`, not `running`, when the probe runs. `HermesContainerService` (api/services/hermes/) is rewritten as a thin wrapper that delegates `prewarm` / `start` / `stop` / `restart` / `shutdown` to the underlying `HermesContainer`. Public surface is preserved so `main.ts`, `server.ts`, and `agent-harness-service` compile unchanged in this PR; `getAccessor()` still returns the structural `HermesAccessor` the ACP runtime expects today (the runtime swap is the next commit). The wrapper also exposes `getContainer(): HermesContainer \| null` for callers that want the richer surface. The user-visible bug — Hermes silent first-turn failure — is fixed as a side effect: `start()` now waits through `cli.waitForContainerRunning` and runs the `hermes --version` readiness probe before transitioning to `running`. Subsequent chat turns are gated on the container actually being ready, not just on `nerdctl create + start` having returned. * feat(agent): ACP runtime spawns Hermes via ManagedContainer.buildExecArgv `resolveHermesAcpCommand` no longer hand-rolls the `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` chain. It now delegates to `gateway.buildExecArgv`, which the wrapper service routes to the underlying `ManagedContainer.buildExecArgv`. The structural `HermesGatewayAccessor` type gains one method (`buildExecArgv`) — keeps the existing four getters so any test/legacy caller still works. The wrapper's `getAccessor()` delegates `buildExecArgv` to its `HermesContainer`. Net effect: the `limactl shell ... -- nerdctl exec ...` argv chain has exactly one owner (`ManagedContainer.buildExecArgv` in the container layer) instead of being duplicated across `acpx-runtime` and the now-deleted hand-built chain. The OpenClaw branch (`resolveOpenclawAcpCommand`) is untouched — its migration to ManagedContainer is a separate, larger PR that also has to model the gateway / control-plane surfaces. Tests: the existing acpx-runtime test suite expected the four old getters; updated the Hermes-container fixture to also provide `buildExecArgv` (mirrors the production builder inline so the test stays independent of the production class wiring). All 320 server tests pass. * test(container): managed-container + hermes-container coverage 20 cases across two files in `tests/lib/container/managed/`. ManagedContainer base (14 cases): - State machine: start() walks installing → starting → running; probe-false lands errored with lastError populated; stop() force-transitions to stopped even from errored. - execProcess gating: rejects ContainerNotReadyError with reason='not_installed' when never started; reason='errored' when in errored state (preserving lastError); resolves once state flips to running while waiting; reason='timeout' when starting never resolves. - buildExecArgv: snapshot test pinning the exact canonical `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` string for the Hermes-shaped invocation; -e flags omitted when env is empty. - reset(level): throws ResetNotSupportedError for all three levels (Phase 1 stub). - Path translation: round-trip host ↔ container under a declared mount; mount-root itself translates without suffix; rejects PathOutsideMountsError for /etc/passwd / /proc/cpuinfo. - subscribeState fires every transition, stops after unsubscribe. HermesContainer subclass (6 cases): - Descriptor declares adapterId='hermes', the canonical container name, image, and darwin platform support. - start() happy path reaches running + invokes the `hermes --version` probe via cli.exec. - Probe-non-zero start() lands errored with the right error. - ContainerSpec built with idle entrypoint, harness bind-mount (source = /mnt/browseros/vm/hermes/harness, target = HERMES_CONTAINER_HARNESS_DIR), and host.containers.internal add-host pointing at the VM gateway. - toContainerPath maps host harness paths to /data/agents/harness. - buildExecArgv produces the canonical Hermes ACP spawn string with LIMA_HOME, container name, hermes binary path, and -e env. Pre-existing test in tests/lib/container/container-cli.test.ts (`waits until a container name is no longer resolvable`) flakes under parallel test load on dev; passes solo. Last touched in `fd5aba24`, well before this branch. * chore: tidy comments * fix(hermes): use provider:custom for openai + openai-compatible Hermes (v2026.4.x) does not have a provider key called "openai" — its `PROVIDER_REGISTRY` enumerates 33 named providers (anthropic, deepseek, gemini, kimi-coding, etc.) and "openai" is not one of them. Per the upstream docs, the canonical shape for any OpenAI-compatible endpoint with an API key is: model: provider: custom base_url: "<endpoint>" When `base_url` is set, Hermes ignores provider lookup and calls the URL directly using OPENAI_API_KEY (or the configured api_key). Today's mapping wrote `provider: "openai"` for both BrowserOS provider types — Hermes' main-model loader rejected that with `unknown provider 'openai'`, and the harness surfaced an opaque "Internal error" on every first chat for any Hermes agent backed by a Fireworks / Together / Groq / OpenAI provider. Fix: - `openai` and `openai-compatible` BrowserOS types now both map to `hermesProvider: 'custom'`. - HermesProviderMapping gains an optional `defaultBaseUrl` field used when `provider: 'custom'` is set with no caller-supplied baseUrl (BrowserOS' `openai` type doesn't require base_url at the API edge, but Hermes' `custom` always does — so we fall back to https://api.openai.com/v1). - writeHermesPerAgentProvider rejects `provider: 'custom'` with no base_url so a future regression fails loudly instead of silently writing an unusable config.yaml. Tests updated: the existing openai-compatible case now asserts `provider: "custom"` instead of `"openai"`, plus a new case covering the openai-default-base-url fallback path. Note: the `openrouter` mapping is left untouched because its fix is unverified (Hermes' PROVIDER_REGISTRY doesn't appear to contain "openrouter" either, but the auxiliary fallback chain recognises it). Worth a separate follow-up — out of scope for this fix which targets the user-reported reproduction. * fix(container): install() must ensure VM is ready before image pull Image operations run inside the Lima VM, so `nerdctl pull` fails on a cold-boot run if the VM hasn't been started yet. `HermesContainerService.prewarm()` (the original wrapper) always called `vm.ensureReady()` before `ensureImageLoaded()` — the wrapper-bridge introduced earlier in this PR delegated `prewarm()` to `container.install()` and dropped the VM-ensure step. `start()` does ensure VM, but on cold boot `prewarm()` and `start()` race for the lifecycle lock and there is no guarantee which one wins. When `prewarm()` lands first, the image pull crashes against an unstarted VM and Hermes never comes up. Fix: `install()` now awaits `deps.vm.ensureReady()` before transitioning to `installing`. Errors land in `errored` exactly as before. New regression test pins the call order (`vm.ensureReady` → `loader.ensureImageLoaded`) so a future edit can't silently re-introduce the gap.	2026-05-08 08:14:45 +05:30

Dani Akash

4e405681a7

feat(container): richen ManagedContainer — isImageCurrent + logs + sibling-exec (#968 )

* feat(container): add isImageCurrent + getLogs + tailLogs + runOneShot to ManagedContainer

Four base-class additions ahead of the OpenClaw runtime migration so
the upcoming subclass doesn't have to re-implement them:

- isImageCurrent() — pure predicate comparing the existing container's
  image ref to descriptor.defaultImage. Treats SHA-pinned variants as
  matches. start() is unchanged; subclasses + service layers compose
  the predicate where they want short-circuit behaviour.
- getLogs(tail) and tailLogs(onLine) — generic log primitives, thin
  pass-throughs to ContainerCli.
- runOneShot(argv, opts) — sibling-container helper that spawns a
  <name>-setup container with the same image+mounts+env (no ports/
  health/restart), runs argv, force-removes after. Includes the
  retry-on-name-collision behaviour previously bespoke to OpenClaw.

Hermes inherits unused surface only — no behavioural change. The
in-flight base-class tests cover all four primitives.

* fix(container): tighten getLogs error path + close runOneShot timeout-onLog leak; trim docstrings

- getLogs now distinguishes a missing container (returns []) from
  other CLI failures (throws). Previously nerdctl's stderr ("Error:
  no such container: …") leaked into the lines array as if it were
  log output. isNoSuchContainer is exported from container-cli to
  share the predicate.
- runWithOptionalTimeout wraps the caller's onLog so post-timeout
  lines from the abandoned runCommand promise become no-ops; before
  this, callers could see onLog fire after runOneShot had already
  rejected, hitting state the caller may have torn down on the
  timeout error.
- Tightens the new docstrings to one short line per the project
  convention; drops a restating comment in the test file.

2026-05-08 15:58:05 +05:30

Dani Akash

d68e8905fe

refactor(hermes): migrate Hermes onto ContainerAgentRuntime (#965 )

* feat(runtime): add HermesContainerRuntime + factory

* refactor(hermes): switch wire-up + dispatch to runtime registry

main.ts and the agent route stack now resolve Hermes through
`AgentRuntimeRegistry`. Drops the `hermesGateway` plumbing chain
(server.ts → routes → harness → AcpxRuntime), the
`HermesGatewayAccessor` interface, and `resolveHermesAcpCommand`.
Removes `HermesContainerService`, `HermesContainer`, and
`prepareHermesContext`'s standalone module — their behaviour is now
owned by `HermesContainerRuntime`.

* test(runtime): cover HermesContainerRuntime descriptor + lifecycle + factory

* test(runtime): move registry reset to afterEach to survive assertion failures

2026-05-08 11:32:19 +05:30

Dani Akash

805ae8e607

feat(server): ManagedContainer abstraction — Hermes readiness gate + ACP layering fix (#962 )

* feat(container): add waitForContainerRunning primitive + typed error

Adds `ContainerCli.waitForContainerRunning(name, opts)` polling
`inspectContainer().running === true` until either the container
reports running or the timeout expires. Distinct from the existing
`waitForContainerNameRelease` (which waits for *deletion*).

Used by the upcoming managed-container layer between
`nerdctl create + start` and "container is ready for exec" so the
harness never spawns a turn against a half-started container —
which is the root cause of the silent first-turn failure on Hermes
today (`hermes-container.ts:130-160` returns immediately after
start).

Defaults sized for cold-start: 30s budget at 500ms cadence.
Throws `ContainerNotRunningError` (new, in `lib/vm/errors.ts`) on
timeout — distinct from `ContainerNameReleaseTimeoutError` so
callers can branch on "didn't come up" vs "didn't get cleaned up".

* feat(container): add ManagedContainer abstract base + state machine

Introduces the abstract base every container-backed agent adapter
will subclass. Owns the canonical state machine (not_installed |
installing | installed | starting | running | stopped | errored),
the lifecycle lock (per-process promise chain + cross-process file
lock), the gated `execute*` family, and the host↔container path
translator.

Subclasses provide only what's actually adapter-specific:
- `descriptor` (image, container name, supported platforms)
- `buildContainerSpec()` for the `nerdctl create` args
- `readinessProbe()` after the container reaches running
- `mountRoots()` for the path translator

Three execute methods, all sharing one invariant — every entry
point gates on state == running:

- `execProcess(spec)` spawns a long-lived child process via Bun,
  waits through `starting` up to 60s, throws typed
  `ContainerNotReadyError` if the container is not_installed /
  stopped / errored / timed out.
- `execOneShot(spec)` is a buffered convenience wrapper.
- `buildExecArgv(spec)` is the pure builder for callers (acpx-core)
  that need a shell-command string. Single source of truth for the
  `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` chain
  that today's ACP runtime hand-rolls in two places (`acpx-runtime
  .ts:780-820` and `:823-870`).

`reset(level)` is on the API surface but throws
`ResetNotSupportedError` so the next PR can wire soft / wipe-agent
/ hard without revving the abstract class.

Path translator uses lexical containment against declared mount
roots; the realpath-based symlink-escape check lives one layer up
(in the file-attribution code that already shipped) since the
translator itself never reads from disk.

* feat(container): HermesContainer subclass + wrapper-service bridge

`HermesContainer` (lib/container/managed/) is the first concrete
adapter on the new `ManagedContainer` base. Provides the four bits
that are actually adapter-specific:

- `descriptor`: image, container name, supported platforms,
  readiness-probe tuning.
- `mountRoots()`: host↔container path mapping for the harness dir.
- `buildContainerSpec()`: nerdctl create args (env, mounts,
  add-hosts, entrypoint override).
- `readinessProbe()`: execs `hermes --version` inside the
  freshly-started container; bypasses the state gate via
  `cli.exec` since we're in `starting`, not `running`, when the
  probe runs.

`HermesContainerService` (api/services/hermes/) is rewritten as a
thin wrapper that delegates `prewarm` / `start` / `stop` /
`restart` / `shutdown` to the underlying `HermesContainer`. Public
surface is preserved so `main.ts`, `server.ts`, and
`agent-harness-service` compile unchanged in this PR; `getAccessor()`
still returns the structural `HermesAccessor` the ACP runtime
expects today (the runtime swap is the next commit). The wrapper
also exposes `getContainer(): HermesContainer | null` for callers
that want the richer surface.

The user-visible bug — Hermes silent first-turn failure — is fixed
as a side effect: `start()` now waits through
`cli.waitForContainerRunning` and runs the `hermes --version`
readiness probe before transitioning to `running`. Subsequent
chat turns are gated on the container actually being ready, not
just on `nerdctl create + start` having returned.

* feat(agent): ACP runtime spawns Hermes via ManagedContainer.buildExecArgv

`resolveHermesAcpCommand` no longer hand-rolls the
`env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` chain.
It now delegates to `gateway.buildExecArgv`, which the wrapper
service routes to the underlying `ManagedContainer.buildExecArgv`.

The structural `HermesGatewayAccessor` type gains one method
(`buildExecArgv`) — keeps the existing four getters so any
test/legacy caller still works. The wrapper's `getAccessor()`
delegates `buildExecArgv` to its `HermesContainer`. Net effect:
the `limactl shell ... -- nerdctl exec ...` argv chain has
exactly one owner (`ManagedContainer.buildExecArgv` in the
container layer) instead of being duplicated across `acpx-runtime`
and the now-deleted hand-built chain.

The OpenClaw branch (`resolveOpenclawAcpCommand`) is untouched —
its migration to ManagedContainer is a separate, larger PR that
also has to model the gateway / control-plane surfaces.

Tests: the existing acpx-runtime test suite expected the four
old getters; updated the Hermes-container fixture to also
provide `buildExecArgv` (mirrors the production builder inline so
the test stays independent of the production class wiring). All
320 server tests pass.

* test(container): managed-container + hermes-container coverage

20 cases across two files in `tests/lib/container/managed/`.

ManagedContainer base (14 cases):
- State machine: start() walks installing → starting → running;
  probe-false lands errored with lastError populated; stop()
  force-transitions to stopped even from errored.
- execProcess gating: rejects ContainerNotReadyError with
  reason='not_installed' when never started; reason='errored'
  when in errored state (preserving lastError); resolves once
  state flips to running while waiting; reason='timeout' when
  starting never resolves.
- buildExecArgv: snapshot test pinning the exact canonical
  `env LIMA_HOME=… limactl shell <vm> -- nerdctl exec -i …` string
  for the Hermes-shaped invocation; -e flags omitted when env is
  empty.
- reset(level): throws ResetNotSupportedError for all three
  levels (Phase 1 stub).
- Path translation: round-trip host ↔ container under a declared
  mount; mount-root itself translates without suffix; rejects
  PathOutsideMountsError for /etc/passwd / /proc/cpuinfo.
- subscribeState fires every transition, stops after unsubscribe.

HermesContainer subclass (6 cases):
- Descriptor declares adapterId='hermes', the canonical container
  name, image, and darwin platform support.
- start() happy path reaches running + invokes the
  `hermes --version` probe via cli.exec.
- Probe-non-zero start() lands errored with the right error.
- ContainerSpec built with idle entrypoint, harness bind-mount
  (source = /mnt/browseros/vm/hermes/harness, target =
  HERMES_CONTAINER_HARNESS_DIR), and host.containers.internal
  add-host pointing at the VM gateway.
- toContainerPath maps host harness paths to /data/agents/harness.
- buildExecArgv produces the canonical Hermes ACP spawn string
  with LIMA_HOME, container name, hermes binary path, and -e env.

Pre-existing test in tests/lib/container/container-cli.test.ts
(`waits until a container name is no longer resolvable`) flakes
under parallel test load on dev; passes solo. Last touched in
fd5aba24, well before this branch.

* chore: tidy comments

* fix(hermes): use provider:custom for openai + openai-compatible

Hermes (v2026.4.x) does not have a provider key called "openai" —
its `PROVIDER_REGISTRY` enumerates 33 named providers (anthropic,
deepseek, gemini, kimi-coding, etc.) and "openai" is not one of
them. Per the upstream docs, the canonical shape for any
OpenAI-compatible endpoint with an API key is:

    model:
      provider: custom
      base_url: "<endpoint>"

When `base_url` is set, Hermes ignores provider lookup and calls
the URL directly using OPENAI_API_KEY (or the configured api_key).
Today's mapping wrote `provider: "openai"` for both BrowserOS
provider types — Hermes' main-model loader rejected that with
`unknown provider 'openai'`, and the harness surfaced an opaque
"Internal error" on every first chat for any Hermes agent backed
by a Fireworks / Together / Groq / OpenAI provider.

Fix:
- `openai` and `openai-compatible` BrowserOS types now both map
  to `hermesProvider: 'custom'`.
- HermesProviderMapping gains an optional `defaultBaseUrl` field
  used when `provider: 'custom'` is set with no caller-supplied
  baseUrl (BrowserOS' `openai` type doesn't require base_url at
  the API edge, but Hermes' `custom` always does — so we fall
  back to https://api.openai.com/v1).
- writeHermesPerAgentProvider rejects `provider: 'custom'` with
  no base_url so a future regression fails loudly instead of
  silently writing an unusable config.yaml.

Tests updated: the existing openai-compatible case now asserts
`provider: "custom"` instead of `"openai"`, plus a new case
covering the openai-default-base-url fallback path.

Note: the `openrouter` mapping is left untouched because its
fix is unverified (Hermes' PROVIDER_REGISTRY doesn't appear to
contain "openrouter" either, but the auxiliary fallback chain
recognises it). Worth a separate follow-up — out of scope for
this fix which targets the user-reported reproduction.

* fix(container): install() must ensure VM is ready before image pull

Image operations run inside the Lima VM, so `nerdctl pull` fails
on a cold-boot run if the VM hasn't been started yet.
`HermesContainerService.prewarm()` (the original wrapper) always
called `vm.ensureReady()` before `ensureImageLoaded()` — the
wrapper-bridge introduced earlier in this PR delegated `prewarm()`
to `container.install()` and dropped the VM-ensure step.

`start()` does ensure VM, but on cold boot `prewarm()` and
`start()` race for the lifecycle lock and there is no guarantee
which one wins. When `prewarm()` lands first, the image pull
crashes against an unstarted VM and Hermes never comes up.

Fix: `install()` now awaits `deps.vm.ensureReady()` before
transitioning to `installing`. Errors land in `errored` exactly
as before. New regression test pins the call order
(`vm.ensureReady` → `loader.ensureImageLoaded`) so a future edit
can't silently re-introduce the gap.

2026-05-08 08:14:45 +05:30

3 Commits