fix(dev): address watch lock review comments

fix(dev): use run lock for watch cleanup
2026-05-14 08:03:58 +00:00 · 2026-04-30 11:44:07 -07:00 · 2026-04-30 11:29:00 -07:00
275 changed files with 2941 additions and 22599 deletions
--- a/.claude/skills/ask-internal/SKILL.md
+++ b/.claude/skills/ask-internal/SKILL.md
@@ -1,152 +0,0 @@
---
-name: ask-internal
-description: Answer questions about BrowserOS internal stuff (setup, features, architecture, design decisions) by reading the private internal-docs submodule and the codebase. Use for "how do I X", "where is Y", "what is the deal with Z", or any question that mixes ops/setup knowledge with code knowledge. Can execute steps with per-command confirmation.
-allowed-tools: Bash, Read, Grep, Glob, Edit, Write
---
-
-# Ask Internal
-
-Answer team-internal questions by reading `.internal-docs/` and the codebase, synthesizing a direct answer with file:line citations, and optionally running surfaced commands with confirmation.
-
-**Announce at start:** "I'm using the ask-internal skill to answer this from internal-docs and the codebase."
-
-## When to use
-
- "How do I reset my dogfood profile?"
- "What's the deal with the OpenClaw VM startup?"
- "Where do we configure release signing?"
- Any question whose answer lives in setup runbooks, feature notes, architecture docs, or the code that produced them.
-
-## Hard rules — never do these
-
- NEVER execute a state-mutating command without per-command `y` confirmation from the user.
- NEVER edit BrowserOS code in response to an ask-internal question. The skill answers; it does not modify code. Use `/document-internal` for writes.
- NEVER guess. If grep finds nothing useful in docs or code, say so plainly.
- NEVER run this skill if `.internal-docs/` is missing. Stop with the init command.
- NEVER cite a file or line number you have not actually read.
-
-## Voice rules
-
-Apply the same voice rules as `document-internal` to the synthesized answer:
-
- Lead with the point.
- Concrete nouns. Name files, functions, commands.
- Short sentences. Active voice. No em dashes.
- Banned words: delve, crucial, robust, comprehensive, nuanced, multifaceted, furthermore, moreover, additionally, pivotal, landscape, tapestry, underscore, foster, showcase, intricate, vibrant, fundamental, significant, leverage, utilize.
- No filler intros.
-
-## Workflow
-
-### Step 0: Pre-flight
-
-```bash
-if git submodule status .internal-docs 2>/dev/null | grep -q '^-'; then
-  echo "internal-docs submodule not initialized. Run: git submodule update --init .internal-docs"
-  exit 0
-fi
-[ -d .internal-docs ] && [ -n "$(ls -A .internal-docs 2>/dev/null)" ] || {
-  echo ".internal-docs/ missing or empty. Submodule not configured?"
-  exit 0
-}
-```
-
-### Step 1: Parse the question
-
-Pull the keywords from the user's question. Drop stop words. Identify intent:
-
- **Setup-question** ("how do I", "how to", "where do I configure"): bias the search toward `setup/`.
- **Feature-question** ("what is X", "why does X work this way"): bias toward `features/` and `architecture/`.
- **Free-form** ("anything about Y"): search all categories.
-
-### Step 2: Multi-source search
-
-Run grep in parallel across two sources.
-
-**Internal docs:**
-
-```bash
-grep -rni --include='*.md' '<keyword>' .internal-docs/
-```
-
-Search each keyword separately. Collect top hits by relevance (more keyword matches = higher).
-
-**Codebase (skip vendored Chromium and `node_modules`):**
-
-```bash
-grep -rni --include='*.ts' --include='*.tsx' --include='*.js' --include='*.json' --include='*.sh' \
-     --exclude-dir=node_modules --exclude-dir=chromium --exclude-dir=.grove \
-     '<keyword>' packages/ scripts/ .config/ .github/
-```
-
-Read the top 3-5 doc hits and top 3-5 code hits. Do not skim — read the relevant section fully so citations are accurate.
-
-### Step 3: Synthesize answer
-
-Structure the response:
-
-1. **Direct answer.** First sentence answers the question. No preamble.
-2. **Steps if applicable.** Numbered list with exact commands.
-3. **Citations.** Every factual claim references `path/to/file.md:42` or `path/to/code.ts:117`. Run the voice self-check before printing.
-
-If multiple docs cover the topic at different layers (e.g., a setup runbook and a feature note both mention dogfood profiles), reconcile them in the answer rather than dumping both.
-
-### Step 4: Offer execution (only if commands surfaced)
-
-If Step 3 produced executable commands the user could run, ask:
-
-> Run these for you? (y / n / dry-run)
-
- **y:** Execute one at a time. For any command that mutates state (writes a file, modifies config, kills a process, deletes anything), ask "run this? <command>" before each. Read-only commands (`ls`, `cat`, `git status`) run without per-command confirmation but still print before running.
- **n:** Skip. Done.
- **dry-run:** Print the full sequence as a `bash` block. Do not execute.
-
-### Step 5: Doc-not-found path
-
-If Step 2 returned nothing useful (no doc hits AND no clear code answer):
-
-1. Tell the user: "No doc covers this. Tangentially relevant files: <list>."
-2. Ask: "Draft a new doc and open a PR to internal-docs?"
-3. On yes: invoke the full `/document-internal` flow (four sharp questions, draft, voice check, PR), forced to `setup/` doc type, with the code-grep findings handed in as initial context.
-
-### Step 6: Completion status
-
-Report one of:
-
- **DONE** — answer delivered, citations verified.
- **DONE_WITH_CONCERNS** — answered, but flag uncertainty (e.g., docs and code disagreed; user should reconcile).
- **BLOCKED** — submodule missing or other pre-flight failure.
- **NEEDS_CONTEXT** — question too vague to search effectively. Ask one clarifying question.
-
-## Citation discipline
-
-Every "X is at Y" claim in the answer must point to a file:line that the skill actually read. Do not approximate. If you didn't read it, don't cite it.
-
-If a doc says one thing and the code says another, surface the conflict explicitly:
-
-> The setup runbook (`setup/dogfood-profile.md:23`) says to delete `~/.cache/browseros/dogfood`, but the actual code path in `packages/cli/src/cleanup.ts:47` removes `~/.local/share/browseros/dogfood`. The doc looks stale. Recommend updating it.
-
-## Common Mistakes
-
-**Skimming and then citing**
- **Problem:** Citation points to a line that doesn't actually contain the claim.
- **Fix:** Read the section fully before citing. If you didn't read line 117, don't cite line 117.
-
-**Executing without per-command confirmation for mutations**
- **Problem:** User says "y" to "run all", skill blasts through `rm -rf`-style commands.
- **Fix:** "y" means "run this sequence with per-mutation confirmations". Per-command y is required for writes.
-
-**Searching only docs, not code**
- **Problem:** Doc says X but code does Y; answer is wrong.
- **Fix:** Always grep both sources in Step 2.
-
-## Red Flags
-
-**Never:**
- Cite a file:line you haven't read.
- Run mutations without per-command confirmation.
- Modify BrowserOS code from this skill (use `/document-internal` for writes).
-
-**Always:**
- Pre-flight check before any search.
- Reconcile doc vs code conflicts in the answer, don't hide them.
- Plain "no doc covers this" when grep is empty — never invent.
--- a/.claude/skills/document-internal/SKILL.md
+++ b/.claude/skills/document-internal/SKILL.md
@@ -1,208 +0,0 @@
---
-name: document-internal
-description: Draft a 1-page internal doc (feature, architecture, or design) for the private browseros-ai/internal-docs repo. Use when wrapping up a feature on a branch, after the PR is open or about to be opened. Skill drafts from the diff, asks four sharp questions, enforces voice rules, and opens a PR to internal-docs.
-allowed-tools: Bash, Read, Write, Edit, Grep, Glob
---
-
-# Document Internal
-
-Draft a 1-page internal doc (feature note, architecture note, or design spec) from the current branch's diff and open a PR to `browseros-ai/internal-docs`.
-
-**Announce at start:** "I'm using the document-internal skill to draft a doc for internal-docs."
-
-## When to use
-
-After finishing implementation on a feature branch, when the work is doc-worthy (a major feature, a new subsystem, a setup runbook for something internal, or a design decision that future engineers need to know).
-
-## Hard rules — never do these
-
- NEVER `git add -A` or `git add .` inside the tmp clone of internal-docs. Always specific paths.
- NEVER write outside the tmp clone (no spillover into the OSS repo's working tree).
- NEVER fabricate filler content for empty template sections. Empty stays empty.
- NEVER touch the OSS repo's `.gitmodules` or submodule pointer — the sync workflow handles that.
- NEVER run this skill if `.internal-docs/` is missing. Stop with the init command.
- NEVER push to `internal-docs/main` directly. Always a feature branch + PR.
-
-## Voice rules — enforced by Step 4
-
-The skill MUST follow these and refuse to draft otherwise. After generation, scan for violations and regenerate offending sentences (max 3 attempts).
-
- Lead with the point. First sentence answers "what is this?"
- Concrete nouns. Name files, functions, commands. Not "the system" or "the component".
- Short sentences. Average <20 words. No deeply nested clauses.
- Active voice. "X does Y" not "Y is done by X".
- No em dashes. Use commas, periods, or rephrase.
- Banned words: delve, crucial, robust, comprehensive, nuanced, multifaceted, furthermore, moreover, additionally, pivotal, landscape, tapestry, underscore, foster, showcase, intricate, vibrant, fundamental, significant, leverage, utilize.
- "110 IQ" target. Write for a smart engineer who has not seen this code yet.
- No filler intros ("This document describes..."). Start with the substance.
- Empty sections stay empty. Do not write "N/A" or fabricate content.
-
-## Workflow
-
-### Step 0: Pre-flight
-
-Bail with a clear message on any failure.
-
-```bash
-# Submodule must be initialized
-if git submodule status .internal-docs 2>/dev/null | grep -q '^-'; then
-  echo "internal-docs submodule not initialized. Run: git submodule update --init .internal-docs"
-  exit 0
-fi
-[ -d .internal-docs ] || { echo ".internal-docs/ missing. Submodule not configured?"; exit 0; }
-
-# Must be on a feature branch
-BRANCH=$(git branch --show-current)
-if [ "$BRANCH" = "main" ] || [ "$BRANCH" = "dev" ]; then
-  echo "On $BRANCH. Run from a feature branch."
-  exit 0
-fi
-
-# Determine base branch (default: dev for this repo, fall back to main).
-# Suppress rev-parse's SHA output on stdout so it doesn't get captured into BASE.
-BASE=$(git rev-parse --verify origin/dev >/dev/null 2>&1 && echo dev || echo main)
-
-# Gather context
-git log "$BASE..HEAD" --oneline
-git diff "$BASE...HEAD" --stat
-gh pr view --json body -q .body 2>/dev/null  # may be empty if no PR yet
-```
-
-### Step 1: Identify the doc
-
-Ask the user for three things in one prompt:
-
-1. **Doc type:** `feature` (default for `feat/*` branches), `architecture`, or `design`
-2. **Slug:** kebab-case, short (e.g., `cowork-mcp`, `auto-skill-suggest`)
-3. **Owner:** GitHub handle (default = `git config user.name` or current `gh api user --jq .login`)
-
-### Step 2: Decision brief — four sharp questions
-
-Ask one question at a time. Each answer constrains the next. These force compression before drafting.
-
-1. "In one sentence: what can someone now DO that they could not before?"
-2. "What is the one design decision a future engineer needs to know?"
-3. "Which 3-5 files are the heart of this change?" (suggest candidates from the diff)
-4. "Any sharp edges or gotchas? (or 'none')"
-
-Skip any question that is N/A for the doc type. Architecture notes don't need question 1; design specs don't need question 4.
-
-### Step 3: Draft from the template
-
-Read the matching template from `.internal-docs/_templates/`:
-
- `feature` → `feature-note.md`
- `architecture` → `architecture-note.md`
- `design` → `design-spec.md`
-
-If `.internal-docs/_templates/` does not exist (first run, before seeding), fall back to the seeds bundled with this skill at `.claude/skills/document-internal/seeds/_templates/`.
-
-Generate the 1-pager from the template, the four answers, and the diff context.
-
-### Step 4: Voice self-check
-
-Scan the draft for violations:
-
- Em dash present (`—`).
- Any banned word from the list.
- Average sentence length > 20 words.
- Body line count > 60 (feature notes only — architecture/design have no cap).
-
-If any violation found, regenerate the offending sentences in place. Max 3 attempts. If still failing after 3 attempts, stop and report which rules are violated.
-
-If the body is over 60 lines for a feature note, ask: "This is N lines, target is 60. Trim, or promote to `architecture/` (no length cap)?"
-
-### Step 5: Show + iterate
-
-Print the full draft. Ask:
-
-> Edit needed? Paste any changes, or say "looks good".
-
-Apply user edits with the Edit tool. Re-run Step 4. Loop until the user approves.
-
-### Step 6: Open PR to internal-docs
-
-Use a tmp clone. Never the user's `.internal-docs` checkout — keeps the user's submodule clean.
-
-```bash
-TMP=$(mktemp -d)
-trap 'rm -rf "$TMP"' EXIT  # cleans up even if any step below fails
-git clone -b main git@github.com:browseros-ai/internal-docs.git "$TMP"
-cd "$TMP"
-git checkout -b "docs/<slug>"
-
-# Write the doc
-mkdir -p "<type>"  # features, architecture, designs, or setup
-cat > "<type>/$(date -u +%Y-%m)-<slug>.md" <<'DOC'
-<draft content>
-DOC
-
-# Update the root README index — insert one line under the matching section
-# Use Edit tool to add: "- [<title>](<type>/YYYY-MM-<slug>.md) — <one-line description>"
-
-git add "<type>/$(date -u +%Y-%m)-<slug>.md" README.md
-git commit -m "docs(<type>): <slug>"
-git push -u origin "docs/<slug>"
-
-PR_URL=$(gh pr create -R browseros-ai/internal-docs --base main \
-  --head "docs/<slug>" \
-  --title "docs(<type>): <slug>" \
-  --body "$(cat <<'BODY'
-## Summary
-<one-line of what this doc covers>
-
-## Source
- BrowserOS branch: <branch>
- Related PR: <#NNN if any>
-BODY
-)")
-
-cd -
-echo "PR opened: $PR_URL"
-# trap above cleans up $TMP on EXIT
-```
-
-If the slug contains characters that won't shell-escape cleanly, sanitize before substitution.
-
-### Step 7: Completion status
-
-Report one of:
-
- **DONE** — file written, branch pushed, PR opened. Print PR URL.
- **DONE_WITH_CONCERNS** — same as DONE but list concerns (e.g., voice check needed multiple regens, user skipped a question).
- **BLOCKED** — submodule missing, auth fail, or template missing. State exactly what's needed.
-
-## Doc type defaults
-
-| Branch pattern | Default doc type | Default location |
-|----------------|------------------|------------------|
-| `feat/*`       | feature          | `features/`      |
-| `arch/*` or refactor branches with >10 files in `packages/` | architecture | `architecture/` |
-| `rfc/*` or `design/*` | design          | `designs/`       |
-| Otherwise      | ask              | ask              |
-
-## Common Mistakes
-
-**Drafting before asking the four questions**
- **Problem:** Output is generic filler that says nothing concrete.
- **Fix:** Always ask Step 2 first, even if the diff "looks obvious".
-
-**Touching `.internal-docs/` directly**
- **Problem:** User's submodule HEAD moves, parent repo shows dirty state.
- **Fix:** Always use the tmp clone in Step 6.
-
-**Skipping voice check on user edits**
- **Problem:** User pastes prose with em dashes or filler; ships as-is.
- **Fix:** Re-run Step 4 after every user edit.
-
-## Red Flags
-
-**Never:**
- Push to `internal-docs/main`. Always branch + PR.
- Modify the OSS repo's `.gitmodules` or submodule pointer.
- Fabricate content for empty template sections.
-
-**Always:**
- Pre-flight check before doing any work.
- One-pager rule for feature notes (60-line body cap).
- File:line citations when referencing code.
--- a/.claude/skills/document-internal/seeds/README.md
+++ b/.claude/skills/document-internal/seeds/README.md
@@ -1,51 +0,0 @@
-# BrowserOS Internal Docs
-
-Private team docs for `browseros-ai`. Mounted as a submodule into the public OSS repo at `.internal-docs/`.
-
-If you are reading this from a public clone of BrowserOS without team access — this submodule is for the BrowserOS internal team. Nothing here is required to build or use BrowserOS.
-
-## How to find what you need
-
- Setup task ("how do I X locally") → look in [`setup/`](setup/)
- Recently shipped feature → look in [`features/`](features/)
- Cross-cutting subsystem → look in [`architecture/`](architecture/)
- A design decision or RFC → look in [`designs/`](designs/)
-
-Or run `/ask-internal "<your question>"` from any BrowserOS checkout. The skill greps these docs and the codebase, then synthesizes an answer with citations.
-
-## How to add a doc
-
-Run `/document-internal` from a feature branch. The skill drafts a 1-pager from your branch's diff, asks four sharp questions, enforces voice rules, and opens a PR back to this repo.
-
-## Index
-
-### Setup
-<!-- one line per setup runbook: -->
-<!-- - [Dev environment](setup/dev-environment.md): first-time machine setup -->
-
-### Features
-<!-- one line per shipped feature, newest first: -->
-<!-- - [Cowork MCP](features/2026-04-cowork-mcp.md): bring outside MCPs into the BrowserOS agent -->
-
-### Architecture
-<!-- one line per cross-cutting subsystem: -->
-<!-- - [Chrome fork overview](architecture/chrome-fork-overview.md): what we patched and why -->
-
-### Designs
-<!-- one line per design spec, newest first: -->
-<!-- - [Internal docs submodule](designs/2026-04-30-internal-docs-submodule.md): this system -->
-
-## Templates
-
-When `/document-internal` runs, it reads from [`_templates/`](_templates/). Edit the templates here when the team's preferred shape changes.
-
-## Voice
-
-Docs in this repo follow these rules. The `/document-internal` skill enforces them; humans editing by hand should match.
-
- Lead with the point.
- Concrete nouns. Name files, functions, commands.
- Short sentences, active voice, no em dashes.
- No filler words: delve, crucial, robust, comprehensive, nuanced, multifaceted, leverage, utilize, etc.
- Empty sections stay empty. Do not write "N/A" or fake content.
- Feature notes target one screen, body 60 lines max.
--- a/.claude/skills/document-internal/seeds/_templates/architecture-note.md
+++ b/.claude/skills/document-internal/seeds/_templates/architecture-note.md
@@ -1,31 +0,0 @@
---
-title: <subsystem name>
-owner: <github handle>
-status: current | deprecated
-date: YYYY-MM-DD
-related-features: [feature-slug-1, feature-slug-2]
---
-
-# <subsystem name>
-
-## What this subsystem does
-<1-2 paragraphs. The top-level responsibility. Boundaries.>
-
-## Architecture
-<Diagram (ASCII or mermaid) plus prose. Components and how they talk.>
-
-## Constraints
-<Hard rules the design enforces. "X must never call Y" type statements.>
-
-## Decisions made
-<Numbered list of non-obvious decisions and the reason for each.>
-
-## Key files
- `path/to/file.ts` — role
- `path/to/dir/` — what lives here
-
-## How to evolve this
-<Where to add things. Which tests to expect to update. What NOT to touch.>
-
-## Open questions
-<What is still being figured out. Empty if none.>
--- a/.claude/skills/document-internal/seeds/_templates/design-spec.md
+++ b/.claude/skills/document-internal/seeds/_templates/design-spec.md
@@ -1,34 +0,0 @@
---
-title: <design name>
-owner: <github handle>
-status: proposed | accepted | rejected | superseded
-date: YYYY-MM-DD
-supersedes: <design-slug or none>
---
-
-# <design name>
-
-## Goal
-<2-4 sentences. What this design is trying to accomplish.>
-
-## Context
-<1-2 paragraphs. The current state, what is failing, why this needs to change.>
-
-## Selected Approach
-<The chosen design at a high level. Architecture, components, data flow.>
-
-## Alternatives Considered
-### 1. <name>
-<2-3 sentences on what this would look like, then pro/con and why rejected (or deferred).>
-
-### 2. <name>
-<Same shape.>
-
-## Out of Scope
-<What this design does NOT cover. Defer references.>
-
-## Rollout
-<Numbered steps from "nothing exists" to "fully shipped".>
-
-## Open Questions
-<Resolved during design? Empty. Unresolved? List with owner.>
--- a/.claude/skills/document-internal/seeds/_templates/feature-note.md
+++ b/.claude/skills/document-internal/seeds/_templates/feature-note.md
@@ -1,29 +0,0 @@
---
-title: <feature name>
-owner: <github handle>
-status: shipped | wip | deprecated
-date: YYYY-MM-DD
-prs: ["#NNN"]
-tags: [agent, browser, mcp]
---
-
-# <feature name>
-
-## What it does
-<2-3 sentences. What can someone now do that they could not before. Lead with user-facing impact, not implementation.>
-
-## Why we built it
-<1-2 sentences. Motivation. What pain it removed or what unlocked.>
-
-## How it works
-<3-6 sentences. The flow at a high level. Name the key files.>
-
-## Key files
- `path/to/file.ts` — what it does
- `path/to/other.ts` — what it does
-
-## How to run / test it locally
-<bullet list of commands. Empty section if N/A — do not fake.>
-
-## Gotchas
-<known sharp edges. "If you see X, that's why." Empty if N/A.>
--- a/.github/workflows/eval-weekly.yml
+++ b/.github/workflows/eval-weekly.yml
@@ -44,19 +44,6 @@ jobs:
        working-directory: packages/browseros-agent
        run: bun install --ignore-scripts

-      - name: Install Claude Code CLI
-        working-directory: packages/browseros-agent/apps/eval
-        env:
-          EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/legacy/browseros-agent-weekly.json' }}
-        run: |
-          if bun -e "const config = await Bun.file(process.env.EVAL_CONFIG).json(); process.exit(config.agent?.type === 'claude-code' ? 0 : 1)"; then
-            npm install -g @anthropic-ai/claude-code@2.1.119
-            echo "Claude Code CLI installed at $(command -v claude)"
-            claude --version
-          else
-            echo "Eval config does not use Claude Code; skipping Claude Code CLI install"
-          fi
-
      - name: Install Python eval dependencies
        # agisdk pinned so silent upstream releases can't shift task definitions
        # or grader behavior. Bump intentionally with a documented re-baseline.
@@ -80,11 +67,13 @@ jobs:
        env:
          FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
          OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
-          AWS_REGION: ${{ secrets.AWS_REGION || 'us-west-2' }}
-          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
-          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          NOPECHA_API_KEY: ${{ secrets.NOPECHA_API_KEY }}
+          EVAL_R2_ACCOUNT_ID: ${{ secrets.EVAL_R2_ACCOUNT_ID }}
+          EVAL_R2_ACCESS_KEY_ID: ${{ secrets.EVAL_R2_ACCESS_KEY_ID }}
+          EVAL_R2_SECRET_ACCESS_KEY: ${{ secrets.EVAL_R2_SECRET_ACCESS_KEY }}
+          EVAL_R2_BUCKET: ${{ secrets.EVAL_R2_BUCKET }}
+          EVAL_R2_CDN_BASE_URL: ${{ secrets.EVAL_R2_CDN_BASE_URL }}
          BROWSEROS_BINARY: /usr/bin/browseros
          WEBARENA_INFINITY_DIR: /tmp/webarena-infinity
          # OpenClaw container runtime is macOS-only; opt the Linux runner
@@ -93,35 +82,7 @@ jobs:
          EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/legacy/browseros-agent-weekly.json' }}
        run: |
          echo "Running eval with config: $EVAL_CONFIG"
-          xvfb-run --auto-servernum --server-args="-screen 0 1440x900x24" bun run src/index.ts suite --config "$EVAL_CONFIG"
-          # Capture the run directory so report.html can be generated before the R2 publish step.
-          SUMMARY_PATH="$(find results -name summary.json -type f -print | sort | tail -n 1)"
-          if [ -z "$SUMMARY_PATH" ]; then
-            echo "No eval run summary found"
-            exit 1
-          fi
-          RUN_DIR="$(dirname "$SUMMARY_PATH")"
-          echo "EVAL_RUN_DIR=$RUN_DIR" >> "$GITHUB_ENV"
-
-      - name: Generate run analysis report
-        if: success()
-        working-directory: packages/browseros-agent/apps/eval
-        env:
-          CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
-        run: |
-          echo "Generating run report for $EVAL_RUN_DIR"
-          bun scripts/generate-report.ts --input "$EVAL_RUN_DIR" --output "$EVAL_RUN_DIR/report.html"
-
-      - name: Publish eval run to R2
-        if: success()
-        working-directory: packages/browseros-agent/apps/eval
-        env:
-          EVAL_R2_ACCOUNT_ID: ${{ secrets.EVAL_R2_ACCOUNT_ID }}
-          EVAL_R2_ACCESS_KEY_ID: ${{ secrets.EVAL_R2_ACCESS_KEY_ID }}
-          EVAL_R2_SECRET_ACCESS_KEY: ${{ secrets.EVAL_R2_SECRET_ACCESS_KEY }}
-          EVAL_R2_BUCKET: ${{ secrets.EVAL_R2_BUCKET }}
-          EVAL_R2_CDN_BASE_URL: ${{ secrets.EVAL_R2_CDN_BASE_URL }}
-        run: bun run src/index.ts publish --run "$EVAL_RUN_DIR" --target r2
+          xvfb-run --auto-servernum --server-args="-screen 0 1440x900x24" bun run src/index.ts suite --config "$EVAL_CONFIG" --publish r2

      - name: Generate trend report
        if: success()
@@ -136,7 +97,7 @@ jobs:
          EVAL_R2_CDN_BASE_URL: ${{ secrets.EVAL_R2_CDN_BASE_URL }}
        run: bun apps/eval/scripts/weekly-report.ts /tmp/eval-report.html

-      - name: Upload trend report as artifact
+      - name: Upload report as artifact
        if: success()
        uses: actions/upload-artifact@v4
        with:
--- a/.github/workflows/sync-internal-docs.yml
+++ b/.github/workflows/sync-internal-docs.yml
@@ -1,62 +0,0 @@
-name: Sync internal-docs submodule
-
-on:
-  schedule:
-    - cron: '0 */4 * * *'
-  workflow_dispatch:
-
-jobs:
-  sync:
-    name: Bump internal-docs submodule pointer on dev
-    runs-on: ubuntu-latest
-    permissions:
-      contents: write
-      pull-requests: write
-    steps:
-      - name: Rewrite SSH submodule URL to HTTPS-with-token
-        env:
-          TOKEN: ${{ secrets.INTERNAL_DOCS_SYNC_TOKEN }}
-        run: |
-          git config --global "url.https://x-access-token:${TOKEN}@github.com/.insteadOf" "git@github.com:"
-
-      - uses: actions/checkout@v4
-        with:
-          token: ${{ secrets.INTERNAL_DOCS_SYNC_TOKEN }}
-          submodules: true
-          ref: dev
-          fetch-depth: 50
-
-      - name: Open auto-merge PR if internal-docs has new commits
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-        run: |
-          set -e
-
-          # Skip if submodule not yet configured (handoff window before someone adds it)
-          if ! git config --file .gitmodules --get-regexp '^submodule\..internal-docs\.path$' >/dev/null 2>&1; then
-            echo "internal-docs submodule not yet configured in .gitmodules. Skipping."
-            exit 0
-          fi
-
-          git submodule update --remote --merge .internal-docs
-
-          if git diff --quiet .internal-docs; then
-            echo "No internal-docs changes to sync."
-            exit 0
-          fi
-
-          BRANCH="bot/sync-internal-docs-$(date -u +%Y%m%d-%H%M%S)"
-          git config user.name  "browseros-bot"
-          git config user.email "bot@browseros.ai"
-          git checkout -b "$BRANCH"
-          git add .internal-docs
-          git commit -m "chore: sync internal-docs submodule"
-          git push -u origin "$BRANCH"
-
-          PR_URL=$(gh pr create \
-            --base dev \
-            --head "$BRANCH" \
-            --title "chore: sync internal-docs submodule" \
-            --body "Automated bump of the \`.internal-docs\` submodule pointer. Auto-merging.")
-
-          gh pr merge "$PR_URL" --auto --squash --delete-branch
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,4 +0,0 @@
-[submodule ".internal-docs"]
-	path = .internal-docs
-	url = git@github.com:browseros-ai/internal-docs.git
-	branch = main
--- a/.internal-docs
+++ b/.internal-docs
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandConversation.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandConversation.tsx
@@ -1,44 +1,186 @@
-import { ArrowLeft, PanelRight } from 'lucide-react'
-import { type FC, useEffect, useMemo, useRef, useState } from 'react'
+import { ArrowLeft, Bot, Home } from 'lucide-react'
+import { type FC, useEffect, useMemo, useRef } from 'react'
 import { Navigate, useNavigate, useParams, useSearchParams } from 'react-router'
 import { Button } from '@/components/ui/button'
-import type {
-  HarnessAgent,
-  HarnessAgentAdapter,
-} from '@/entrypoints/app/agents/agent-harness-types'
-import type { AgentAdapterHealth } from '@/entrypoints/app/agents/agent-row/agent-row.types'
 import {
  cancelHarnessTurn,
-  useAgentAdapters,
  useEnqueueHarnessMessage,
  useHarnessAgents,
  useRemoveHarnessQueuedMessage,
-  useUpdateHarnessAgent,
 } from '@/entrypoints/app/agents/useAgents'
-import type { AgentEntry } from '@/entrypoints/app/agents/useOpenClaw'
-import { type ProducedFilesRailGroup, useAgentOutputs } from '@/lib/agent-files'
-import { cn } from '@/lib/utils'
-import { AgentRail } from './AgentRail'
-import { useAgentCommandData } from './agent-command-layout'
 import {
-  OutputsRail,
-  useOutputsRailOpen,
-} from './agent-conversation.outputs-rail'
+  type AgentEntry,
+  getModelDisplayName,
+} from '@/entrypoints/app/agents/useOpenClaw'
+import { cn } from '@/lib/utils'
+import { useAgentCommandData } from './agent-command-layout'
 import { ClawChat } from './ClawChat'
-import { ConversationHeader } from './ConversationHeader'
 import { ConversationInput } from './ConversationInput'
 import {
  buildChatHistoryFromClawMessages,
  filterTurnsPersistedInHistory,
  flattenHistoryPages,
-  mapHistoryToProducedFilesGroups,
-  selectStripOnlyTurns,
 } from './claw-chat-types'
-import { consumePendingInitialMessage } from './pending-initial-message'
 import { QueuePanel } from './QueuePanel'
 import { useAgentConversation } from './useAgentConversation'
 import { useHarnessChatHistory } from './useHarnessChatHistory'

+function StatusBadge({ status }: { status: string }) {
+  return (
+    <div className="inline-flex items-center gap-2 rounded-full border border-border/60 bg-card px-3 py-1 text-[11px] text-muted-foreground uppercase tracking-[0.18em]">
+      <span
+        className={cn(
+          'size-1.5 rounded-full',
+          status === 'Working on your request'
+            ? 'bg-amber-500'
+            : status === 'Ready'
+              ? 'bg-emerald-500'
+              : status === 'Offline'
+                ? 'bg-muted-foreground/50'
+                : 'bg-[var(--accent-orange)]',
+        )}
+      />
+      <span>{status}</span>
+    </div>
+  )
+}
+
+function AgentIdentity({
+  name,
+  meta,
+  className,
+}: {
+  name: string
+  meta: string
+  className?: string
+}) {
+  return (
+    <div className={cn('min-w-0', className)}>
+      <div className="truncate font-semibold text-[15px] leading-5">{name}</div>
+      <div className="truncate text-muted-foreground text-xs leading-5">
+        {meta}
+      </div>
+    </div>
+  )
+}
+
+function ConversationHeader({
+  agentName,
+  agentMeta,
+  status,
+  backLabel,
+  backTarget,
+  onGoHome,
+}: {
+  agentName: string
+  agentMeta: string
+  status: string
+  backLabel: string
+  backTarget: 'home' | 'page'
+  onGoHome: () => void
+}) {
+  const BackIcon = backTarget === 'home' ? Home : ArrowLeft
+
+  return (
+    <div className="flex h-14 items-center justify-between gap-4 border-border/50 border-b px-5">
+      <div className="flex min-w-0 items-center gap-3">
+        <Button
+          variant="ghost"
+          size="icon"
+          onClick={onGoHome}
+          className="size-8 rounded-xl lg:hidden"
+          title={backLabel}
+        >
+          <BackIcon className="size-4" />
+        </Button>
+        <div className="flex size-8 shrink-0 items-center justify-center rounded-xl bg-muted text-muted-foreground">
+          <Bot className="size-4" />
+        </div>
+        <AgentIdentity name={agentName} meta={agentMeta} />
+      </div>
+
+      <StatusBadge status={status} />
+    </div>
+  )
+}
+
+function AgentRailHeader({ onGoHome }: { onGoHome: () => void }) {
+  return (
+    <div className="hidden h-14 items-center border-border/50 border-r border-b bg-background/70 px-4 lg:flex">
+      <div className="flex min-w-0 items-center gap-3">
+        <Button
+          variant="ghost"
+          size="icon"
+          onClick={onGoHome}
+          className="size-8 rounded-xl"
+          title="Back to home"
+        >
+          <ArrowLeft className="size-4" />
+        </Button>
+        <div className="truncate font-semibold text-[15px] leading-5">
+          Agents
+        </div>
+      </div>
+    </div>
+  )
+}
+
+function AgentRailList({
+  activeAgentId,
+  agents,
+  onSelectAgent,
+}: {
+  activeAgentId: string
+  agents: AgentEntry[]
+  onSelectAgent: (entry: AgentEntry) => void
+}) {
+  return (
+    <aside className="hidden min-h-0 flex-col border-border/50 border-r bg-background/70 lg:flex">
+      <div className="styled-scrollbar min-h-0 flex-1 space-y-2 overflow-y-auto px-3 py-3">
+        {agents.map((entry) => {
+          const active = entry.agentId === activeAgentId
+          const modelName = getAgentEntryMeta(entry)
+
+          return (
+            <button
+              key={entry.agentId}
+              type="button"
+              onClick={() => onSelectAgent(entry)}
+              className={cn(
+                'w-full rounded-2xl border px-3 py-3 text-left transition-all',
+                active
+                  ? 'border-[var(--accent-orange)]/30 bg-[var(--accent-orange)]/8 shadow-sm'
+                  : 'border-transparent bg-transparent hover:border-border/60 hover:bg-card',
+              )}
+            >
+              <div className="flex items-center gap-3">
+                <div
+                  className={cn(
+                    'flex size-9 items-center justify-center rounded-xl',
+                    active
+                      ? 'bg-[var(--accent-orange)]/12 text-[var(--accent-orange)]'
+                      : 'bg-muted text-muted-foreground',
+                  )}
+                >
+                  <Bot className="size-4" />
+                </div>
+                <AgentIdentity name={entry.name} meta={modelName} />
+              </div>
+            </button>
+          )
+        })}
+      </div>
+    </aside>
+  )
+}
+
+function getAgentEntryMeta(agent: AgentEntry | undefined): string {
+  if (agent?.source === 'agent-harness') {
+    return getModelDisplayName(agent.model) ?? 'ACP agent'
+  }
+  return getModelDisplayName(agent?.model) ?? 'OpenClaw agent'
+}
+
 function AgentConversationController({
  agentId,
  initialMessage,
@@ -46,7 +188,6 @@ function AgentConversationController({
  agents,
  agentPathPrefix,
  createAgentPath,
-  onOpenOutputsRail,
 }: {
  agentId: string
  initialMessage: string | null
@@ -54,7 +195,6 @@ function AgentConversationController({
  agents: AgentEntry[]
  agentPathPrefix: string
  createAgentPath: string
-  onOpenOutputsRail?: ((turnId?: string | null) => void) | null
 }) {
  const navigate = useNavigate()
  const initialMessageSentRef = useRef<string | null>(null)
@@ -86,15 +226,6 @@ function AgentConversationController({
  const harnessAgent = harnessAgents.find((entry) => entry.id === agentId)
  const queue = harnessAgent?.queue ?? []
  const activeTurnId = harnessAgent?.activeTurnId ?? null
-  const isOpenClawAgent = harnessAgent?.adapter === 'openclaw'
-
-  // Used to surface produced-files strips on a fresh page load
-  // when there's no optimistic turn to carry the data. Disabled
-  // for non-openclaw adapters since they don't attribute files.
-  const { groups: agentOutputGroups } = useAgentOutputs(
-    agentId,
-    isOpenClawAgent,
-  )

  const { turns, streaming, send } = useAgentConversation(agentId, {
    runtime: 'agent-harness',
@@ -119,44 +250,6 @@ function AgentConversationController({
    () => filterTurnsPersistedInHistory(turns, historyMessages),
    [historyMessages, turns],
  )
-  // Persisted turns that still need to surface their FileCardStrip
-  // — history items don't carry produced-files data, so without
-  // these the strip would vanish on history reload.
-  const stripOnlyTurns = useMemo(
-    () => selectStripOnlyTurns(turns, historyMessages),
-    [historyMessages, turns],
-  )
-  // Two outputs from the per-turn matcher:
-  //  - filesByAssistantId  → strip rendered directly under the
-  //    matching assistant history bubble.
-  //  - tailUnmatched      → groups with no history pair (orphans);
-  //    rendered at the conversation tail.
-  // Both are filtered to exclude turnIds already covered by a
-  // live or strip-only optimistic turn (those carry their own
-  // strip and history hasn't reloaded yet).
-  const { filesByAssistantId, tailStripGroups } = useMemo(() => {
-    if (!isOpenClawAgent) {
-      return {
-        filesByAssistantId: new Map<string, ProducedFilesRailGroup>(),
-        tailStripGroups: [] as ProducedFilesRailGroup[],
-      }
-    }
-    const coveredTurnIds = new Set<string>()
-    for (const turn of turns) {
-      if (turn.turnId) coveredTurnIds.add(turn.turnId)
-    }
-    const eligibleGroups = agentOutputGroups.filter(
-      (group) => !coveredTurnIds.has(group.turnId),
-    )
-    const { byAssistantMessageId, unmatched } = mapHistoryToProducedFilesGroups(
-      historyMessages,
-      eligibleGroups,
-    )
-    return {
-      filesByAssistantId: byAssistantMessageId,
-      tailStripGroups: unmatched,
-    }
-  }, [agentOutputGroups, isOpenClawAgent, historyMessages, turns])
  onInitialMessageConsumedRef.current = onInitialMessageConsumed

  const disabled = !agent
@@ -171,73 +264,42 @@ function AgentConversationController({
  sendRef.current = send

  useEffect(() => {
-    if (disabled || !historyReady) return
-
-    // Registry-first: when the user submitted at /home with
-    // attachments, the rich payload is here. URL `?q=` may also be
-    // present and is the text-only fallback path; the registry wins
-    // when both exist because it carries the binary attachments
-    // alongside the text.
-    const pending = consumePendingInitialMessage(agentId)
-    if (pending) {
-      // Mark the dedup ref so the text-only branch below doesn't
-      // re-fire on the same render.
-      if (initialMessageKey) {
-        initialMessageSentRef.current = initialMessageKey
-      }
-      onInitialMessageConsumedRef.current()
-      void sendRef.current({
-        text: pending.text,
-        attachments: pending.attachments.map((a) => a.payload),
-        attachmentPreviews: pending.attachments.map((a) => ({
-          id: a.id,
-          kind: a.kind,
-          mediaType: a.mediaType,
-          name: a.name,
-          dataUrl: a.dataUrl,
-        })),
-      })
-      return
-    }
-
    const query = initialMessage?.trim()
    if (!initialMessageKey) {
-      // Reset is safe even on the post-registry-fire re-run: consume
-      // is destructive, so the registry is already drained — there's
-      // nothing left for a third run to re-send.
      initialMessageSentRef.current = null
      return
    }

-    if (!query || initialMessageSentRef.current === initialMessageKey) {
+    if (
+      !query ||
+      initialMessageSentRef.current === initialMessageKey ||
+      disabled ||
+      !historyReady
+    ) {
      return
    }

    initialMessageSentRef.current = initialMessageKey
    onInitialMessageConsumedRef.current()
    void sendRef.current({ text: query })
-  }, [agentId, disabled, historyReady, initialMessage, initialMessageKey])
+  }, [disabled, historyReady, initialMessage, initialMessageKey])

  const handleSelectAgent = (entry: AgentEntry) => {
    navigate(`${agentPathPrefix}/${entry.agentId}`)
  }

  return (
-    <div className="flex min-h-0 flex-1 flex-col overflow-hidden">
+    <div className="flex min-h-0 flex-col overflow-hidden">
      <ClawChat
        agentName={agentName}
        historyMessages={historyMessages}
        turns={visibleTurns}
-        stripOnlyTurns={stripOnlyTurns}
-        filesByAssistantId={filesByAssistantId}
-        tailStripGroups={tailStripGroups}
        streaming={streaming}
        isInitialLoading={harnessHistoryQuery.isLoading}
        error={error}
        hasNextPage={false}
        isFetchingNextPage={false}
        onFetchNextPage={() => {}}
-        onOpenOutputsRail={onOpenOutputsRail}
        onRetry={() => {
          void harnessHistoryQuery.refetch()
        }}
@@ -306,22 +368,6 @@ interface AgentCommandConversationProps {
  createAgentPath?: string
 }

-function inferAdapterFromEntry(
-  entry: AgentEntry | undefined,
-): HarnessAgentAdapter | 'unknown' {
-  if (!entry) return 'unknown'
-  if (entry.source === 'agent-harness') {
-    // Harness entries don't carry the adapter on AgentEntry; the rail
-    // / header read the harness record directly. This branch only runs
-    // before the harness query resolves, so 'unknown' is correct — the
-    // tile's bot fallback renders until data arrives.
-    return 'unknown'
-  }
-  // OpenClaw-only entries (no harness shadow) are deprecated in
-  // practice but the rail still tolerates them.
-  return 'openclaw'
-}
-
 export const AgentCommandConversation: FC<AgentCommandConversationProps> = ({
  variant = 'command',
  backPath = '/home',
@@ -332,191 +378,60 @@ export const AgentCommandConversation: FC<AgentCommandConversationProps> = ({
  const [searchParams, setSearchParams] = useSearchParams()
  const navigate = useNavigate()
  const { agents } = useAgentCommandData()
-  const { harnessAgents } = useHarnessAgents()
-  const { adapters } = useAgentAdapters()
-  const updateAgent = useUpdateHarnessAgent()
-
  const shouldRedirectHome = !agentId
  const resolvedAgentId = agentId ?? ''
-  const harnessAgent = harnessAgents.find(
-    (entry) => entry.id === resolvedAgentId,
-  )
-  const entry = agents.find((item) => item.agentId === resolvedAgentId)
-  const fallbackName = entry?.name || resolvedAgentId || 'Agent'
-  const fallbackAdapter = inferAdapterFromEntry(entry)
+  const agent = agents.find((entry) => entry.agentId === resolvedAgentId)
+  const agentName = agent?.name || resolvedAgentId || 'Agent'
+  const agentMeta = getAgentEntryMeta(agent)
  const initialMessage = searchParams.get('q')
  const isPageVariant = variant === 'page'
  const backLabel = isPageVariant ? 'Back to agents' : 'Back to home'

-  const isOpenClawAgent = harnessAgent?.adapter === 'openclaw'
-  const [outputsRailOpen, setOutputsRailOpen] =
-    useOutputsRailOpen(resolvedAgentId)
-  const railVisible = isOpenClawAgent && outputsRailOpen
-
-  // Deep-link target for the rail. Set when (a) the user clicks
-  // View / +N on an inline file-card strip, or (b) an external nav
-  // arrived with `?outputsTurn=<turnId>`. Cleared by the rail
-  // itself once it has scrolled to + expanded the matching group.
-  const urlOutputsTurn = searchParams.get('outputsTurn')
-  const [focusTurnId, setFocusTurnId] = useState<string | null>(urlOutputsTurn)
-  // If the URL param flips while we're already on this agent, sync.
-  useEffect(() => {
-    if (!urlOutputsTurn) return
-    setFocusTurnId(urlOutputsTurn)
-    if (isOpenClawAgent) setOutputsRailOpen(true)
-  }, [urlOutputsTurn, isOpenClawAgent, setOutputsRailOpen])
-
-  const handleOpenOutputsRail = (turnId?: string | null) => {
-    if (!isOpenClawAgent) return
-    setOutputsRailOpen(true)
-    setFocusTurnId(turnId ?? null)
-  }
-  const handleFocusTurnConsumed = () => {
-    setFocusTurnId(null)
-    if (urlOutputsTurn) {
-      // Drop the URL param so a back-nav doesn't re-trigger the
-      // scroll. `replace: true` keeps history clean.
-      setSearchParams(
-        (prev) => {
-          const next = new URLSearchParams(prev)
-          next.delete('outputsTurn')
-          return next
-        },
-        { replace: true },
-      )
-    }
-  }
-
-  const adapterHealth = useMemo<AgentAdapterHealth | null>(() => {
-    const adapterId = harnessAgent?.adapter
-    if (!adapterId) return null
-    const descriptor = adapters.find((item) => item.id === adapterId)
-    if (!descriptor?.health) return null
-    return {
-      healthy: descriptor.health.healthy,
-      reason: descriptor.health.reason,
-    }
-  }, [adapters, harnessAgent?.adapter])
-
  if (shouldRedirectHome) {
    return <Navigate to="/home" replace />
  }

-  const handleSelectHarnessAgent = (target: HarnessAgent) => {
-    navigate(`${agentPathPrefix}/${target.id}`)
+  const handleSelectAgent = (entry: AgentEntry) => {
+    navigate(`${agentPathPrefix}/${entry.agentId}`)
  }

-  const handlePinToggle = (target: HarnessAgent | null, next: boolean) => {
-    if (!target) return
-    updateAgent.mutate({
-      agentId: target.id,
-      patch: { pinned: next },
-    })
-  }
+  // Every visible agent runs through the harness now, so per-agent
+  // runtime status doesn't gate chat the way OpenClaw's legacy
+  // gateway lifecycle did. Show "Ready" once the agent record is
+  // resolved from the rail, "Setup" otherwise.
+  const statusCopy = agent ? 'Ready' : 'Setup'

  return (
    <div className="absolute inset-0 overflow-hidden bg-background md:pl-[theme(spacing.14)]">
-      <div className="mx-auto flex h-full w-full max-w-[1480px] flex-col">
-        {/* Shared top band — the rail's "Agents" header and the chat
-            header live on one row so they're aligned by construction. */}
-        <div className="flex shrink-0 items-stretch border-border/50 border-b">
-          <div className="hidden min-h-[60px] w-[288px] shrink-0 items-center gap-3 border-border/50 border-r px-4 lg:flex">
-            <Button
-              variant="ghost"
-              size="icon"
-              onClick={() => navigate(backPath)}
-              className="size-8 rounded-xl"
-              title="Back to home"
-            >
-              <ArrowLeft className="size-4" />
-            </Button>
-            <div className="truncate font-semibold text-[15px] leading-5">
-              Agents
-            </div>
-          </div>
-          <div className="min-w-0 flex-1">
-            <ConversationHeader
-              agent={harnessAgent ?? null}
-              fallbackName={fallbackName}
-              fallbackAdapter={fallbackAdapter}
-              adapterHealth={adapterHealth}
-              backLabel={backLabel}
-              backTarget={isPageVariant ? 'page' : 'home'}
-              onGoHome={() => navigate(backPath)}
-              onPinToggle={(next) =>
-                handlePinToggle(harnessAgent ?? null, next)
-              }
-              headerExtra={
-                isOpenClawAgent ? (
-                  <Button
-                    variant={railVisible ? 'secondary' : 'ghost'}
-                    size="icon"
-                    className="size-8 rounded-xl"
-                    onClick={() => setOutputsRailOpen(!railVisible)}
-                    title={railVisible ? 'Hide outputs' : 'Show outputs'}
-                  >
-                    <PanelRight className="size-4" />
-                  </Button>
-                ) : undefined
-              }
-            />
-          </div>
-        </div>
+      <div className="mx-auto grid h-full w-full max-w-[1480px] lg:grid-cols-[288px_minmax(0,1fr)] lg:grid-rows-[3.5rem_minmax(0,1fr)]">
+        <AgentRailHeader onGoHome={() => navigate(backPath)} />

-        {/* Body grid: rail list + chat (+ outputs rail when an
-            openclaw agent has it open). Columns share the same top
-            edge as the band above so headers can never drift. */}
-        <div
-          className={cn(
-            'grid min-h-0 flex-1 grid-rows-[minmax(0,1fr)]',
-            railVisible
-              ? 'lg:grid-cols-[288px_minmax(0,1fr)_320px]'
-              : 'lg:grid-cols-[288px_minmax(0,1fr)]',
-          )}
-        >
-          <AgentRail
-            agents={harnessAgents}
-            adapters={adapters}
-            activeAgentId={resolvedAgentId}
-            onSelectAgent={handleSelectHarnessAgent}
-            onPinToggle={(target, next) => handlePinToggle(target, next)}
-          />
+        <ConversationHeader
+          agentName={agentName}
+          agentMeta={agentMeta}
+          status={statusCopy}
+          backLabel={backLabel}
+          backTarget={isPageVariant ? 'page' : 'home'}
+          onGoHome={() => navigate(backPath)}
+        />

-          <div className="flex h-full min-h-0 flex-col overflow-hidden">
-            <AgentConversationController
-              key={resolvedAgentId}
-              agentId={resolvedAgentId}
-              agents={agents}
-              initialMessage={initialMessage}
-              onInitialMessageConsumed={() => {
-                // Preserve the outputsTurn deep-link if present —
-                // dropping all params would erase the rail focus
-                // before it had a chance to consume.
-                setSearchParams(
-                  (prev) => {
-                    const next = new URLSearchParams()
-                    const turn = prev.get('outputsTurn')
-                    if (turn) next.set('outputsTurn', turn)
-                    return next
-                  },
-                  { replace: true },
-                )
-              }}
-              agentPathPrefix={agentPathPrefix}
-              createAgentPath={createAgentPath}
-              onOpenOutputsRail={isOpenClawAgent ? handleOpenOutputsRail : null}
-            />
-          </div>
+        <AgentRailList
+          activeAgentId={resolvedAgentId}
+          agents={agents}
+          onSelectAgent={handleSelectAgent}
+        />

-          {railVisible ? (
-            <OutputsRail
-              agentId={resolvedAgentId}
-              onClose={() => setOutputsRailOpen(false)}
-              focusTurnId={focusTurnId}
-              onFocusTurnConsumed={handleFocusTurnConsumed}
-            />
-          ) : null}
-        </div>
+        <AgentConversationController
+          key={resolvedAgentId}
+          agentId={resolvedAgentId}
+          agents={agents}
+          initialMessage={initialMessage}
+          onInitialMessageConsumed={() =>
+            setSearchParams({}, { replace: true })
+          }
+          agentPathPrefix={agentPathPrefix}
+          createAgentPath={createAgentPath}
+        />
      </div>
    </div>
  )
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandHome.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandHome.tsx
@@ -18,12 +18,8 @@ import { SignInHint } from '@/entrypoints/newtab/index/SignInHint'
 import { useActiveHint } from '@/entrypoints/newtab/index/useActiveHint'
 import { AgentCardDock } from './AgentCardDock'
 import { useAgentCommandData } from './agent-command-layout'
-import {
-  ConversationInput,
-  type ConversationInputSendInput,
-} from './ConversationInput'
+import { ConversationInput } from './ConversationInput'
 import { orderHomeAgents } from './home-agent-card.helpers'
-import { setPendingInitialMessage } from './pending-initial-message'

 function EmptyAgentsState({ onOpenAgents }: { onOpenAgents: () => void }) {
  return (
@@ -120,19 +116,8 @@ export const AgentCommandHome: FC = () => {
    }
  }, [legacyAgents, selectedAgentId])

-  const handleSend = (input: ConversationInputSendInput) => {
+  const handleSend = (input: { text: string }) => {
    if (!selectedAgentId) return
-    // Stash text + attachments in the in-memory registry. Text also
-    // travels in `?q=` so a hard refresh / shareable URL still works
-    // for text-only prompts; attachments are registry-only because a
-    // multi-megabyte dataUrl can't ride a URL search param. The chat
-    // screen prefers the registry when both are present.
-    setPendingInitialMessage({
-      agentId: selectedAgentId,
-      text: input.text,
-      attachments: input.attachments,
-      createdAt: Date.now(),
-    })
    navigate(
      `/home/agents/${selectedAgentId}?q=${encodeURIComponent(input.text)}`,
    )
@@ -162,16 +147,12 @@ export const AgentCommandHome: FC = () => {
          <>
            <div className="flex flex-col items-center gap-5 pt-[max(10vh,24px)] text-center">
              <div className="space-y-3">
-                <h1 className="font-semibold text-[clamp(2.25rem,4.5vw,3.5rem)] leading-[1.08] tracking-[-0.025em] [text-wrap:balance]">
-                  What should your agent{' '}
-                  <span className="font-medium text-[var(--accent-orange)] italic">
-                    work on
-                  </span>{' '}
-                  next?
+                <h1 className="font-semibold text-[clamp(2rem,4vw,3.25rem)] leading-tight tracking-tight">
+                  What should your agent work on next?
                </h1>
-                <p className="mx-auto max-w-2xl text-muted-foreground text-sm leading-6 [text-wrap:pretty]">
-                  Start a task, continue a thread, or hand off to a different
-                  agent — all without leaving this tab.
+                <p className="mx-auto max-w-2xl text-muted-foreground text-sm leading-6">
+                  Start with a task, continue a thread, or switch to another
+                  agent without leaving the new tab.
                </p>
              </div>

@@ -186,7 +167,7 @@ export const AgentCommandHome: FC = () => {
                  streaming={false}
                  disabled={!selectedAgentReady}
                  status={selectedAgentStatus}
-                  attachmentsEnabled={true}
+                  attachmentsEnabled={false}
                  placeholder={
                    selectedAgentReady
                      ? `Ask ${selectedAgentName} to handle a task...`
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentRail.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentRail.tsx
@@ -1,65 +0,0 @@
-import { type FC, useMemo } from 'react'
-import type {
-  HarnessAdapterDescriptor,
-  HarnessAgent,
-  HarnessAgentAdapter,
-} from '@/entrypoints/app/agents/agent-harness-types'
-import type { AgentAdapterHealth } from '@/entrypoints/app/agents/agent-row/agent-row.types'
-import { orderAgentsByPinThenRecency } from '@/entrypoints/app/agents/agents-list-order'
-import { AgentRailRow } from './AgentRailRow'
-
-interface AgentRailProps {
-  agents: HarnessAgent[]
-  adapters: HarnessAdapterDescriptor[]
-  activeAgentId: string
-  onSelectAgent: (agent: HarnessAgent) => void
-  onPinToggle: (agent: HarnessAgent, next: boolean) => void
-}
-
-/**
- * Left-column scrollable list of agents. The "Agents" label + back
- * button live in the shared top band above (so the rail header and
- * the chat header sit on a single aligned strip rather than as two
- * separately-sized headers per column). Sort matches `/agents`:
- * pinned-first → recency, so the rail doesn't reshuffle as turns
- * transition every 5 s.
- */
-export const AgentRail: FC<AgentRailProps> = ({
-  agents,
-  adapters,
-  activeAgentId,
-  onSelectAgent,
-  onPinToggle,
-}) => {
-  const adapterHealth = useMemo(() => {
-    const map = new Map<HarnessAgentAdapter, AgentAdapterHealth>()
-    for (const adapter of adapters) {
-      if (adapter.health) {
-        map.set(adapter.id, {
-          healthy: adapter.health.healthy,
-          reason: adapter.health.reason,
-        })
-      }
-    }
-    return map
-  }, [adapters])
-
-  const ordered = useMemo(() => orderAgentsByPinThenRecency(agents), [agents])
-
-  return (
-    <aside className="hidden min-h-0 flex-col border-border/50 border-r bg-background/70 lg:flex">
-      <div className="styled-scrollbar min-h-0 flex-1 space-y-1.5 overflow-y-auto px-3 py-3">
-        {ordered.map((agent) => (
-          <AgentRailRow
-            key={agent.id}
-            agent={agent}
-            active={agent.id === activeAgentId}
-            adapterHealth={adapterHealth.get(agent.adapter) ?? null}
-            onSelect={() => onSelectAgent(agent)}
-            onPinToggle={(next) => onPinToggle(agent, next)}
-          />
-        ))}
-      </div>
-    </aside>
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentRailRow.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentRailRow.tsx
@@ -1,102 +0,0 @@
-import type { FC } from 'react'
-import { Badge } from '@/components/ui/badge'
-import { adapterLabel } from '@/entrypoints/app/agents/AdapterIcon'
-import type { HarnessAgent } from '@/entrypoints/app/agents/agent-harness-types'
-import { AgentSummaryChips } from '@/entrypoints/app/agents/agent-row/AgentSummaryChips'
-import { AgentTile } from '@/entrypoints/app/agents/agent-row/AgentTile'
-import type { AgentAdapterHealth } from '@/entrypoints/app/agents/agent-row/agent-row.types'
-import { PinToggle } from '@/entrypoints/app/agents/agent-row/PinToggle'
-import { cn } from '@/lib/utils'
-
-interface AgentRailRowProps {
-  agent: HarnessAgent
-  active: boolean
-  adapterHealth: AgentAdapterHealth | null
-  onSelect: () => void
-  onPinToggle: (next: boolean) => void
-}
-
-/**
- * Compact rail row for the chat-screen sidebar. Slims `<AgentRowCard>`
- * down to the essentials that fit a ~280 px rail: tile + name + status
- * badge + pin star, with the adapter / model / reasoning chips on a
- * second line. Token totals, sparkline, last-message preview all stay
- * on the `/agents` page where rows are full-width.
- */
-export const AgentRailRow: FC<AgentRailRowProps> = ({
-  agent,
-  active,
-  adapterHealth,
-  onSelect,
-  onPinToggle,
-}) => {
-  const status = agent.status ?? 'unknown'
-  const lastUsedAt = agent.lastUsedAt ?? null
-  const pinned = agent.pinned ?? false
-  return (
-    <button
-      type="button"
-      onClick={onSelect}
-      className={cn(
-        'group w-full rounded-2xl border px-3 py-3 text-left transition-colors',
-        active
-          ? 'border-[var(--accent-orange)]/30 bg-[var(--accent-orange)]/8'
-          : 'border-transparent bg-transparent hover:border-border/60 hover:bg-card',
-      )}
-    >
-      <div className="flex min-w-0 items-start gap-3">
-        <AgentTile
-          adapter={agent.adapter}
-          status={status}
-          lastUsedAt={lastUsedAt}
-        />
-        <div className="min-w-0 flex-1">
-          <div className="flex items-center gap-1.5">
-            <span className="truncate font-semibold text-[14px] leading-5">
-              {agent.name}
-            </span>
-            {status === 'working' && (
-              <Badge
-                variant="secondary"
-                className="h-5 bg-amber-50 px-1.5 text-[10px] text-amber-900 hover:bg-amber-50"
-              >
-                Working
-              </Badge>
-            )}
-            {status === 'asleep' && (
-              <Badge
-                variant="outline"
-                className="h-5 px-1.5 text-[10px] text-muted-foreground"
-              >
-                Asleep
-              </Badge>
-            )}
-            {status === 'error' && (
-              <Badge variant="destructive" className="h-5 px-1.5 text-[10px]">
-                Attention
-              </Badge>
-            )}
-            <div className="ml-auto">
-              <PinToggle pinned={pinned} onToggle={onPinToggle} />
-            </div>
-          </div>
-          <AgentSummaryChips
-            adapter={agent.adapter}
-            modelLabel={agent.modelId ?? null}
-            reasoningEffort={agent.reasoningEffort ?? null}
-            adapterHealth={adapterHealth}
-          />
-        </div>
-      </div>
-    </button>
-  )
-}
-
-/**
- * Tooltip-only label helper kept exported in case the tile row needs to
- * show "Codex agent" or similar in a future state. Inlined fallback for
- * the rare `unknown` adapter rendering path.
- */
-export function railRowAdapterLabel(agent: HarnessAgent): string {
-  return adapterLabel(agent.adapter)
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentSelector.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentSelector.tsx
@@ -27,14 +27,6 @@ interface AgentSelectorProps {
  onSelectAgent: (agent: AgentEntry) => void
  onCreateAgent?: () => void
  status?: string
-  /**
-   * `'pill'` renders the filled-pill variant used by the calm
-   * composer on `/home` — bordered, slightly elevated background,
-   * mono agent name, used as the visual anchor on the left of the
-   * footer chip row. Default `'ghost'` keeps the existing flat
-   * shadcn ghost-button trigger used by the chat surface.
-   */
-  triggerVariant?: 'ghost' | 'pill'
 }

 function getStatusDot(status?: string) {
@@ -50,49 +42,31 @@ export const AgentSelector: FC<AgentSelectorProps> = ({
  onSelectAgent,
  onCreateAgent,
  status,
-  triggerVariant = 'ghost',
 }) => {
  const [open, setOpen] = useState(false)
  const selectedAgent = agents.find(
    (agent) => agent.agentId === selectedAgentId,
  )

-  const triggerNode =
-    triggerVariant === 'pill' ? (
-      <button
-        type="button"
-        className={cn(
-          'inline-flex h-6 max-w-[180px] items-center gap-1.5 rounded-full border border-border bg-accent/40 pr-2 pl-2.5 text-[11.5px] text-foreground transition-colors',
-          'hover:border-border hover:bg-accent/70 data-[state=open]:border-border data-[state=open]:bg-accent/70',
-        )}
-      >
-        <span className={cn('size-1.5 rounded-full', getStatusDot(status))} />
-        <span className="truncate font-medium font-mono text-[11.5px] tracking-[-0.01em]">
-          {selectedAgent?.name ?? 'Select agent'}
-        </span>
-        <ChevronDown className="size-3 shrink-0 text-muted-foreground" />
-      </button>
-    ) : (
-      <Button
-        variant="ghost"
-        className={cn(
-          'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
-          'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
-          'data-[state=open]:bg-accent',
-        )}
-      >
-        <Bot className="h-4 w-4" />
-        <span className={cn('size-2 rounded-full', getStatusDot(status))} />
-        <span className="max-w-32 truncate">
-          {selectedAgent?.name ?? 'Select agent'}
-        </span>
-        <ChevronDown className="h-3 w-3" />
-      </Button>
-    )
-
  return (
    <Popover open={open} onOpenChange={setOpen}>
-      <PopoverTrigger asChild>{triggerNode}</PopoverTrigger>
+      <PopoverTrigger asChild>
+        <Button
+          variant="ghost"
+          className={cn(
+            'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+            'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+            'data-[state=open]:bg-accent',
+          )}
+        >
+          <Bot className="h-4 w-4" />
+          <span className={cn('size-2 rounded-full', getStatusDot(status))} />
+          <span className="max-w-32 truncate">
+            {selectedAgent?.name ?? 'Select agent'}
+          </span>
+          <ChevronDown className="h-3 w-3" />
+        </Button>
+      </PopoverTrigger>
      <PopoverContent side="bottom" align="start" className="w-72 p-0">
        <Command>
          <CommandInput placeholder="Search agents..." className="h-9" />
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ClawChat.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ClawChat.tsx
@@ -1,14 +1,12 @@
 import { Bot, Loader2, RefreshCw } from 'lucide-react'
-import { type FC, Fragment, useEffect, useRef } from 'react'
+import { type FC, useEffect, useRef } from 'react'
 import {
  Conversation,
  ConversationContent,
  ConversationScrollButton,
 } from '@/components/ai-elements/conversation'
 import type { AgentConversationTurn } from '@/lib/agent-conversations/types'
-import type { ProducedFilesRailGroup } from '@/lib/agent-files'
 import { cn } from '@/lib/utils'
-import { FileCardStrip } from './agent-conversation.file-card-strip'
 import { ClawChatMessage } from './ClawChatMessage'
 import { ConversationMessage } from './ConversationMessage'
 import type { ClawChatMessage as ClawChatMessageModel } from './claw-chat-types'
@@ -17,29 +15,6 @@ interface ClawChatProps {
  agentName: string
  historyMessages: ClawChatMessageModel[]
  turns: AgentConversationTurn[]
-  /**
-   * Persisted turns that still need to render their FileCardStrip
-   * because the history items they were filtered against don't
-   * carry produced-files data. Rendered between history and the
-   * live `turns` so the strip lands at the bottom of the
-   * corresponding assistant turn.
-   */
-  stripOnlyTurns?: AgentConversationTurn[]
-  /**
-   * Maps each assistant history message id → the produced-files
-   * group that came from its turn. Built by
-   * `mapHistoryToProducedFilesGroups` upstream so the strip
-   * renders directly under the matching message instead of
-   * stacking at the conversation tail.
-   */
-  filesByAssistantId?: Map<string, ProducedFilesRailGroup>
-  /**
-   * Produced-files groups that didn't match any persisted history
-   * pair (e.g. orphaned turns where history loaded after the
-   * group was attributed). Rendered at the conversation tail as
-   * a fallback so the user can still see them.
-   */
-  tailStripGroups?: ReadonlyArray<ProducedFilesRailGroup>
  streaming: boolean
  isInitialLoading: boolean
  error: Error | null
@@ -47,8 +22,6 @@ interface ClawChatProps {
  isFetchingNextPage: boolean
  onFetchNextPage: () => void
  onRetry: () => void
-  /** Wired through to the inline file-card strip on each assistant turn. */
-  onOpenOutputsRail?: ((turnId?: string | null) => void) | null
  className?: string
 }

@@ -105,9 +78,6 @@ export const ClawChat: FC<ClawChatProps> = ({
  agentName,
  historyMessages,
  turns,
-  stripOnlyTurns,
-  filesByAssistantId,
-  tailStripGroups,
  streaming,
  isInitialLoading,
  error,
@@ -115,7 +85,6 @@ export const ClawChat: FC<ClawChatProps> = ({
  isFetchingNextPage,
  onFetchNextPage,
  onRetry,
-  onOpenOutputsRail,
  className,
 }) => {
  const topSentinelRef = useRef<HTMLDivElement>(null)
@@ -178,44 +147,14 @@ export const ClawChat: FC<ClawChatProps> = ({
                  Start of conversation
                </div>
              ) : null}
-              {historyMessages.map((message) => {
-                const matched = filesByAssistantId?.get(message.id)
-                return (
-                  <Fragment key={message.id}>
-                    <ClawChatMessage message={message} />
-                    {matched ? (
-                      <FileCardStrip
-                        turnId={matched.turnId}
-                        files={matched.files}
-                        onOpenRail={onOpenOutputsRail ?? (() => {})}
-                      />
-                    ) : null}
-                  </Fragment>
-                )
-              })}
-              {(tailStripGroups ?? []).map((group) => (
-                <FileCardStrip
-                  key={`tail-strip-${group.turnId}`}
-                  turnId={group.turnId}
-                  files={group.files}
-                  onOpenRail={onOpenOutputsRail ?? (() => {})}
-                />
-              ))}
-              {(stripOnlyTurns ?? []).map((turn) => (
-                <ConversationMessage
-                  key={`strip-${turn.id}`}
-                  turn={turn}
-                  streaming={false}
-                  stripOnly
-                  onOpenOutputsRail={onOpenOutputsRail}
-                />
+              {historyMessages.map((message) => (
+                <ClawChatMessage key={message.id} message={message} />
              ))}
              {turns.map((turn, index) => (
                <ConversationMessage
                  key={turn.id}
                  turn={turn}
                  streaming={streaming && index === turns.length - 1}
-                  onOpenOutputsRail={onOpenOutputsRail}
                />
              ))}
              {error ? (
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationHeader.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationHeader.tsx
@@ -1,187 +0,0 @@
-import { ArrowLeft, Home } from 'lucide-react'
-import type { FC, ReactNode } from 'react'
-import { Badge } from '@/components/ui/badge'
-import { Button } from '@/components/ui/button'
-import { formatRelativeTime } from '@/entrypoints/app/agents/agent-display.helpers'
-import type { HarnessAgent } from '@/entrypoints/app/agents/agent-harness-types'
-import { AgentSummaryChips } from '@/entrypoints/app/agents/agent-row/AgentSummaryChips'
-import { formatTokens } from '@/entrypoints/app/agents/agent-row/agent-row.helpers'
-import type { AgentAdapterHealth } from '@/entrypoints/app/agents/agent-row/agent-row.types'
-import { PinToggle } from '@/entrypoints/app/agents/agent-row/PinToggle'
-import type { AgentLiveness } from '@/entrypoints/app/agents/LivenessDot'
-import { cn } from '@/lib/utils'
-
-interface ConversationHeaderProps {
-  agent: HarnessAgent | null
-  fallbackName: string
-  fallbackAdapter: 'claude' | 'codex' | 'openclaw' | 'hermes' | 'unknown'
-  adapterHealth: AgentAdapterHealth | null
-  backLabel: string
-  backTarget: 'home' | 'page'
-  onGoHome: () => void
-  onPinToggle: (next: boolean) => void
-  /** Optional trailing slot — currently used for the Outputs rail toggle. */
-  headerExtra?: ReactNode
-}
-
-/**
- * Strip above the chat. Mirrors the `/agents` row card's title row +
- * summary chips so the user gets adapter health, pin state, and status
- * at a glance — but adds the meta line (last used · lifetime tokens ·
- * queued) that's specific to this surface.
- *
- * The mobile `lg:hidden` Back button is preserved so the small-screen
- * collapse keeps a navigable header without a sidebar.
- */
-export const ConversationHeader: FC<ConversationHeaderProps> = ({
-  agent,
-  fallbackName,
-  fallbackAdapter,
-  adapterHealth,
-  backLabel,
-  backTarget,
-  onGoHome,
-  onPinToggle,
-  headerExtra,
-}) => {
-  const BackIcon = backTarget === 'home' ? Home : ArrowLeft
-  const adapter = agent?.adapter ?? fallbackAdapter
-  const status: AgentLiveness = agent?.status ?? 'unknown'
-  const lastUsedAt = agent?.lastUsedAt ?? null
-  const pinned = agent?.pinned ?? false
-  const queueCount = agent?.queue?.length ?? 0
-  const tokens = agent?.tokens ?? null
-  const lifetimeTotal = tokens
-    ? tokens.cumulative.input + tokens.cumulative.output
-    : 0
-
-  const metaParts: string[] = []
-  if (lastUsedAt !== null) metaParts.push(formatRelativeTime(lastUsedAt))
-  if (lifetimeTotal > 0) metaParts.push(`${formatTokens(lifetimeTotal)} tokens`)
-  if (queueCount > 0) {
-    metaParts.push(queueCount === 1 ? '1 queued' : `${queueCount} queued`)
-  }
-
-  return (
-    <div className="flex min-h-[60px] shrink-0 items-center justify-between gap-4 px-5 py-2.5">
-      <div className="flex min-w-0 items-center gap-3">
-        <Button
-          variant="ghost"
-          size="icon"
-          onClick={onGoHome}
-          className="size-8 shrink-0 rounded-xl lg:hidden"
-          title={backLabel}
-        >
-          <BackIcon className="size-4" />
-        </Button>
-        <div className="group min-w-0 flex-1">
-          <div className="flex items-center gap-2">
-            <span className="truncate font-semibold text-[15px] leading-6">
-              {agent?.name || fallbackName}
-            </span>
-            {agent ? (
-              <PinToggle pinned={pinned} onToggle={onPinToggle} />
-            ) : null}
-          </div>
-          <div className="mt-0.5 flex items-center gap-2">
-            <AgentSummaryChips
-              adapter={adapter}
-              modelLabel={agent?.modelId ?? null}
-              reasoningEffort={agent?.reasoningEffort ?? null}
-              adapterHealth={adapterHealth}
-            />
-          </div>
-        </div>
-      </div>
-      <div className="flex shrink-0 items-center gap-3">
-        <div className="flex shrink-0 flex-col items-end gap-1">
-          <StatusPill
-            status={status}
-            hasActiveTurn={Boolean(agent?.activeTurnId)}
-          />
-          <div className="flex h-4 items-center text-[11px] text-muted-foreground">
-            <span className="truncate">
-              {metaParts.length > 0 ? metaParts.join(' · ') : '\u00A0'}
-            </span>
-          </div>
-        </div>
-        {headerExtra ? (
-          <div className="flex shrink-0 items-center">{headerExtra}</div>
-        ) : null}
-      </div>
-    </div>
-  )
-}
-
-interface StatusPillProps {
-  status: AgentLiveness
-  hasActiveTurn: boolean
-}
-
-/**
- * Working / Asleep / Attention all get distinctive styling; idle keeps
- * the legacy emerald `Ready` pill so the default state is visually
- * calm. Defensive working: `idle + activeTurnId` falls through to the
- * working pill since the server says a turn is in flight.
- */
-const StatusPill: FC<StatusPillProps> = ({ status, hasActiveTurn }) => {
-  const effective: AgentLiveness =
-    status === 'idle' && hasActiveTurn ? 'working' : status
-
-  const base =
-    'inline-flex items-center gap-2 rounded-full border px-3 py-0.5 text-[11px] uppercase tracking-[0.18em]'
-
-  if (effective === 'working') {
-    return (
-      <Badge
-        variant="secondary"
-        className={cn(
-          base,
-          'border-amber-200 bg-amber-50 text-amber-900 hover:bg-amber-50',
-        )}
-      >
-        <span className="size-1.5 animate-pulse rounded-full bg-amber-500" />
-        Working
-      </Badge>
-    )
-  }
-  if (effective === 'asleep') {
-    return (
-      <Badge variant="outline" className={cn(base, 'text-muted-foreground')}>
-        <span className="size-1.5 rounded-full bg-muted-foreground/50" />
-        Asleep
-      </Badge>
-    )
-  }
-  if (effective === 'error') {
-    return (
-      <Badge
-        variant="destructive"
-        className={cn(base, 'border-destructive/30')}
-      >
-        <span className="size-1.5 rounded-full bg-destructive-foreground" />
-        Attention
-      </Badge>
-    )
-  }
-  if (effective === 'idle') {
-    return (
-      <Badge
-        variant="outline"
-        className={cn(
-          base,
-          'border-emerald-200 bg-emerald-50 text-emerald-900 hover:bg-emerald-50',
-        )}
-      >
-        <span className="size-1.5 rounded-full bg-emerald-500" />
-        Ready
-      </Badge>
-    )
-  }
-  return (
-    <Badge variant="outline" className={cn(base, 'text-muted-foreground')}>
-      <span className="size-1.5 rounded-full bg-muted-foreground/30" />
-      Setup
-    </Badge>
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationInput.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationInput.tsx
@@ -164,16 +164,7 @@ function VoiceButton({
  )
 }

-/**
- * Calm-composer footer shared by both `/home` (`variant="home"`) and
- * the chat surface at `/agents/:agentId` (`variant="conversation"`).
- * Pill-shaped chips on an internal dashed divider, with a right-
- * aligned keyboard hint. The agent selector is conditional via
- * `showAgentSelector`: home shows it as a filled pill on the left,
- * the chat surface hides it (the agent is locked once you're in the
- * conversation).
- */
-function CalmContextControls({
+function ContextControls({
  agents,
  onCreateAgent,
  onSelectAgent,
@@ -210,128 +201,110 @@ function CalmContextControls({
    )?.is_authenticated
  })

-  const showApps = supports(Feature.MANAGED_MCP_SUPPORT)
-  const showWorkspace = supports(Feature.WORKSPACE_FOLDER_SUPPORT)
-
  return (
-    <div className="mx-3 flex items-center gap-1 border-border/60 border-t border-dashed py-2">
-      {showAgentSelector ? (
-        <>
+    <div className="flex items-center justify-between border-border/40 border-t px-4 py-2.5">
+      <div className="flex items-center gap-1">
+        {showAgentSelector ? (
          <AgentSelector
            agents={agents}
            selectedAgentId={selectedAgentId}
            onSelectAgent={onSelectAgent}
            onCreateAgent={onCreateAgent}
            status={status}
-            triggerVariant="pill"
          />
-          <span
-            aria-hidden="true"
-            className="mx-1 inline-block h-3.5 w-px shrink-0 bg-border"
-          />
-        </>
-      ) : null}
-      {showWorkspace ? (
-        <WorkspaceSelector>
-          <button
-            type="button"
-            className="inline-flex h-6 items-center gap-1.5 rounded-full px-2.5 text-[11.5px] text-muted-foreground transition-colors hover:bg-accent hover:text-foreground data-[state=open]:bg-accent data-[state=open]:text-foreground"
-          >
-            <Folder className="size-3" />
-            <span>Workspace</span>
-            <span className="font-mono text-[10.5px] text-muted-foreground/70">
-              {selectedFolder?.name ?? 'none'}
-            </span>
-          </button>
-        </WorkspaceSelector>
-      ) : null}
-      <TabPickerPopover
-        variant="selector"
-        selectedTabs={selectedTabs}
-        onToggleTab={onToggleTab}
-      >
-        <button
-          type="button"
-          className={cn(
-            'inline-flex h-6 items-center gap-1.5 rounded-full px-2.5 text-[11.5px] transition-colors data-[state=open]:bg-accent data-[state=open]:text-foreground',
-            selectedTabs.length > 0
-              ? 'bg-[var(--accent-orange)] text-white hover:bg-[var(--accent-orange)]/90'
-              : 'text-muted-foreground hover:bg-accent hover:text-foreground',
-          )}
+        ) : null}
+        {supports(Feature.WORKSPACE_FOLDER_SUPPORT) ? (
+          <WorkspaceSelector>
+            <Button
+              variant="ghost"
+              className={cn(
+                'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+                'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+                'data-[state=open]:bg-accent',
+              )}
+            >
+              <Folder className="h-4 w-4" />
+              <span>{selectedFolder?.name || 'Add workspace'}</span>
+              <ChevronDown className="h-3 w-3" />
+            </Button>
+          </WorkspaceSelector>
+        ) : null}
+        <TabPickerPopover
+          variant="selector"
+          selectedTabs={selectedTabs}
+          onToggleTab={onToggleTab}
        >
-          <Layers className="size-3" />
-          <span>Tabs</span>
-          <span
+          <Button
            className={cn(
-              'font-mono text-[10.5px]',
+              'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
              selectedTabs.length > 0
-                ? 'text-white/80'
-                : 'text-muted-foreground/70',
+                ? 'bg-[var(--accent-orange)]! text-white shadow-sm'
+                : 'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+              'data-[state=open]:bg-accent',
            )}
          >
-            {selectedTabs.length}
-          </span>
-        </button>
-      </TabPickerPopover>
-      <button
-        type="button"
-        onClick={onAttachClick}
-        disabled={attachDisabled || !attachmentsEnabled}
-        title="Attach files"
-        className="inline-flex h-6 items-center gap-1.5 rounded-full px-2.5 text-[11.5px] text-muted-foreground transition-colors hover:bg-accent hover:text-foreground disabled:cursor-not-allowed disabled:opacity-50"
-      >
-        <Paperclip className="size-3" />
-        <span>Attach</span>
-      </button>
-      {showApps ? (
-        <AppSelector side="bottom">
-          <button
-            type="button"
-            className="inline-flex h-6 items-center gap-1.5 rounded-full px-2.5 text-[11.5px] text-muted-foreground transition-colors hover:bg-accent hover:text-foreground data-[state=open]:bg-accent data-[state=open]:text-foreground"
-          >
-            {connectedManagedServers.length > 0 ? (
-              <span className="flex items-center -space-x-1.5">
+            <Layers className="h-4 w-4" />
+            <span>Tabs</span>
+          </Button>
+        </TabPickerPopover>
+        <Button
+          type="button"
+          variant="ghost"
+          onClick={onAttachClick}
+          disabled={attachDisabled || !attachmentsEnabled}
+          title="Attach files"
+          className={cn(
+            'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+            'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+          )}
+        >
+          <Paperclip className="h-4 w-4" />
+          <span>Attach</span>
+        </Button>
+      </div>
+
+      {supports(Feature.MANAGED_MCP_SUPPORT) ? (
+        <div className="ml-auto flex items-center gap-1.5">
+          <AppSelector side="bottom">
+            <Button
+              variant="ghost"
+              className={cn(
+                'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+                'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+                'data-[state=open]:bg-accent',
+              )}
+            >
+              <div className="flex items-center -space-x-1.5">
                {connectedManagedServers.slice(0, 4).map((server) => (
-                  <span
+                  <div
                    key={server.id}
                    className="rounded-full ring-2 ring-card"
                  >
                    <McpServerIcon
                      serverName={server.managedServerName ?? ''}
-                      size={12}
+                      size={16}
                    />
-                  </span>
+                  </div>
                ))}
-              </span>
-            ) : (
-              <FileText className="size-3" />
-            )}
-            <span>Apps</span>
-            <ChevronDown className="size-3" />
-          </button>
-        </AppSelector>
+              </div>
+              {connectedManagedServers.length > 4 ? (
+                <span className="text-xs">
+                  +{connectedManagedServers.length - 4}
+                </span>
+              ) : null}
+              <span>Apps</span>
+              <ChevronDown className="h-3 w-3" />
+            </Button>
+          </AppSelector>
+        </div>
      ) : null}
-      <div className="ml-auto inline-flex shrink-0 items-center gap-1.5 text-[11px] text-muted-foreground/70">
-        <kbd className="inline-flex h-4 min-w-4 items-center justify-center rounded border border-border bg-accent/30 px-1 font-mono text-[10px] text-muted-foreground">
-          ↵
-        </kbd>
-        <span>to run</span>
-        <span className="text-muted-foreground/40">·</span>
-        <kbd className="inline-flex h-4 min-w-4 items-center justify-center rounded border border-border bg-accent/30 px-1 font-mono text-[10px] text-muted-foreground">
-          ⇧
-        </kbd>
-        <kbd className="inline-flex h-4 min-w-4 items-center justify-center rounded border border-border bg-accent/30 px-1 font-mono text-[10px] text-muted-foreground">
-          ↵
-        </kbd>
-        <span>new line</span>
-      </div>
    </div>
  )
 }

 function HomeShell({ children }: { children: ReactNode }) {
  return (
-    <div className="overflow-hidden rounded-[1.55rem] border border-border/60 bg-card/95 shadow-sm transition-[border-color,box-shadow] duration-150 focus-within:border-[var(--accent-orange)]/40 focus-within:shadow-[0_0_0_4px_color-mix(in_oklch,var(--accent-orange)_15%,transparent),0_1px_2px_rgba(15,23,42,0.04)]">
+    <div className="overflow-hidden rounded-[1.55rem] border border-border/60 bg-card/95 shadow-sm">
      {children}
    </div>
  )
@@ -339,7 +312,7 @@ function HomeShell({ children }: { children: ReactNode }) {

 function ConversationShell({ children }: { children: ReactNode }) {
  return (
-    <div className="overflow-hidden rounded-[1.35rem] border border-border/50 bg-background/95 shadow-[0_10px_30px_rgba(15,23,42,0.06)] backdrop-blur-md transition-[border-color,box-shadow] duration-150 focus-within:border-[var(--accent-orange)]/40 focus-within:shadow-[0_0_0_4px_color-mix(in_oklch,var(--accent-orange)_15%,transparent),0_10px_30px_rgba(15,23,42,0.06)]">
+    <div className="overflow-hidden rounded-[1.35rem] border border-border/50 bg-background/95 shadow-[0_10px_30px_rgba(15,23,42,0.06)] backdrop-blur-md">
      {children}
    </div>
  )
@@ -569,7 +542,7 @@ export const ConversationInput: FC<ConversationInputProps> = ({
              }
              disabled={disabled || voice.isTranscribing}
              className={cn(
-                'resize-none border-none bg-transparent px-0 text-[15px] shadow-none focus-visible:ring-0 dark:bg-transparent',
+                'resize-none border-none bg-transparent px-0 text-[15px] shadow-none focus-visible:ring-0',
                '[field-sizing:fixed]',
                variant === 'home'
                  ? 'min-h-[40px] py-2 leading-6'
@@ -610,7 +583,7 @@ export const ConversationInput: FC<ConversationInputProps> = ({
            {voice.error}
          </div>
        ) : null}
-        <CalmContextControls
+        <ContextControls
          agents={agents}
          onCreateAgent={onCreateAgent}
          onSelectAgent={onSelectAgent}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationMessage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationMessage.tsx
@@ -22,26 +22,10 @@ import type {
  AgentConversationTurn,
  ToolEntry,
 } from '@/lib/agent-conversations/types'
-import { FileCardStrip } from './agent-conversation.file-card-strip'

 interface ConversationMessageProps {
  turn: AgentConversationTurn
  streaming: boolean
-  /**
-   * Forwarded to the inline file-card strip's "View" / "+N"
-   * button. Wired up by AgentCommandConversation so the strip can
-   * deep-link straight into the Outputs rail at the matching turn
-   * group. `null` here disables the strip's deep-link affordance
-   * — the cards still open the preview Sheet directly.
-   */
-  onOpenOutputsRail?: ((turnId?: string | null) => void) | null
-  /**
-   * Render only the trailing FileCardStrip for this turn — used
-   * when the turn's user / assistant text is already rendered
-   * elsewhere (e.g. by `ClawChatMessage` from persisted history)
-   * but the produced-files affordance would otherwise be lost.
-   */
-  stripOnly?: boolean
 }

 interface RenderEntry {
@@ -104,22 +88,9 @@ function ToolStatusIcon({ status }: { status: ToolEntry['status'] }) {
 export const ConversationMessage: FC<ConversationMessageProps> = ({
  turn,
  streaming,
-  onOpenOutputsRail,
-  stripOnly,
 }) => {
  const entries = useMemo(() => buildRenderEntries(turn), [turn])

-  if (stripOnly) {
-    if (!turn.producedFiles || turn.producedFiles.length === 0) return null
-    return (
-      <FileCardStrip
-        turnId={turn.turnId ?? null}
-        files={turn.producedFiles}
-        onOpenRail={onOpenOutputsRail ?? (() => {})}
-      />
-    )
-  }
-
  return (
    <div className="space-y-3">
      <Message from="user">
@@ -214,14 +185,6 @@ export const ConversationMessage: FC<ConversationMessageProps> = ({
        </Message>
      )}

-      {turn.producedFiles && turn.producedFiles.length > 0 ? (
-        <FileCardStrip
-          turnId={turn.turnId ?? null}
-          files={turn.producedFiles}
-          onOpenRail={onOpenOutputsRail ?? (() => {})}
-        />
-      ) : null}
-
      {!turn.done && turn.parts.length === 0 && streaming && (
        <div className="flex gap-2">
          <div className="flex size-7 shrink-0 items-center justify-center rounded-full bg-[var(--accent-orange)] text-white">
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.artifact-card.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.artifact-card.tsx
@@ -1,124 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * @deprecated Replaced by `FileCardStrip` in
- * `agent-conversation.file-card-strip.tsx`. Kept temporarily so
- * any in-flight callers don't fail to import; remove in a
- * follow-up once nothing external references it.
- *
- * Compact "Files produced" card rendered under an assistant turn.
- */
-
-import { FileText, Image as ImageIcon, Paperclip } from 'lucide-react'
-import { type FC, useMemo, useState } from 'react'
-import { Button } from '@/components/ui/button'
-import { basenameOf, formatFileSize, inferFileKind } from '@/lib/agent-files'
-import { cn } from '@/lib/utils'
-import { FilePreviewSheet } from './agent-conversation.file-preview-sheet'
-
-export interface ProducedFileLike {
-  id: string
-  path: string
-  size: number
-}
-
-interface ArtifactCardProps {
-  files: ReadonlyArray<ProducedFileLike>
-  className?: string
-}
-
-const MAX_INLINE_ROWS = 4
-
-export const ArtifactCard: FC<ArtifactCardProps> = ({ files, className }) => {
-  const [openFileId, setOpenFileId] = useState<string | null>(null)
-  const [expanded, setExpanded] = useState(false)
-
-  const sortedFiles = useMemo(
-    () => [...files].sort((a, b) => a.path.localeCompare(b.path)),
-    [files],
-  )
-
-  if (sortedFiles.length === 0) return null
-
-  const visible = expanded ? sortedFiles : sortedFiles.slice(0, MAX_INLINE_ROWS)
-  const hiddenCount = sortedFiles.length - visible.length
-  const openFile = sortedFiles.find((file) => file.id === openFileId) ?? null
-
-  return (
-    <div
-      className={cn(
-        'rounded-xl border border-border/60 bg-card/50 px-3 py-2.5',
-        className,
-      )}
-    >
-      <div className="mb-2 flex items-center gap-2 text-muted-foreground text-xs">
-        <Paperclip className="size-3.5" />
-        <span className="font-medium text-foreground">
-          {sortedFiles.length === 1
-            ? '1 file produced'
-            : `${sortedFiles.length} files produced`}
-        </span>
-      </div>
-
-      <ul className="flex flex-col gap-1">
-        {visible.map((file) => (
-          <li key={file.id}>
-            <ArtifactRow file={file} onOpen={() => setOpenFileId(file.id)} />
-          </li>
-        ))}
-      </ul>
-
-      {hiddenCount > 0 ? (
-        <Button
-          type="button"
-          variant="ghost"
-          size="sm"
-          className="mt-1.5 h-7 px-2 text-xs"
-          onClick={() => setExpanded(true)}
-        >
-          Show {hiddenCount} more
-        </Button>
-      ) : null}
-
-      <FilePreviewSheet
-        fileId={openFile?.id ?? null}
-        filePath={openFile?.path ?? null}
-        open={Boolean(openFileId)}
-        onOpenChange={(next) => {
-          if (!next) setOpenFileId(null)
-        }}
-      />
-    </div>
-  )
-}
-
-function ArtifactRow({
-  file,
-  onOpen,
-}: {
-  file: ProducedFileLike
-  onOpen: () => void
-}) {
-  const name = basenameOf(file.path)
-  const kind = inferFileKind(file.path)
-  const Icon = kind === 'image' ? ImageIcon : FileText
-
-  return (
-    <button
-      type="button"
-      onClick={onOpen}
-      className={cn(
-        'flex w-full items-center gap-2 rounded-md px-2 py-1.5 text-left text-sm transition-colors',
-        'hover:bg-accent/60 focus:bg-accent/60 focus:outline-hidden',
-      )}
-    >
-      <Icon className="size-3.5 shrink-0 text-muted-foreground" />
-      <span className="min-w-0 flex-1 truncate font-medium">{name}</span>
-      <span className="shrink-0 text-muted-foreground text-xs tabular-nums">
-        {formatFileSize(file.size)}
-      </span>
-    </button>
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.file-card-strip.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.file-card-strip.tsx
@@ -1,163 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * "Files produced" strip rendered at the bottom of any assistant
- * turn that produced files (openclaw only). Replaces Phase 5.3's
- * row-list ArtifactCard with small horizontal cards for a lighter
- * visual treatment.
- *
- * Click semantics:
- *  - Card  → opens FilePreviewSheet directly (preview + download).
- *  - View  → emits onOpenRail(turnId); the parent opens the rail
- *            and scrolls to the matching turn group.
- *  - +N    → same as View (the user is asking to see what was
- *            overflowed).
- */
-
-import { ChevronRight, FileText, Image as ImageIcon } from 'lucide-react'
-import { type FC, useMemo, useState } from 'react'
-import { Button } from '@/components/ui/button'
-import { basenameOf, formatFileSize, inferFileKind } from '@/lib/agent-files'
-import { cn } from '@/lib/utils'
-import { FilePreviewSheet } from './agent-conversation.file-preview-sheet'
-
-export interface CardStripFile {
-  id: string
-  path: string
-  size: number
-}
-
-interface FileCardStripProps {
-  /**
-   * The turn id that produced these files. Forwarded to
-   * `onOpenRail` so the rail can scroll/expand the matching group.
-   * Optional because the live `produced_files` event lands before
-   * the harness has stamped a server-issued turn id on the
-   * optimistic turn — in that brief window, View falls back to
-   * just opening the rail at the top.
-   */
-  turnId?: string | null
-  files: ReadonlyArray<CardStripFile>
-  /** Caller wires this to `setOutputsRailOpen(true)` + deep-link. */
-  onOpenRail: (turnId?: string | null) => void
-  className?: string
-}
-
-const MAX_VISIBLE = 4
-
-export const FileCardStrip: FC<FileCardStripProps> = ({
-  turnId,
-  files,
-  onOpenRail,
-  className,
-}) => {
-  const [openFileId, setOpenFileId] = useState<string | null>(null)
-
-  const sortedFiles = useMemo(
-    () => [...files].sort((a, b) => a.path.localeCompare(b.path)),
-    [files],
-  )
-
-  if (sortedFiles.length === 0) return null
-
-  const visible = sortedFiles.slice(0, MAX_VISIBLE)
-  const hiddenCount = sortedFiles.length - visible.length
-  const openFile = sortedFiles.find((file) => file.id === openFileId) ?? null
-
-  return (
-    <div
-      className={cn(
-        'rounded-xl border border-border/60 bg-card/50 px-3 py-2.5',
-        className,
-      )}
-    >
-      <div className="mb-2 flex items-center gap-2">
-        <span className="text-muted-foreground text-xs">
-          {sortedFiles.length === 1
-            ? 'File produced'
-            : `Files produced (${sortedFiles.length})`}
-        </span>
-        <Button
-          type="button"
-          variant="ghost"
-          size="sm"
-          className="ml-auto h-7 gap-1 px-2 text-xs"
-          onClick={() => onOpenRail(turnId ?? null)}
-        >
-          View
-          <ChevronRight className="size-3" />
-        </Button>
-      </div>
-
-      <div className="flex flex-wrap gap-2">
-        {visible.map((file) => (
-          <FileCard
-            key={file.id}
-            file={file}
-            onOpen={() => setOpenFileId(file.id)}
-          />
-        ))}
-        {hiddenCount > 0 ? (
-          <button
-            type="button"
-            onClick={() => onOpenRail(turnId ?? null)}
-            className={cn(
-              'flex h-[56px] min-w-[56px] shrink-0 items-center justify-center rounded-lg border border-border/60 px-3 text-muted-foreground text-xs',
-              'transition-colors hover:border-border hover:bg-accent/40 hover:text-foreground',
-              'focus:outline-hidden focus-visible:ring-2 focus-visible:ring-[var(--accent-orange)]',
-            )}
-            title={`See ${hiddenCount} more in the Outputs rail`}
-          >
-            +{hiddenCount}
-          </button>
-        ) : null}
-      </div>
-
-      <FilePreviewSheet
-        fileId={openFile?.id ?? null}
-        filePath={openFile?.path ?? null}
-        open={Boolean(openFileId)}
-        onOpenChange={(next) => {
-          if (!next) setOpenFileId(null)
-        }}
-      />
-    </div>
-  )
-}
-
-function FileCard({
-  file,
-  onOpen,
-}: {
-  file: CardStripFile
-  onOpen: () => void
-}) {
-  const name = basenameOf(file.path)
-  const kind = inferFileKind(file.path)
-  const Icon = kind === 'image' ? ImageIcon : FileText
-
-  return (
-    <button
-      type="button"
-      onClick={onOpen}
-      title={file.path}
-      className={cn(
-        'flex h-[56px] w-[140px] shrink-0 flex-col justify-between rounded-lg border border-border/60 bg-background px-2.5 py-1.5 text-left',
-        'transition-colors hover:border-border hover:bg-accent/40',
-        'focus:outline-hidden focus-visible:ring-2 focus-visible:ring-[var(--accent-orange)]',
-      )}
-    >
-      <div className="flex min-w-0 items-center gap-1.5">
-        <Icon className="size-3.5 shrink-0 text-muted-foreground" />
-        <span className="min-w-0 flex-1 truncate font-medium text-xs">
-          {name}
-        </span>
-      </div>
-      <span className="text-[11px] text-muted-foreground tabular-nums">
-        {formatFileSize(file.size)}
-      </span>
-    </button>
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.file-preview-sheet.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.file-preview-sheet.tsx
@@ -1,283 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * Shared preview drawer used by the inline artifact card AND the
- * Outputs rail. Branches on the FilePreview discriminated union and
- * renders the appropriate body. Always opens via a controlled
- * `open`/`onOpenChange` pair so the parent owns the selected file.
- */
-
-import { Download, FileWarning, Loader2 } from 'lucide-react'
-import { type FC, useEffect, useMemo, useRef } from 'react'
-import { toast } from 'sonner'
-import { MessageResponse } from '@/components/ai-elements/message'
-import { Button } from '@/components/ui/button'
-import { ScrollArea } from '@/components/ui/scroll-area'
-import {
-  Sheet,
-  SheetContent,
-  SheetDescription,
-  SheetHeader,
-  SheetTitle,
-} from '@/components/ui/sheet'
-import { Skeleton } from '@/components/ui/skeleton'
-import {
-  basenameOf,
-  buildFileDownloadUrl,
-  extensionOf,
-  type FilePreview,
-  formatFileSize,
-  useFilePreview,
-} from '@/lib/agent-files'
-import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
-import { cn } from '@/lib/utils'
-
-interface FilePreviewSheetProps {
-  fileId: string | null
-  filePath: string | null
-  open: boolean
-  onOpenChange: (open: boolean) => void
-}
-
-const MARKDOWN_EXTENSIONS = new Set(['md', 'markdown', 'mdx'])
-
-export const FilePreviewSheet: FC<FilePreviewSheetProps> = ({
-  fileId,
-  filePath,
-  open,
-  onOpenChange,
-}) => {
-  const { baseUrl } = useAgentServerUrl()
-  const { preview, loading, error } = useFilePreview(fileId, open)
-
-  const fileName = filePath ? basenameOf(filePath) : 'File preview'
-  const downloadUrl = useMemo(() => {
-    if (!baseUrl || !fileId) return null
-    return buildFileDownloadUrl(baseUrl, fileId)
-  }, [baseUrl, fileId])
-
-  // Surface preview-load failures in a toast in addition to the
-  // inline error block — the inline UI lives at the bottom of the
-  // sheet and is easy to miss when scrolled into the body.
-  const lastToastedFileIdRef = useRef<string | null>(null)
-  useEffect(() => {
-    if (!open) {
-      lastToastedFileIdRef.current = null
-      return
-    }
-    if (!error || !fileId) return
-    if (lastToastedFileIdRef.current === fileId) return
-    lastToastedFileIdRef.current = fileId
-    toast.error('Could not load preview', { description: error.message })
-  }, [open, error, fileId])
-
-  const handleDownload = () => {
-    if (!downloadUrl) {
-      toast.error("Couldn't reach the agent server", {
-        description: 'Reconnect to BrowserOS and try again.',
-      })
-      return
-    }
-    // Manually trigger the download so any future failure (e.g. the
-    // server returns 404 because the file was removed) can be
-    // surfaced via toast — the bare <a download> path swallows
-    // these errors silently.
-    const link = document.createElement('a')
-    link.href = downloadUrl
-    link.download = fileName
-    link.rel = 'noopener'
-    document.body.appendChild(link)
-    link.click()
-    link.remove()
-  }
-
-  return (
-    <Sheet open={open} onOpenChange={onOpenChange}>
-      <SheetContent
-        side="right"
-        className="flex w-full flex-col gap-0 p-0 sm:max-w-xl"
-      >
-        <SheetHeader className="border-border/60 border-b px-5 py-4">
-          <SheetTitle className="truncate pr-8">{fileName}</SheetTitle>
-          <SheetDescription className="truncate">
-            {filePath ?? ''}
-          </SheetDescription>
-        </SheetHeader>
-
-        <ScrollArea className="min-h-0 flex-1">
-          <div className="px-5 py-4">
-            {loading ? (
-              <PreviewSkeleton />
-            ) : error ? (
-              <PreviewError message={error.message} />
-            ) : preview ? (
-              <PreviewBody
-                preview={preview}
-                filePath={filePath}
-                downloadUrl={downloadUrl}
-              />
-            ) : null}
-          </div>
-        </ScrollArea>
-
-        {fileId ? (
-          <div className="border-border/60 border-t bg-background/90 px-5 py-3 backdrop-blur">
-            <Button
-              type="button"
-              size="sm"
-              className="w-full gap-2"
-              onClick={handleDownload}
-            >
-              <Download className="size-3.5" />
-              Download
-            </Button>
-          </div>
-        ) : null}
-      </SheetContent>
-    </Sheet>
-  )
-}
-
-function PreviewSkeleton() {
-  return (
-    <div className="flex flex-col gap-2">
-      <div className="flex items-center gap-2 text-muted-foreground text-xs">
-        <Loader2 className="size-3.5 animate-spin" />
-        Loading preview...
-      </div>
-      <Skeleton className="h-4 w-3/4" />
-      <Skeleton className="h-4 w-full" />
-      <Skeleton className="h-4 w-5/6" />
-      <Skeleton className="h-4 w-2/3" />
-    </div>
-  )
-}
-
-function PreviewError({ message }: { message: string }) {
-  return (
-    <div className="flex flex-col items-start gap-2 rounded-lg border border-destructive/30 bg-destructive/5 px-3 py-2 text-destructive text-sm">
-      <div className="flex items-center gap-2 font-medium">
-        <FileWarning className="size-4" />
-        Could not load preview
-      </div>
-      <p className="text-destructive/80 text-xs">{message}</p>
-    </div>
-  )
-}
-
-function PreviewBody({
-  preview,
-  filePath,
-  downloadUrl,
-}: {
-  preview: FilePreview
-  filePath: string | null
-  downloadUrl: string | null
-}) {
-  if (preview.kind === 'missing') {
-    return (
-      <div className="rounded-lg border border-border/60 bg-muted/40 px-4 py-6 text-center text-muted-foreground text-sm">
-        This file is no longer in the workspace. The agent may have moved or
-        deleted it after the turn finished.
-      </div>
-    )
-  }
-
-  if (preview.kind === 'image') {
-    return (
-      <div className="flex flex-col gap-3">
-        <PreviewMeta preview={preview} />
-        <div className="overflow-hidden rounded-lg border border-border/60 bg-muted/30">
-          <img
-            src={preview.dataUrl}
-            alt={filePath ?? 'preview'}
-            className="block max-h-[60vh] w-full object-contain"
-          />
-        </div>
-      </div>
-    )
-  }
-
-  if (preview.kind === 'pdf') {
-    return (
-      <div className="flex flex-col gap-3">
-        <PreviewMeta preview={preview} />
-        <div className="rounded-lg border border-border/60 bg-muted/40 px-4 py-6 text-center text-muted-foreground text-sm">
-          PDF previews aren't supported inline yet. Use Download to open this
-          file in your default PDF viewer.
-        </div>
-      </div>
-    )
-  }
-
-  if (preview.kind === 'binary') {
-    return (
-      <div className="flex flex-col gap-3">
-        <PreviewMeta preview={preview} />
-        <div className="rounded-lg border border-border/60 bg-muted/40 px-4 py-6 text-center text-muted-foreground text-sm">
-          No inline preview for this file type.
-          {downloadUrl ? ' Use Download to save it locally.' : null}
-        </div>
-      </div>
-    )
-  }
-
-  return <TextPreviewBody preview={preview} filePath={filePath} />
-}
-
-function TextPreviewBody({
-  preview,
-  filePath,
-}: {
-  preview: Extract<FilePreview, { kind: 'text' }>
-  filePath: string | null
-}) {
-  const ext = filePath ? extensionOf(filePath).toLowerCase() : ''
-  const renderAsMarkdown = MARKDOWN_EXTENSIONS.has(ext)
-
-  return (
-    <div className="flex flex-col gap-3">
-      <PreviewMeta preview={preview} />
-      {renderAsMarkdown ? (
-        <div
-          className={cn(
-            'prose prose-sm dark:prose-invert max-w-none break-words rounded-lg border border-border/60 bg-muted/30 px-4 py-3',
-            "[&_[data-streamdown='code-block']]:!w-full [&_[data-streamdown='code-block']]:overflow-x-auto",
-          )}
-        >
-          <MessageResponse mode="static" parseIncompleteMarkdown={false}>
-            {preview.snippet}
-          </MessageResponse>
-        </div>
-      ) : (
-        <pre className="overflow-x-auto rounded-lg border border-border/60 bg-muted/30 px-3 py-2 text-xs leading-relaxed">
-          <code className="font-mono text-foreground">{preview.snippet}</code>
-        </pre>
-      )}
-      {preview.truncated ? (
-        <div className="text-muted-foreground text-xs">
-          Showing the first part of this file. Download to see the full
-          contents.
-        </div>
-      ) : null}
-    </div>
-  )
-}
-
-function PreviewMeta({
-  preview,
-}: {
-  preview: Exclude<FilePreview, { kind: 'missing' }>
-}) {
-  return (
-    <div className="flex flex-wrap items-center gap-x-3 gap-y-1 text-muted-foreground text-xs">
-      <span className="font-medium text-foreground">
-        {formatFileSize(preview.size)}
-      </span>
-      <span>·</span>
-      <span className="font-mono">{preview.mimeType || 'unknown'}</span>
-    </div>
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.outputs-rail.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-conversation.outputs-rail.tsx
@@ -1,338 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * Per-agent right-side "Outputs" panel. Lists every file the harness
- * has attributed to this agent, grouped by the turn that produced
- * them. Click a row to open the shared preview Sheet.
- *
- * Lifecycle:
- *  - Open/closed state is controlled by the parent and persisted via
- *    `useOutputsRailOpen(agentId)` so each agent remembers its
- *    preference independently.
- *  - Data refreshes whenever a turn finishes (the conversation hook
- *    fires `useInvalidateAgentOutputs` from its finally block).
- *  - Manual "Refresh" button is wired to `useRefreshAgentOutputs`
- *    for users who navigate in mid-turn.
- */
-
-import {
-  ChevronDown,
-  ChevronRight,
-  FileText,
-  Image as ImageIcon,
-  Inbox,
-  Loader2,
-  PanelRightClose,
-  RefreshCw,
-} from 'lucide-react'
-import { type FC, useEffect, useMemo, useRef, useState } from 'react'
-import { toast } from 'sonner'
-import { Button } from '@/components/ui/button'
-import {
-  Collapsible,
-  CollapsibleContent,
-  CollapsibleTrigger,
-} from '@/components/ui/collapsible'
-import { ScrollArea } from '@/components/ui/scroll-area'
-import { Skeleton } from '@/components/ui/skeleton'
-import {
-  basenameOf,
-  formatFileSize,
-  inferFileKind,
-  type ProducedFilesRailGroup,
-  useAgentOutputs,
-  useRefreshAgentOutputs,
-} from '@/lib/agent-files'
-import { cn } from '@/lib/utils'
-import { FilePreviewSheet } from './agent-conversation.file-preview-sheet'
-
-interface OutputsRailProps {
-  agentId: string
-  onClose: () => void
-  /**
-   * When set, the rail scrolls the matching `RailTurnGroup` into
-   * view and force-opens its `Collapsible`. Used by the inline
-   * file-card strip's "View" / "+N" deep-link path. Cleared by
-   * the parent (via `onFocusTurnConsumed`) once the rail has
-   * acknowledged the deep-link so subsequent renders don't keep
-   * re-scrolling the same group.
-   */
-  focusTurnId?: string | null
-  onFocusTurnConsumed?: () => void
-}
-
-const RAIL_LOCAL_STORAGE_PREFIX = 'browseros:outputs-rail:'
-
-/**
- * Controlled open/close state with per-agent localStorage memory.
- * Returns a tuple compatible with React's useState shape so the
- * parent can pass it straight into the rail without an extra effect.
- */
-export function useOutputsRailOpen(
-  agentId: string,
-): [boolean, (next: boolean) => void] {
-  const [open, setOpen] = useState(false)
-
-  useEffect(() => {
-    if (typeof window === 'undefined' || !agentId) return
-    try {
-      const stored = window.localStorage.getItem(
-        `${RAIL_LOCAL_STORAGE_PREFIX}${agentId}`,
-      )
-      setOpen(stored === '1')
-    } catch {
-      // localStorage may be unavailable (private mode, locked-down
-      // contexts) — fall back to closed.
-    }
-  }, [agentId])
-
-  const update = (next: boolean) => {
-    setOpen(next)
-    if (typeof window === 'undefined' || !agentId) return
-    try {
-      window.localStorage.setItem(
-        `${RAIL_LOCAL_STORAGE_PREFIX}${agentId}`,
-        next ? '1' : '0',
-      )
-    } catch {
-      // Best-effort persistence.
-    }
-  }
-
-  return [open, update]
-}
-
-export const OutputsRail: FC<OutputsRailProps> = ({
-  agentId,
-  onClose,
-  focusTurnId,
-  onFocusTurnConsumed,
-}) => {
-  const { groups, loading, error } = useAgentOutputs(agentId)
-  const refresh = useRefreshAgentOutputs(agentId)
-
-  const [openFile, setOpenFile] = useState<{
-    id: string
-    path: string
-  } | null>(null)
-
-  const totalFiles = useMemo(
-    () => groups.reduce((sum, group) => sum + group.files.length, 0),
-    [groups],
-  )
-
-  return (
-    <aside className="flex h-full min-h-0 w-full flex-col border-border/50 border-l bg-background">
-      <header className="flex shrink-0 items-center gap-2 border-border/50 border-b px-3 py-3">
-        <span className="font-semibold text-[13px] uppercase tracking-wide">
-          Outputs
-        </span>
-        {totalFiles > 0 ? (
-          <span className="text-muted-foreground text-xs tabular-nums">
-            {totalFiles}
-          </span>
-        ) : null}
-        <div className="ml-auto flex items-center gap-1">
-          <Button
-            type="button"
-            variant="ghost"
-            size="icon"
-            className="size-7"
-            onClick={() =>
-              refresh.mutate(undefined, {
-                onError: (err) =>
-                  toast.error('Refresh failed', {
-                    description:
-                      err instanceof Error ? err.message : String(err),
-                  }),
-              })
-            }
-            disabled={refresh.isPending}
-            title="Refresh"
-          >
-            {refresh.isPending ? (
-              <Loader2 className="size-3.5 animate-spin" />
-            ) : (
-              <RefreshCw className="size-3.5" />
-            )}
-          </Button>
-          <Button
-            type="button"
-            variant="ghost"
-            size="icon"
-            className="size-7"
-            onClick={onClose}
-            title="Hide outputs"
-          >
-            <PanelRightClose className="size-3.5" />
-          </Button>
-        </div>
-      </header>
-
-      <ScrollArea className="min-h-0 flex-1">
-        <div className="px-2 py-2">
-          {loading && groups.length === 0 ? (
-            <RailSkeleton />
-          ) : error ? (
-            <RailError message={error.message} />
-          ) : groups.length === 0 ? (
-            <RailEmpty />
-          ) : (
-            <ul className="flex flex-col gap-2">
-              {groups.map((group) => (
-                <li key={group.turnId}>
-                  <RailTurnGroup
-                    group={group}
-                    focused={
-                      Boolean(focusTurnId) && focusTurnId === group.turnId
-                    }
-                    onFocusConsumed={onFocusTurnConsumed}
-                    onOpenFile={(file) =>
-                      setOpenFile({ id: file.id, path: file.path })
-                    }
-                  />
-                </li>
-              ))}
-            </ul>
-          )}
-        </div>
-      </ScrollArea>
-
-      <FilePreviewSheet
-        fileId={openFile?.id ?? null}
-        filePath={openFile?.path ?? null}
-        open={Boolean(openFile)}
-        onOpenChange={(next) => {
-          if (!next) setOpenFile(null)
-        }}
-      />
-    </aside>
-  )
-}
-
-function RailTurnGroup({
-  group,
-  focused,
-  onFocusConsumed,
-  onOpenFile,
-}: {
-  group: ProducedFilesRailGroup
-  focused: boolean
-  onFocusConsumed?: () => void
-  onOpenFile: (file: { id: string; path: string }) => void
-}) {
-  const [open, setOpen] = useState(true)
-  const headerLabel = group.turnPrompt.trim() || 'Turn'
-  const containerRef = useRef<HTMLDivElement>(null)
-
-  // Deep-link consumption: when the parent passes `focused=true`,
-  // expand the collapsible (in case the user had collapsed it
-  // earlier) and scroll into view. Fire `onFocusConsumed` so the
-  // parent can drop the URL param and we don't re-scroll on every
-  // render after that.
-  useEffect(() => {
-    if (!focused) return
-    setOpen(true)
-    containerRef.current?.scrollIntoView({
-      behavior: 'smooth',
-      block: 'nearest',
-    })
-    onFocusConsumed?.()
-  }, [focused, onFocusConsumed])
-
-  return (
-    <div ref={containerRef}>
-      <Collapsible open={open} onOpenChange={setOpen}>
-        <CollapsibleTrigger
-          className={cn(
-            'flex w-full items-center gap-1.5 rounded-md px-1.5 py-1 text-left text-muted-foreground text-xs',
-            'transition-colors hover:bg-accent/40 hover:text-foreground',
-          )}
-        >
-          {open ? (
-            <ChevronDown className="size-3 shrink-0" />
-          ) : (
-            <ChevronRight className="size-3 shrink-0" />
-          )}
-          <span className="min-w-0 flex-1 truncate font-medium">
-            {headerLabel}
-          </span>
-          <span className="shrink-0 tabular-nums">{group.files.length}</span>
-        </CollapsibleTrigger>
-        <CollapsibleContent>
-          <ul className="mt-1 ml-1 flex flex-col gap-0.5 border-border/40 border-l pl-2">
-            {group.files.map((file) => (
-              <li key={file.id}>
-                <RailFileRow file={file} onOpen={() => onOpenFile(file)} />
-              </li>
-            ))}
-          </ul>
-        </CollapsibleContent>
-      </Collapsible>
-    </div>
-  )
-}
-
-function RailFileRow({
-  file,
-  onOpen,
-}: {
-  file: ProducedFilesRailGroup['files'][number]
-  onOpen: () => void
-}) {
-  const name = basenameOf(file.path)
-  const kind = inferFileKind(file.path)
-  const Icon = kind === 'image' ? ImageIcon : FileText
-
-  return (
-    <button
-      type="button"
-      onClick={onOpen}
-      className={cn(
-        'flex w-full items-center gap-2 rounded-md px-1.5 py-1 text-left text-xs transition-colors',
-        'hover:bg-accent/60 focus:bg-accent/60 focus:outline-hidden',
-      )}
-      title={file.path}
-    >
-      <Icon className="size-3 shrink-0 text-muted-foreground" />
-      <span className="min-w-0 flex-1 truncate">{name}</span>
-      <span className="shrink-0 text-muted-foreground tabular-nums">
-        {formatFileSize(file.size)}
-      </span>
-    </button>
-  )
-}
-
-function RailSkeleton() {
-  return (
-    <div className="flex flex-col gap-2 px-1.5 py-1">
-      <Skeleton className="h-4 w-1/2" />
-      <Skeleton className="h-4 w-3/4" />
-      <Skeleton className="h-4 w-2/3" />
-      <Skeleton className="h-4 w-5/6" />
-    </div>
-  )
-}
-
-function RailEmpty() {
-  return (
-    <div className="mx-2 my-3 flex flex-col items-center gap-1.5 rounded-lg border border-border/60 border-dashed bg-muted/20 px-3 py-6 text-center text-muted-foreground text-xs">
-      <Inbox className="size-4" />
-      <p className="font-medium">No outputs yet</p>
-      <p className="text-[11px] text-muted-foreground/70 leading-snug">
-        Files this agent creates will appear here, grouped by the turn that made
-        them.
-      </p>
-    </div>
-  )
-}
-
-function RailError({ message }: { message: string }) {
-  return (
-    <div className="mx-2 my-3 rounded-lg border border-destructive/30 bg-destructive/5 px-3 py-2 text-destructive text-xs">
-      {message}
-    </div>
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/claw-chat-types.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/claw-chat-types.ts
@@ -1,6 +1,5 @@
 import type { OpenClawChatHistoryMessage } from '@/entrypoints/app/agents/useOpenClaw'
 import type { AgentConversationTurn } from '@/lib/agent-conversations/types'
-import type { ProducedFilesRailGroup } from '@/lib/agent-files'

 export type ClawChatRole = 'user' | 'assistant'

@@ -235,30 +234,6 @@ export function filterTurnsPersistedInHistory(
  )
 }

-/**
- * Persisted turns that still carry `producedFiles` — once history
- * reloads, the assistant text is rendered by `ClawChatMessage` and
- * the optimistic turn is filtered out by
- * `filterTurnsPersistedInHistory`. The historical message has no
- * `producedFiles` field (history items don't carry that), so the
- * inline file-card strip would vanish on history reload.
- *
- * Returning these here lets the caller render a strip-only entry
- * after the corresponding history bubble — full message stays as
- * the persisted history pair, but the produced-files affordance
- * survives.
- */
-export function selectStripOnlyTurns(
-  turns: AgentConversationTurn[],
-  historyMessages: ClawChatMessage[],
-): AgentConversationTurn[] {
-  return turns.filter(
-    (turn) =>
-      Boolean(turn.producedFiles && turn.producedFiles.length > 0) &&
-      isTurnPersistedInHistory(turn, historyMessages),
-  )
-}
-
 function isTurnPersistedInHistory(
  turn: AgentConversationTurn,
  historyMessages: ClawChatMessage[],
@@ -310,59 +285,3 @@ function getClawMessageText(message: ClawChatMessage): string {
    .join('')
    .trim()
 }
-
-function firstNonBlankLine(value: string): string {
-  for (const raw of value.split('\n')) {
-    const trimmed = raw.trim()
-    if (trimmed) return trimmed
-  }
-  return ''
-}
-
-/**
- * Map each assistant history message to the produced-files group
- * that came from its turn. Match key is `group.turnPrompt` (first
- * non-blank line of the user prompt that initiated the turn) vs.
- * the first non-blank line of the user message that immediately
- * preceded this assistant message — the same shape the server
- * emits when storing turnPrompt.
- *
- * Walks history forward (oldest-first per `flattenHistoryPages`)
- * and consumes groups in chronological order. A group can only
- * match once — if two turns share the same prompt the earlier
- * one wins, and the later assistant message stays unassociated
- * (those land back in `tailStripGroups` at the conversation tail).
- */
-export function mapHistoryToProducedFilesGroups(
-  historyMessages: ClawChatMessage[],
-  groups: ReadonlyArray<ProducedFilesRailGroup>,
-): {
-  byAssistantMessageId: Map<string, ProducedFilesRailGroup>
-  unmatched: ProducedFilesRailGroup[]
-} {
-  const byAssistantMessageId = new Map<string, ProducedFilesRailGroup>()
-  if (groups.length === 0) {
-    return { byAssistantMessageId, unmatched: [] }
-  }
-  // Oldest-first so the iteration order matches history.
-  const remaining = [...groups].sort((a, b) => a.createdAt - b.createdAt)
-
-  let pendingPrompt: string | null = null
-  for (const message of historyMessages) {
-    if (message.role === 'user') {
-      pendingPrompt = firstNonBlankLine(getClawMessageText(message))
-      continue
-    }
-    if (message.role !== 'assistant' || !pendingPrompt) continue
-    const matchIndex = remaining.findIndex(
-      (group) => group.turnPrompt === pendingPrompt,
-    )
-    if (matchIndex >= 0) {
-      const [match] = remaining.splice(matchIndex, 1)
-      byAssistantMessageId.set(message.id, match)
-    }
-    pendingPrompt = null
-  }
-
-  return { byAssistantMessageId, unmatched: remaining }
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/pending-initial-message.test.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/pending-initial-message.test.ts
@@ -1,109 +0,0 @@
-import { afterEach, describe, expect, it } from 'bun:test'
-import type { StagedAttachment } from '@/lib/attachments'
-import {
-  consumePendingInitialMessage,
-  peekPendingInitialMessage,
-  setPendingInitialMessage,
-} from './pending-initial-message'
-
-function makeAttachment(id: string): StagedAttachment {
-  return {
-    id,
-    kind: 'image',
-    mediaType: 'image/png',
-    name: `${id}.png`,
-    dataUrl: `data:image/png;base64,${id}`,
-    payload: {
-      kind: 'image',
-      mediaType: 'image/png',
-      name: `${id}.png`,
-      dataUrl: `data:image/png;base64,${id}`,
-    },
-  }
-}
-
-afterEach(() => {
-  // Drain any leftover pending entry so tests don't leak into each
-  // other (the module-scope state survives across `it` blocks).
-  consumePendingInitialMessage('drain')
-  // If still set, clear by consuming with the matching id.
-  const leftover = peekPendingInitialMessage()
-  if (leftover) consumePendingInitialMessage(leftover.agentId)
-})
-
-describe('pending-initial-message', () => {
-  it('consume returns the payload set for the same agentId', () => {
-    setPendingInitialMessage({
-      agentId: 'agent-a',
-      text: 'hello',
-      attachments: [makeAttachment('one')],
-      createdAt: Date.now(),
-    })
-    const result = consumePendingInitialMessage('agent-a')
-    expect(result?.text).toBe('hello')
-    expect(result?.attachments).toHaveLength(1)
-    expect(result?.attachments[0]?.id).toBe('one')
-  })
-
-  it('consume is destructive — second call returns null', () => {
-    setPendingInitialMessage({
-      agentId: 'agent-a',
-      text: 'hello',
-      attachments: [],
-      createdAt: Date.now(),
-    })
-    expect(consumePendingInitialMessage('agent-a')).not.toBeNull()
-    expect(consumePendingInitialMessage('agent-a')).toBeNull()
-  })
-
-  it('consume returns null and preserves entry when agentId differs', () => {
-    setPendingInitialMessage({
-      agentId: 'agent-a',
-      text: 'hello',
-      attachments: [],
-      createdAt: Date.now(),
-    })
-    expect(consumePendingInitialMessage('agent-b')).toBeNull()
-    expect(peekPendingInitialMessage()?.agentId).toBe('agent-a')
-    expect(consumePendingInitialMessage('agent-a')).not.toBeNull()
-  })
-
-  it('returns null for entries older than the TTL', () => {
-    setPendingInitialMessage({
-      agentId: 'agent-a',
-      text: 'old',
-      attachments: [],
-      createdAt: Date.now() - 11_000, // older than 10 s TTL
-    })
-    expect(consumePendingInitialMessage('agent-a')).toBeNull()
-  })
-
-  it('replaces a previous pending entry when set is called again', () => {
-    setPendingInitialMessage({
-      agentId: 'agent-a',
-      text: 'first',
-      attachments: [],
-      createdAt: Date.now(),
-    })
-    setPendingInitialMessage({
-      agentId: 'agent-b',
-      text: 'second',
-      attachments: [makeAttachment('two')],
-      createdAt: Date.now(),
-    })
-    expect(consumePendingInitialMessage('agent-a')).toBeNull()
-    const result = consumePendingInitialMessage('agent-b')
-    expect(result?.text).toBe('second')
-    expect(result?.attachments[0]?.id).toBe('two')
-  })
-
-  it('no-ops when set is called with empty agentId', () => {
-    setPendingInitialMessage({
-      agentId: '',
-      text: 'oops',
-      attachments: [],
-      createdAt: Date.now(),
-    })
-    expect(peekPendingInitialMessage()).toBeNull()
-  })
-})
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/pending-initial-message.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/pending-initial-message.ts
@@ -1,81 +0,0 @@
-import type { StagedAttachment } from '@/lib/attachments'
-
-/**
- * Same-tab in-memory handoff between the `/home` composer and the
- * chat screen at `/home/agents/:agentId`. URL search params (`?q=`)
- * carry the text fine, but cannot carry binary attachments — a multi-
- * megabyte image dataUrl would explode URL length limits and round-
- * trip badly. This module is the rich-data side channel for the same
- * navigation: the composer writes here, the chat screen reads here on
- * mount.
- *
- * Intentionally module-scope. Same render tree, same tab — no need
- * for sessionStorage (which would force JSON-serialising the dataUrls
- * and re-parsing on the read side). Cross-tab handoff is out of
- * scope: the user typing at home in tab A and switching to tab B's
- * chat would surface an empty registry there, which is the correct
- * behaviour.
- */
-
-export interface PendingInitialMessage {
-  agentId: string
-  text: string
-  attachments: StagedAttachment[]
-  createdAt: number
-}
-
-/**
- * 10s TTL on the entry. A stale entry from a back-button journey
- * shouldn't fire on a future visit; if real-world latency makes 10s
- * too tight under slow harness boot, bump but never make it
- * indefinite.
- */
-const PENDING_TTL_MS = 10_000
-
-let pending: PendingInitialMessage | null = null
-let pendingTimer: ReturnType<typeof setTimeout> | null = null
-
-function clearPending(): void {
-  pending = null
-  if (pendingTimer !== null) {
-    clearTimeout(pendingTimer)
-    pendingTimer = null
-  }
-}
-
-export function setPendingInitialMessage(payload: PendingInitialMessage): void {
-  // Defensive: the home composer should never call this without an
-  // agent selected. If it somehow does, no-op rather than holding a
-  // payload we can't route.
-  if (!payload.agentId) return
-  clearPending()
-  pending = payload
-  pendingTimer = setTimeout(clearPending, PENDING_TTL_MS)
-}
-
-/**
- * Destructive read. Returns the entry only if `agentId` matches and
- * the entry is fresh; clears the entry on success so Strict-Mode
- * double-invokes can't double-send.
- */
-export function consumePendingInitialMessage(
-  agentId: string,
-): PendingInitialMessage | null {
-  if (!pending) return null
-  if (pending.agentId !== agentId) return null
-  if (Date.now() - pending.createdAt >= PENDING_TTL_MS) {
-    clearPending()
-    return null
-  }
-  const entry = pending
-  clearPending()
-  return entry
-}
-
-/**
- * Non-mutating read for tests. Production code should never need this
- * — use `consume` and own the lifecycle.
- */
-export function peekPendingInitialMessage(): PendingInitialMessage | null {
-  return pending
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentConversation.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentConversation.ts
@@ -10,11 +10,9 @@ import type { OpenClawChatHistoryMessage } from '@/entrypoints/app/agents/useOpe
 import type {
  AgentConversationTurn,
  AssistantPart,
-  ConversationTurnFile,
  ToolEntry,
  UserAttachmentPreview,
 } from '@/lib/agent-conversations/types'
-import { useInvalidateAgentOutputs } from '@/lib/agent-files'
 import type { ServerAttachmentPayload } from '@/lib/attachments'
 import { consumeSSEStream } from '@/lib/sse'
 import { buildToolLabel } from '@/lib/tool-labels'
@@ -55,12 +53,6 @@ export function useAgentConversation(
 ) {
  const [turns, setTurns] = useState<AgentConversationTurn[]>([])
  const [streaming, setStreaming] = useState(false)
-  const invalidateAgentOutputs = useInvalidateAgentOutputs()
-  // Stable ref so the resume effect doesn't re-subscribe on every
-  // render (the hook's returned callable is freshly closured each
-  // time, but the underlying queryClient is stable).
-  const invalidateAgentOutputsRef = useRef(invalidateAgentOutputs)
-  invalidateAgentOutputsRef.current = invalidateAgentOutputs
  const sessionKeyRef = useRef(options.sessionKey ?? '')
  const historyRef = useRef<OpenClawChatHistoryMessage[]>(options.history ?? [])
  const textAccRef = useRef('')
@@ -160,17 +152,6 @@ export function useAgentConversation(
    })
  }

-  const setProducedFilesOnCurrentTurn = (files: ConversationTurnFile[]) => {
-    setTurns((prev) => {
-      const last = prev[prev.length - 1]
-      if (!last) return prev
-      // Replace, don't merge: the server's diff is authoritative for
-      // the just-completed turn — duplicate events shouldn't grow the
-      // list, and a re-attribution should overwrite an earlier one.
-      return [...prev.slice(0, -1), { ...last, producedFiles: files }]
-    })
-  }
-
  const upsertAgentHarnessTool = (event: AgentHarnessStreamEvent) => {
    if (event.type !== 'tool_call') return
    const rawName = event.title || event.rawType || 'tool call'
@@ -227,9 +208,6 @@ export function useAgentConversation(
      case 'tool_call':
        upsertAgentHarnessTool(event)
        break
-      case 'produced_files':
-        setProducedFilesOnCurrentTurn(event.files)
-        break
      case 'done':
        markCurrentTurnDone()
        break
@@ -281,7 +259,6 @@ export function useAgentConversation(
          ...prev,
          {
            id: crypto.randomUUID(),
-            turnId: active.turnId,
            userText: active.prompt ?? '',
            parts: [],
            done: false,
@@ -327,14 +304,9 @@ export function useAgentConversation(
        // When `cancelled` is true the next run will set these
        // itself, so resetting here would only cause a brief flicker.
        if (!cancelled && weStartedStream) {
-          const finishedTurnId = turnIdRef.current
          turnIdRef.current = null
          lastSeqRef.current = null
          setStreaming(false)
-          void invalidateAgentOutputsRef.current(
-            agentId,
-            finishedTurnId ?? undefined,
-          )
        }
      }
    }
@@ -346,60 +318,6 @@ export function useAgentConversation(
    }
  }, [agentId, activeTurnIdDep])

-  /**
-   * Send the chat request and follow the 409-active-turn redirect
-   * once. Pulled out of `send` to keep its cognitive complexity in
-   * check — the retry adds a branch that biome counts heavily.
-   */
-  const openSendStream = async (
-    targetAgentId: string,
-    text: string,
-    attachments: ServerAttachmentPayload[],
-    signal: AbortSignal,
-  ): Promise<Response> => {
-    const initial = await chatWithHarnessAgent(
-      targetAgentId,
-      text,
-      signal,
-      attachments,
-    )
-    if (initial.status !== 409) return initial
-    // 409 means the server already has an active turn for this agent
-    // (a previous tab kicked one off and we're a fresh mount that
-    // missed the resume window). Attach to it instead of double-sending.
-    const body = (await initial.json()) as { turnId?: string }
-    if (!body.turnId) return initial
-    return attachToHarnessTurn(targetAgentId, {
-      turnId: body.turnId,
-      signal,
-    })
-  }
-
-  /**
-   * Pull session-key / turn-id off response headers and propagate to
-   * refs + the optimistic turn. Stamping `turnId` here lets the
-   * inline artifact card fall back to /files/turn/<id> on a resumed
-   * mount that missed the live `produced_files` event.
-   */
-  const applyResponseHeadersToTurn = (response: Response) => {
-    const responseSessionKey =
-      response.headers.get('X-Session-Key') ??
-      response.headers.get('X-Session-Id')
-    if (responseSessionKey) {
-      sessionKeyRef.current = responseSessionKey
-      onSessionKeyChangeRef.current?.(responseSessionKey)
-    }
-    const responseTurnId = response.headers.get('X-Turn-Id')
-    if (!responseTurnId) return
-    turnIdRef.current = responseTurnId
-    lastSeqRef.current = null
-    setTurns((prev) => {
-      const last = prev[prev.length - 1]
-      if (!last) return prev
-      return [...prev.slice(0, -1), { ...last, turnId: responseTurnId }]
-    })
-  }
-
  const send = async (input: string | SendInput) => {
    const normalized: SendInput =
      typeof input === 'string' ? { text: input } : input
@@ -428,13 +346,37 @@ export function useAgentConversation(
    streamAbortRef.current = abortController

    try {
-      const response = await openSendStream(
+      let response = await chatWithHarnessAgent(
        agentId,
        trimmed,
-        attachments,
        abortController.signal,
+        attachments,
      )
-      applyResponseHeadersToTurn(response)
+      // 409 means the server already has an active turn for this
+      // agent (e.g. a previous tab kicked one off and we're a fresh
+      // mount that missed the resume window). Attach to it instead of
+      // double-sending.
+      if (response.status === 409) {
+        const body = (await response.json()) as { turnId?: string }
+        if (body.turnId) {
+          response = await attachToHarnessTurn(agentId, {
+            turnId: body.turnId,
+            signal: abortController.signal,
+          })
+        }
+      }
+      const responseSessionKey =
+        response.headers.get('X-Session-Key') ??
+        response.headers.get('X-Session-Id')
+      if (responseSessionKey) {
+        sessionKeyRef.current = responseSessionKey
+        onSessionKeyChangeRef.current?.(responseSessionKey)
+      }
+      const responseTurnId = response.headers.get('X-Turn-Id')
+      if (responseTurnId) {
+        turnIdRef.current = responseTurnId
+        lastSeqRef.current = null
+      }
      if (!response.ok) {
        const err = await response.text()
        updateCurrentTurnParts((parts) => [
@@ -462,15 +404,10 @@ export function useAgentConversation(
      if (streamAbortRef.current === abortController) {
        streamAbortRef.current = null
      }
-      // Capture before nulling — the invalidation needs the turn id so
-      // useAgentTurnFiles consumers also flush, not just the agent-wide
-      // rail query.
-      const finishedTurnId = turnIdRef.current
      turnIdRef.current = null
      lastSeqRef.current = null
      onCompleteRef.current?.()
      setStreaming(false)
-      void invalidateAgentOutputs(agentId, finishedTurnId ?? undefined)
    }
  }

--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/AdapterIcon.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/AdapterIcon.tsx
@@ -1,4 +1,4 @@
-import { Bot, Cpu, Sparkles, Wand2 } from 'lucide-react'
+import { Bot, Cpu, Sparkles } from 'lucide-react'
 import type { FC } from 'react'
 import type { HarnessAgentAdapter } from './agent-harness-types'

@@ -23,9 +23,6 @@ export const AdapterIcon: FC<AdapterIconProps> = ({ adapter, className }) => {
    case 'openclaw':
      // OpenClaw — bot/automation framing.
      return <Bot className={className} aria-label="OpenClaw" />
-    case 'hermes':
-      // Hermes — messenger god framing, wand evokes the agentic conjuring.
-      return <Wand2 className={className} aria-label="Hermes" />
    default:
      return <Bot className={className} aria-label="Agent" />
  }
@@ -39,8 +36,6 @@ export function adapterLabel(adapter: HarnessAgentAdapter | 'unknown'): string {
      return 'Codex'
    case 'openclaw':
      return 'OpenClaw'
-    case 'hermes':
-      return 'Hermes'
    default:
      return 'Agent'
  }
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentList.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentList.tsx
@@ -11,7 +11,6 @@ import type {
  AgentAdapterHealth,
  AgentRowData,
 } from './agent-row/agent-row.types'
-import { compareAgentsByPinThenRecency } from './agents-list-order'
 import type { AgentListItem } from './agents-page-types'
 import type { AgentLiveness } from './LivenessDot'

@@ -57,18 +56,31 @@ export const AgentList: FC<AgentListProps> = ({
    return map
  }, [adapters])

+  // Sort: pinned rows first, then most recently used, then never-used
+  // agents in id-stable order. The gateway's `main` agent stays
+  // pinned-to-top when never touched so a fresh install has an
+  // obvious starting point.
  const ordered = useMemo(() => {
    const withMeta = agents.map((agent) => {
      const harness = harnessAgentLookup?.get(agent.agentId)
      return {
        agent,
-        id: agent.agentId,
        pinned: harness?.pinned ?? false,
        lastUsedAt: activity?.[agent.agentId]?.lastUsedAt ?? null,
      }
    })
    return withMeta
-      .sort(compareAgentsByPinThenRecency)
+      .sort((a, b) => {
+        if (a.pinned !== b.pinned) return a.pinned ? -1 : 1
+        const aSeed = a.agent.agentId === 'main' && a.lastUsedAt === null
+        const bSeed = b.agent.agentId === 'main' && b.lastUsedAt === null
+        if (aSeed && !bSeed) return -1
+        if (!aSeed && bSeed) return 1
+        const aValue = a.lastUsedAt ?? -Infinity
+        const bValue = b.lastUsedAt ?? -Infinity
+        if (aValue !== bValue) return bValue - aValue
+        return a.agent.agentId.localeCompare(b.agent.agentId)
+      })
      .map((entry) => entry.agent)
  }, [activity, agents, harnessAgentLookup])

@@ -117,7 +129,6 @@ function inferAdapterFromLabel(label: string): HarnessAgentAdapter | 'unknown' {
  if (lower === 'claude code') return 'claude'
  if (lower === 'codex') return 'codex'
  if (lower === 'openclaw') return 'openclaw'
-  if (lower === 'hermes') return 'hermes'
  return 'unknown'
 }

--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx
@@ -10,7 +10,6 @@ import { createAgentPageActions } from './agents-page-actions'
 import {
  useDefaultAgentName,
  useHarnessAgentDefaults,
-  useHermesProviderSelection,
  useOpenClawProviderSelection,
 } from './agents-page-hooks'
 import {
@@ -107,7 +106,6 @@ export const AgentsPage: FC = () => {
  )
  const [harnessModelId, setHarnessModelId] = useState('')
  const [harnessReasoningEffort, setHarnessReasoningEffort] = useState('')
-  const [createHermesProviderId, setCreateHermesProviderId] = useState('')
  const [showTerminal, setShowTerminal] = useState(false)
  const [cliAuthModalOpen, setCliAuthModalOpen] = useState(false)
  const [pageError, setPageError] = useState<string | null>(null)
@@ -135,14 +133,6 @@ export const AgentsPage: FC = () => {
    cliAuthModalOpen,
    setCliAuthModalOpen,
  })
-  const { selectableHermesProviders } = useHermesProviderSelection({
-    providers,
-    defaultProviderId,
-    createOpen,
-    createRuntime,
-    createHermesProviderId,
-    setCreateHermesProviderId,
-  })
  useDefaultAgentName(createOpen, setNewName)
  useHarnessAgentDefaults({
    adapters,
@@ -236,13 +226,11 @@ export const AgentsPage: FC = () => {
    createAgentPageActions({
      createProviderId,
      createRuntime,
-      createHermesProviderId,
      harnessModelId,
      harnessReasoningEffort,
      navigate,
      newName,
      selectableOpenClawProviders,
-      selectableHermesProviders,
      setupProviderId,
      createHarnessAgent: createHarnessAgent.mutateAsync,
      createOpenClawAgent,
@@ -398,8 +386,6 @@ export const AgentsPage: FC = () => {
          harnessAdapterId={harnessAdapterId}
          harnessModelId={harnessModelId}
          harnessReasoningEffort={harnessReasoningEffort}
-          hermesProviders={selectableHermesProviders}
-          hermesSelectedProviderId={createHermesProviderId}
          name={newName}
          open={createOpen}
          providers={selectableOpenClawProviders}
@@ -415,14 +401,12 @@ export const AgentsPage: FC = () => {
            if (!open) {
              setCreateError(null)
              createHarnessAgent.reset()
-              setCreateHermesProviderId('')
            }
          }}
          onRuntimeChange={setCreateRuntime}
          onHarnessAdapterChange={handleHarnessAdapterChange}
          onHarnessModelChange={setHarnessModelId}
          onHarnessReasoningChange={setHarnessReasoningEffort}
-          onHermesProviderChange={setCreateHermesProviderId}
          onNameChange={setNewName}
          onProviderChange={setCreateProviderId}
        />
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/NewAgentDialog.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/NewAgentDialog.tsx
@@ -40,8 +40,6 @@ interface NewAgentDialogProps {
  harnessAdapterId: HarnessAgentAdapter
  harnessModelId: string
  harnessReasoningEffort: string
-  hermesProviders: ProviderOption[]
-  hermesSelectedProviderId: string
  name: string
  open: boolean
  providers: ProviderOption[]
@@ -57,7 +55,6 @@ interface NewAgentDialogProps {
  onHarnessAdapterChange: (adapter: HarnessAgentAdapter) => void
  onHarnessModelChange: (modelId: string) => void
  onHarnessReasoningChange: (reasoningEffort: string) => void
-  onHermesProviderChange: (providerId: string) => void
  onNameChange: (name: string) => void
  onProviderChange: (providerId: string) => void
 }
@@ -72,8 +69,6 @@ export const NewAgentDialog: FC<NewAgentDialogProps> = ({
  harnessAdapterId,
  harnessModelId,
  harnessReasoningEffort,
-  hermesProviders,
-  hermesSelectedProviderId,
  name,
  open,
  providers,
@@ -89,29 +84,22 @@ export const NewAgentDialog: FC<NewAgentDialogProps> = ({
  onHarnessAdapterChange,
  onHarnessModelChange,
  onHarnessReasoningChange,
-  onHermesProviderChange,
  onNameChange,
  onProviderChange,
 }) => {
  const selectedHarnessAdapter =
    adapters.find((adapter) => adapter.id === harnessAdapterId) ?? adapters[0]
  const isHarnessRuntime = createRuntime !== 'openclaw'
-  const isHermesRuntime = createRuntime === 'hermes'
-  const isClassicHarnessRuntime = isHarnessRuntime && !isHermesRuntime
  const openClawBlocked = createRuntime === 'openclaw' && !canManageOpenClaw
  const cliBlocked =
    createRuntime === 'openclaw' &&
    !!selectedCliProvider &&
    !cliAuthStatus?.loggedIn
-  const hermesBlocked =
-    isHermesRuntime &&
-    (hermesProviders.length === 0 || !hermesSelectedProviderId)
  const canCreate =
    Boolean(name.trim()) &&
    !creating &&
    !openClawBlocked &&
    !cliBlocked &&
-    !hermesBlocked &&
    (createRuntime === 'openclaw'
      ? providers.length > 0
      : Boolean(selectedHarnessAdapter))
@@ -155,8 +143,7 @@ export const NewAgentDialog: FC<NewAgentDialogProps> = ({
                if (
                  value === 'openclaw' ||
                  value === 'claude' ||
-                  value === 'codex' ||
-                  value === 'hermes'
+                  value === 'codex'
                ) {
                  onRuntimeChange(value)
                  if (value !== 'openclaw') onHarnessAdapterChange(value)
@@ -209,16 +196,7 @@ export const NewAgentDialog: FC<NewAgentDialogProps> = ({
            </>
          ) : null}

-          {isHermesRuntime ? (
-            <ProviderSelector
-              providers={hermesProviders}
-              defaultProviderId={defaultProviderId}
-              selectedId={hermesSelectedProviderId}
-              onSelect={onHermesProviderChange}
-            />
-          ) : null}
-
-          {isClassicHarnessRuntime ? (
+          {isHarnessRuntime ? (
            <>
              <div className="grid gap-2">
                <Label htmlFor="harness-model">Model</Label>
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/agent-harness-types.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/agent-harness-types.ts
@@ -1,21 +1,6 @@
 import type { AgentEntry } from './useOpenClaw'

-export type HarnessAgentAdapter = 'claude' | 'codex' | 'openclaw' | 'hermes'
-
-/**
- * One file the harness attributed to the assistant turn that just
- * finished. Mirrors the server-side `ProducedFileEventEntry` shape so
- * the inline artifact card can render alongside the streamed text the
- * user just watched complete. Only present for openclaw adapter
- * turns; claude / codex don't produce these events in v1.
- */
-export interface HarnessProducedFile {
-  id: string
-  /** Workspace-relative POSIX path. */
-  path: string
-  size: number
-  mtimeMs: number
-}
+export type HarnessAgentAdapter = 'claude' | 'codex' | 'openclaw'

 export type AgentHarnessStreamEvent =
  | {
@@ -37,10 +22,6 @@ export type AgentHarnessStreamEvent =
      text: string
      rawType?: string
    }
-  | {
-      type: 'produced_files'
-      files: HarnessProducedFile[]
-    }
  | {
      type: 'done'
      text?: string
@@ -130,16 +111,6 @@ export interface CreateHarnessAgentInput {
  adapter: HarnessAgentAdapter
  modelId?: string
  reasoningEffort?: string
-  /**
-   * Adapter provider id from the user's BrowserOS AI Settings entry.
-   * Provider-backed adapters use this with `apiKey`/`baseUrl` to write
-   * or provision their runtime-specific provider config.
-   */
-  providerType?: string
-  /** API key paired with `providerType` when the selected adapter needs one. */
-  apiKey?: string
-  /** Base URL for OpenAI-compatible/custom provider entries. */
-  baseUrl?: string
 }

 export interface HarnessHistoryReasoning {
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-list-order.test.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-list-order.test.ts
@@ -1,104 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import type { HarnessAgent } from './agent-harness-types'
-import {
-  compareAgentsByPinThenRecency,
-  orderAgentsByPinThenRecency,
-} from './agents-list-order'
-
-function makeAgent(input: {
-  id: string
-  pinned?: boolean
-  lastUsedAt?: number | null
-}): HarnessAgent {
-  return {
-    id: input.id,
-    name: input.id,
-    adapter: 'codex',
-    permissionMode: 'approve-all',
-    sessionKey: 'session',
-    createdAt: 0,
-    updatedAt: 0,
-    pinned: input.pinned,
-    lastUsedAt: input.lastUsedAt,
-  }
-}
-
-describe('orderAgentsByPinThenRecency', () => {
-  it('floats pinned agents to the top regardless of recency', () => {
-    const result = orderAgentsByPinThenRecency([
-      makeAgent({ id: 'a', pinned: false, lastUsedAt: 1_000 }),
-      makeAgent({ id: 'b', pinned: true, lastUsedAt: 100 }),
-      makeAgent({ id: 'c', pinned: false, lastUsedAt: 500 }),
-    ])
-    expect(result.map((entry) => entry.id)).toEqual(['b', 'a', 'c'])
-  })
-
-  it('sorts by lastUsedAt desc within each pin group', () => {
-    const result = orderAgentsByPinThenRecency([
-      makeAgent({ id: 'older-pin', pinned: true, lastUsedAt: 100 }),
-      makeAgent({ id: 'newer-pin', pinned: true, lastUsedAt: 200 }),
-      makeAgent({ id: 'older', pinned: false, lastUsedAt: 50 }),
-      makeAgent({ id: 'newer', pinned: false, lastUsedAt: 80 }),
-    ])
-    expect(result.map((entry) => entry.id)).toEqual([
-      'newer-pin',
-      'older-pin',
-      'newer',
-      'older',
-    ])
-  })
-
-  it('seed-pins the gateway main agent above other never-used agents', () => {
-    const result = orderAgentsByPinThenRecency([
-      makeAgent({ id: 'aaa', pinned: false, lastUsedAt: null }),
-      makeAgent({ id: 'main', pinned: false, lastUsedAt: null }),
-      makeAgent({ id: 'zzz', pinned: false, lastUsedAt: null }),
-    ])
-    expect(result.map((entry) => entry.id)).toEqual(['main', 'aaa', 'zzz'])
-  })
-
-  it('drops the main seed-pin once the agent has been used', () => {
-    const result = orderAgentsByPinThenRecency([
-      makeAgent({ id: 'aaa', pinned: false, lastUsedAt: 999 }),
-      makeAgent({ id: 'main', pinned: false, lastUsedAt: 1 }),
-    ])
-    expect(result.map((entry) => entry.id)).toEqual(['aaa', 'main'])
-  })
-
-  it('puts never-used agents below recently-used ones', () => {
-    const result = orderAgentsByPinThenRecency([
-      makeAgent({ id: 'fresh', pinned: false, lastUsedAt: null }),
-      makeAgent({ id: 'used', pinned: false, lastUsedAt: 100 }),
-    ])
-    expect(result.map((entry) => entry.id)).toEqual(['used', 'fresh'])
-  })
-
-  it('id-stable tiebreaks two agents with identical lastUsedAt', () => {
-    const result = orderAgentsByPinThenRecency([
-      makeAgent({ id: 'b', pinned: false, lastUsedAt: 100 }),
-      makeAgent({ id: 'a', pinned: false, lastUsedAt: 100 }),
-    ])
-    expect(result.map((entry) => entry.id)).toEqual(['a', 'b'])
-  })
-})
-
-describe('compareAgentsByPinThenRecency', () => {
-  it('produces the same order as the harness-shape helper', () => {
-    const items = [
-      { id: 'older', pinned: false, lastUsedAt: 50 },
-      { id: 'newer', pinned: false, lastUsedAt: 80 },
-      { id: 'pinned', pinned: true, lastUsedAt: 1 },
-    ]
-    const sorted = [...items].sort(compareAgentsByPinThenRecency)
-    expect(sorted.map((item) => item.id)).toEqual(['pinned', 'newer', 'older'])
-  })
-
-  it('seeds the main agent above other never-used rows', () => {
-    const items = [
-      { id: 'zzz', pinned: false, lastUsedAt: null },
-      { id: 'main', pinned: false, lastUsedAt: null },
-    ]
-    const sorted = [...items].sort(compareAgentsByPinThenRecency)
-    expect(sorted.map((item) => item.id)).toEqual(['main', 'zzz'])
-  })
-})
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-list-order.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-list-order.ts
@@ -1,59 +0,0 @@
-import type { HarnessAgent } from './agent-harness-types'
-
-/**
- * Stable ordering for index-shaped agent surfaces (the `/agents` rail
- * and the chat-screen rail at `/agents/:agentId`). Pinned rows float
- * to the top, then recency desc, with never-used agents falling to
- * the bottom in id-stable order. The gateway's `main` agent gets
- * seed-pinned to the top of the never-used group so a fresh install
- * has an obvious starting point even before the user has used it.
- *
- * NOT the same rule as the home grid (`orderHomeAgents`): home is
- * action-shaped — active-turn floats to the top — so users can
- * resume what's running. The chat rail keeps recency stable so it
- * doesn't reshuffle as turns transition every 5s.
- */
-export function orderAgentsByPinThenRecency(
-  agents: HarnessAgent[],
-): HarnessAgent[] {
-  return [...agents].sort((a, b) => {
-    const aPinned = a.pinned ?? false
-    const bPinned = b.pinned ?? false
-    if (aPinned !== bPinned) return aPinned ? -1 : 1
-
-    const aSeed = a.id === 'main' && (a.lastUsedAt ?? null) === null
-    const bSeed = b.id === 'main' && (b.lastUsedAt ?? null) === null
-    if (aSeed && !bSeed) return -1
-    if (!aSeed && bSeed) return 1
-
-    const aValue = a.lastUsedAt ?? Number.NEGATIVE_INFINITY
-    const bValue = b.lastUsedAt ?? Number.NEGATIVE_INFINITY
-    if (aValue !== bValue) return bValue - aValue
-
-    return a.id.localeCompare(b.id)
-  })
-}
-
-/**
- * Same comparator, but operates over arbitrary records that carry
- * `pinned`, `lastUsedAt`, and an `id`-equivalent key. Used by the
- * `/agents` `AgentList` which pivots `AgentListItem` + harness
- * lookup into a sortable shape; both surfaces stay on identical
- * sort semantics through this adapter.
- */
-export function compareAgentsByPinThenRecency<
-  T extends { pinned: boolean; lastUsedAt: number | null; id: string },
->(a: T, b: T): number {
-  if (a.pinned !== b.pinned) return a.pinned ? -1 : 1
-
-  const aSeed = a.id === 'main' && a.lastUsedAt === null
-  const bSeed = b.id === 'main' && b.lastUsedAt === null
-  if (aSeed && !bSeed) return -1
-  if (!aSeed && bSeed) return 1
-
-  const aValue = a.lastUsedAt ?? Number.NEGATIVE_INFINITY
-  const bValue = b.lastUsedAt ?? Number.NEGATIVE_INFINITY
-  if (aValue !== bValue) return bValue - aValue
-
-  return a.id.localeCompare(b.id)
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-page-actions.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-page-actions.ts
@@ -20,22 +20,17 @@ import type {
 export interface AgentPageActionInput {
  createProviderId: string
  createRuntime: CreateAgentRuntime
-  createHermesProviderId: string
  harnessModelId: string
  harnessReasoningEffort: string
  navigate: NavigateFunction
  newName: string
  selectableOpenClawProviders: ProviderOption[]
-  selectableHermesProviders: ProviderOption[]
  setupProviderId: string
  createHarnessAgent: (input: {
    name: string
    adapter: HarnessAgentAdapter
    modelId?: string
    reasoningEffort?: string
-    providerType?: string
-    apiKey?: string
-    baseUrl?: string
  }) => Promise<HarnessAgent>
  createOpenClawAgent: (
    input: OpenClawAgentMutationInput,
@@ -119,37 +114,20 @@ export function createAgentPageActions(input: AgentPageActionInput) {
  const handleHarnessCreate = async () => {
    if (!input.newName.trim()) return

-    const isHermes = input.createRuntime === 'hermes'
-    // Hermes pulls every provider field from the user's selected entry
-    // in the global LLM-providers list (managed under AI Settings). The
-    // backend rejects creation if any required field is missing.
-    const hermesProvider = isHermes
-      ? input.selectableHermesProviders.find(
-          (option) => option.id === input.createHermesProviderId,
-        )
-      : undefined
-    const effectiveModelId = isHermes
-      ? hermesProvider?.modelId
-      : input.harnessModelId || undefined
-
    input.setCreateError(null)
    try {
      const agent = await input.createHarnessAgent({
        name: input.newName.trim(),
        adapter: input.createRuntime as HarnessAgentAdapter,
-        modelId: effectiveModelId,
+        modelId: input.harnessModelId || undefined,
        reasoningEffort: input.harnessReasoningEffort || undefined,
-        providerType: hermesProvider?.type,
-        apiKey: hermesProvider?.apiKey,
-        baseUrl: hermesProvider?.baseUrl,
      })
      input.setCreateOpen(false)
      input.setNewName('')
      track(AGENT_CREATED_EVENT, {
        runtime: input.createRuntime,
-        model_id: effectiveModelId,
+        model_id: input.harnessModelId || undefined,
        reasoning_effort: input.harnessReasoningEffort || undefined,
-        provider_type: hermesProvider?.type,
      })
      input.navigate(`/agents/${agent.id}`)
    } catch (err) {
@@ -162,7 +140,6 @@ export function createAgentPageActions(input: AgentPageActionInput) {
      openclaw: handleOpenClawCreate,
      claude: handleHarnessCreate,
      codex: handleHarnessCreate,
-      hermes: handleHarnessCreate,
    }
    void createByRuntime[input.createRuntime]()
  }
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-page-hooks.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/agents-page-hooks.ts
@@ -4,9 +4,8 @@ import type {
  HarnessAdapterDescriptor,
  HarnessAgentAdapter,
 } from './agent-harness-types'
-import type { CreateAgentRuntime, ProviderOption } from './agents-page-types'
+import type { CreateAgentRuntime } from './agents-page-types'
 import { toProviderOptions } from './agents-page-utils'
-import { getHermesSupportedProviders } from './hermes-supported-providers'
 import {
  buildOpenClawCliProviderOptions,
  findOpenClawCliProviderById,
@@ -172,60 +171,3 @@ export function useOpenClawProviderSelection(input: {
    cliAuthError,
  }
 }
-
-/**
- * Mirror of useOpenClawProviderSelection but for Hermes. Hermes only
- * needs the create-dialog flow (no setup dialog, no CLI providers), so
- * this hook is much smaller — it just filters the global provider list
- * to ones Hermes can drive and seeds the selected id when the dialog
- * opens.
- */
-export function useHermesProviderSelection(input: {
-  providers: LlmProviderConfig[]
-  defaultProviderId: string
-  createOpen: boolean
-  createRuntime: CreateAgentRuntime
-  createHermesProviderId: string
-  setCreateHermesProviderId: Dispatch<SetStateAction<string>>
-}) {
-  const {
-    providers,
-    defaultProviderId,
-    createOpen,
-    createRuntime,
-    createHermesProviderId,
-    setCreateHermesProviderId,
-  } = input
-
-  const selectableHermesProviders = useMemo<ProviderOption[]>(
-    () =>
-      getHermesSupportedProviders(providers).map((provider) => ({
-        id: provider.id,
-        type: provider.type,
-        name: provider.name,
-        modelId: provider.modelId,
-        baseUrl: provider.baseUrl,
-        apiKey: provider.apiKey,
-      })),
-    [providers],
-  )
-
-  useEffect(() => {
-    if (selectableHermesProviders.length === 0) return
-    if (!createOpen || createRuntime !== 'hermes') return
-    if (createHermesProviderId) return
-    const fallbackId =
-      selectableHermesProviders.find((p) => p.id === defaultProviderId)?.id ??
-      selectableHermesProviders[0].id
-    setCreateHermesProviderId(fallbackId)
-  }, [
-    createHermesProviderId,
-    createOpen,
-    createRuntime,
-    defaultProviderId,
-    selectableHermesProviders,
-    setCreateHermesProviderId,
-  ])
-
-  return { selectableHermesProviders }
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/hermes-supported-providers.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/hermes-supported-providers.ts
@@ -1,30 +0,0 @@
-import {
-  HERMES_SUPPORTED_BROWSEROS_PROVIDER_TYPES,
-  type HermesSupportedBrowserosProviderType,
-} from '@browseros/shared/constants/hermes'
-import type { LlmProviderConfig, ProviderType } from '@/lib/llm-providers/types'
-
-export function isHermesSupportedProviderType(
-  providerType: ProviderType,
-): providerType is HermesSupportedBrowserosProviderType {
-  return (
-    HERMES_SUPPORTED_BROWSEROS_PROVIDER_TYPES as readonly ProviderType[]
-  ).includes(providerType)
-}
-
-/**
- * Filters the user's global LLM providers down to ones Hermes can use.
- * A provider qualifies when its type is in the Hermes-supported set
- * AND it has an API key wired up. CLI-style providers (chatgpt-pro,
- * github-copilot, qwen-code) and other unsupported types (browseros,
- * ollama, lmstudio, bedrock, azure, google, moonshot) are filtered
- * out — Hermes can't drive them today.
- */
-export function getHermesSupportedProviders(
-  providers: LlmProviderConfig[],
-): LlmProviderConfig[] {
-  return providers.filter(
-    (provider) =>
-      !!provider.apiKey && isHermesSupportedProviderType(provider.type),
-  )
-}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/useAgents.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/useAgents.ts
@@ -25,18 +25,12 @@ interface HarnessAgentsResponse {

 export type { AgentHarnessStreamEvent }

-export const AGENT_QUERY_KEYS = {
+const AGENT_QUERY_KEYS = {
  adapters: 'agent-harness-adapters',
  agents: 'agent-harness-agents',
-  /** Outputs-rail data for one agent — `[agentOutputs, baseUrl, agentId]`. */
-  agentOutputs: 'agent-harness-agent-outputs',
-  /** Per-turn artifact-card files — `[agentTurnFiles, baseUrl, agentId, turnId]`. */
-  agentTurnFiles: 'agent-harness-agent-turn-files',
-  /** Single-file preview payload — `[filePreview, baseUrl, fileId]`. */
-  filePreview: 'agent-harness-file-preview',
 } as const

-export async function agentsFetch<T>(
+async function agentsFetch<T>(
  baseUrl: string,
  path: string,
  init?: RequestInit,
--- a/packages/browseros-agent/apps/agent/entrypoints/app/layout/SidebarLayout.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/layout/SidebarLayout.tsx
@@ -85,8 +85,7 @@ export const SidebarLayout: FC = () => {

  return (
    <RpcClientProvider>
-      {/* pl-14 offsets all content by the collapsed sidebar width (w-14 = 56px) so it never sits under the rail */}
-      <div className="relative min-h-screen bg-background pl-14">
+      <div className="relative min-h-screen bg-background">
        {/* Sidebar - fixed overlay */}
        {/* biome-ignore lint/a11y/noStaticElementInteractions: hover interactions needed */}
        <div
@@ -97,6 +96,7 @@ export const SidebarLayout: FC = () => {
          <AppSidebar expanded={sidebarOpen} onOpenShortcuts={openShortcuts} />
        </div>

+        {/* Main content - full width, centered */}
        {location.pathname === '/home/chat' ? (
          <main className="relative h-dvh overflow-hidden">
            <Outlet />
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/sidepanel-chat-targets.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/sidepanel-chat-targets.ts
@@ -108,7 +108,6 @@ function formatAdapterName(adapter: HarnessAgentAdapter): string {
  if (adapter === 'claude') return 'Claude Code'
  if (adapter === 'codex') return 'Codex'
  if (adapter === 'openclaw') return 'OpenClaw'
-  if (adapter === 'hermes') return 'Hermes'
  return adapter
 }

--- a/packages/browseros-agent/apps/agent/lib/agent-conversations/types.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-conversations/types.ts
@@ -42,34 +42,11 @@ export interface UserAttachmentPreview {
  dataUrl?: string
 }

-/**
- * Files attributed to this turn by the harness's per-turn workspace
- * diff. Populated either via the live `produced_files` SSE event or
- * (on resume) the `useAgentTurnFiles` fallback. Mirrors the wire
- * shape from `agent-harness-types.HarnessProducedFile` minus the
- * stream-only fields the inline card doesn't need.
- */
-export interface ConversationTurnFile {
-  id: string
-  path: string
-  size: number
-  mtimeMs: number
-}
-
 export interface AgentConversationTurn {
  id: string
-  /**
-   * Server-issued turn id, set as soon as the response headers arrive
-   * (`X-Turn-Id`) for fresh sends, or from the active-turn payload on
-   * resume. Required for the historic-files fallback fetch; absent on
-   * the brief optimistic window before the first header.
-   */
-  turnId?: string | null
  userText: string
  userAttachments?: UserAttachmentPreview[]
  parts: AssistantPart[]
-  /** Files produced during this turn (openclaw only in v1). */
-  producedFiles?: ConversationTurnFile[]
  done: boolean
  timestamp: number
 }
--- a/packages/browseros-agent/apps/agent/lib/agent-files/file-helpers.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-files/file-helpers.ts
@@ -1,126 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * Pure helpers used by the artifact card and the Outputs rail.
- * Display formatting only — no React, no fetch, no DOM. Anything
- * stateful belongs in `./useAgentOutputs` or `./useFilePreview`.
- */
-
-import { buildAgentApiUrl } from '@/entrypoints/app/agents/agent-api-url'
-
-/**
- * Coarse classification of a file's intended preview / icon path.
- * Mirrors the server-side `FilePreviewKind` minus `missing` — the
- * client only ever computes a kind for a row it already has.
- */
-export type FileKind = 'text' | 'image' | 'pdf' | 'binary'
-
-const TEXT_EXTENSIONS = new Set([
-  'txt',
-  'md',
-  'markdown',
-  'json',
-  'jsonl',
-  'csv',
-  'tsv',
-  'xml',
-  'yaml',
-  'yml',
-  'toml',
-  'ini',
-  'log',
-  'html',
-  'htm',
-  'css',
-  'js',
-  'mjs',
-  'cjs',
-  'ts',
-  'tsx',
-  'jsx',
-  'py',
-  'rb',
-  'go',
-  'rs',
-  'java',
-  'kt',
-  'swift',
-  'c',
-  'h',
-  'cpp',
-  'hpp',
-  'sh',
-  'zsh',
-  'bash',
-  'sql',
-  'svg',
-])
-
-const IMAGE_EXTENSIONS = new Set([
-  'png',
-  'jpg',
-  'jpeg',
-  'gif',
-  'webp',
-  'bmp',
-  'ico',
-  'heic',
-  'heif',
-])
-
-/** Best-effort kind based on extension only. Server's preview API
- * is the source of truth for actual rendering — this is just for
- * picking an icon / sort hint without a network round-trip. */
-export function inferFileKind(path: string): FileKind {
-  const ext = extensionOf(path).toLowerCase()
-  if (ext === 'pdf') return 'pdf'
-  if (IMAGE_EXTENSIONS.has(ext)) return 'image'
-  if (TEXT_EXTENSIONS.has(ext)) return 'text'
-  return 'binary'
-}
-
-/** Plain extension without the leading dot. Empty string when none. */
-export function extensionOf(path: string): string {
-  const dot = path.lastIndexOf('.')
-  if (dot === -1) return ''
-  const slash = path.lastIndexOf('/')
-  if (dot < slash) return ''
-  return path.slice(dot + 1)
-}
-
-/** File name (final path segment), no directory prefix. */
-export function basenameOf(path: string): string {
-  const slash = path.lastIndexOf('/')
-  return slash === -1 ? path : path.slice(slash + 1)
-}
-
-const SIZE_UNITS = ['B', 'KB', 'MB', 'GB', 'TB'] as const
-
-/** "2.4 MB" / "340 KB" / "78 B" — for the artifact card's right-side
- *  metadata. Not localised; the rail uses one space + the unit. */
-export function formatFileSize(bytes: number): string {
-  if (!Number.isFinite(bytes) || bytes < 0) return '—'
-  if (bytes < 1024) return `${bytes} ${SIZE_UNITS[0]}`
-  let value = bytes
-  let unit = 0
-  while (value >= 1024 && unit < SIZE_UNITS.length - 1) {
-    value /= 1024
-    unit += 1
-  }
-  // 1-digit precision below 10, integer above — feels less noisy.
-  const formatted = value < 10 ? value.toFixed(1) : Math.round(value).toString()
-  return `${formatted} ${SIZE_UNITS[unit]}`
-}
-
-/**
- * Build the per-file download URL using the same agent-api root the
- * rest of the harness hits. Returned URL is already absolute.
- */
-export function buildFileDownloadUrl(baseUrl: string, fileId: string): string {
-  return buildAgentApiUrl(
-    baseUrl,
-    `/files/${encodeURIComponent(fileId)}/download`,
-  )
-}
--- a/packages/browseros-agent/apps/agent/lib/agent-files/index.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-files/index.ts
@@ -1,32 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- */
-
-export {
-  basenameOf,
-  buildFileDownloadUrl,
-  extensionOf,
-  type FileKind,
-  formatFileSize,
-  inferFileKind,
-} from './file-helpers'
-export type {
-  BinaryFilePreview,
-  FilePreview,
-  FilePreviewKind,
-  ImageFilePreview,
-  MissingFilePreview,
-  PdfFilePreview,
-  ProducedFile,
-  ProducedFilesRailGroup,
-  TextFilePreview,
-} from './types'
-export {
-  useAgentOutputs,
-  useAgentTurnFiles,
-  useInvalidateAgentOutputs,
-  useRefreshAgentOutputs,
-} from './useAgentOutputs'
-export { useFilePreview } from './useFilePreview'
--- a/packages/browseros-agent/apps/agent/lib/agent-files/types.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-files/types.ts
@@ -1,75 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * Wire types shared by the inline artifact card and the per-agent
- * Outputs rail. These mirror `ProducedFileEntry` /
- * `ProducedFilesRailGroup` on the server and the `FilePreview`
- * discriminated union from `apps/server/src/api/services/openclaw/file-preview.ts`.
- *
- * The schema mirror is deliberate (vs sharing a workspace package)
- * because the server keeps the on-disk row shape — `agentDefinitionId`,
- * `sessionKey` — out of the wire payload. Dropping those columns at the
- * type boundary keeps the client honest about what it can refer to.
- */
-
-export interface ProducedFile {
-  id: string
-  /** Workspace-relative POSIX path. */
-  path: string
-  size: number
-  mtimeMs: number
-  /** Server clock when the file was first attributed to its turn. */
-  createdAt: number
-  detectedBy: 'diff' | 'tool'
-}
-
-export interface ProducedFilesRailGroup {
-  turnId: string
-  /** First non-blank line of the user prompt that initiated this turn. */
-  turnPrompt: string
-  createdAt: number
-  files: ProducedFile[]
-}
-
-export type FilePreviewKind = 'text' | 'image' | 'pdf' | 'binary' | 'missing'
-
-interface BasePreview {
-  kind: FilePreviewKind
-  mimeType: string
-  size: number
-  mtimeMs: number
-}
-
-export interface TextFilePreview extends BasePreview {
-  kind: 'text'
-  snippet: string
-  /** True when the on-disk file is larger than the server's snippet cap. */
-  truncated: boolean
-}
-
-export interface ImageFilePreview extends BasePreview {
-  kind: 'image'
-  /** Base64 data URL (incl. `data:` prefix). Suitable for `<img src>`. */
-  dataUrl: string
-}
-
-export interface PdfFilePreview extends BasePreview {
-  kind: 'pdf'
-}
-
-export interface BinaryFilePreview extends BasePreview {
-  kind: 'binary'
-}
-
-export interface MissingFilePreview {
-  kind: 'missing'
-}
-
-export type FilePreview =
-  | TextFilePreview
-  | ImageFilePreview
-  | PdfFilePreview
-  | BinaryFilePreview
-  | MissingFilePreview
--- a/packages/browseros-agent/apps/agent/lib/agent-files/useAgentOutputs.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-files/useAgentOutputs.ts
@@ -1,166 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * React Query hooks backing the per-agent Outputs rail and the
- * inline artifact card.
- *
- * Live updates: the consumer of `useAgentConversation` (see Phase 5)
- * is expected to call `useInvalidateAgentOutputs(agentId)` whenever
- * an assistant turn completes, so the rail picks up the new
- * `produced_files` rows the server attributed during that turn.
- * No SSE channel here — invalidation off the existing chat-stream
- * completion is enough for v1.
- */
-
-import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query'
-import {
-  AGENT_QUERY_KEYS,
-  agentsFetch,
-} from '@/entrypoints/app/agents/useAgents'
-import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
-import type { ProducedFile, ProducedFilesRailGroup } from './types'
-
-interface OutputsResponse {
-  groups: ProducedFilesRailGroup[]
-}
-
-interface TurnFilesResponse {
-  files: ProducedFile[]
-}
-
-export function useAgentOutputs(agentId: string, enabled = true) {
-  const {
-    baseUrl,
-    isLoading: urlLoading,
-    error: urlError,
-  } = useAgentServerUrl()
-
-  const query = useQuery<ProducedFilesRailGroup[], Error>({
-    queryKey: [AGENT_QUERY_KEYS.agentOutputs, baseUrl, agentId],
-    queryFn: async () => {
-      const data = await agentsFetch<OutputsResponse>(
-        baseUrl as string,
-        `/${encodeURIComponent(agentId)}/files`,
-      )
-      return data.groups ?? []
-    },
-    enabled: Boolean(baseUrl) && !urlLoading && enabled && Boolean(agentId),
-  })
-
-  return {
-    groups: query.data ?? [],
-    loading: query.isLoading || urlLoading,
-    error: query.error ?? urlError,
-    refetch: query.refetch,
-  }
-}
-
-/**
- * Per-turn fetch for the inline artifact card. Used both as the
- * fallback when an SSE `produced_files` event was missed, and to
- * rehydrate a turn the user scrolled back to.
- */
-export function useAgentTurnFiles(
-  agentId: string,
-  turnId: string | null,
-  enabled = true,
-) {
-  const {
-    baseUrl,
-    isLoading: urlLoading,
-    error: urlError,
-  } = useAgentServerUrl()
-
-  const query = useQuery<ProducedFile[], Error>({
-    queryKey: [AGENT_QUERY_KEYS.agentTurnFiles, baseUrl, agentId, turnId],
-    queryFn: async () => {
-      const data = await agentsFetch<TurnFilesResponse>(
-        baseUrl as string,
-        `/${encodeURIComponent(agentId)}/files/turn/${encodeURIComponent(
-          turnId as string,
-        )}`,
-      )
-      return data.files ?? []
-    },
-    enabled:
-      Boolean(baseUrl) &&
-      !urlLoading &&
-      enabled &&
-      Boolean(agentId) &&
-      Boolean(turnId),
-  })
-
-  return {
-    files: query.data ?? [],
-    loading: query.isLoading || urlLoading,
-    error: query.error ?? urlError,
-    refetch: query.refetch,
-  }
-}
-
-/**
- * Returns a callable that invalidates outputs / turn-files queries
- * for one agent across any baseUrl. Call after an assistant turn
- * completes so the rail (and the inline file-card strip) pick up
- * the new attributed rows. Cheap when the queries aren't mounted
- * — react-query just marks the cached value stale.
- *
- * Implementation note: react-query's `invalidateQueries({ queryKey })`
- * does positional partial-match, so passing `undefined` as the
- * baseUrl placeholder does NOT match a cached `[…, baseUrl, …]`
- * key — the cache stayed stale. Use a predicate so we ignore the
- * baseUrl position entirely.
- */
-export function useInvalidateAgentOutputs() {
-  const queryClient = useQueryClient()
-  return async (agentId: string, turnId?: string) => {
-    await Promise.all([
-      queryClient.invalidateQueries({
-        predicate: (query) => {
-          const key = query.queryKey
-          return (
-            Array.isArray(key) &&
-            key[0] === AGENT_QUERY_KEYS.agentOutputs &&
-            key[2] === agentId
-          )
-        },
-      }),
-      queryClient.invalidateQueries({
-        predicate: (query) => {
-          const key = query.queryKey
-          if (
-            !Array.isArray(key) ||
-            key[0] !== AGENT_QUERY_KEYS.agentTurnFiles ||
-            key[2] !== agentId
-          ) {
-            return false
-          }
-          // When a turnId was supplied, scope to just that turn's
-          // entry. Otherwise flush every cached turn for this agent.
-          return turnId ? key[3] === turnId : true
-        },
-      }),
-    ])
-  }
-}
-
-/**
- * Tiny mutation wrapper so the Outputs rail's "Refresh" button can
- * surface an `isPending` indicator while the new query is in flight.
- * No body — just triggers `refetch` on the rail's query for this
- * agent and resolves when it settles.
- */
-export function useRefreshAgentOutputs(agentId: string) {
-  const queryClient = useQueryClient()
-  const { baseUrl } = useAgentServerUrl()
-  return useMutation({
-    mutationFn: async () => {
-      await queryClient.refetchQueries({
-        queryKey: [AGENT_QUERY_KEYS.agentOutputs, baseUrl, agentId],
-        exact: true,
-      })
-    },
-  })
-}
--- a/packages/browseros-agent/apps/agent/lib/agent-files/useFilePreview.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-files/useFilePreview.ts
@@ -1,49 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- *
- * Single-file preview hook used by the inline artifact card and the
- * Outputs rail's preview Sheet. Always opt-in (`enabled`) — the
- * preview is fetched only when the user clicks a row, never
- * eagerly.
- */
-
-import { useQuery } from '@tanstack/react-query'
-import {
-  AGENT_QUERY_KEYS,
-  agentsFetch,
-} from '@/entrypoints/app/agents/useAgents'
-import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
-import type { FilePreview } from './types'
-
-export function useFilePreview(fileId: string | null, enabled = true) {
-  const {
-    baseUrl,
-    isLoading: urlLoading,
-    error: urlError,
-  } = useAgentServerUrl()
-
-  const query = useQuery<FilePreview, Error>({
-    queryKey: [AGENT_QUERY_KEYS.filePreview, baseUrl, fileId],
-    queryFn: async () => {
-      return agentsFetch<FilePreview>(
-        baseUrl as string,
-        `/files/${encodeURIComponent(fileId as string)}/preview`,
-      )
-    },
-    enabled: Boolean(baseUrl) && !urlLoading && enabled && Boolean(fileId),
-    // Previews are immutable for a given fileId — once loaded, never
-    // refetch on focus / reconnect. They go stale only when the
-    // underlying file is removed (rare in v1; no rename / delete).
-    staleTime: Infinity,
-    gcTime: 5 * 60 * 1000,
-  })
-
-  return {
-    preview: query.data ?? null,
-    loading: query.isLoading || urlLoading,
-    error: query.error ?? urlError,
-    refetch: query.refetch,
-  }
-}
--- a/packages/browseros-agent/apps/agent/lib/attachments.test.ts
+++ b/packages/browseros-agent/apps/agent/lib/attachments.test.ts
@@ -1,76 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import { stageAttachment } from './attachments'
-
-function restoreGlobal(name: string, value: unknown) {
-  if (value === undefined) {
-    Reflect.deleteProperty(globalThis, name)
-    return
-  }
-  Reflect.set(globalThis, name, value)
-}
-
-describe('stageAttachment', () => {
-  it('uses the recompressed blob media type for large images', async () => {
-    const originalCreateImageBitmap = Reflect.get(
-      globalThis,
-      'createImageBitmap',
-    )
-    const originalOffscreenCanvas = Reflect.get(globalThis, 'OffscreenCanvas')
-    const originalHTMLCanvasElement = Reflect.get(
-      globalThis,
-      'HTMLCanvasElement',
-    )
-
-    class FakeOffscreenCanvas {
-      width: number
-      height: number
-
-      constructor(width: number, height: number) {
-        this.width = width
-        this.height = height
-      }
-
-      getContext() {
-        return {
-          drawImage() {},
-        }
-      }
-
-      async convertToBlob(options: { type?: string }) {
-        return new Blob([new Uint8Array([9, 8, 7])], {
-          type: options.type ?? 'image/jpeg',
-        })
-      }
-    }
-
-    try {
-      Reflect.set(globalThis, 'createImageBitmap', async () => ({
-        width: 4096,
-        height: 2048,
-        close() {},
-      }))
-      Reflect.set(globalThis, 'OffscreenCanvas', FakeOffscreenCanvas)
-      Reflect.set(globalThis, 'HTMLCanvasElement', class HTMLCanvasElement {})
-
-      const file = new File([new Uint8Array(2 * 1024 * 1024)], 'shot.png', {
-        type: 'image/png',
-      })
-
-      const result = await stageAttachment(file)
-
-      expect(result.ok).toBe(true)
-      if (!result.ok) throw new Error(result.error.message)
-      expect(result.attachment.mediaType).toBe('image/jpeg')
-      expect(result.attachment.dataUrl).toStartWith('data:image/jpeg;base64,')
-      expect(result.attachment.payload).toMatchObject({
-        kind: 'image',
-        mediaType: 'image/jpeg',
-        dataUrl: result.attachment.dataUrl,
-      })
-    } finally {
-      restoreGlobal('createImageBitmap', originalCreateImageBitmap)
-      restoreGlobal('OffscreenCanvas', originalOffscreenCanvas)
-      restoreGlobal('HTMLCanvasElement', originalHTMLCanvasElement)
-    }
-  })
-})
--- a/packages/browseros-agent/apps/agent/lib/attachments.ts
+++ b/packages/browseros-agent/apps/agent/lib/attachments.ts
@@ -100,7 +100,6 @@ export async function stageAttachment(
    try {
      const compressed = await compressImageIfNeeded(file)
      const dataUrl = await readAsDataUrl(compressed)
-      const encodedMediaType = compressed.type || mediaType
      // Rough byte ceiling — `data:image/png;base64,...` doubles size with
      // base64. Reject early so we never POST something the route will 400.
      if (dataUrl.length > MAX_IMAGE_BYTES * 2) {
@@ -119,12 +118,12 @@ export async function stageAttachment(
        attachment: {
          id: makeId(),
          kind: 'image',
-          mediaType: encodedMediaType,
+          mediaType,
          name: file.name || 'image',
          dataUrl,
          payload: {
            kind: 'image',
-            mediaType: encodedMediaType,
+            mediaType,
            dataUrl,
            name: file.name || undefined,
          },
--- a/packages/browseros-agent/apps/cli/README.md
+++ b/packages/browseros-agent/apps/cli/README.md
@@ -38,8 +38,8 @@ browseros-cli install                # downloads BrowserOS for your platform
 # If BrowserOS is installed but not running
 browseros-cli launch                 # opens BrowserOS, waits for server

-# Configure the CLI with the Server URL from BrowserOS settings
-browseros-cli init http://127.0.0.1:9000/mcp
+# Configure the CLI (auto-discovers running BrowserOS)
+browseros-cli init --auto            # detects server URL and saves config

 # Verify connection
 browseros-cli health
@@ -52,7 +52,7 @@ browseros-cli init <url>             # non-interactive — pass URL directly
 browseros-cli init                   # interactive — prompts for URL
 ```

-Config is saved to `~/.config/browseros-cli/config.yaml`. If `browseros-cli health` cannot connect, copy the current Server URL from BrowserOS Settings > BrowserOS MCP and run `browseros-cli init <Server URL>` again.
+Config is saved to `~/.config/browseros-cli/config.yaml`. The CLI also auto-discovers the server from `~/.browseros/server.json` (written by BrowserOS on startup).

 ### CLI updates

@@ -126,9 +126,9 @@ To connect Claude Code, Gemini CLI, or any MCP client, see the [MCP setup guide]
 | `--debug` | `BOS_DEBUG=1` | Debug output |
 | `--timeout, -t` | | Request timeout (default: 2m) |

-Priority for server URL: `--server` flag > `BROWSEROS_URL` env > config file
+Priority for server URL: `--server` flag > `BROWSEROS_URL` env > `~/.browseros/server.json` > config file

-If no server URL is configured, the CLI exits with setup instructions pointing to `install`, `launch`, and `init <Server URL>`.
+If no server URL is configured, the CLI exits with setup instructions pointing to `install`, `launch`, and `init`.

 ## Testing

@@ -179,7 +179,7 @@ apps/cli/
 │   └── config.go       # Config file (~/.config/browseros-cli/config.yaml)
 ├── cmd/
 │   ├── root.go         # Root command, global flags
-│   ├── init.go         # Server URL configuration (URL arg or interactive)
+│   ├── init.go         # Server URL configuration (URL arg, --auto, interactive)
 │   ├── install.go      # install (download BrowserOS for current platform)
 │   ├── launch.go       # launch (find and start BrowserOS, wait for server)
 │   ├── open.go         # open (new_page / new_hidden_page)
--- a/packages/browseros-agent/apps/cli/cmd/init.go
+++ b/packages/browseros-agent/apps/cli/cmd/init.go
@@ -17,6 +17,8 @@ import (
 )

 func init() {
+	var autoDiscover bool
+
 	cmd := &cobra.Command{
 		Use:   "init [url]",
 		Short: "Configure the BrowserOS server connection",
@@ -32,8 +34,9 @@ You can provide the full URL or just the port number:
  browseros-cli init http://127.0.0.1:9000/mcp
  browseros-cli init 9000

-Modes:
+Three modes:
  browseros-cli init <url>    Non-interactive (full URL or port number)
+  browseros-cli init --auto   Auto-discover from ~/.browseros/server.json
  browseros-cli init          Interactive prompt`,
 		Annotations: map[string]string{"group": "Setup:"},
 		Args:        cobra.MaximumNArgs(1),
@@ -46,9 +49,22 @@ Modes:

 			switch {
 			case len(args) == 1:
+				// Non-interactive: URL provided as argument
 				input = args[0]

+			case autoDiscover:
+				// Auto-discover: server.json → config → probe common ports
+				discovered := probeRunningServer()
+				if discovered == "" {
+					output.Error("auto-discovery failed: no running BrowserOS found.\n\n"+
+						"  If not running:    browseros-cli launch\n"+
+						"  If not installed:  browseros-cli install", 1)
+				}
+				input = discovered
+				fmt.Printf("Auto-discovered server at %s\n", input)
+
 			default:
+				// Interactive prompt (original behavior)
 				fmt.Println()
 				bold.Println("BrowserOS CLI Setup")
 				fmt.Println()
@@ -79,14 +95,12 @@ Modes:
 				output.Errorf(1, "invalid URL: %s", input)
 			}

+			// Verify connectivity
 			fmt.Printf("Checking connection to %s ...\n", baseURL)
 			client := &http.Client{Timeout: 5 * time.Second}
 			resp, err := client.Get(baseURL + "/health")
 			if err != nil {
-				output.Errorf(1, "cannot connect to %s: %v\n\n"+
-					"Open BrowserOS Settings > BrowserOS MCP and copy the Server URL.\n"+
-					"Then run: browseros-cli init <Server URL>\n"+
-					"Example:  browseros-cli init http://127.0.0.1:9000/mcp", baseURL, err)
+				output.Errorf(1, "cannot connect to %s: %v\nIs BrowserOS running?", baseURL, err)
 			}
 			resp.Body.Close()

@@ -107,5 +121,6 @@ Modes:
 		},
 	}

+	cmd.Flags().BoolVar(&autoDiscover, "auto", false, "Auto-discover server URL from ~/.browseros/server.json")
 	rootCmd.AddCommand(cmd)
 }
--- a/packages/browseros-agent/apps/cli/cmd/install.go
+++ b/packages/browseros-agent/apps/cli/cmd/install.go
@@ -28,7 +28,7 @@ Linux:   Downloads AppImage (or .deb with --deb flag)

 After installation:
  browseros-cli launch        # start BrowserOS
-  browseros-cli init <url>    # configure the CLI with the Server URL`,
+  browseros-cli init --auto   # configure the CLI`,
 		Annotations: map[string]string{"group": "Setup:"},
 		Args:        cobra.NoArgs,
 		Run: func(cmd *cobra.Command, args []string) {
@@ -81,7 +81,7 @@ After installation:
 			fmt.Println()
 			bold.Println("Next steps:")
 			dim.Println("  browseros-cli launch        # start BrowserOS")
-			dim.Println("  browseros-cli init <url>    # use the Server URL from BrowserOS settings")
+			dim.Println("  browseros-cli init --auto   # configure the CLI")
 		},
 	}

--- a/packages/browseros-agent/apps/cli/cmd/launch.go
+++ b/packages/browseros-agent/apps/cli/cmd/launch.go
@@ -1,7 +1,6 @@
 package cmd

 import (
-	"encoding/json"
 	"fmt"
 	"net/http"
 	"os"
@@ -39,7 +38,6 @@ If BrowserOS is already running, reports the server URL.`,

 			if url := probeRunningServer(); url != "" {
 				green.Printf("BrowserOS is already running at %s\n", url)
-				dim.Printf("Next: browseros-cli init %s\n", mcpEndpointURL(url))
 				return
 			}

@@ -65,7 +63,7 @@ If BrowserOS is already running, reports the server URL.`,

 			green.Printf("BrowserOS is ready at %s\n", url)
 			fmt.Println()
-			dim.Printf("Next: browseros-cli init %s\n", mcpEndpointURL(url))
+			dim.Println("Next: browseros-cli init --auto")
 		},
 	}

@@ -77,77 +75,39 @@ If BrowserOS is already running, reports the server URL.`,
 // Server probing
 // ---------------------------------------------------------------------------

-var commonBrowserOSPorts = []int{9100, 9200, 9300}
-
-// probeRunningServer checks launch discovery, explicit config, and common ports for a running server.
+// probeRunningServer checks server.json, config, and common ports for a running server.
 func probeRunningServer() string {
-	client := &http.Client{Timeout: 2 * time.Second}
+	check := func(baseURL string) bool {
+		client := &http.Client{Timeout: 2 * time.Second}
+		resp, err := client.Get(baseURL + "/health")
+		if err != nil {
+			return false
+		}
+		resp.Body.Close()
+		return resp.StatusCode == 200
+	}

-	if url := loadBrowserosServerURL(); url != "" && checkServerHealth(client, url) {
+	// 1. server.json — written by BrowserOS on startup with the actual port
+	if url := loadBrowserosServerURL(); url != "" && check(url) {
 		return url
 	}

-	if url := defaultServerURL(); url != "" && checkServerHealth(client, url) {
+	// 2. Saved config / env var
+	if url := defaultServerURL(); url != "" && check(url) {
 		return url
 	}

-	return probeCommonServerPorts(client)
-}
-
-func checkServerHealth(client *http.Client, baseURL string) bool {
-	resp, err := client.Get(baseURL + "/health")
-	if err != nil {
-		return false
-	}
-	resp.Body.Close()
-	return resp.StatusCode == 200
-}
-
-func probeCommonServerPorts(client *http.Client) string {
-	for _, port := range commonBrowserOSPorts {
+	// 3. Probe common BrowserOS ports as last resort
+	for _, port := range []int{9100, 9200, 9300} {
 		url := fmt.Sprintf("http://127.0.0.1:%d", port)
-		if checkServerHealth(client, url) {
+		if check(url) {
 			return url
 		}
 	}
+
 	return ""
 }

-type serverDiscoveryConfig struct {
-	ServerPort       int    `json:"server_port"`
-	URL              string `json:"url"`
-	ServerVersion    string `json:"server_version"`
-	BrowserOSVersion string `json:"browseros_version,omitempty"`
-	ChromiumVersion  string `json:"chromium_version,omitempty"`
-}
-
-// loadBrowserosServerURL reads BrowserOS's runtime discovery file for launch readiness only.
-//
-// Normal command resolution must not call this because it can override a URL the
-// user explicitly saved with `browseros-cli init <Server URL>`.
-func loadBrowserosServerURL() string {
-	home, err := os.UserHomeDir()
-	if err != nil {
-		return ""
-	}
-
-	data, err := os.ReadFile(filepath.Join(home, ".browseros", "server.json"))
-	if err != nil {
-		return ""
-	}
-
-	var sc serverDiscoveryConfig
-	if err := json.Unmarshal(data, &sc); err != nil {
-		return ""
-	}
-
-	return normalizeServerURL(sc.URL)
-}
-
-func mcpEndpointURL(baseURL string) string {
-	return strings.TrimSuffix(baseURL, "/") + "/mcp"
-}
-
 // ---------------------------------------------------------------------------
 // Platform-native installation detection
 // ---------------------------------------------------------------------------
@@ -157,8 +117,7 @@ func mcpEndpointURL(baseURL string) string {
 // macOS:   `open -Ra "BrowserOS"` — queries Launch Services (finds apps anywhere)
 // Linux:   checks /usr/bin/browseros (.deb), browseros.desktop, or AppImage files
 // Windows: checks executable at %LOCALAPPDATA%\BrowserOS\Application\BrowserOS.exe
-//
-//	and registry uninstall key (per-user Chromium install pattern)
+//          and registry uninstall key (per-user Chromium install pattern)
 func isBrowserOSInstalled() bool {
 	switch runtime.GOOS {
 	case "darwin":
@@ -312,11 +271,14 @@ func waitForServer(maxWait time.Duration) (string, bool) {

 	for time.Now().Before(deadline) {
 		// server.json is written by BrowserOS on startup with the actual port
-		if url := loadBrowserosServerURL(); url != "" && checkServerHealth(client, url) {
-			return url, true
-		}
-		if url := probeCommonServerPorts(client); url != "" {
-			return url, true
+		if url := loadBrowserosServerURL(); url != "" {
+			resp, err := client.Get(url + "/health")
+			if err == nil {
+				resp.Body.Close()
+				if resp.StatusCode == 200 {
+					return url, true
+				}
+			}
 		}
 		fmt.Print(".")
 		time.Sleep(1 * time.Second)
--- a/packages/browseros-agent/apps/cli/cmd/launch_test.go
+++ b/packages/browseros-agent/apps/cli/cmd/launch_test.go
@@ -1,99 +0,0 @@
-package cmd
-
-import (
-	"fmt"
-	"net"
-	"net/http"
-	"net/http/httptest"
-	"net/url"
-	"os"
-	"path/filepath"
-	"strconv"
-	"testing"
-	"time"
-
-	"browseros-cli/config"
-)
-
-func TestProbeRunningServerUsesDiscoveryBeforeConfig(t *testing.T) {
-	home := t.TempDir()
-	t.Setenv("HOME", home)
-	t.Setenv("USERPROFILE", home)
-	t.Setenv("XDG_CONFIG_HOME", t.TempDir())
-	t.Setenv("BROWSEROS_URL", "")
-
-	discoveredServer := newHealthyServer(t)
-	configServer := newHealthyServer(t)
-
-	serverDir := filepath.Join(home, ".browseros")
-	if err := os.MkdirAll(serverDir, 0755); err != nil {
-		t.Fatalf("os.MkdirAll() error = %v", err)
-	}
-	data := []byte(fmt.Sprintf(`{"url":%q}`, discoveredServer.URL))
-	if err := os.WriteFile(filepath.Join(serverDir, "server.json"), data, 0644); err != nil {
-		t.Fatalf("os.WriteFile() error = %v", err)
-	}
-	if err := config.Save(&config.Config{ServerURL: configServer.URL}); err != nil {
-		t.Fatalf("config.Save() error = %v", err)
-	}
-
-	got := probeRunningServer()
-	if got != normalizeServerURL(discoveredServer.URL) {
-		t.Fatalf("probeRunningServer() = %q, want %q", got, normalizeServerURL(discoveredServer.URL))
-	}
-}
-
-func TestWaitForServerUsesCommonPortFallback(t *testing.T) {
-	home := t.TempDir()
-	t.Setenv("HOME", home)
-	t.Setenv("USERPROFILE", home)
-
-	server := newHealthyServer(t)
-	port := serverPort(t, server.URL)
-
-	originalPorts := commonBrowserOSPorts
-	commonBrowserOSPorts = []int{port}
-	t.Cleanup(func() {
-		commonBrowserOSPorts = originalPorts
-	})
-
-	got, ok := waitForServer(100 * time.Millisecond)
-	if !ok {
-		t.Fatal("waitForServer() ok = false, want true")
-	}
-	if got != normalizeServerURL(server.URL) {
-		t.Fatalf("waitForServer() = %q, want %q", got, normalizeServerURL(server.URL))
-	}
-}
-
-func newHealthyServer(t *testing.T) *httptest.Server {
-	t.Helper()
-
-	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		if r.URL.Path != "/health" {
-			http.NotFound(w, r)
-			return
-		}
-		w.WriteHeader(http.StatusOK)
-	}))
-	t.Cleanup(server.Close)
-	return server
-}
-
-func serverPort(t *testing.T, rawURL string) int {
-	t.Helper()
-
-	parsed, err := url.Parse(rawURL)
-	if err != nil {
-		t.Fatalf("url.Parse() error = %v", err)
-	}
-	_, portText, err := net.SplitHostPort(parsed.Host)
-	if err != nil {
-		t.Fatalf("net.SplitHostPort() error = %v", err)
-	}
-	port, err := strconv.Atoi(portText)
-	if err != nil {
-		t.Fatalf("strconv.Atoi() error = %v", err)
-	}
-	return port
-}
--- a/packages/browseros-agent/apps/cli/cmd/root.go
+++ b/packages/browseros-agent/apps/cli/cmd/root.go
@@ -2,8 +2,10 @@ package cmd

 import (
 	"context"
+	"encoding/json"
 	"fmt"
 	"os"
+	"path/filepath"
 	"strconv"
 	"strings"
 	"time"
@@ -287,15 +289,18 @@ func drainAutomaticUpdateCheckWithTimeout(done <-chan struct{}, timeout time.Dur
 	}
 }

-// defaultServerURL returns the implicit target from user-controlled settings only.
-//
-// BrowserOS writes a discovery file at runtime, but normal commands intentionally
-// ignore it so a saved URL is not silently overridden by another running server.
 func defaultServerURL() string {
+	// 1. Explicit env var always wins
 	if env := normalizeServerURL(os.Getenv("BROWSEROS_URL")); env != "" {
 		return env
 	}

+	// 2. Live discovery file from running BrowserOS (most current)
+	if url := loadBrowserosServerURL(); url != "" {
+		return url
+	}
+
+	// 3. Saved config (may be stale if port changed)
 	cfg, err := config.Load()
 	if err == nil {
 		if url := normalizeServerURL(cfg.ServerURL); url != "" {
@@ -306,6 +311,33 @@ func defaultServerURL() string {
 	return ""
 }

+type serverDiscoveryConfig struct {
+	ServerPort       int    `json:"server_port"`
+	URL              string `json:"url"`
+	ServerVersion    string `json:"server_version"`
+	BrowserOSVersion string `json:"browseros_version,omitempty"`
+	ChromiumVersion  string `json:"chromium_version,omitempty"`
+}
+
+func loadBrowserosServerURL() string {
+	home, err := os.UserHomeDir()
+	if err != nil {
+		return ""
+	}
+
+	data, err := os.ReadFile(filepath.Join(home, ".browseros", "server.json"))
+	if err != nil {
+		return ""
+	}
+
+	var sc serverDiscoveryConfig
+	if err := json.Unmarshal(data, &sc); err != nil {
+		return ""
+	}
+
+	return normalizeServerURL(sc.URL)
+}
+
 func normalizeServerURL(raw string) string {
 	normalized := strings.TrimSpace(raw)

@@ -337,10 +369,8 @@ func validateServerURL(raw string) (string, error) {

 	return "", fmt.Errorf(
 		"BrowserOS server URL is not configured.\n\n" +
-			"  Open BrowserOS Settings > BrowserOS MCP and copy the Server URL.\n" +
-			"  Save it with:       browseros-cli init <Server URL>\n" +
-			"  Example:            browseros-cli init http://127.0.0.1:9000/mcp\n" +
-			"  If BrowserOS is closed:  browseros-cli launch\n" +
-			"  If not installed:        browseros-cli install",
+			"  If BrowserOS is running:  browseros-cli init --auto\n" +
+			"  If BrowserOS is closed:   browseros-cli launch\n" +
+			"  If not installed:         browseros-cli install",
 	)
 }
--- a/packages/browseros-agent/apps/cli/cmd/root_test.go
+++ b/packages/browseros-agent/apps/cli/cmd/root_test.go
@@ -1,13 +1,8 @@
 package cmd

 import (
-	"os"
-	"path/filepath"
-	"strings"
 	"testing"
 	"time"
-
-	"browseros-cli/config"
 )

 func TestSetVersionUpdatesRootCommand(t *testing.T) {
@@ -105,76 +100,6 @@ func TestShouldSkipAutomaticUpdates(t *testing.T) {
 	}
 }

-func TestDefaultServerURLUsesEnvBeforeConfig(t *testing.T) {
-	t.Setenv("XDG_CONFIG_HOME", t.TempDir())
-	t.Setenv("BROWSEROS_URL", "http://127.0.0.1:9115/mcp")
-
-	if err := config.Save(&config.Config{ServerURL: "http://127.0.0.1:9000/mcp"}); err != nil {
-		t.Fatalf("config.Save() error = %v", err)
-	}
-
-	got := defaultServerURL()
-	if got != "http://127.0.0.1:9115" {
-		t.Fatalf("defaultServerURL() = %q, want %q", got, "http://127.0.0.1:9115")
-	}
-}
-
-func TestDefaultServerURLUsesSavedConfig(t *testing.T) {
-	t.Setenv("XDG_CONFIG_HOME", t.TempDir())
-	t.Setenv("BROWSEROS_URL", "")
-
-	if err := config.Save(&config.Config{ServerURL: "http://127.0.0.1:9115/mcp"}); err != nil {
-		t.Fatalf("config.Save() error = %v", err)
-	}
-
-	got := defaultServerURL()
-	if got != "http://127.0.0.1:9115" {
-		t.Fatalf("defaultServerURL() = %q, want %q", got, "http://127.0.0.1:9115")
-	}
-}
-
-func TestDefaultServerURLIgnoresBrowserOSServerJSON(t *testing.T) {
-	home := t.TempDir()
-	t.Setenv("HOME", home)
-	t.Setenv("USERPROFILE", home)
-	t.Setenv("XDG_CONFIG_HOME", t.TempDir())
-	t.Setenv("BROWSEROS_URL", "")
-
-	serverDir := filepath.Join(home, ".browseros")
-	if err := os.MkdirAll(serverDir, 0755); err != nil {
-		t.Fatalf("os.MkdirAll() error = %v", err)
-	}
-	data := []byte(`{"url":"http://127.0.0.1:9999"}`)
-	if err := os.WriteFile(filepath.Join(serverDir, "server.json"), data, 0644); err != nil {
-		t.Fatalf("os.WriteFile() error = %v", err)
-	}
-
-	if got := defaultServerURL(); got != "" {
-		t.Fatalf("defaultServerURL() = %q, want empty", got)
-	}
-}
-
-func TestNormalizeServerURLAcceptsMCPEndpoint(t *testing.T) {
-	got := normalizeServerURL(" http://127.0.0.1:9115/mcp ")
-	if got != "http://127.0.0.1:9115" {
-		t.Fatalf("normalizeServerURL() = %q, want %q", got, "http://127.0.0.1:9115")
-	}
-}
-
-func TestValidateServerURLExplainsManualInit(t *testing.T) {
-	_, err := validateServerURL("")
-	if err == nil {
-		t.Fatal("validateServerURL() error = nil, want setup instructions")
-	}
-	msg := err.Error()
-	if !strings.Contains(msg, "browseros-cli init <Server URL>") {
-		t.Fatalf("validateServerURL() error = %q, want manual init instructions", msg)
-	}
-	if strings.Contains(msg, "init --auto") {
-		t.Fatalf("validateServerURL() error = %q, should not mention init --auto", msg)
-	}
-}
-
 func TestDrainAutomaticUpdateCheckWithTimeoutWaitsForCompletion(t *testing.T) {
 	done := make(chan struct{})
 	returned := make(chan struct{})
--- a/packages/browseros-agent/apps/cli/mcp/client.go
+++ b/packages/browseros-agent/apps/cli/mcp/client.go
@@ -44,7 +44,10 @@ func (c *Client) connect(ctx context.Context) (*sdkmcp.ClientSession, error) {

 	session, err := sdkClient.Connect(ctx, transport, nil)
 	if err != nil {
-		return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w%s", c.BaseURL, err, connectionSetupInstructions())
+		return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w\n\n"+
+			"  If BrowserOS is running on a different port:  browseros-cli init --auto\n"+
+			"  If BrowserOS is not running:                  browseros-cli launch\n"+
+			"  If not installed:                             browseros-cli install", c.BaseURL, err)
 	}
 	return session, nil
 }
@@ -184,7 +187,10 @@ func (c *Client) Status() (map[string]any, error) {
 func (c *Client) restGET(path string) (map[string]any, error) {
 	resp, err := c.HTTPClient.Get(c.BaseURL + path)
 	if err != nil {
-		return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w%s", c.BaseURL, err, connectionSetupInstructions())
+		return nil, fmt.Errorf("cannot connect to BrowserOS at %s: %w\n\n"+
+			"  If BrowserOS is running on a different port:  browseros-cli init --auto\n"+
+			"  If BrowserOS is not running:                  browseros-cli launch\n"+
+			"  If not installed:                             browseros-cli install", c.BaseURL, err)
 	}
 	defer resp.Body.Close()

@@ -199,14 +205,3 @@ func (c *Client) restGET(path string) (map[string]any, error) {
 	}
 	return data, nil
 }
-
-// connectionSetupInstructions explains how to recover from a stale or missing server URL.
-func connectionSetupInstructions() string {
-	return "\n\n" +
-		"  Open BrowserOS Settings > BrowserOS MCP and copy the Server URL.\n" +
-		"  Save it with:       browseros-cli init <Server URL>\n" +
-		"  Example:            browseros-cli init http://127.0.0.1:9000/mcp\n" +
-		"  Run once with:      browseros-cli --server <Server URL> health\n" +
-		"  If BrowserOS is closed:  browseros-cli launch\n" +
-		"  If not installed:        browseros-cli install"
-}
--- a/packages/browseros-agent/apps/cli/npm/README.md
+++ b/packages/browseros-agent/apps/cli/npm/README.md
@@ -31,8 +31,8 @@ browseros-cli install
 # Start BrowserOS
 browseros-cli launch

-# Configure MCP settings with the Server URL from BrowserOS settings
-browseros-cli init http://127.0.0.1:9000/mcp
+# Auto-configure MCP settings for your AI tools
+browseros-cli init --auto

 # Verify everything is working
 browseros-cli health
--- a/packages/browseros-agent/apps/eval/README.md
+++ b/packages/browseros-agent/apps/eval/README.md
@@ -9,7 +9,6 @@ Evaluation framework for BrowserOS browser automation agents. Runs tasks from st
 - **BrowserOS binary** at `/Applications/BrowserOS.app` (macOS) or `BROWSEROS_BINARY` pointing at it
 - **Bun** runtime
 - **API keys** for your LLM provider (and `CLAUDE_CODE_OAUTH_TOKEN` if you use `performance_grader`)
- **Python 3.10+ with `agisdk`** for AGI SDK / REAL Bench grading. Set `BROWSEROS_EVAL_PYTHON` if your default `python3` is older.

 ## Quick Start

@@ -68,7 +67,7 @@ This lets us run the same suite against multiple model setups without copying th

 ```txt
 agisdk-daily-10 + kimi-fireworks
-agisdk-daily-10 + claude-opus
+agisdk-daily-10 + claude-sonnet
 agisdk-daily-10 + clado-action-000159
 ```

@@ -80,7 +79,6 @@ For `orchestrator-executor` suites, there can also be an executor model/backend.
 |------|-------------|
 | `single` | Single LLM agent driven by the BrowserOS tool loop (CDP) |
 | `orchestrator-executor` | High-level orchestrator + per-step executor (LLM or Clado visual model) |
-| `claude-code` | External Claude Code CLI driven through BrowserOS MCP |

 ### Single agent

@@ -121,24 +119,6 @@ The orchestrator works with any LLM provider. The executor can be another LLM, o
 }
 ```

-### Claude Code
-
-Claude Code runs as an external `claude -p` subprocess. The eval runner passes a task-scoped MCP config that points Claude Code at the active worker's BrowserOS MCP endpoint, while the eval capture layer still saves messages, screenshots, trajectory metadata, and grader outputs.
-
-```json
-{
-  "agent": {
-    "type": "claude-code",
-    "model": "opus"
-  }
-}
-```
-
-```bash
-BROWSEROS_EVAL_PYTHON=/path/to/python3 bun run eval run --config configs/legacy/claude-code-agisdk-real.json
-bun run eval suite --config configs/legacy/claude-code-agisdk-real.json --publish r2
-```
-
 ## Graders

 | Name | Description |
@@ -171,7 +151,6 @@ The `apiKey` field supports two formats:
 | `CLADO_ACTION_MODEL`, `CLADO_ACTION_API_KEY`, `CLADO_ACTION_BASE_URL` | Clado executor defaults |
 | `BROWSEROS_BINARY` | BrowserOS binary path in CI/local smoke runs |
 | `BROWSEROS_SERVER_URL` | Optional grader MCP URL override |
-| `BROWSEROS_EVAL_PYTHON` | Optional Python interpreter for JSON graders such as `agisdk_state_diff` |
 | `WEBARENA_INFINITY_DIR` | Local WebArena-Infinity checkout for Infinity tasks |
 | `NOPECHA_API_KEY` | CAPTCHA solver extension |
 | `EVAL_R2_ACCOUNT_ID`, `EVAL_R2_ACCESS_KEY_ID`, `EVAL_R2_SECRET_ACCESS_KEY`, `EVAL_R2_BUCKET`, `EVAL_R2_CDN_BASE_URL` | R2 upload and viewer URL |
@@ -215,7 +194,7 @@ Published runs are available at `EVAL_R2_CDN_BASE_URL/viewer.html?run=<run-id>`.
  "base_server_port": 9110,
  "base_extension_port": 9310,
  "load_extensions": false,
-  "headless": false
+  "headless": true
 }
 ```

--- a/packages/browseros-agent/apps/eval/configs/legacy/browseros-agent-kimi-k2-5-agisdk-real.json
+++ b/packages/browseros-agent/apps/eval/configs/legacy/browseros-agent-kimi-k2-5-agisdk-real.json
@@ -1,26 +0,0 @@
-{
-  "agent": {
-    "type": "single",
-    "provider": "openai-compatible",
-    "model": "moonshotai/kimi-k2.5",
-    "apiKey": "OPENROUTER_API_KEY",
-    "baseUrl": "https://openrouter.ai/api/v1",
-    "supportsImages": true
-  },
-  "dataset": "../../data/agisdk-real.jsonl",
-  "num_workers": 3,
-  "restart_server_per_task": true,
-  "browseros": {
-    "server_url": "http://127.0.0.1:9110",
-    "base_cdp_port": 9010,
-    "base_server_port": 9110,
-    "base_extension_port": 9310,
-    "load_extensions": false,
-    "headless": false
-  },
-  "captcha": {
-    "api_key_env": "NOPECHA_API_KEY"
-  },
-  "graders": ["agisdk_state_diff"],
-  "timeout_ms": 1800000
-}
--- a/packages/browseros-agent/apps/eval/configs/legacy/browseros-agent-opus-4-6-agisdk-real.json
+++ b/packages/browseros-agent/apps/eval/configs/legacy/browseros-agent-opus-4-6-agisdk-real.json
@@ -1,27 +0,0 @@
-{
-  "agent": {
-    "type": "single",
-    "provider": "bedrock",
-    "model": "global.anthropic.claude-opus-4-6-v1",
-    "region": "AWS_REGION",
-    "accessKeyId": "AWS_ACCESS_KEY_ID",
-    "secretAccessKey": "AWS_SECRET_ACCESS_KEY",
-    "supportsImages": true
-  },
-  "dataset": "../../data/agisdk-real.jsonl",
-  "num_workers": 2,
-  "restart_server_per_task": true,
-  "browseros": {
-    "server_url": "http://127.0.0.1:9110",
-    "base_cdp_port": 9010,
-    "base_server_port": 9110,
-    "base_extension_port": 9310,
-    "load_extensions": false,
-    "headless": false
-  },
-  "captcha": {
-    "api_key_env": "NOPECHA_API_KEY"
-  },
-  "graders": ["agisdk_state_diff"],
-  "timeout_ms": 1800000
-}
--- a/packages/browseros-agent/apps/eval/configs/legacy/browseros-agent-weekly.json
+++ b/packages/browseros-agent/apps/eval/configs/legacy/browseros-agent-weekly.json
@@ -7,8 +7,8 @@
    "baseUrl": "https://openrouter.ai/api/v1",
    "supportsImages": true
  },
-  "dataset": "../../data/agisdk-real.jsonl",
-  "num_workers": 3,
+  "dataset": "../../data/webbench-2of4-50.jsonl",
+  "num_workers": 10,
  "restart_server_per_task": true,
  "browseros": {
    "server_url": "http://127.0.0.1:9110",
@@ -21,6 +21,6 @@
  "captcha": {
    "api_key_env": "NOPECHA_API_KEY"
  },
-  "graders": ["agisdk_state_diff"],
+  "graders": ["performance_grader"],
  "timeout_ms": 1800000
 }
--- a/packages/browseros-agent/apps/eval/configs/legacy/browseros-oe-clado-weekly.json
+++ b/packages/browseros-agent/apps/eval/configs/legacy/browseros-oe-clado-weekly.json
@@ -23,7 +23,7 @@
    "base_server_port": 9110,
    "base_extension_port": 9310,
    "load_extensions": false,
-    "headless": false
+    "headless": true
  },
  "captcha": {
    "api_key_env": "NOPECHA_API_KEY"
--- a/packages/browseros-agent/apps/eval/configs/legacy/claude-code-agisdk-real.json
+++ b/packages/browseros-agent/apps/eval/configs/legacy/claude-code-agisdk-real.json
@@ -1,23 +0,0 @@
-{
-  "agent": {
-    "type": "claude-code",
-    "model": "opus",
-    "extraArgs": ["--permission-mode", "bypassPermissions"]
-  },
-  "dataset": "../../data/agisdk-real.jsonl",
-  "num_workers": 1,
-  "restart_server_per_task": true,
-  "browseros": {
-    "server_url": "http://127.0.0.1:9110",
-    "base_cdp_port": 9010,
-    "base_server_port": 9110,
-    "base_extension_port": 9310,
-    "load_extensions": false,
-    "headless": false
-  },
-  "captcha": {
-    "api_key_env": "NOPECHA_API_KEY"
-  },
-  "graders": ["agisdk_state_diff"],
-  "timeout_ms": 1800000
-}
--- a/packages/browseros-agent/apps/eval/configs/suites/agisdk-daily-10.json
+++ b/packages/browseros-agent/apps/eval/configs/suites/agisdk-daily-10.json
@@ -14,7 +14,7 @@
    "base_server_port": 9110,
    "base_extension_port": 9310,
    "load_extensions": false,
-    "headless": false
+    "headless": true
  },
  "captcha": {
    "api_key_env": "NOPECHA_API_KEY"
--- a/packages/browseros-agent/apps/eval/scripts/generate-report.ts
+++ b/packages/browseros-agent/apps/eval/scripts/generate-report.ts
@@ -1,191 +0,0 @@
-#!/usr/bin/env bun
-
-import { mkdir, stat } from 'node:fs/promises'
-import { dirname, resolve } from 'node:path'
-import { query as claudeQuery } from '@anthropic-ai/claude-agent-sdk'
-import { readRunMetricSummary } from '../src/reporting/task-metrics'
-
-export const DEFAULT_REPORT_MODEL = 'claude-opus-4-6'
-export const DEFAULT_REPORT_MAX_TURNS = 300
-
-type Env = Record<string, string | undefined>
-type ClaudeQuery = (input: unknown) => AsyncIterable<Record<string, unknown>>
-
-export interface ReportAgentInvocation {
-  inputDir: string
-  outputPath: string
-  prompt: string
-}
-
-export interface GenerateEvalReportOptions {
-  inputDir: string
-  outputPath: string
-  runAgent?: (invocation: ReportAgentInvocation) => Promise<void>
-}
-
-interface ClaudeReportAgentDeps {
-  query?: ClaudeQuery
-  env?: Env
-}
-
-function usage(): string {
-  return `Usage: bun scripts/generate-report.ts --input <run-dir> --output <report.html>`
-}
-
-function parseArgs(
-  argv: string[],
-): Pick<GenerateEvalReportOptions, 'inputDir' | 'outputPath'> {
-  let inputDir = ''
-  let outputPath = ''
-  for (let i = 0; i < argv.length; i++) {
-    const arg = argv[i]
-    if (arg === '--input' || arg === '--run') {
-      inputDir = argv[++i] ?? ''
-    } else if (arg === '--output' || arg === '--out') {
-      outputPath = argv[++i] ?? ''
-    } else if (arg === '--help' || arg === '-h') {
-      console.log(usage())
-      process.exit(0)
-    }
-  }
-  if (!inputDir || !outputPath) {
-    throw new Error(usage())
-  }
-  return { inputDir, outputPath }
-}
-
-function claudeCodeEnv(env: Env): Env {
-  return {
-    CLAUDE_CODE_OAUTH_TOKEN: env.CLAUDE_CODE_OAUTH_TOKEN,
-    ANTHROPIC_API_KEY: env.ANTHROPIC_API_KEY,
-    HOME: env.HOME,
-    PATH: env.PATH,
-    SHELL: env.SHELL,
-    TMPDIR: env.TMPDIR,
-    TMP: env.TMP,
-    TEMP: env.TEMP,
-    USER: env.USER,
-    CLAUDECODE: '',
-  }
-}
-
-async function buildReportPrompt(
-  inputDir: string,
-  outputPath: string,
-): Promise<string> {
-  const metrics = await readRunMetricSummary(inputDir)
-
-  return `Analyze this BrowserOS eval run and write a shareable HTML report.
-
-Run directory: ${inputDir}
-Output file to write: ${outputPath}
-
-You are running with the run directory as cwd. Inspect the local artifacts:
- summary.json for run totals and pass rate
- each task directory's metadata.json for query, final answer, timing, screenshots, and grader results
- each task directory's messages.jsonl for tool calls, tool errors, and recent trajectory
- screenshots/ for visual evidence
- grader-artifacts/ when present for grader-specific context
-
-Write the final report directly to the output file path above. Do not print the
-report instead of writing it. Do not modify any input artifacts. The only file
-you should create or overwrite is the requested report.html.
-
-The report should follow the style and density of the Shadowfax AGI SDK report:
- Title like "AGI SDK Random-10 Failure Report" or a run-specific equivalent
- Run directory and note that screenshots are embedded as data URIs
- Summary cards for total tasks, passed, failed, pass rate, average duration, average steps, and average tool calls
- A Metrics section with compact charts for Duration by task, Steps by task, Tool calls by task, and Tool errors by task
- Task Summary table with task id, status, score, duration, steps, and prompt
- Include tool calls and tool errors in the Task Summary table
- Failure sections with stable anchors using each task id, for example <section id="agisdk-networkin-10">
- For each failed task: Diagnosis, Evidence, Next Check, final screenshot, AGI SDK / grader criteria, final answer, and recent trajectory events
- Make failure links in the summary table point to the task anchors
- Keep the HTML self-contained: inline CSS and embedded final screenshots as data:image/png;base64 URIs
- Escape user/model text correctly so task outputs cannot break the page
-
-Analysis guidance:
- Focus on why the model failed: task understanding, browser/tool usage, missing verification, tool errors, max-step/timeout, bad final answer, or grader ambiguity
- Use messages.jsonl strategically. Do not paste huge DOM outputs into the report. Summarize only the relevant recent trajectory and evidence.
- Limit trajectory analysis to the most relevant 200-300 events/calls across the run. Prefer failed tasks and the final/key actions for each failure.
- If a grader criterion is boolean-only or ambiguous, say so and identify what additional artifact would make it debuggable.
-
-Deterministic run metrics computed from metadata.json and messages.jsonl:
-\`\`\`json
-${JSON.stringify(metrics, null, 2)}
-\`\`\`
-
-After writing the file, verify that ${outputPath} exists and is non-empty.`
-}
-
-async function assertRunDir(inputDir: string): Promise<void> {
-  const inputStat = await stat(inputDir).catch(() => null)
-  if (!inputStat?.isDirectory()) {
-    throw new Error(`Not a run directory: ${inputDir}`)
-  }
-}
-
-async function assertReportWritten(outputPath: string): Promise<void> {
-  const outputStat = await stat(outputPath).catch(() => null)
-  if (!outputStat?.isFile() || outputStat.size === 0) {
-    throw new Error(`Report was not written: ${outputPath}`)
-  }
-}
-
-export async function runClaudeCodeReportAgent(
-  invocation: ReportAgentInvocation,
-  deps: ClaudeReportAgentDeps = {},
-): Promise<void> {
-  const query = deps.query ?? (claudeQuery as unknown as ClaudeQuery)
-  let resultSubtype: string | undefined
-
-  for await (const message of query({
-    prompt: invocation.prompt,
-    options: {
-      cwd: invocation.inputDir,
-      model: DEFAULT_REPORT_MODEL,
-      systemPrompt:
-        'You are an eval failure analyst. Produce a concise, evidence-backed, self-contained HTML report from local run artifacts.',
-      permissionMode: 'bypassPermissions',
-      allowDangerouslySkipPermissions: true,
-      maxTurns: DEFAULT_REPORT_MAX_TURNS,
-      env: claudeCodeEnv(deps.env ?? process.env),
-    },
-  })) {
-    if (message.type === 'result') {
-      resultSubtype =
-        typeof message.subtype === 'string' ? message.subtype : undefined
-    }
-  }
-
-  if (resultSubtype && resultSubtype !== 'success') {
-    throw new Error(`Claude Code report agent failed: ${resultSubtype}`)
-  }
-}
-
-export async function generateEvalReport(
-  options: GenerateEvalReportOptions,
-): Promise<void> {
-  const inputDir = resolve(options.inputDir)
-  const outputPath = resolve(options.outputPath)
-
-  await assertRunDir(inputDir)
-  await mkdir(dirname(outputPath), { recursive: true })
-
-  const invocation = {
-    inputDir,
-    outputPath,
-    prompt: await buildReportPrompt(inputDir, outputPath),
-  }
-  await (options.runAgent ?? runClaudeCodeReportAgent)(invocation)
-  await assertReportWritten(outputPath)
-}
-
-if (import.meta.main) {
-  try {
-    await generateEvalReport(parseArgs(Bun.argv.slice(2)))
-  } catch (error) {
-    console.error(error instanceof Error ? error.message : String(error))
-    process.exit(1)
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/agents/claude-code/index.ts
+++ b/packages/browseros-agent/apps/eval/src/agents/claude-code/index.ts
@@ -1,238 +0,0 @@
-import { writeFile } from 'node:fs/promises'
-import { join } from 'node:path'
-import { DEFAULT_TIMEOUT_MS } from '../../constants'
-import type { ClaudeCodeAgentConfig, UIMessageStreamEvent } from '../../types'
-import { withEvalTimeout } from '../../utils/with-eval-timeout'
-import type { AgentContext, AgentEvaluator, AgentResult } from '../types'
-import {
-  type ClaudeCodeProcessRunner,
-  createClaudeCodeProcessRunner,
-} from './process-runner'
-import {
-  ClaudeCodeStreamParser,
-  shouldCaptureScreenshotForTool,
-} from './stream-parser'
-
-export interface ClaudeCodeEvaluatorDeps {
-  processRunner?: ClaudeCodeProcessRunner
-}
-
-export class ClaudeCodeEvaluator implements AgentEvaluator {
-  private processRunner: ClaudeCodeProcessRunner
-
-  constructor(
-    private ctx: AgentContext,
-    deps: ClaudeCodeEvaluatorDeps = {},
-  ) {
-    this.processRunner = deps.processRunner ?? createClaudeCodeProcessRunner()
-  }
-
-  async execute(): Promise<AgentResult> {
-    const { config, task, capture, taskOutputDir } = this.ctx
-    const startTime = Date.now()
-    const timeoutMs = config.timeout_ms ?? DEFAULT_TIMEOUT_MS
-
-    await capture.messageLogger.logUser(task.query)
-
-    if (config.agent.type !== 'claude-code') {
-      throw new Error('ClaudeCodeEvaluator only supports claude-code config')
-    }
-    const agentConfig = config.agent
-
-    const mcpConfigPath = join(taskOutputDir, 'claude-code-mcp.json')
-    await writeFile(
-      mcpConfigPath,
-      JSON.stringify(
-        buildClaudeCodeMcpConfig(config.browseros.server_url),
-        null,
-        2,
-      ),
-    )
-
-    const parser = new ClaudeCodeStreamParser()
-    const toolNamesById = new Map<string, string>()
-    const prompt = buildClaudeCodePrompt(task.query)
-    const args = buildClaudeCodeArgs({
-      prompt,
-      mcpConfigPath,
-      config: agentConfig,
-    })
-
-    const { terminationReason } = await withEvalTimeout(
-      timeoutMs,
-      capture,
-      async (signal) => {
-        const runResult = await this.processRunner.run({
-          executable: agentConfig.claudePath,
-          args,
-          cwd: taskOutputDir,
-          signal,
-          onStdoutLine: async (line) => {
-            const events = parser.pushLine(line)
-            for (const event of events) {
-              await this.handleStreamEvent(event, toolNamesById)
-            }
-          },
-        })
-
-        if (runResult.exitCode !== 0) {
-          const message =
-            runResult.stderr.trim() ||
-            `Claude Code exited with status ${runResult.exitCode}`
-          capture.addError('agent_execution', message, {
-            exitCode: runResult.exitCode,
-          })
-          if (!parser.getLastText()) {
-            throw new Error(message)
-          }
-        }
-
-        for (const error of runResult.streamErrors ?? []) {
-          capture.addWarning(
-            'message_logging',
-            `Claude Code stream event processing failed: ${error}`,
-          )
-        }
-
-        return runResult
-      },
-    )
-
-    const endTime = Date.now()
-    const finalAnswer = parser.getLastText() ?? capture.getLastAssistantText()
-    const metadata = {
-      query_id: task.query_id,
-      dataset: task.dataset,
-      query: task.query,
-      started_at: new Date(startTime).toISOString(),
-      completed_at: new Date(endTime).toISOString(),
-      total_duration_ms: endTime - startTime,
-      total_steps: parser.getToolCallCount() || capture.getScreenshotCount(),
-      termination_reason: terminationReason,
-      final_answer: finalAnswer,
-      errors: capture.getErrors(),
-      warnings: capture.getWarnings(),
-      device_pixel_ratio: capture.screenshot.getDevicePixelRatio(),
-      agent_config: {
-        type: 'claude-code' as const,
-        model: agentConfig.model,
-      },
-      grader_results: {},
-    }
-
-    await capture.trajectorySaver.saveMetadata(metadata)
-
-    return {
-      metadata,
-      messages: capture.getMessages(),
-      finalAnswer,
-    }
-  }
-
-  private async handleStreamEvent(
-    event: UIMessageStreamEvent,
-    toolNamesById: Map<string, string>,
-  ): Promise<void> {
-    const { capture, task } = this.ctx
-    let screenshot: number | undefined
-
-    if (event.type === 'tool-input-available') {
-      toolNamesById.set(event.toolCallId, event.toolName)
-      if (isPageInput(event.input)) {
-        capture.setActivePageId(event.input.page)
-      }
-    }
-
-    if (
-      event.type === 'tool-output-available' ||
-      event.type === 'tool-output-error'
-    ) {
-      const toolName = toolNamesById.get(event.toolCallId)
-      if (toolName && shouldCaptureScreenshotForTool(toolName)) {
-        screenshot = await this.captureScreenshot()
-      }
-    }
-
-    await capture.messageLogger.logStreamEvent(event, screenshot)
-    capture.emitEvent(task.query_id, {
-      ...event,
-      ...(screenshot !== undefined && { screenshot }),
-    })
-  }
-
-  private async captureScreenshot(): Promise<number | undefined> {
-    const { capture, task } = this.ctx
-    try {
-      const screenshot = await capture.screenshot.capture(
-        capture.getActivePageId(),
-      )
-      capture.emitEvent(task.query_id, {
-        type: 'screenshot-captured',
-        screenshot,
-      })
-      return screenshot
-    } catch {
-      return undefined
-    }
-  }
-}
-
-function isPageInput(input: unknown): input is { page: number } {
-  return (
-    typeof input === 'object' &&
-    input !== null &&
-    'page' in input &&
-    typeof input.page === 'number'
-  )
-}
-
-function buildClaudeCodePrompt(taskQuery: string): string {
-  return [
-    'You are running inside BrowserOS eval.',
-    'Use the BrowserOS MCP tools to interact with the already-open browser and complete the user task.',
-    'When the task is complete, respond with the final answer only.',
-    'If blocked, explain the blocker clearly.',
-    '',
-    `Task: ${taskQuery}`,
-  ].join('\n')
-}
-
-function buildClaudeCodeArgs({
-  prompt,
-  mcpConfigPath,
-  config,
-}: {
-  prompt: string
-  mcpConfigPath: string
-  config: ClaudeCodeAgentConfig
-}): string[] {
-  const args = [
-    '-p',
-    prompt,
-    '--mcp-config',
-    mcpConfigPath,
-    '--strict-mcp-config',
-    '--output-format',
-    'stream-json',
-    '--verbose',
-  ]
-
-  if (config.model) args.push('--model', config.model)
-  args.push(...config.extraArgs)
-
-  return args
-}
-
-function buildClaudeCodeMcpConfig(serverUrl: string) {
-  const trimmed = serverUrl.replace(/\/$/, '')
-  const url = trimmed.endsWith('/mcp') ? trimmed : `${trimmed}/mcp`
-  return {
-    mcpServers: {
-      browseros: {
-        type: 'http',
-        url,
-        headers: { 'X-BrowserOS-Source': 'sdk-internal' },
-      },
-    },
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/agents/claude-code/process-runner.ts
+++ b/packages/browseros-agent/apps/eval/src/agents/claude-code/process-runner.ts
@@ -1,114 +0,0 @@
-export interface ClaudeCodeRunOptions {
-  executable: string
-  args: string[]
-  cwd: string
-  signal?: AbortSignal
-  onStdoutLine: (line: string) => Promise<void>
-}
-
-export interface ClaudeCodeRunResult {
-  exitCode: number
-  stderr: string
-  streamErrors?: string[]
-}
-
-export interface ClaudeCodeProcessRunner {
-  run(options: ClaudeCodeRunOptions): Promise<ClaudeCodeRunResult>
-}
-
-export interface SpawnOptions {
-  cwd: string
-  signal?: AbortSignal
-  onStdoutLine: (line: string) => Promise<void>
-}
-
-export interface CreateClaudeCodeProcessRunnerDeps {
-  spawn?: (cmd: string[], options: SpawnOptions) => Promise<ClaudeCodeRunResult>
-}
-
-export function createClaudeCodeProcessRunner(
-  deps: CreateClaudeCodeProcessRunnerDeps = {},
-): ClaudeCodeProcessRunner {
-  const spawn = deps.spawn ?? spawnClaudeCode
-  return {
-    run: async ({ executable, args, cwd, signal, onStdoutLine }) =>
-      spawn([executable, ...args], { cwd, signal, onStdoutLine }),
-  }
-}
-
-async function spawnClaudeCode(
-  cmd: string[],
-  options: SpawnOptions,
-): Promise<ClaudeCodeRunResult> {
-  const proc = Bun.spawn({
-    cmd,
-    cwd: options.cwd,
-    stdin: 'ignore',
-    stdout: 'pipe',
-    stderr: 'pipe',
-  })
-
-  const abort = () => {
-    try {
-      proc.kill('SIGTERM')
-    } catch {
-      // Process may already have exited.
-    }
-  }
-  options.signal?.addEventListener('abort', abort, { once: true })
-
-  try {
-    const streamErrors: string[] = []
-    const stdoutPromise = readLines(
-      proc.stdout,
-      options.onStdoutLine,
-      streamErrors,
-    )
-    const stderrPromise = new Response(proc.stderr).text()
-    const exitCode = await proc.exited
-    await stdoutPromise
-    const stderr = await stderrPromise
-    return { exitCode, stderr, streamErrors }
-  } finally {
-    options.signal?.removeEventListener('abort', abort)
-  }
-}
-
-async function readLines(
-  stream: ReadableStream<Uint8Array>,
-  onLine: (line: string) => Promise<void>,
-  streamErrors: string[],
-): Promise<void> {
-  const reader = stream.getReader()
-  const decoder = new TextDecoder()
-  let buffer = ''
-
-  while (true) {
-    const { done, value } = await reader.read()
-    if (done) break
-
-    buffer += decoder.decode(value, { stream: true })
-    const lines = buffer.split('\n')
-    buffer = lines.pop() ?? ''
-    for (const line of lines) {
-      await emitLine(line, onLine, streamErrors)
-    }
-  }
-
-  buffer += decoder.decode()
-  if (buffer.length > 0) {
-    await emitLine(buffer, onLine, streamErrors)
-  }
-}
-
-async function emitLine(
-  line: string,
-  onLine: (line: string) => Promise<void>,
-  streamErrors: string[],
-): Promise<void> {
-  try {
-    await onLine(line)
-  } catch (error) {
-    streamErrors.push(error instanceof Error ? error.message : String(error))
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/agents/claude-code/stream-parser.ts
+++ b/packages/browseros-agent/apps/eval/src/agents/claude-code/stream-parser.ts
@@ -1,142 +0,0 @@
-import { randomUUID } from 'node:crypto'
-import type { UIMessageStreamEvent } from '../../types'
-
-type JsonObject = Record<string, unknown>
-
-export class ClaudeCodeStreamParser {
-  private lastText: string | null = null
-  private toolCallCount = 0
-
-  pushLine(line: string): UIMessageStreamEvent[] {
-    const trimmed = line.trim()
-    if (!trimmed) return []
-
-    let parsed: unknown
-    try {
-      parsed = JSON.parse(trimmed)
-    } catch {
-      return []
-    }
-
-    if (!isObject(parsed)) return []
-
-    if (parsed.type === 'assistant') {
-      return this.parseAssistantMessage(parsed)
-    }
-    if (parsed.type === 'user') {
-      return this.parseUserMessage(parsed)
-    }
-    if (parsed.type === 'result' && typeof parsed.result === 'string') {
-      this.lastText = parsed.result
-    }
-
-    return []
-  }
-
-  getLastText(): string | null {
-    return this.lastText
-  }
-
-  getToolCallCount(): number {
-    return this.toolCallCount
-  }
-
-  private parseAssistantMessage(message: JsonObject): UIMessageStreamEvent[] {
-    const content = contentBlocks(message)
-    const events: UIMessageStreamEvent[] = []
-
-    for (const block of content) {
-      if (block.type === 'text' && typeof block.text === 'string') {
-        const id = randomUUID()
-        this.lastText = block.text
-        events.push(
-          { type: 'text-start', id },
-          { type: 'text-delta', id, delta: block.text },
-          { type: 'text-end', id },
-        )
-      } else if (
-        block.type === 'tool_use' &&
-        typeof block.id === 'string' &&
-        typeof block.name === 'string'
-      ) {
-        this.toolCallCount++
-        events.push({
-          type: 'tool-input-available',
-          toolCallId: block.id,
-          toolName: block.name,
-          input: block.input,
-        })
-      }
-    }
-
-    return events
-  }
-
-  private parseUserMessage(message: JsonObject): UIMessageStreamEvent[] {
-    const content = contentBlocks(message)
-    const events: UIMessageStreamEvent[] = []
-
-    for (const block of content) {
-      if (
-        block.type !== 'tool_result' ||
-        typeof block.tool_use_id !== 'string'
-      ) {
-        continue
-      }
-
-      if (block.is_error === true) {
-        events.push({
-          type: 'tool-output-error',
-          toolCallId: block.tool_use_id,
-          errorText: stringifyToolContent(block.content),
-        })
-      } else {
-        events.push({
-          type: 'tool-output-available',
-          toolCallId: block.tool_use_id,
-          output: normalizeToolContent(block.content),
-        })
-      }
-    }
-
-    return events
-  }
-}
-
-export function shouldCaptureScreenshotForTool(toolName: string): boolean {
-  if (!toolName.startsWith('mcp__browseros__')) return false
-  return !toolName.endsWith('__take_screenshot')
-}
-
-function contentBlocks(message: JsonObject): JsonObject[] {
-  const inner = isObject(message.message) ? message.message : message
-  return Array.isArray(inner.content) ? inner.content.filter(isObject) : []
-}
-
-function isObject(value: unknown): value is JsonObject {
-  return typeof value === 'object' && value !== null
-}
-
-function normalizeToolContent(content: unknown): unknown {
-  if (!Array.isArray(content)) return content
-  return content.map((item) => {
-    if (
-      isObject(item) &&
-      item.type === 'text' &&
-      typeof item.text === 'string'
-    ) {
-      return item.text
-    }
-    return item
-  })
-}
-
-function stringifyToolContent(content: unknown): string {
-  const normalized = normalizeToolContent(content)
-  if (typeof normalized === 'string') return normalized
-  try {
-    return JSON.stringify(normalized)
-  } catch {
-    return String(normalized)
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/agents/index.ts
+++ b/packages/browseros-agent/apps/eval/src/agents/index.ts
@@ -1,4 +1,3 @@
-import { ClaudeCodeEvaluator } from './claude-code'
 import { OrchestratorExecutorEvaluator } from './orchestrator-executor'
 import { SingleAgentEvaluator } from './single-agent'
 import type { AgentContext, AgentEvaluator } from './types'
@@ -9,8 +8,6 @@ export function createAgent(context: AgentContext): AgentEvaluator {
      return new SingleAgentEvaluator(context)
    case 'orchestrator-executor':
      return new OrchestratorExecutorEvaluator(context)
-    case 'claude-code':
-      return new ClaudeCodeEvaluator(context)
  }
 }

--- a/packages/browseros-agent/apps/eval/src/agents/orchestrator-executor/index.ts
+++ b/packages/browseros-agent/apps/eval/src/agents/orchestrator-executor/index.ts
@@ -134,10 +134,7 @@ export class OrchestratorExecutorEvaluator implements AgentEvaluator {

    // Connect to Chrome via CDP — same per-worker offset used by app-manager.
    const cdpPort = config.browseros.base_cdp_port + workerIndex
-    const cdp = new CdpBackend({
-      port: cdpPort,
-      exitOnReconnectFailure: false,
-    })
+    const cdp = new CdpBackend({ port: cdpPort })
    await cdp.connect()
    const browser = new Browser(cdp)
    capture.screenshot.setBrowser(browser)
--- a/packages/browseros-agent/apps/eval/src/agents/single-agent.ts
+++ b/packages/browseros-agent/apps/eval/src/agents/single-agent.ts
@@ -43,10 +43,7 @@ export class SingleAgentEvaluator implements AgentEvaluator {

    // Connect to Chrome via CDP — same per-worker offset used by app-manager.
    const cdpPort = config.browseros.base_cdp_port + workerIndex
-    const cdp = new CdpBackend({
-      port: cdpPort,
-      exitOnReconnectFailure: false,
-    })
+    const cdp = new CdpBackend({ port: cdpPort })
    await cdp.connect()

    const browser = new Browser(cdp)
--- a/packages/browseros-agent/apps/eval/src/capture/trajectory-saver.ts
+++ b/packages/browseros-agent/apps/eval/src/capture/trajectory-saver.ts
@@ -105,10 +105,7 @@ export class TrajectorySaver {
      errors: [],
      warnings: [],
      agent_config: {
-        type: agentConfig.type as
-          | 'single'
-          | 'orchestrator-executor'
-          | 'claude-code',
+        type: agentConfig.type as 'single' | 'orchestrator-executor',
        model: agentConfig.model,
      },
      grader_results: {},
--- a/packages/browseros-agent/apps/eval/src/cli/commands/suite.ts
+++ b/packages/browseros-agent/apps/eval/src/cli/commands/suite.ts
@@ -82,16 +82,6 @@ function suiteToEvalConfig(
    })
  }

-  if (suite.agent.type === 'claude-code') {
-    return EvalConfigSchema.parse({
-      ...base,
-      agent: {
-        type: 'claude-code',
-        ...(variant.agent.model && { model: variant.agent.model }),
-      },
-    })
-  }
-
  const executorBackend = suite.agent.executorBackend ?? 'tool-loop'
  const executor =
    executorBackend === 'clado'
@@ -145,10 +135,7 @@ export async function resolveSuiteCommand(
  const loaded = await loadSuite(options.suitePath)
  const variant = resolveVariant({
    variantId: options.variantId,
-    provider:
-      loaded.suite.agent.type === 'claude-code'
-        ? 'claude-code'
-        : options.provider,
+    provider: options.provider,
    model: options.model,
    apiKey: options.apiKey,
    baseUrl: options.baseUrl,
--- a/packages/browseros-agent/apps/eval/src/dashboard/server.ts
+++ b/packages/browseros-agent/apps/eval/src/dashboard/server.ts
@@ -536,12 +536,6 @@ export interface DashboardConfig {
  configMode?: boolean
 }

-export function shouldAutoOpenDashboard(
-  env: Record<string, string | undefined> = process.env,
-): boolean {
-  return env.CI !== 'true'
-}
-
 export function startDashboard(config: DashboardConfig) {
  const port = config.port ?? 9900
  dashboardConfigMode = config.configMode ?? false
@@ -564,12 +558,10 @@ export function startDashboard(config: DashboardConfig) {
  console.log(`  Dashboard: ${url}`)

  // Auto-open browser
-  if (shouldAutoOpenDashboard()) {
-    try {
-      Bun.spawn(['open', url], { stdout: 'ignore', stderr: 'ignore' })
-    } catch {
-      /* ignore if open command fails */
-    }
+  try {
+    Bun.spawn(['open', url], { stdout: 'ignore', stderr: 'ignore' })
+  } catch {
+    /* ignore if open command fails */
  }

  return { url, port }
--- a/packages/browseros-agent/apps/eval/src/dashboard/viewer.html
+++ b/packages/browseros-agent/apps/eval/src/dashboard/viewer.html
@@ -61,17 +61,6 @@
  .header-stats .stat-pass { color: #3fb950; }
  .header-stats .stat-fail { color: #f85149; }
  .header-stats .stat-score { color: #f0883e; }
-  .header-report {
-    color: #58a6ff;
-    text-decoration: none;
-    font-size: 12px;
-    font-weight: 600;
-    border: 1px solid #30363d;
-    border-radius: 6px;
-    padding: 5px 9px;
-    white-space: nowrap;
-  }
-  .header-report:hover { border-color: #58a6ff; background: #1c2333; }

  /* ── 3-column layout ─────────────────────────────────────────── */
  .layout {
@@ -95,7 +84,6 @@
    background: #161b22;
    border-bottom: 1px solid #30363d;
    display: flex;
-    flex-wrap: wrap;
    gap: 12px;
    font-size: 11px;
    font-weight: 600;
@@ -105,80 +93,6 @@
  }
  .sidebar-stats .s-pass { color: #3fb950; }
  .sidebar-stats .s-fail { color: #f85149; }
-  .sidebar-metrics {
-    padding: 12px 16px;
-    background: #0d1117;
-    border-bottom: 1px solid #21262d;
-  }
-  .metric-grid {
-    display: grid;
-    grid-template-columns: repeat(3, minmax(0, 1fr));
-    gap: 8px;
-    margin-bottom: 12px;
-  }
-  .metric-cell {
-    min-width: 0;
-  }
-  .metric-label {
-    display: block;
-    font-size: 9px;
-    font-weight: 600;
-    color: #6e7681;
-    text-transform: uppercase;
-    letter-spacing: 0.04em;
-    white-space: nowrap;
-  }
-  .metric-value {
-    display: block;
-    font-size: 13px;
-    font-weight: 700;
-    color: #e6edf3;
-    margin-top: 2px;
-    overflow: hidden;
-    text-overflow: ellipsis;
-  }
-  .mini-chart {
-    display: flex;
-    flex-direction: column;
-    gap: 6px;
-  }
-  .mini-chart-title {
-    font-size: 10px;
-    font-weight: 700;
-    color: #8b949e;
-    text-transform: uppercase;
-    letter-spacing: 0.04em;
-  }
-  .mini-bar-row {
-    display: grid;
-    grid-template-columns: minmax(60px, 1fr) 70px 28px;
-    gap: 8px;
-    align-items: center;
-    font-size: 10px;
-    color: #8b949e;
-  }
-  .mini-bar-name {
-    overflow: hidden;
-    text-overflow: ellipsis;
-    white-space: nowrap;
-    font-family: 'SF Mono', SFMono-Regular, Consolas, 'Liberation Mono', Menlo, monospace;
-  }
-  .mini-bar-track {
-    height: 6px;
-    background: #21262d;
-    border-radius: 999px;
-    overflow: hidden;
-  }
-  .mini-bar-fill {
-    height: 100%;
-    background: #58a6ff;
-    border-radius: 999px;
-  }
-  .mini-bar-value {
-    color: #e6edf3;
-    font-variant-numeric: tabular-nums;
-    text-align: right;
-  }
  .sidebar-filter {
    padding: 8px 12px;
    border-bottom: 1px solid #21262d;
@@ -612,7 +526,6 @@
  <div class="header-sep"></div>
  <span class="header-run" id="header-run"></span>
  <span class="header-date" id="header-date"></span>
-  <a class="header-report" id="header-report" target="_blank" rel="noopener" style="display: none;">Run Report</a>
  <div class="header-stats" id="header-stats"></div>
 </div>

@@ -620,7 +533,6 @@
  <!-- Left sidebar -->
  <div class="sidebar" id="sidebar">
    <div class="sidebar-stats" id="sidebar-stats"></div>
-    <div class="sidebar-metrics" id="sidebar-metrics"></div>
    <div class="sidebar-filter">
      <input type="text" id="filter-input" placeholder="Search tasks..." autocomplete="off" spellcheck="false" />
    </div>
@@ -715,23 +627,7 @@
    if (stats.avgScore !== null) {
      parts.push(`<span class="stat-score">avg ${stats.avgScore}%</span>`);
    }
-    if (stats.avgDurationMs !== null) {
-      parts.push(`<span>${fmtDuration(stats.avgDurationMs)} avg</span>`);
-    }
-    if (stats.avgToolCalls !== null) {
-      parts.push(`<span>${fmtCompact(stats.avgToolCalls)} tools/task</span>`);
-    }
    el.innerHTML = parts.join('');
-
-    const reportLink = document.getElementById('header-report');
-    const url = reportUrl(manifest);
-    if (url) {
-      reportLink.href = url;
-      reportLink.style.display = '';
-    } else {
-      reportLink.removeAttribute('href');
-      reportLink.style.display = 'none';
-    }
  }

  // ── Sidebar rendering ─────────────────────────────────────────
@@ -743,49 +639,11 @@
    statsEl.innerHTML =
      '<span>' + stats.total + ' total</span>' +
      '<span class="s-pass">' + stats.passed + ' pass</span>' +
-      '<span class="s-fail">' + stats.failed + ' fail</span>' +
-      (stats.avgSteps !== null ? '<span>' + fmtCompact(stats.avgSteps) + ' steps/task</span>' : '') +
-      (stats.avgToolCalls !== null ? '<span>' + fmtCompact(stats.avgToolCalls) + ' tools/task</span>' : '');
-
-    renderSidebarMetrics(tasks, stats);
+      '<span class="s-fail">' + stats.failed + ' fail</span>';

    renderTaskList('');
  }

-  function renderSidebarMetrics(tasks, stats) {
-    const el = document.getElementById('sidebar-metrics');
-    if (!el) return;
-
-    const chartTasks = tasks
-      .slice()
-      .sort((a, b) => taskMetrics(b).toolCalls - taskMetrics(a).toolCalls)
-      .slice(0, 5);
-    const maxCalls = Math.max(1, ...chartTasks.map((task) => taskMetrics(task).toolCalls));
-
-    const bars = chartTasks.map((task) => {
-      const calls = taskMetrics(task).toolCalls;
-      const width = Math.max(4, Math.round((calls / maxCalls) * 100));
-      return (
-        '<div class="mini-bar-row">' +
-          '<span class="mini-bar-name" title="' + escAttr(task.queryId || task.id || 'Untitled') + '">' + esc(task.queryId || task.id || 'Untitled') + '</span>' +
-          '<span class="mini-bar-track"><span class="mini-bar-fill" style="width: ' + width + '%"></span></span>' +
-          '<span class="mini-bar-value">' + fmtCompact(calls) + '</span>' +
-        '</div>'
-      );
-    }).join('');
-
-    el.innerHTML =
-      '<div class="metric-grid">' +
-        '<div class="metric-cell"><span class="metric-label">Avg Time</span><span class="metric-value">' + (stats.avgDurationMs !== null ? fmtDuration(stats.avgDurationMs) : '-') + '</span></div>' +
-        '<div class="metric-cell"><span class="metric-label">Avg Steps</span><span class="metric-value">' + (stats.avgSteps !== null ? fmtCompact(stats.avgSteps) : '-') + '</span></div>' +
-        '<div class="metric-cell"><span class="metric-label">Avg Tools</span><span class="metric-value">' + (stats.avgToolCalls !== null ? fmtCompact(stats.avgToolCalls) : '-') + '</span></div>' +
-      '</div>' +
-      '<div class="mini-chart">' +
-        '<div class="mini-chart-title">Tool Calls by Task</div>' +
-        (bars || '<div class="task-meta-line"><span>No tool calls recorded</span></div>') +
-      '</div>';
-  }
-
  function renderTaskList(filter) {
    const list = document.getElementById('task-list');
    list.innerHTML = '';
@@ -810,11 +668,8 @@
      }

      const metaParts = [];
-      const metrics = taskMetrics(task);
-      if (metrics.durationMs) metaParts.push(fmtDuration(metrics.durationMs));
-      if (metrics.steps) metaParts.push(`${fmtCompact(metrics.steps)} steps`);
-      if (metrics.toolCalls) metaParts.push(`${fmtCompact(metrics.toolCalls)} tools`);
-      if (metrics.toolErrors) metaParts.push(`${fmtCompact(metrics.toolErrors)} errors`);
+      if (task.durationMs) metaParts.push(fmtDuration(task.durationMs));
+      if (task.screenshotCount) metaParts.push(`${task.screenshotCount} steps`);

      item.innerHTML =
        '<div class="task-row">' +
@@ -859,7 +714,7 @@
  }

  function artifactPath(task, artifact) {
-    const manifestPath = task.paths?.[artifact];
+    const manifestPath = task.paths && task.paths[artifact];
    if (typeof manifestPath === 'string' && manifestPath.length > 0) {
      return manifestPath.replace(/^\/+/, '');
    }
@@ -870,17 +725,6 @@
    return `${basePath}/${artifactPath(task, artifact)}`;
  }

-  function runArtifactUrl(path) {
-    if (typeof path !== 'string' || path.length === 0) return null;
-    return `${basePath}/${path.replace(/^\/+/, '')}`;
-  }
-
-  function reportUrl(manifest, task) {
-    const url = runArtifactUrl(manifest?.reportPath);
-    if (!url || !task) return url;
-    return `${url}#${encodeURIComponent(task.queryId || task.id || '')}`;
-  }
-
  function metadataUrl(task) {
    return artifactUrl(task, 'metadata');
  }
@@ -1061,38 +905,10 @@
    }

    // Duration
-    const metrics = taskMetrics(task);
-    if (metrics.durationMs) {
+    if (task.durationMs) {
      html += '<div class="db-section">';
      html += '<span class="db-label">Duration</span>';
-      html += `<span class="db-value">${fmtDuration(metrics.durationMs)}</span>`;
-      html += '</div>';
-    }
-
-    if (metrics.steps) {
-      html += '<div class="db-section">';
-      html += '<span class="db-label">Steps</span>';
-      html += `<span class="db-value">${fmtCompact(metrics.steps)}</span>`;
-      html += '</div>';
-    }
-
-    html += '<div class="db-section">';
-    html += '<span class="db-label">Tool Calls</span>';
-    html += `<span class="db-value">${fmtCompact(metrics.toolCalls)}</span>`;
-    html += '</div>';
-
-    if (metrics.toolErrors) {
-      html += '<div class="db-section">';
-      html += '<span class="db-label">Tool Errors</span>';
-      html += `<span class="db-value">${fmtCompact(metrics.toolErrors)}</span>`;
-      html += '</div>';
-    }
-
-    const reportLink = reportUrl(manifest, task);
-    if (reportLink) {
-      html += '<div class="db-section">';
-      html += '<span class="db-label">Report</span>';
-      html += `<span class="db-value"><a href="${escAttr(reportLink)}" target="_blank" rel="noopener">Open task analysis</a></span>`;
+      html += `<span class="db-value">${fmtDuration(task.durationMs)}</span>`;
      html += '</div>';
    }

@@ -1418,25 +1234,8 @@
  function computeStats(tasks) {
    const total = tasks.length;
    let passed = 0, failed = 0, totalScore = 0, scoredCount = 0;
-    let totalDurationMs = 0, durationCount = 0;
-    let totalSteps = 0, stepsCount = 0;
-    let totalToolCalls = 0, toolCount = 0;
-    let totalToolErrors = 0;

    tasks.forEach((t) => {
-      const metrics = taskMetrics(t);
-      if (metrics.durationMs > 0) {
-        totalDurationMs += metrics.durationMs;
-        durationCount++;
-      }
-      if (metrics.steps > 0) {
-        totalSteps += metrics.steps;
-        stepsCount++;
-      }
-      totalToolCalls += metrics.toolCalls;
-      totalToolErrors += metrics.toolErrors;
-      toolCount++;
-
      const graders = t.graderResults || {};
      const keys = Object.keys(graders);
      if (keys.length > 0) {
@@ -1455,34 +1254,7 @@
      total: total,
      passed: passed,
      failed: failed,
-      avgScore: scoredCount > 0 ? Math.round((totalScore / scoredCount) * 100) : null,
-      avgDurationMs: durationCount > 0 ? totalDurationMs / durationCount : null,
-      avgSteps: stepsCount > 0 ? totalSteps / stepsCount : null,
-      avgToolCalls: toolCount > 0 ? totalToolCalls / toolCount : null,
-      totalToolCalls: totalToolCalls,
-      totalToolErrors: totalToolErrors
-    };
-  }
-
-  function taskMetrics(task) {
-    const metrics = task.metrics || {};
-    const screenshots = Number.isFinite(Number(metrics.screenshots))
-      ? Number(metrics.screenshots)
-      : Number(task.screenshotCount || 0);
-    return {
-      durationMs: Number.isFinite(Number(metrics.durationMs))
-        ? Number(metrics.durationMs)
-        : Number(task.durationMs || 0),
-      steps: Number.isFinite(Number(metrics.steps))
-        ? Number(metrics.steps)
-        : screenshots,
-      screenshots: screenshots,
-      toolCalls: Number.isFinite(Number(metrics.toolCalls))
-        ? Number(metrics.toolCalls)
-        : 0,
-      toolErrors: Number.isFinite(Number(metrics.toolErrors))
-        ? Number(metrics.toolErrors)
-        : 0
+      avgScore: scoredCount > 0 ? Math.round((totalScore / scoredCount) * 100) : null
    };
  }

@@ -1538,13 +1310,6 @@
    return `${h}h ${remM}m`;
  }

-  function fmtCompact(value) {
-    const num = Number(value);
-    if (!Number.isFinite(num)) return '0';
-    if (Number.isInteger(num)) return String(num);
-    return num.toFixed(1);
-  }
-
  function showFatalError(msgHtml) {
    document.getElementById('center-panel').innerHTML =
      '<div class="placeholder error">' +
--- a/packages/browseros-agent/apps/eval/src/grading/python-evaluator.ts
+++ b/packages/browseros-agent/apps/eval/src/grading/python-evaluator.ts
@@ -2,7 +2,6 @@ export interface PythonEvaluatorOptions {
  scriptPath: string
  input: unknown
  timeoutMs: number
-  pythonPath?: string
 }

 export interface PythonEvaluatorResult<T> {
@@ -16,9 +15,7 @@ export interface PythonEvaluatorResult<T> {
 export async function runPythonJsonEvaluator<T>(
  options: PythonEvaluatorOptions,
 ): Promise<PythonEvaluatorResult<T>> {
-  const pythonPath =
-    options.pythonPath || process.env.BROWSEROS_EVAL_PYTHON || 'python3'
-  const proc = Bun.spawn([pythonPath, options.scriptPath], {
+  const proc = Bun.spawn(['python3', options.scriptPath], {
    stdin: 'pipe',
    stdout: 'pipe',
    stderr: 'pipe',
--- a/packages/browseros-agent/apps/eval/src/publishing/r2-publisher.ts
+++ b/packages/browseros-agent/apps/eval/src/publishing/r2-publisher.ts
@@ -5,7 +5,6 @@ import {
  PutObjectCommand,
  S3Client,
 } from '@aws-sdk/client-s3'
-import { readTaskMetrics } from '../reporting/task-metrics'
 import {
  buildViewerManifest,
  type ViewerManifestTaskInput,
@@ -316,7 +315,6 @@ export class R2Publisher {
        graderResults:
          (meta.grader_results as ViewerManifestTaskInput['graderResults']) ||
          {},
-        metrics: await readTaskMetrics(taskPath, meta, screenshotCount),
      })
    }

@@ -381,12 +379,10 @@ export class R2Publisher {
        await readFile(join(runDir, 'summary.json'), 'utf-8'),
      ) as Record<string, unknown>
    } catch {}
-    const reportStat = await stat(join(runDir, 'report.html')).catch(() => null)

    return buildViewerManifest({
      runId,
      uploadedAt: this.now().toISOString(),
-      reportPath: reportStat?.isFile() ? 'report.html' : undefined,
      agentConfig,
      dataset,
      summary: summaryData
--- a/packages/browseros-agent/apps/eval/src/reporting/task-metrics.ts
+++ b/packages/browseros-agent/apps/eval/src/reporting/task-metrics.ts
@@ -1,188 +0,0 @@
-import { readdir, readFile, stat } from 'node:fs/promises'
-import { join } from 'node:path'
-
-export interface EvalTaskMetrics {
-  durationMs: number
-  steps: number
-  screenshots: number
-  toolCalls: number
-  toolErrors: number
-}
-
-export interface EvalRunMetrics {
-  taskCount: number
-  totalDurationMs: number
-  avgDurationMs: number
-  totalSteps: number
-  avgSteps: number
-  totalToolCalls: number
-  avgToolCalls: number
-  totalToolErrors: number
-  avgToolErrors: number
-}
-
-export interface EvalTaskMetricSummary {
-  queryId: string
-  status: string
-  score?: number
-  pass?: boolean
-  metrics: EvalTaskMetrics
-}
-
-export interface EvalRunMetricSummary {
-  run: EvalRunMetrics
-  tasks: EvalTaskMetricSummary[]
-}
-
-interface TaskDirEntry {
-  taskId: string
-  taskPath: string
-}
-
-function numberValue(value: unknown): number {
-  return typeof value === 'number' && Number.isFinite(value) ? value : 0
-}
-
-export function countMessageMetrics(messagesJsonl: string): {
-  toolCalls: number
-  toolErrors: number
-} {
-  let toolCalls = 0
-  let toolErrors = 0
-
-  for (const line of messagesJsonl.split('\n')) {
-    const trimmed = line.trim()
-    if (!trimmed) continue
-    try {
-      const event = JSON.parse(trimmed) as { type?: unknown }
-      if (event.type === 'tool-input-available') toolCalls++
-      if (event.type === 'tool-output-error') toolErrors++
-    } catch {
-      // Ignore malformed telemetry lines; the raw artifact is still uploaded.
-    }
-  }
-
-  return { toolCalls, toolErrors }
-}
-
-export function buildTaskMetrics(
-  metadata: Record<string, unknown>,
-  messageMetrics: { toolCalls: number; toolErrors: number },
-  screenshotCount = 0,
-): EvalTaskMetrics {
-  const screenshots = numberValue(metadata.screenshot_count) || screenshotCount
-  return {
-    durationMs: numberValue(metadata.total_duration_ms),
-    steps: numberValue(metadata.total_steps) || screenshots,
-    screenshots,
-    toolCalls: messageMetrics.toolCalls,
-    toolErrors: messageMetrics.toolErrors,
-  }
-}
-
-export function buildRunMetrics(metrics: EvalTaskMetrics[]): EvalRunMetrics {
-  const taskCount = metrics.length
-  const totalDurationMs = metrics.reduce((sum, metric) => {
-    return sum + metric.durationMs
-  }, 0)
-  const totalSteps = metrics.reduce((sum, metric) => sum + metric.steps, 0)
-  const totalToolCalls = metrics.reduce((sum, metric) => {
-    return sum + metric.toolCalls
-  }, 0)
-  const totalToolErrors = metrics.reduce((sum, metric) => {
-    return sum + metric.toolErrors
-  }, 0)
-
-  return {
-    taskCount,
-    totalDurationMs,
-    avgDurationMs: taskCount > 0 ? totalDurationMs / taskCount : 0,
-    totalSteps,
-    avgSteps: taskCount > 0 ? totalSteps / taskCount : 0,
-    totalToolCalls,
-    avgToolCalls: taskCount > 0 ? totalToolCalls / taskCount : 0,
-    totalToolErrors,
-    avgToolErrors: taskCount > 0 ? totalToolErrors / taskCount : 0,
-  }
-}
-
-export async function readTaskMetrics(
-  taskPath: string,
-  metadata: Record<string, unknown>,
-  screenshotCount = 0,
-): Promise<EvalTaskMetrics> {
-  const messages = await readFile(join(taskPath, 'messages.jsonl'), 'utf-8')
-    .then(countMessageMetrics)
-    .catch(() => ({ toolCalls: 0, toolErrors: 0 }))
-  return buildTaskMetrics(metadata, messages, screenshotCount)
-}
-
-function statusFromMetadata(metadata: Record<string, unknown>): string {
-  const termination = metadata.termination_reason
-  if (termination === 'timeout') return 'timeout'
-  if (Array.isArray(metadata.errors) && metadata.errors.length > 0) {
-    return 'failed'
-  }
-  return 'completed'
-}
-
-function primaryGrade(metadata: Record<string, unknown>): {
-  score?: number
-  pass?: boolean
-} {
-  const graders = metadata.grader_results as
-    | Record<string, { score?: unknown; pass?: unknown }>
-    | undefined
-  const first = graders ? Object.values(graders)[0] : undefined
-  return {
-    ...(typeof first?.score === 'number' ? { score: first.score } : {}),
-    ...(typeof first?.pass === 'boolean' ? { pass: first.pass } : {}),
-  }
-}
-
-async function readTaskDirs(runDir: string): Promise<TaskDirEntry[]> {
-  const canonicalTasksDir = join(runDir, 'tasks')
-  const canonicalStat = await stat(canonicalTasksDir).catch(() => null)
-  const baseDir = canonicalStat?.isDirectory() ? canonicalTasksDir : runDir
-  const entries = await readdir(baseDir, { withFileTypes: true }).catch(
-    () => [],
-  )
-
-  return entries
-    .filter((entry) => entry.isDirectory())
-    .filter((entry) => entry.name !== 'screenshots')
-    .filter((entry) => entry.name !== 'tasks')
-    .map((entry) => ({
-      taskId: entry.name,
-      taskPath: join(baseDir, entry.name),
-    }))
-}
-
-export async function readRunMetricSummary(
-  runDir: string,
-): Promise<EvalRunMetricSummary> {
-  const tasks: EvalTaskMetricSummary[] = []
-
-  for (const entry of await readTaskDirs(runDir)) {
-    const metadata = await readFile(
-      join(entry.taskPath, 'metadata.json'),
-      'utf-8',
-    )
-      .then((text) => JSON.parse(text) as Record<string, unknown>)
-      .catch(() => null)
-    if (!metadata) continue
-
-    const metrics = await readTaskMetrics(entry.taskPath, metadata)
-    tasks.push({
-      queryId: (metadata.query_id as string | undefined) || entry.taskId,
-      status: statusFromMetadata(metadata),
-      ...primaryGrade(metadata),
-      metrics,
-    })
-  }
-
-  return {
-    run: buildRunMetrics(tasks.map((task) => task.metrics)),
-    tasks,
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/suites/config-adapter.ts
+++ b/packages/browseros-agent/apps/eval/src/suites/config-adapter.ts
@@ -33,13 +33,6 @@ function variantSource(config: EvalConfig): {
  baseUrl?: string
  supportsImages?: boolean
 } {
-  if (config.agent.type === 'claude-code') {
-    return {
-      provider: 'claude-code',
-      model: config.agent.model ?? 'default',
-    }
-  }
-
  const agent =
    config.agent.type === 'single' ? config.agent : config.agent.orchestrator
  if (!agent.model) {
@@ -83,7 +76,10 @@ export async function adaptEvalConfigFile(
    suite: {
      id,
      dataset: evalConfig.dataset,
-      agent: suiteAgent(evalConfig, backend),
+      agent:
+        evalConfig.agent.type === 'single'
+          ? { type: 'tool-loop' }
+          : { type: 'orchestrated', executorBackend: backend ?? 'tool-loop' },
      graders: evalConfig.graders ?? [],
      workers: evalConfig.num_workers,
      restartBrowserPerTask: evalConfig.restart_server_per_task,
@@ -103,17 +99,3 @@ export async function adaptEvalConfigFile(
    }),
  }
 }
-
-function suiteAgent(
-  config: EvalConfig,
-  backend: ReturnType<typeof executorBackend>,
-): EvalSuite['agent'] {
-  switch (config.agent.type) {
-    case 'single':
-      return { type: 'tool-loop' }
-    case 'orchestrator-executor':
-      return { type: 'orchestrated', executorBackend: backend ?? 'tool-loop' }
-    case 'claude-code':
-      return { type: 'claude-code' }
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/suites/resolve-variant.ts
+++ b/packages/browseros-agent/apps/eval/src/suites/resolve-variant.ts
@@ -57,30 +57,10 @@ export function resolveVariant(
  options: ResolveVariantOptions = {},
 ): EvalVariant {
  const env = options.env ?? process.env
+  const id = options.variantId ?? env.EVAL_VARIANT ?? 'default'
  const provider =
    options.provider ?? env.EVAL_AGENT_PROVIDER ?? 'openai-compatible'
  const model = options.model ?? env.EVAL_AGENT_MODEL
-
-  if (provider === 'claude-code') {
-    const id = options.variantId ?? env.EVAL_VARIANT ?? 'claude-code'
-    return {
-      id,
-      agent: {
-        provider,
-        model: model ?? '',
-      },
-      publicMetadata: {
-        id,
-        agent: {
-          provider,
-          model: model || 'default',
-          apiKeyConfigured: false,
-        },
-      },
-    }
-  }
-
-  const id = options.variantId ?? env.EVAL_VARIANT ?? 'default'
  const apiKey = options.apiKey ?? env.EVAL_AGENT_API_KEY
  const apiKeyEnv =
    options.apiKeyEnv ?? (options.apiKey ? undefined : 'EVAL_AGENT_API_KEY')
--- a/packages/browseros-agent/apps/eval/src/suites/schema.ts
+++ b/packages/browseros-agent/apps/eval/src/suites/schema.ts
@@ -8,7 +8,6 @@ export const SuiteAgentSchema = z
      'single',
      'orchestrated',
      'orchestrator-executor',
-      'claude-code',
    ]),
    executorBackend: z.enum(['tool-loop', 'clado']).optional(),
  })
--- a/packages/browseros-agent/apps/eval/src/types/config.ts
+++ b/packages/browseros-agent/apps/eval/src/types/config.ts
@@ -19,19 +19,9 @@ export const OrchestratorExecutorConfigSchema = z.object({
  }),
 })

-export const ClaudeCodeAgentConfigSchema = z
-  .object({
-    type: z.literal('claude-code'),
-    model: z.string().min(1).optional(),
-    claudePath: z.string().min(1).default('claude'),
-    extraArgs: z.array(z.string()).default([]),
-  })
-  .strict()
-
 export const AgentConfigSchema = z.discriminatedUnion('type', [
  SingleAgentConfigSchema,
  OrchestratorExecutorConfigSchema,
-  ClaudeCodeAgentConfigSchema,
 ])

 export const EvalConfigSchema = z.object({
@@ -63,6 +53,5 @@ export type SingleAgentConfig = z.infer<typeof SingleAgentConfigSchema>
 export type OrchestratorExecutorConfig = z.infer<
  typeof OrchestratorExecutorConfigSchema
 >
-export type ClaudeCodeAgentConfig = z.infer<typeof ClaudeCodeAgentConfigSchema>
 export type AgentConfig = z.infer<typeof AgentConfigSchema>
 export type EvalConfig = z.infer<typeof EvalConfigSchema>
--- a/packages/browseros-agent/apps/eval/src/types/index.ts
+++ b/packages/browseros-agent/apps/eval/src/types/index.ts
@@ -2,8 +2,6 @@
 export {
  type AgentConfig,
  AgentConfigSchema,
-  type ClaudeCodeAgentConfig,
-  ClaudeCodeAgentConfigSchema,
  type EvalConfig,
  EvalConfigSchema,
  type OrchestratorExecutorConfig,
--- a/packages/browseros-agent/apps/eval/src/types/result.ts
+++ b/packages/browseros-agent/apps/eval/src/types/result.ts
@@ -13,7 +13,7 @@ export const GraderResultSchema = z.object({
 // Agent config in metadata
 const AgentConfigMetaSchema = z
  .object({
-    type: z.enum(['single', 'orchestrator-executor', 'claude-code']),
+    type: z.enum(['single', 'orchestrator-executor']),
    model: z.string().optional(),
  })
  .passthrough()
--- a/packages/browseros-agent/apps/eval/src/utils/config-validator.ts
+++ b/packages/browseros-agent/apps/eval/src/utils/config-validator.ts
@@ -59,7 +59,7 @@ export async function validateConfig(
    ) {
      envVarsToCheck.push(config.agent.apiKey)
    }
-  } else if (config.agent.type === 'orchestrator-executor') {
+  } else {
    const { orchestrator, executor } = config.agent
    if (orchestrator.apiKey && isEnvVarName(orchestrator.apiKey)) {
      envVarsToCheck.push(orchestrator.apiKey)
--- a/packages/browseros-agent/apps/eval/src/utils/resolve-provider-config.ts
+++ b/packages/browseros-agent/apps/eval/src/utils/resolve-provider-config.ts
@@ -36,6 +36,5 @@ export async function resolveProviderConfig(
    accessKeyId: resolveEnvValue(agent.accessKeyId),
    secretAccessKey: resolveEnvValue(agent.secretAccessKey),
    sessionToken: resolveEnvValue(agent.sessionToken),
-    region: resolveEnvValue(agent.region),
  }
 }
--- a/packages/browseros-agent/apps/eval/src/viewer/viewer-manifest.ts
+++ b/packages/browseros-agent/apps/eval/src/viewer/viewer-manifest.ts
@@ -1,8 +1,3 @@
-import {
-  buildRunMetrics,
-  type EvalRunMetrics,
-  type EvalTaskMetrics,
-} from '../reporting/task-metrics'
 import type { GraderResult } from '../types'

 export const VIEWER_MANIFEST_SCHEMA_VERSION = 2
@@ -25,7 +20,6 @@ export interface ViewerManifestTaskInput {
  status: string
  durationMs: number
  screenshotCount: number
-  metrics?: EvalTaskMetrics
  graderResults: Record<string, GraderResult>
 }

@@ -41,11 +35,9 @@ export interface ViewerManifest {
  suiteId?: string
  variantId?: string
  uploadedAt?: string
-  reportPath?: string
  agentConfig?: Record<string, unknown>
  dataset?: string
  summary?: Record<string, unknown>
-  metrics?: EvalRunMetrics
  tasks: ViewerManifestTask[]
 }

@@ -54,7 +46,6 @@ export interface BuildViewerManifestInput {
  suiteId?: string
  variantId?: string
  uploadedAt?: string
-  reportPath?: string
  agentConfig?: Record<string, unknown>
  dataset?: string
  summary?: Record<string, unknown>
@@ -77,37 +68,22 @@ function taskPaths(queryId: string): ViewerManifestTaskPaths {
 export function buildViewerManifest(
  input: BuildViewerManifestInput,
 ): ViewerManifest {
-  const tasks = input.tasks.map((task) => {
-    const { artifactId, ...publicTask } = task
-    const metrics =
-      publicTask.metrics ??
-      ({
-        durationMs: publicTask.durationMs,
-        steps: publicTask.screenshotCount,
-        screenshots: publicTask.screenshotCount,
-        toolCalls: 0,
-        toolErrors: 0,
-      } satisfies EvalTaskMetrics)
-
-    return {
-      ...publicTask,
-      metrics,
-      startUrl: publicTask.startUrl ?? '',
-      paths: taskPaths(artifactId ?? publicTask.queryId),
-    }
-  })
-
  return {
    schemaVersion: VIEWER_MANIFEST_SCHEMA_VERSION,
    runId: input.runId,
    ...(input.suiteId ? { suiteId: input.suiteId } : {}),
    ...(input.variantId ? { variantId: input.variantId } : {}),
    ...(input.uploadedAt ? { uploadedAt: input.uploadedAt } : {}),
-    ...(input.reportPath ? { reportPath: input.reportPath } : {}),
    ...(input.agentConfig ? { agentConfig: input.agentConfig } : {}),
    ...(input.dataset ? { dataset: input.dataset } : {}),
    ...(input.summary ? { summary: input.summary } : {}),
-    metrics: buildRunMetrics(tasks.map((task) => task.metrics)),
-    tasks,
+    tasks: input.tasks.map((task) => {
+      const { artifactId, ...publicTask } = task
+      return {
+        ...publicTask,
+        startUrl: publicTask.startUrl ?? '',
+        paths: taskPaths(artifactId ?? publicTask.queryId),
+      }
+    }),
  }
 }
--- a/packages/browseros-agent/apps/eval/tests/agents/claude-code-evaluator.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/agents/claude-code-evaluator.test.ts
@@ -1,268 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import { mkdtemp, readFile } from 'node:fs/promises'
-import { tmpdir } from 'node:os'
-import { join } from 'node:path'
-import { createAgent } from '../../src/agents'
-import { ClaudeCodeEvaluator } from '../../src/agents/claude-code'
-import { CaptureContext } from '../../src/capture/context'
-import {
-  AgentConfigSchema,
-  type EvalConfig,
-  EvalConfigSchema,
-  type Task,
-  TaskMetadataSchema,
-} from '../../src/types'
-
-function config(): EvalConfig {
-  return {
-    agent: {
-      type: 'claude-code',
-      model: 'opus',
-      claudePath: 'claude',
-      extraArgs: [],
-    },
-    dataset: 'data/test.jsonl',
-    num_workers: 1,
-    restart_server_per_task: false,
-    browseros: {
-      server_url: 'http://127.0.0.1:9110',
-      base_cdp_port: 9010,
-      base_server_port: 9110,
-      base_extension_port: 9310,
-      load_extensions: false,
-      headless: false,
-    },
-    graders: [],
-  }
-}
-
-const task: Task = {
-  query_id: 'task-1',
-  dataset: 'test',
-  query: 'Find the title',
-  graders: [],
-  metadata: {
-    original_task_id: 'task-1',
-  },
-}
-
-describe('ClaudeCodeEvaluator', () => {
-  it('accepts claude-code config defaults without permission mode', () => {
-    const agent = AgentConfigSchema.parse({ type: 'claude-code' })
-
-    expect(agent).toEqual({
-      type: 'claude-code',
-      claudePath: 'claude',
-      extraArgs: [],
-    })
-  })
-
-  it('accepts claude-code as a runnable eval agent', () => {
-    const parsed = EvalConfigSchema.parse({
-      agent: {
-        type: 'claude-code',
-        model: 'opus',
-      },
-      dataset: 'data/test-set.jsonl',
-      browseros: {
-        server_url: 'http://127.0.0.1:9110',
-      },
-    })
-
-    expect(parsed.agent.type).toBe('claude-code')
-    expect(parsed.agent.model).toBe('opus')
-  })
-
-  it('rejects unsupported claude-code settings instead of silently ignoring them', () => {
-    expect(
-      AgentConfigSchema.safeParse({
-        type: 'claude-code',
-        permissionMode: 'bypassPermissions',
-      }).success,
-    ).toBe(false)
-    expect(
-      AgentConfigSchema.safeParse({
-        type: 'claude-code',
-        maxTurns: 3,
-      }).success,
-    ).toBe(false)
-  })
-
-  it('allows claude-code in task metadata', () => {
-    const metadata = TaskMetadataSchema.parse({
-      query_id: 'task-1',
-      dataset: 'test',
-      query: 'Do the thing',
-      started_at: new Date().toISOString(),
-      completed_at: new Date().toISOString(),
-      total_duration_ms: 100,
-      total_steps: 1,
-      termination_reason: 'completed',
-      final_answer: 'done',
-      errors: [],
-      warnings: [],
-      agent_config: {
-        type: 'claude-code',
-        model: 'opus',
-      },
-      grader_results: {},
-    })
-
-    expect(metadata.agent_config.type).toBe('claude-code')
-  })
-
-  it('is created by the agent factory', async () => {
-    const outputDir = await mkdtemp(join(tmpdir(), 'claude-code-eval-'))
-    const { capture, taskOutputDir } = await CaptureContext.create({
-      serverUrl: 'http://127.0.0.1:9110',
-      outputDir,
-      taskId: task.query_id,
-      initialPageId: 1,
-    })
-
-    const agent = createAgent({
-      config: config(),
-      task,
-      workerIndex: 0,
-      initialPageId: 1,
-      outputDir,
-      taskOutputDir,
-      capture,
-    })
-
-    expect(agent).toBeInstanceOf(ClaudeCodeEvaluator)
-  })
-
-  it('runs claude code, logs messages, writes MCP config, and saves metadata', async () => {
-    const outputDir = await mkdtemp(join(tmpdir(), 'claude-code-eval-'))
-    const { capture, taskOutputDir } = await CaptureContext.create({
-      serverUrl: 'http://127.0.0.1:9110',
-      outputDir,
-      taskId: task.query_id,
-      initialPageId: 1,
-    })
-    const calls: Array<{ executable: string; args: string[]; cwd: string }> = []
-    const evaluator = new ClaudeCodeEvaluator(
-      {
-        config: config(),
-        task,
-        workerIndex: 0,
-        initialPageId: 1,
-        outputDir,
-        taskOutputDir,
-        capture,
-      },
-      {
-        processRunner: {
-          async run(options) {
-            calls.push(options)
-            await options.onStdoutLine(
-              JSON.stringify({
-                type: 'assistant',
-                message: {
-                  content: [{ type: 'text', text: 'The title is Example' }],
-                },
-              }),
-            )
-            await options.onStdoutLine(
-              JSON.stringify({
-                type: 'result',
-                subtype: 'success',
-                result: 'The title is Example',
-              }),
-            )
-            return { exitCode: 0, stderr: '' }
-          },
-        },
-      },
-    )
-
-    const result = await evaluator.execute()
-
-    expect(result.finalAnswer).toBe('The title is Example')
-    expect(result.metadata.agent_config).toMatchObject({
-      type: 'claude-code',
-      model: 'opus',
-    })
-    expect(result.messages.some((msg) => msg.type === 'user')).toBe(true)
-    expect(result.messages.some((msg) => msg.type === 'text-delta')).toBe(true)
-    const mcpConfig = JSON.parse(
-      await readFile(join(taskOutputDir, 'claude-code-mcp.json'), 'utf-8'),
-    )
-    expect(mcpConfig.mcpServers.browseros).toMatchObject({
-      type: 'http',
-      url: 'http://127.0.0.1:9110/mcp',
-      headers: {
-        'X-BrowserOS-Source': 'sdk-internal',
-      },
-    })
-    expect(calls).toEqual([
-      expect.objectContaining({
-        executable: 'claude',
-        cwd: taskOutputDir,
-        args: [
-          '-p',
-          expect.stringContaining('Task: Find the title'),
-          '--mcp-config',
-          join(taskOutputDir, 'claude-code-mcp.json'),
-          '--strict-mcp-config',
-          '--output-format',
-          'stream-json',
-          '--verbose',
-          '--model',
-          'opus',
-        ],
-      }),
-    ])
-    expect(calls[0].args).not.toContain('--permission-mode')
-  })
-
-  it('records non-fatal stream processing errors as warnings', async () => {
-    const outputDir = await mkdtemp(join(tmpdir(), 'claude-code-eval-'))
-    const { capture, taskOutputDir } = await CaptureContext.create({
-      serverUrl: 'http://127.0.0.1:9110',
-      outputDir,
-      taskId: task.query_id,
-      initialPageId: 1,
-    })
-    const evaluator = new ClaudeCodeEvaluator(
-      {
-        config: config(),
-        task,
-        workerIndex: 0,
-        initialPageId: 1,
-        outputDir,
-        taskOutputDir,
-        capture,
-      },
-      {
-        processRunner: {
-          async run(options) {
-            await options.onStdoutLine(
-              JSON.stringify({
-                type: 'result',
-                subtype: 'success',
-                result: 'done',
-              }),
-            )
-            return {
-              exitCode: 0,
-              stderr: '',
-              streamErrors: ['bad stream line'],
-            }
-          },
-        },
-      },
-    )
-
-    const result = await evaluator.execute()
-
-    expect(result.finalAnswer).toBe('done')
-    expect(result.metadata.warnings).toEqual([
-      expect.objectContaining({
-        source: 'message_logging',
-        message: 'Claude Code stream event processing failed: bad stream line',
-      }),
-    ])
-  })
-})
--- a/packages/browseros-agent/apps/eval/tests/agents/claude-code-process-runner.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/agents/claude-code-process-runner.test.ts
@@ -1,78 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import { chmod, mkdtemp, writeFile } from 'node:fs/promises'
-import { tmpdir } from 'node:os'
-import { join } from 'node:path'
-import { createClaudeCodeProcessRunner } from '../../src/agents/claude-code/process-runner'
-
-async function writeStdoutScript(): Promise<string> {
-  const dir = await mkdtemp(join(tmpdir(), 'claude-code-runner-'))
-  const script = join(dir, 'stdout-lines')
-  await writeFile(script, '#!/bin/sh\nprintf "first\\nbad\\nlast\\n"\n')
-  await chmod(script, 0o755)
-  return script
-}
-
-describe('createClaudeCodeProcessRunner', () => {
-  it('passes executable and args to the spawn dependency', async () => {
-    const calls: unknown[] = []
-    const runner = createClaudeCodeProcessRunner({
-      spawn: async (cmd, options) => {
-        calls.push({ cmd, options })
-        await options.onStdoutLine('{"type":"result","result":"done"}')
-        return { exitCode: 0, stderr: '' }
-      },
-    })
-
-    const result = await runner.run({
-      executable: 'claude',
-      args: ['-p', 'hello'],
-      cwd: '/tmp',
-      signal: new AbortController().signal,
-      onStdoutLine: async () => {},
-    })
-
-    expect(result.exitCode).toBe(0)
-    expect(calls).toEqual([
-      {
-        cmd: ['claude', '-p', 'hello'],
-        options: expect.objectContaining({ cwd: '/tmp' }),
-      },
-    ])
-  })
-
-  it('returns stderr and non-zero exit codes', async () => {
-    const runner = createClaudeCodeProcessRunner({
-      spawn: async () => ({ exitCode: 2, stderr: 'bad auth' }),
-    })
-
-    const result = await runner.run({
-      executable: 'claude',
-      args: [],
-      cwd: '/tmp',
-      signal: new AbortController().signal,
-      onStdoutLine: async () => {},
-    })
-
-    expect(result).toEqual({ exitCode: 2, stderr: 'bad auth' })
-  })
-
-  it('continues reading stdout after a line handler error', async () => {
-    const script = await writeStdoutScript()
-    const lines: string[] = []
-    const runner = createClaudeCodeProcessRunner()
-
-    const result = await runner.run({
-      executable: script,
-      args: [],
-      cwd: '/tmp',
-      onStdoutLine: async (line) => {
-        lines.push(line)
-        if (line === 'bad') throw new Error('bad line')
-      },
-    })
-
-    expect(result.exitCode).toBe(0)
-    expect(result.streamErrors).toEqual(['bad line'])
-    expect(lines).toEqual(['first', 'bad', 'last'])
-  })
-})
--- a/packages/browseros-agent/apps/eval/tests/agents/claude-code-stream-parser.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/agents/claude-code-stream-parser.test.ts
@@ -1,102 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import {
-  ClaudeCodeStreamParser,
-  shouldCaptureScreenshotForTool,
-} from '../../src/agents/claude-code/stream-parser'
-
-describe('ClaudeCodeStreamParser', () => {
-  it('maps assistant text and MCP tool use into eval stream events', () => {
-    const parser = new ClaudeCodeStreamParser()
-    const events = parser.pushLine(
-      JSON.stringify({
-        type: 'assistant',
-        message: {
-          content: [
-            { type: 'text', text: 'I will navigate.' },
-            {
-              type: 'tool_use',
-              id: 'toolu_1',
-              name: 'mcp__browseros__navigate_page',
-              input: { page: 2, url: 'https://example.com' },
-            },
-          ],
-        },
-      }),
-    )
-
-    expect(events).toEqual([
-      { type: 'text-start', id: expect.any(String) },
-      {
-        type: 'text-delta',
-        id: expect.any(String),
-        delta: 'I will navigate.',
-      },
-      { type: 'text-end', id: expect.any(String) },
-      {
-        type: 'tool-input-available',
-        toolCallId: 'toolu_1',
-        toolName: 'mcp__browseros__navigate_page',
-        input: { page: 2, url: 'https://example.com' },
-      },
-    ])
-    expect(parser.getLastText()).toBe('I will navigate.')
-    expect(parser.getToolCallCount()).toBe(1)
-  })
-
-  it('maps Claude Code tool results into eval output events', () => {
-    const parser = new ClaudeCodeStreamParser()
-    const events = parser.pushLine(
-      JSON.stringify({
-        type: 'user',
-        message: {
-          content: [
-            {
-              type: 'tool_result',
-              tool_use_id: 'toolu_1',
-              content: 'Navigated successfully',
-            },
-          ],
-        },
-      }),
-    )
-
-    expect(events).toEqual([
-      {
-        type: 'tool-output-available',
-        toolCallId: 'toolu_1',
-        output: 'Navigated successfully',
-      },
-    ])
-  })
-
-  it('uses result messages as the authoritative final text', () => {
-    const parser = new ClaudeCodeStreamParser()
-    parser.pushLine(
-      JSON.stringify({
-        type: 'assistant',
-        message: {
-          content: [{ type: 'text', text: 'I will complete the task.' }],
-        },
-      }),
-    )
-    parser.pushLine(
-      JSON.stringify({
-        type: 'result',
-        subtype: 'success',
-        result: 'Final answer',
-      }),
-    )
-
-    expect(parser.getLastText()).toBe('Final answer')
-  })
-
-  it('identifies BrowserOS MCP tools that should trigger screenshots', () => {
-    expect(
-      shouldCaptureScreenshotForTool('mcp__browseros__navigate_page'),
-    ).toBe(true)
-    expect(
-      shouldCaptureScreenshotForTool('mcp__browseros__take_screenshot'),
-    ).toBe(false)
-    expect(shouldCaptureScreenshotForTool('Read')).toBe(false)
-  })
-})
--- a/packages/browseros-agent/apps/eval/tests/cli/suite-command.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/cli/suite-command.test.ts
@@ -7,11 +7,8 @@ import {
  runSuiteCommand,
 } from '../../src/cli/commands/suite'
 import type { RunEvalOptions } from '../../src/runner/types'
-import type { EvalSuite } from '../../src/suites/schema'

-async function writeTempSuite(
-  overrides: Partial<EvalSuite> = {},
-): Promise<{ dir: string; suitePath: string }> {
+async function writeTempSuite(): Promise<{ dir: string; suitePath: string }> {
  const dir = await mkdtemp(join(tmpdir(), 'eval-suite-cli-'))
  const suitePath = join(dir, 'agisdk-daily-10.json')
  await writeFile(
@@ -26,9 +23,8 @@ async function writeTempSuite(
        restartBrowserPerTask: true,
        browseros: {
          server_url: 'http://127.0.0.1:9110',
-          headless: false,
+          headless: true,
        },
-        ...overrides,
      },
      null,
      2,
@@ -47,7 +43,9 @@ describe('suite command', () => {

    expect(resolved.kind).toBe('config')
    expect(resolved.suite.id).toBe('browseros-agent-weekly')
-    expect(resolved.evalConfig.dataset).toBe('../../data/agisdk-real.jsonl')
+    expect(resolved.evalConfig.dataset).toBe(
+      '../../data/webbench-2of4-50.jsonl',
+    )
    expect(resolved.variant.publicMetadata.agent.apiKeyConfigured).toBe(true)
  })

@@ -77,25 +75,6 @@ describe('suite command', () => {
    expect(resolved.evalConfig.num_workers).toBe(2)
  })

-  it('resolves claude-code suites without provider API credentials', async () => {
-    const { dir, suitePath } = await writeTempSuite({
-      agent: { type: 'claude-code' },
-    })
-
-    const resolved = await resolveSuiteCommand({
-      suitePath,
-      model: 'opus',
-      env: {},
-    })
-
-    expect(resolved.kind).toBe('suite')
-    expect(resolved.evalConfig.agent).toMatchObject({
-      type: 'claude-code',
-      model: 'opus',
-    })
-    expect(resolved.datasetPath).toBe(join(dir, 'tasks.jsonl'))
-  })
-
  it('runs config and suite commands through the runner dependency', async () => {
    const calls: RunEvalOptions[] = []
    await runSuiteCommand(
--- a/packages/browseros-agent/apps/eval/tests/dashboard/server.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/dashboard/server.test.ts
@@ -1,12 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import { shouldAutoOpenDashboard } from '../../src/dashboard/server'
-
-describe('dashboard server', () => {
-  it('does not auto-open the dashboard in CI', () => {
-    expect(shouldAutoOpenDashboard({ CI: 'true' })).toBe(false)
-  })
-
-  it('auto-opens the dashboard outside CI by default', () => {
-    expect(shouldAutoOpenDashboard({})).toBe(true)
-  })
-})
--- a/packages/browseros-agent/apps/eval/tests/grading/python-evaluator.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/grading/python-evaluator.test.ts
@@ -1,5 +1,5 @@
 import { describe, expect, it } from 'bun:test'
-import { chmod, mkdtemp, writeFile } from 'node:fs/promises'
+import { mkdtemp, writeFile } from 'node:fs/promises'
 import { tmpdir } from 'node:os'
 import { join } from 'node:path'
 import { runPythonJsonEvaluator } from '../../src/grading/python-evaluator'
@@ -11,17 +11,6 @@ async function writeScript(source: string): Promise<string> {
  return script
 }

-async function writePythonWrapper(): Promise<string> {
-  const dir = await mkdtemp(join(tmpdir(), 'eval-python-wrapper-'))
-  const wrapper = join(dir, 'python-wrapper')
-  await writeFile(
-    wrapper,
-    '#!/bin/sh\necho custom-python >&2\nexec python3 "$@"\n',
-  )
-  await chmod(wrapper, 0o755)
-  return wrapper
-}
-
 describe('runPythonJsonEvaluator', () => {
  it('sends JSON on stdin, captures stderr, and parses stdout JSON', async () => {
    const script = await writeScript(`
@@ -60,34 +49,6 @@ sys.exit(3)
    ).rejects.toThrow('bad verifier')
  })

-  it('uses BROWSEROS_EVAL_PYTHON when provided', async () => {
-    const script = await writeScript(`
-import json, sys
-data = json.loads(sys.stdin.read())
-print(json.dumps({"ok": data["ok"]}))
-`)
-    const wrapper = await writePythonWrapper()
-    const previousPythonPath = process.env.BROWSEROS_EVAL_PYTHON
-    process.env.BROWSEROS_EVAL_PYTHON = wrapper
-
-    try {
-      const result = await runPythonJsonEvaluator<{ ok: boolean }>({
-        scriptPath: script,
-        input: { ok: true },
-        timeoutMs: 5_000,
-      })
-
-      expect(result.output).toEqual({ ok: true })
-      expect(result.stderr).toContain('custom-python')
-    } finally {
-      if (previousPythonPath === undefined) {
-        delete process.env.BROWSEROS_EVAL_PYTHON
-      } else {
-        process.env.BROWSEROS_EVAL_PYTHON = previousPythonPath
-      }
-    }
-  })
-
  it('enforces timeouts', async () => {
    const script = await writeScript(`
 import time
--- a/packages/browseros-agent/apps/eval/tests/publishing/r2-publisher.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/publishing/r2-publisher.test.ts
@@ -40,7 +40,6 @@ async function writeRunFixture(
      start_url: 'https://example.test',
      termination_reason: 'completed',
      total_duration_ms: 1200,
-      total_steps: 4,
      screenshot_count: 1,
      agent_config: { type: 'single', model: 'kimi' },
      grader_results: {
@@ -48,22 +47,13 @@ async function writeRunFixture(
      },
    }),
  )
-  await writeFile(
-    join(taskDir, 'messages.jsonl'),
-    [
-      '{"type":"user"}',
-      '{"type":"tool-input-available","toolName":"click"}',
-      '{"type":"tool-input-available","toolName":"take_snapshot"}',
-      '{"type":"tool-output-error","toolName":"click"}',
-    ].join('\n'),
-  )
+  await writeFile(join(taskDir, 'messages.jsonl'), '{"type":"user"}\n')
  await writeFile(join(taskDir, 'grades.json'), '{"ok":true}')
  await writeFile(join(taskDir, 'screenshots', '1.png'), 'png')
  await writeFile(
    join(runDir, 'summary.json'),
    JSON.stringify({ passRate: 1, avgDurationMs: 1200 }),
  )
-  await writeFile(join(runDir, 'report.html'), '<html>report</html>')
  return { runDir, runId: `${configName}-${timestamp}` }
 }

@@ -120,9 +110,6 @@ describe('R2Publisher', () => {
    expect(byKey.get(`runs/${runId}/summary.json`)?.ContentType).toBe(
      'application/json',
    )
-    expect(byKey.get(`runs/${runId}/report.html`)?.ContentType).toBe(
-      'text/html',
-    )
    expect(byKey.get('viewer.html')?.ContentType).toBe('text/html')
    expect(result.viewerUrl).toBe(
      `https://eval.example.test/viewer.html?run=${runId}`,
@@ -139,28 +126,12 @@ describe('R2Publisher', () => {
      uploadedAt: '2026-04-29T12:00:00.000Z',
      agentConfig: { type: 'single', model: 'kimi' },
      dataset: 'webbench',
-      reportPath: 'report.html',
      summary: { passRate: 1, avgDurationMs: 1200 },
-      metrics: {
-        taskCount: 1,
-        avgDurationMs: 1200,
-        avgSteps: 4,
-        avgToolCalls: 2,
-        totalToolCalls: 2,
-        totalToolErrors: 1,
-      },
      tasks: [
        {
          queryId: 'task-1',
          status: 'completed',
          screenshotCount: 1,
-          metrics: {
-            durationMs: 1200,
-            steps: 4,
-            screenshots: 1,
-            toolCalls: 2,
-            toolErrors: 1,
-          },
          paths: {
            attempt: 'tasks/task-1/attempt.json',
            metadata: 'tasks/task-1/metadata.json',
--- a/packages/browseros-agent/apps/eval/tests/publishing/r2-viewer-compat.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/publishing/r2-viewer-compat.test.ts
@@ -6,7 +6,6 @@ interface ViewerPathResolvers {
  artifactUrl(task: Record<string, unknown>, artifact: string): string
  metadataUrl(task: Record<string, unknown>): string
  messagesUrl(task: Record<string, unknown>): string
-  reportUrl(manifest: Record<string, unknown>): string | null
  screenshotUrl(task: Record<string, unknown>, step: number): string
 }

@@ -25,7 +24,7 @@ async function loadViewerPathResolvers(): Promise<ViewerPathResolvers> {
    `
      const basePath = 'runs/run-1';
      ${block}
-      return { artifactUrl, metadataUrl, messagesUrl, reportUrl, screenshotUrl };
+      return { artifactUrl, metadataUrl, messagesUrl, screenshotUrl };
    `,
  ) as () => ViewerPathResolvers
  return createResolvers()
@@ -61,35 +60,6 @@ async function runAutoSelectFromHash(hash: string): Promise<unknown> {
  return runAutoSelect()
 }

-async function runComputeStats(): Promise<unknown> {
-  const html = await readFile(
-    join(import.meta.dir, '..', '..', 'src', 'dashboard', 'viewer.html'),
-    'utf-8',
-  )
-  const start = html.indexOf('function computeStats(tasks)')
-  const end = html.indexOf('function resolveStatus(task)', start)
-  expect(start).toBeGreaterThan(-1)
-  expect(end).toBeGreaterThan(start)
-
-  const block = html.slice(start, end)
-  const compute = new Function(
-    `
-      ${block}
-      return computeStats([
-        {
-          graderResults: { agisdk_state_diff: { pass: true, score: 1 } },
-          metrics: { durationMs: 1000, steps: 4, toolCalls: 3, toolErrors: 0 }
-        },
-        {
-          graderResults: { agisdk_state_diff: { pass: false, score: 0 } },
-          metrics: { durationMs: 3000, steps: 8, toolCalls: 5, toolErrors: 2 }
-        }
-      ]);
-    `,
-  ) as () => unknown
-  return compute()
-}
-
 describe('R2 viewer artifact path compatibility', () => {
  it('uses explicit manifest paths for new uploaded runs', async () => {
    const resolvers = await loadViewerPathResolvers()
@@ -125,15 +95,6 @@ describe('R2 viewer artifact path compatibility', () => {
    )
  })

-  it('resolves manifest-level run report links', async () => {
-    const resolvers = await loadViewerPathResolvers()
-
-    expect(resolvers.reportUrl({ reportPath: 'report.html' })).toBe(
-      'runs/run-1/report.html',
-    )
-    expect(resolvers.reportUrl({})).toBe(null)
-  })
-
  it('falls back to legacy inferred paths for old uploaded runs', async () => {
    const resolvers = await loadViewerPathResolvers()
    const task = { queryId: 'legacy-task' }
@@ -166,17 +127,4 @@ describe('R2 viewer artifact path compatibility', () => {
      queryId: 'legacy-task',
    })
  })
-
-  it('computes run-level timing and tool metrics for the viewer', async () => {
-    expect(await runComputeStats()).toMatchObject({
-      total: 2,
-      passed: 1,
-      failed: 1,
-      avgDurationMs: 2000,
-      avgSteps: 6,
-      avgToolCalls: 4,
-      totalToolCalls: 8,
-      totalToolErrors: 2,
-    })
-  })
 })
--- a/packages/browseros-agent/apps/eval/tests/reporting/generate-report-script.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/reporting/generate-report-script.test.ts
@@ -1,159 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import { mkdir, mkdtemp, readFile, writeFile } from 'node:fs/promises'
-import { tmpdir } from 'node:os'
-import { join } from 'node:path'
-import {
-  DEFAULT_REPORT_MAX_TURNS,
-  DEFAULT_REPORT_MODEL,
-  generateEvalReport,
-  runClaudeCodeReportAgent,
-} from '../../scripts/generate-report'
-
-async function writeRunFixture(): Promise<string> {
-  const runDir = await mkdtemp(join(tmpdir(), 'eval-report-script-'))
-  const taskDir = join(runDir, 'agisdk-networkin-10')
-  await mkdir(join(taskDir, 'screenshots'), { recursive: true })
-  await writeFile(
-    join(runDir, 'summary.json'),
-    JSON.stringify({
-      total: 1,
-      completed: 1,
-      passRate: 0,
-      avgDurationMs: 1234,
-    }),
-  )
-  await writeFile(
-    join(taskDir, 'metadata.json'),
-    JSON.stringify({
-      query_id: 'agisdk-networkin-10',
-      dataset: 'agisdk-real',
-      query: 'Send a follow-up message starting with "Following up on".',
-      termination_reason: 'completed',
-      total_duration_ms: 1234,
-      total_steps: 2,
-      screenshot_count: 1,
-      final_answer: 'No app action was taken.',
-      errors: [],
-      warnings: [],
-      agent_config: { type: 'single', model: 'kimi' },
-      grader_results: {
-        agisdk_state_diff: {
-          score: 0,
-          pass: false,
-          reasoning: 'Some criteria failed',
-          details: {
-            per_criterion: [
-              { passed: true, detail: 'message starts correctly' },
-              { passed: false, detail: 'message was not sent' },
-            ],
-          },
-        },
-      },
-    }),
-  )
-  await writeFile(
-    join(taskDir, 'messages.jsonl'),
-    [
-      JSON.stringify({
-        type: 'tool-input-available',
-        timestamp: '2026-04-30T00:00:00.000Z',
-        toolCallId: 'call-1',
-        toolName: 'memory_search',
-        input: { q: 'chat' },
-      }),
-      JSON.stringify({
-        type: 'tool-output-error',
-        timestamp: '2026-04-30T00:00:01.000Z',
-        toolCallId: 'call-1',
-        errorText: 'memory unavailable',
-      }),
-    ].join('\n'),
-  )
-  await writeFile(join(taskDir, 'screenshots', '1.png'), 'png')
-  return runDir
-}
-
-describe('generate-report script', () => {
-  it('delegates report.html creation to Claude Code', async () => {
-    const runDir = await writeRunFixture()
-    const outputPath = join(runDir, 'report.html')
-    let prompt = ''
-
-    await generateEvalReport({
-      inputDir: runDir,
-      outputPath,
-      runAgent: async (invocation) => {
-        prompt = invocation.prompt
-        await writeFile(
-          invocation.outputPath,
-          '<!doctype html><h1>Claude-written report</h1>',
-        )
-      },
-    })
-
-    expect(await readFile(outputPath, 'utf-8')).toContain(
-      'Claude-written report',
-    )
-    expect(prompt).toContain('AGI SDK Random-10 Failure Report')
-    expect(prompt).toContain('summary.json')
-    expect(prompt).toContain('messages.jsonl')
-    expect(prompt).toContain('screenshots')
-    expect(prompt).toContain('Deterministic run metrics')
-    expect(prompt).toContain('"queryId": "agisdk-networkin-10"')
-    expect(prompt).toContain('"toolCalls": 1')
-    expect(prompt).toContain('"toolErrors": 1')
-    expect(prompt).toContain('Duration by task')
-    expect(prompt).toContain('Tool calls by task')
-    expect(prompt).toContain(outputPath)
-  })
-
-  it('fails when the Claude Code agent does not write the report', async () => {
-    const runDir = await writeRunFixture()
-
-    await expect(
-      generateEvalReport({
-        inputDir: runDir,
-        outputPath: join(runDir, 'missing-report.html'),
-        runAgent: async () => {},
-      }),
-    ).rejects.toThrow('Report was not written')
-  })
-
-  it('runs Claude Code with Opus 4.6, full bypass, and bounded turns', async () => {
-    const runDir = await writeRunFixture()
-    const calls: unknown[] = []
-
-    await runClaudeCodeReportAgent(
-      {
-        inputDir: runDir,
-        outputPath: join(runDir, 'report.html'),
-        prompt: 'write the report',
-      },
-      {
-        query: async function* (call: unknown) {
-          calls.push(call)
-          yield { type: 'result', subtype: 'success', result: 'done' }
-        },
-        env: {
-          CLAUDE_CODE_OAUTH_TOKEN: 'token',
-          EVAL_R2_SECRET_ACCESS_KEY: 'secret',
-          HOME: '/tmp/home',
-          PATH: '/bin',
-        },
-      },
-    )
-
-    expect(calls).toHaveLength(1)
-    expect(calls[0]).toMatchObject({
-      prompt: 'write the report',
-      options: {
-        cwd: runDir,
-        model: DEFAULT_REPORT_MODEL,
-        maxTurns: DEFAULT_REPORT_MAX_TURNS,
-        permissionMode: 'bypassPermissions',
-        allowDangerouslySkipPermissions: true,
-      },
-    })
-    expect(JSON.stringify(calls[0])).not.toContain('secret')
-  })
-})
--- a/packages/browseros-agent/apps/eval/tests/suites/config-adapter.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/suites/config-adapter.test.ts
@@ -1,22 +1,19 @@
 import { describe, expect, it } from 'bun:test'
-import { mkdtemp, writeFile } from 'node:fs/promises'
-import { tmpdir } from 'node:os'
-import { join } from 'node:path'
 import { adaptEvalConfigFile } from '../../src/suites/config-adapter'

 describe('adaptEvalConfigFile', () => {
-  it('preserves browseros-agent-weekly AGI SDK config semantics', async () => {
+  it('preserves browseros-agent-weekly config semantics', async () => {
    const adapted = await adaptEvalConfigFile(
      'apps/eval/configs/legacy/browseros-agent-weekly.json',
    )

    expect(adapted.suite.id).toBe('browseros-agent-weekly')
-    expect(adapted.suite.dataset).toBe('../../data/agisdk-real.jsonl')
-    expect(adapted.suite.graders).toEqual(['agisdk_state_diff'])
-    expect(adapted.suite.workers).toBe(3)
+    expect(adapted.suite.dataset).toBe('../../data/webbench-2of4-50.jsonl')
+    expect(adapted.suite.graders).toEqual(['performance_grader'])
+    expect(adapted.suite.workers).toBe(10)
    expect(adapted.suite.restartBrowserPerTask).toBe(true)
    expect(adapted.suite.timeoutMs).toBe(1_800_000)
-    expect(adapted.evalConfig.num_workers).toBe(3)
+    expect(adapted.evalConfig.num_workers).toBe(10)
    expect(adapted.evalConfig.browseros.server_url).toBe(
      'http://127.0.0.1:9110',
    )
@@ -37,61 +34,4 @@ describe('adaptEvalConfigFile', () => {
      'secret-openrouter-value',
    )
  })
-
-  it('adapts BrowserOS AGI SDK comparison configs', async () => {
-    const kimi = await adaptEvalConfigFile(
-      'apps/eval/configs/legacy/browseros-agent-kimi-k2-5-agisdk-real.json',
-    )
-    const opus = await adaptEvalConfigFile(
-      'apps/eval/configs/legacy/browseros-agent-opus-4-6-agisdk-real.json',
-    )
-
-    expect(kimi.suite.id).toBe('browseros-agent-kimi-k2-5-agisdk-real')
-    expect(kimi.evalConfig.agent).toMatchObject({
-      type: 'single',
-      provider: 'openai-compatible',
-      model: 'moonshotai/kimi-k2.5',
-    })
-    expect(kimi.evalConfig.num_workers).toBe(3)
-
-    expect(opus.suite.id).toBe('browseros-agent-opus-4-6-agisdk-real')
-    expect(opus.evalConfig.agent).toMatchObject({
-      type: 'single',
-      provider: 'bedrock',
-      model: 'global.anthropic.claude-opus-4-6-v1',
-      region: 'AWS_REGION',
-      accessKeyId: 'AWS_ACCESS_KEY_ID',
-      secretAccessKey: 'AWS_SECRET_ACCESS_KEY',
-    })
-    expect(opus.evalConfig.num_workers).toBe(2)
-  })
-
-  it('adapts claude-code configs without provider credentials', async () => {
-    const dir = await mkdtemp(join(tmpdir(), 'claude-code-config-'))
-    const configPath = join(dir, 'claude-code-agisdk.json')
-    await writeFile(
-      configPath,
-      JSON.stringify({
-        agent: {
-          type: 'claude-code',
-          model: 'opus',
-        },
-        dataset: 'tasks.jsonl',
-        num_workers: 1,
-        restart_server_per_task: false,
-        browseros: {
-          server_url: 'http://127.0.0.1:9110',
-          headless: false,
-        },
-      }),
-    )
-
-    const adapted = await adaptEvalConfigFile(configPath, { env: {} })
-
-    expect(adapted.suite.agent).toEqual({ type: 'claude-code' })
-    expect(adapted.variant.agent).toMatchObject({
-      provider: 'claude-code',
-      model: 'opus',
-    })
-  })
 })
--- a/packages/browseros-agent/apps/eval/tests/suites/schema.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/suites/schema.test.ts
@@ -35,16 +35,6 @@ describe('EvalSuiteSchema', () => {
    expect(parsed.success).toBe(false)
  })

-  it('validates claude-code suites', () => {
-    const suite = EvalSuiteSchema.parse({
-      id: 'claude-code-agisdk',
-      dataset: 'data/agisdk-real.jsonl',
-      agent: { type: 'claude-code' },
-    })
-
-    expect(suite.agent.type).toBe('claude-code')
-  })
-
  it('validates the daily AGISDK 10-task suite', async () => {
    const loaded = await loadSuite(
      'apps/eval/configs/suites/agisdk-daily-10.json',
@@ -99,40 +89,4 @@ describe('resolveVariant', () => {
      }),
    ).toThrow('EVAL_AGENT_API_KEY')
  })
-
-  it('resolves claude-code variants without model or API key requirements', () => {
-    const variant = resolveVariant({
-      variantId: 'claude-opus',
-      provider: 'claude-code',
-      model: 'opus',
-      env: {},
-    })
-
-    expect(variant.id).toBe('claude-opus')
-    expect(variant.agent).toEqual({
-      provider: 'claude-code',
-      model: 'opus',
-    })
-    expect(variant.publicMetadata.agent).toEqual({
-      provider: 'claude-code',
-      model: 'opus',
-      apiKeyConfigured: false,
-    })
-
-    const defaultVariant = resolveVariant({
-      provider: 'claude-code',
-      env: {},
-    })
-
-    expect(defaultVariant.id).toBe('claude-code')
-    expect(defaultVariant.agent).toEqual({
-      provider: 'claude-code',
-      model: '',
-    })
-    expect(defaultVariant.publicMetadata.agent).toEqual({
-      provider: 'claude-code',
-      model: 'default',
-      apiKeyConfigured: false,
-    })
-  })
 })
--- a/packages/browseros-agent/apps/eval/tests/utils/resolve-provider-config.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/utils/resolve-provider-config.test.ts
@@ -1,38 +0,0 @@
-import { describe, expect, it } from 'bun:test'
-import { resolveProviderConfig } from '../../src/utils/resolve-provider-config'
-
-describe('resolveProviderConfig', () => {
-  it('resolves Bedrock region from environment variables', async () => {
-    const previous = {
-      AWS_REGION: process.env.AWS_REGION,
-      AWS_ACCESS_KEY_ID: process.env.AWS_ACCESS_KEY_ID,
-      AWS_SECRET_ACCESS_KEY: process.env.AWS_SECRET_ACCESS_KEY,
-    }
-    process.env.AWS_REGION = 'us-west-2'
-    process.env.AWS_ACCESS_KEY_ID = 'test-access-key'
-    process.env.AWS_SECRET_ACCESS_KEY = 'test-secret-key'
-
-    try {
-      const resolved = await resolveProviderConfig({
-        provider: 'bedrock',
-        model: 'global.anthropic.claude-opus-4-6-v1',
-        region: 'AWS_REGION',
-        accessKeyId: 'AWS_ACCESS_KEY_ID',
-        secretAccessKey: 'AWS_SECRET_ACCESS_KEY',
-      })
-
-      expect(resolved).toMatchObject({
-        provider: 'bedrock',
-        model: 'global.anthropic.claude-opus-4-6-v1',
-        region: process.env.AWS_REGION,
-        accessKeyId: process.env.AWS_ACCESS_KEY_ID,
-        secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
-      })
-    } finally {
-      for (const [key, value] of Object.entries(previous)) {
-        if (value === undefined) delete process.env[key]
-        else process.env[key] = value
-      }
-    }
-  })
-})
--- a/packages/browseros-agent/apps/eval/tests/viewer/viewer-manifest.test.ts
+++ b/packages/browseros-agent/apps/eval/tests/viewer/viewer-manifest.test.ts
@@ -9,7 +9,6 @@ describe('buildViewerManifest', () => {
      suiteId: 'agisdk-daily-10',
      variantId: 'kimi',
      uploadedAt: '2026-04-29T06:00:00.000Z',
-      reportPath: 'report.html',
      summary: { total: 1, passRate: 0 },
      tasks: [
        {
@@ -19,13 +18,6 @@ describe('buildViewerManifest', () => {
          status: 'completed',
          durationMs: 353_000,
          screenshotCount: 42,
-          metrics: {
-            durationMs: 353_000,
-            steps: 47,
-            screenshots: 42,
-            toolCalls: 19,
-            toolErrors: 2,
-          },
          graderResults: {
            agisdk_state_diff: {
              score: 0,
@@ -40,7 +32,6 @@ describe('buildViewerManifest', () => {

    const publishManifest: R2RunManifest = manifest
    expect(publishManifest.schemaVersion).toBe(2)
-    expect(manifest.reportPath).toBe('report.html')
    expect(manifest.tasks[0].paths.messages).toBe(
      'tasks/agisdk-dashdish-4/messages.jsonl',
    )
@@ -50,21 +41,6 @@ describe('buildViewerManifest', () => {
    expect(manifest.tasks[0].paths.graderArtifacts).toBe(
      'tasks/agisdk-dashdish-4/grader-artifacts',
    )
-    expect(manifest.metrics).toMatchObject({
-      taskCount: 1,
-      avgDurationMs: 353_000,
-      avgSteps: 47,
-      avgToolCalls: 19,
-      totalToolCalls: 19,
-      totalToolErrors: 2,
-    })
-    expect(manifest.tasks[0].metrics).toEqual({
-      durationMs: 353_000,
-      steps: 47,
-      screenshots: 42,
-      toolCalls: 19,
-      toolErrors: 2,
-    })
    expect(manifest.tasks[0].graderResults.agisdk_state_diff.details).toEqual({
      missing: ['checkout item'],
    })
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Nikhil Sonti	ad716fca78	fix(dev): address watch lock review comments	2026-04-30 11:44:07 -07:00
Nikhil Sonti	df8ff02b8f	fix(dev): use run lock for watch cleanup	2026-04-30 11:29:00 -07:00