Compare commits

..

2 Commits

Author SHA1 Message Date
Nikhil Sonti
e78bf58d41 test: cover terminal limactl resolver errors 2026-04-28 17:04:39 -07:00
Nikhil Sonti
a1c83a4b9c fix: avoid eager limactl resolution in server tests 2026-04-28 16:53:12 -07:00
308 changed files with 14709 additions and 20735 deletions

View File

@@ -1,152 +0,0 @@
---
name: ask-internal
description: Answer questions about BrowserOS internal stuff (setup, features, architecture, design decisions) by reading the private internal-docs submodule and the codebase. Use for "how do I X", "where is Y", "what is the deal with Z", or any question that mixes ops/setup knowledge with code knowledge. Can execute steps with per-command confirmation.
allowed-tools: Bash, Read, Grep, Glob, Edit, Write
---
# Ask Internal
Answer team-internal questions by reading `.internal-docs/` and the codebase, synthesizing a direct answer with file:line citations, and optionally running surfaced commands with confirmation.
**Announce at start:** "I'm using the ask-internal skill to answer this from internal-docs and the codebase."
## When to use
- "How do I reset my dogfood profile?"
- "What's the deal with the OpenClaw VM startup?"
- "Where do we configure release signing?"
- Any question whose answer lives in setup runbooks, feature notes, architecture docs, or the code that produced them.
## Hard rules — never do these
- NEVER execute a state-mutating command without per-command `y` confirmation from the user.
- NEVER edit BrowserOS code in response to an ask-internal question. The skill answers; it does not modify code. Use `/document-internal` for writes.
- NEVER guess. If grep finds nothing useful in docs or code, say so plainly.
- NEVER run this skill if `.internal-docs/` is missing. Stop with the init command.
- NEVER cite a file or line number you have not actually read.
## Voice rules
Apply the same voice rules as `document-internal` to the synthesized answer:
- Lead with the point.
- Concrete nouns. Name files, functions, commands.
- Short sentences. Active voice. No em dashes.
- Banned words: delve, crucial, robust, comprehensive, nuanced, multifaceted, furthermore, moreover, additionally, pivotal, landscape, tapestry, underscore, foster, showcase, intricate, vibrant, fundamental, significant, leverage, utilize.
- No filler intros.
## Workflow
### Step 0: Pre-flight
```bash
if git submodule status .internal-docs 2>/dev/null | grep -q '^-'; then
echo "internal-docs submodule not initialized. Run: git submodule update --init .internal-docs"
exit 0
fi
[ -d .internal-docs ] && [ -n "$(ls -A .internal-docs 2>/dev/null)" ] || {
echo ".internal-docs/ missing or empty. Submodule not configured?"
exit 0
}
```
### Step 1: Parse the question
Pull the keywords from the user's question. Drop stop words. Identify intent:
- **Setup-question** ("how do I", "how to", "where do I configure"): bias the search toward `setup/`.
- **Feature-question** ("what is X", "why does X work this way"): bias toward `features/` and `architecture/`.
- **Free-form** ("anything about Y"): search all categories.
### Step 2: Multi-source search
Run grep in parallel across two sources.
**Internal docs:**
```bash
grep -rni --include='*.md' '<keyword>' .internal-docs/
```
Search each keyword separately. Collect top hits by relevance (more keyword matches = higher).
**Codebase (skip vendored Chromium and `node_modules`):**
```bash
grep -rni --include='*.ts' --include='*.tsx' --include='*.js' --include='*.json' --include='*.sh' \
--exclude-dir=node_modules --exclude-dir=chromium --exclude-dir=.grove \
'<keyword>' packages/ scripts/ .config/ .github/
```
Read the top 3-5 doc hits and top 3-5 code hits. Do not skim — read the relevant section fully so citations are accurate.
### Step 3: Synthesize answer
Structure the response:
1. **Direct answer.** First sentence answers the question. No preamble.
2. **Steps if applicable.** Numbered list with exact commands.
3. **Citations.** Every factual claim references `path/to/file.md:42` or `path/to/code.ts:117`. Run the voice self-check before printing.
If multiple docs cover the topic at different layers (e.g., a setup runbook and a feature note both mention dogfood profiles), reconcile them in the answer rather than dumping both.
### Step 4: Offer execution (only if commands surfaced)
If Step 3 produced executable commands the user could run, ask:
> Run these for you? (y / n / dry-run)
- **y:** Execute one at a time. For any command that mutates state (writes a file, modifies config, kills a process, deletes anything), ask "run this? <command>" before each. Read-only commands (`ls`, `cat`, `git status`) run without per-command confirmation but still print before running.
- **n:** Skip. Done.
- **dry-run:** Print the full sequence as a `bash` block. Do not execute.
### Step 5: Doc-not-found path
If Step 2 returned nothing useful (no doc hits AND no clear code answer):
1. Tell the user: "No doc covers this. Tangentially relevant files: <list>."
2. Ask: "Draft a new doc and open a PR to internal-docs?"
3. On yes: invoke the full `/document-internal` flow (four sharp questions, draft, voice check, PR), forced to `setup/` doc type, with the code-grep findings handed in as initial context.
### Step 6: Completion status
Report one of:
- **DONE** — answer delivered, citations verified.
- **DONE_WITH_CONCERNS** — answered, but flag uncertainty (e.g., docs and code disagreed; user should reconcile).
- **BLOCKED** — submodule missing or other pre-flight failure.
- **NEEDS_CONTEXT** — question too vague to search effectively. Ask one clarifying question.
## Citation discipline
Every "X is at Y" claim in the answer must point to a file:line that the skill actually read. Do not approximate. If you didn't read it, don't cite it.
If a doc says one thing and the code says another, surface the conflict explicitly:
> The setup runbook (`setup/dogfood-profile.md:23`) says to delete `~/.cache/browseros/dogfood`, but the actual code path in `packages/cli/src/cleanup.ts:47` removes `~/.local/share/browseros/dogfood`. The doc looks stale. Recommend updating it.
## Common Mistakes
**Skimming and then citing**
- **Problem:** Citation points to a line that doesn't actually contain the claim.
- **Fix:** Read the section fully before citing. If you didn't read line 117, don't cite line 117.
**Executing without per-command confirmation for mutations**
- **Problem:** User says "y" to "run all", skill blasts through `rm -rf`-style commands.
- **Fix:** "y" means "run this sequence with per-mutation confirmations". Per-command y is required for writes.
**Searching only docs, not code**
- **Problem:** Doc says X but code does Y; answer is wrong.
- **Fix:** Always grep both sources in Step 2.
## Red Flags
**Never:**
- Cite a file:line you haven't read.
- Run mutations without per-command confirmation.
- Modify BrowserOS code from this skill (use `/document-internal` for writes).
**Always:**
- Pre-flight check before any search.
- Reconcile doc vs code conflicts in the answer, don't hide them.
- Plain "no doc covers this" when grep is empty — never invent.

View File

@@ -1,208 +0,0 @@
---
name: document-internal
description: Draft a 1-page internal doc (feature, architecture, or design) for the private browseros-ai/internal-docs repo. Use when wrapping up a feature on a branch, after the PR is open or about to be opened. Skill drafts from the diff, asks four sharp questions, enforces voice rules, and opens a PR to internal-docs.
allowed-tools: Bash, Read, Write, Edit, Grep, Glob
---
# Document Internal
Draft a 1-page internal doc (feature note, architecture note, or design spec) from the current branch's diff and open a PR to `browseros-ai/internal-docs`.
**Announce at start:** "I'm using the document-internal skill to draft a doc for internal-docs."
## When to use
After finishing implementation on a feature branch, when the work is doc-worthy (a major feature, a new subsystem, a setup runbook for something internal, or a design decision that future engineers need to know).
## Hard rules — never do these
- NEVER `git add -A` or `git add .` inside the tmp clone of internal-docs. Always specific paths.
- NEVER write outside the tmp clone (no spillover into the OSS repo's working tree).
- NEVER fabricate filler content for empty template sections. Empty stays empty.
- NEVER touch the OSS repo's `.gitmodules` or submodule pointer — the sync workflow handles that.
- NEVER run this skill if `.internal-docs/` is missing. Stop with the init command.
- NEVER push to `internal-docs/main` directly. Always a feature branch + PR.
## Voice rules — enforced by Step 4
The skill MUST follow these and refuse to draft otherwise. After generation, scan for violations and regenerate offending sentences (max 3 attempts).
- Lead with the point. First sentence answers "what is this?"
- Concrete nouns. Name files, functions, commands. Not "the system" or "the component".
- Short sentences. Average <20 words. No deeply nested clauses.
- Active voice. "X does Y" not "Y is done by X".
- No em dashes. Use commas, periods, or rephrase.
- Banned words: delve, crucial, robust, comprehensive, nuanced, multifaceted, furthermore, moreover, additionally, pivotal, landscape, tapestry, underscore, foster, showcase, intricate, vibrant, fundamental, significant, leverage, utilize.
- "110 IQ" target. Write for a smart engineer who has not seen this code yet.
- No filler intros ("This document describes..."). Start with the substance.
- Empty sections stay empty. Do not write "N/A" or fabricate content.
## Workflow
### Step 0: Pre-flight
Bail with a clear message on any failure.
```bash
# Submodule must be initialized
if git submodule status .internal-docs 2>/dev/null | grep -q '^-'; then
echo "internal-docs submodule not initialized. Run: git submodule update --init .internal-docs"
exit 0
fi
[ -d .internal-docs ] || { echo ".internal-docs/ missing. Submodule not configured?"; exit 0; }
# Must be on a feature branch
BRANCH=$(git branch --show-current)
if [ "$BRANCH" = "main" ] || [ "$BRANCH" = "dev" ]; then
echo "On $BRANCH. Run from a feature branch."
exit 0
fi
# Determine base branch (default: dev for this repo, fall back to main).
# Suppress rev-parse's SHA output on stdout so it doesn't get captured into BASE.
BASE=$(git rev-parse --verify origin/dev >/dev/null 2>&1 && echo dev || echo main)
# Gather context
git log "$BASE..HEAD" --oneline
git diff "$BASE...HEAD" --stat
gh pr view --json body -q .body 2>/dev/null # may be empty if no PR yet
```
### Step 1: Identify the doc
Ask the user for three things in one prompt:
1. **Doc type:** `feature` (default for `feat/*` branches), `architecture`, or `design`
2. **Slug:** kebab-case, short (e.g., `cowork-mcp`, `auto-skill-suggest`)
3. **Owner:** GitHub handle (default = `git config user.name` or current `gh api user --jq .login`)
### Step 2: Decision brief — four sharp questions
Ask one question at a time. Each answer constrains the next. These force compression before drafting.
1. "In one sentence: what can someone now DO that they could not before?"
2. "What is the one design decision a future engineer needs to know?"
3. "Which 3-5 files are the heart of this change?" (suggest candidates from the diff)
4. "Any sharp edges or gotchas? (or 'none')"
Skip any question that is N/A for the doc type. Architecture notes don't need question 1; design specs don't need question 4.
### Step 3: Draft from the template
Read the matching template from `.internal-docs/_templates/`:
- `feature` `feature-note.md`
- `architecture` `architecture-note.md`
- `design` `design-spec.md`
If `.internal-docs/_templates/` does not exist (first run, before seeding), fall back to the seeds bundled with this skill at `.claude/skills/document-internal/seeds/_templates/`.
Generate the 1-pager from the template, the four answers, and the diff context.
### Step 4: Voice self-check
Scan the draft for violations:
- Em dash present (`—`).
- Any banned word from the list.
- Average sentence length > 20 words.
- Body line count > 60 (feature notes only — architecture/design have no cap).
If any violation found, regenerate the offending sentences in place. Max 3 attempts. If still failing after 3 attempts, stop and report which rules are violated.
If the body is over 60 lines for a feature note, ask: "This is N lines, target is 60. Trim, or promote to `architecture/` (no length cap)?"
### Step 5: Show + iterate
Print the full draft. Ask:
> Edit needed? Paste any changes, or say "looks good".
Apply user edits with the Edit tool. Re-run Step 4. Loop until the user approves.
### Step 6: Open PR to internal-docs
Use a tmp clone. Never the user's `.internal-docs` checkout — keeps the user's submodule clean.
```bash
TMP=$(mktemp -d)
trap 'rm -rf "$TMP"' EXIT # cleans up even if any step below fails
git clone -b main git@github.com:browseros-ai/internal-docs.git "$TMP"
cd "$TMP"
git checkout -b "docs/<slug>"
# Write the doc
mkdir -p "<type>" # features, architecture, designs, or setup
cat > "<type>/$(date -u +%Y-%m)-<slug>.md" <<'DOC'
<draft content>
DOC
# Update the root README index — insert one line under the matching section
# Use Edit tool to add: "- [<title>](<type>/YYYY-MM-<slug>.md) — <one-line description>"
git add "<type>/$(date -u +%Y-%m)-<slug>.md" README.md
git commit -m "docs(<type>): <slug>"
git push -u origin "docs/<slug>"
PR_URL=$(gh pr create -R browseros-ai/internal-docs --base main \
--head "docs/<slug>" \
--title "docs(<type>): <slug>" \
--body "$(cat <<'BODY'
## Summary
<one-line of what this doc covers>
## Source
- BrowserOS branch: <branch>
- Related PR: <#NNN if any>
BODY
)")
cd -
echo "PR opened: $PR_URL"
# trap above cleans up $TMP on EXIT
```
If the slug contains characters that won't shell-escape cleanly, sanitize before substitution.
### Step 7: Completion status
Report one of:
- **DONE** — file written, branch pushed, PR opened. Print PR URL.
- **DONE_WITH_CONCERNS** — same as DONE but list concerns (e.g., voice check needed multiple regens, user skipped a question).
- **BLOCKED** — submodule missing, auth fail, or template missing. State exactly what's needed.
## Doc type defaults
| Branch pattern | Default doc type | Default location |
|----------------|------------------|------------------|
| `feat/*` | feature | `features/` |
| `arch/*` or refactor branches with >10 files in `packages/` | architecture | `architecture/` |
| `rfc/*` or `design/*` | design | `designs/` |
| Otherwise | ask | ask |
## Common Mistakes
**Drafting before asking the four questions**
- **Problem:** Output is generic filler that says nothing concrete.
- **Fix:** Always ask Step 2 first, even if the diff "looks obvious".
**Touching `.internal-docs/` directly**
- **Problem:** User's submodule HEAD moves, parent repo shows dirty state.
- **Fix:** Always use the tmp clone in Step 6.
**Skipping voice check on user edits**
- **Problem:** User pastes prose with em dashes or filler; ships as-is.
- **Fix:** Re-run Step 4 after every user edit.
## Red Flags
**Never:**
- Push to `internal-docs/main`. Always branch + PR.
- Modify the OSS repo's `.gitmodules` or submodule pointer.
- Fabricate content for empty template sections.
**Always:**
- Pre-flight check before doing any work.
- One-pager rule for feature notes (60-line body cap).
- File:line citations when referencing code.

View File

@@ -1,51 +0,0 @@
# BrowserOS Internal Docs
Private team docs for `browseros-ai`. Mounted as a submodule into the public OSS repo at `.internal-docs/`.
If you are reading this from a public clone of BrowserOS without team access — this submodule is for the BrowserOS internal team. Nothing here is required to build or use BrowserOS.
## How to find what you need
- Setup task ("how do I X locally") → look in [`setup/`](setup/)
- Recently shipped feature → look in [`features/`](features/)
- Cross-cutting subsystem → look in [`architecture/`](architecture/)
- A design decision or RFC → look in [`designs/`](designs/)
Or run `/ask-internal "<your question>"` from any BrowserOS checkout. The skill greps these docs and the codebase, then synthesizes an answer with citations.
## How to add a doc
Run `/document-internal` from a feature branch. The skill drafts a 1-pager from your branch's diff, asks four sharp questions, enforces voice rules, and opens a PR back to this repo.
## Index
### Setup
<!-- one line per setup runbook: -->
<!-- - [Dev environment](setup/dev-environment.md): first-time machine setup -->
### Features
<!-- one line per shipped feature, newest first: -->
<!-- - [Cowork MCP](features/2026-04-cowork-mcp.md): bring outside MCPs into the BrowserOS agent -->
### Architecture
<!-- one line per cross-cutting subsystem: -->
<!-- - [Chrome fork overview](architecture/chrome-fork-overview.md): what we patched and why -->
### Designs
<!-- one line per design spec, newest first: -->
<!-- - [Internal docs submodule](designs/2026-04-30-internal-docs-submodule.md): this system -->
## Templates
When `/document-internal` runs, it reads from [`_templates/`](_templates/). Edit the templates here when the team's preferred shape changes.
## Voice
Docs in this repo follow these rules. The `/document-internal` skill enforces them; humans editing by hand should match.
- Lead with the point.
- Concrete nouns. Name files, functions, commands.
- Short sentences, active voice, no em dashes.
- No filler words: delve, crucial, robust, comprehensive, nuanced, multifaceted, leverage, utilize, etc.
- Empty sections stay empty. Do not write "N/A" or fake content.
- Feature notes target one screen, body 60 lines max.

View File

@@ -1,31 +0,0 @@
---
title: <subsystem name>
owner: <github handle>
status: current | deprecated
date: YYYY-MM-DD
related-features: [feature-slug-1, feature-slug-2]
---
# <subsystem name>
## What this subsystem does
<1-2 paragraphs. The top-level responsibility. Boundaries.>
## Architecture
<Diagram (ASCII or mermaid) plus prose. Components and how they talk.>
## Constraints
<Hard rules the design enforces. "X must never call Y" type statements.>
## Decisions made
<Numbered list of non-obvious decisions and the reason for each.>
## Key files
- `path/to/file.ts` — role
- `path/to/dir/` — what lives here
## How to evolve this
<Where to add things. Which tests to expect to update. What NOT to touch.>
## Open questions
<What is still being figured out. Empty if none.>

View File

@@ -1,34 +0,0 @@
---
title: <design name>
owner: <github handle>
status: proposed | accepted | rejected | superseded
date: YYYY-MM-DD
supersedes: <design-slug or none>
---
# <design name>
## Goal
<2-4 sentences. What this design is trying to accomplish.>
## Context
<1-2 paragraphs. The current state, what is failing, why this needs to change.>
## Selected Approach
<The chosen design at a high level. Architecture, components, data flow.>
## Alternatives Considered
### 1. <name>
<2-3 sentences on what this would look like, then pro/con and why rejected (or deferred).>
### 2. <name>
<Same shape.>
## Out of Scope
<What this design does NOT cover. Defer references.>
## Rollout
<Numbered steps from "nothing exists" to "fully shipped".>
## Open Questions
<Resolved during design? Empty. Unresolved? List with owner.>

View File

@@ -1,29 +0,0 @@
---
title: <feature name>
owner: <github handle>
status: shipped | wip | deprecated
date: YYYY-MM-DD
prs: ["#NNN"]
tags: [agent, browser, mcp]
---
# <feature name>
## What it does
<2-3 sentences. What can someone now do that they could not before. Lead with user-facing impact, not implementation.>
## Why we built it
<1-2 sentences. Motivation. What pain it removed or what unlocked.>
## How it works
<3-6 sentences. The flow at a high level. Name the key files.>
## Key files
- `path/to/file.ts` — what it does
- `path/to/other.ts` — what it does
## How to run / test it locally
<bullet list of commands. Empty section if N/A do not fake.>
## Gotchas
<known sharp edges. "If you see X, that's why." Empty if N/A.>

157
.github/workflows/build-agent.yml vendored Normal file
View File

@@ -0,0 +1,157 @@
name: build-agent
on:
workflow_dispatch:
inputs:
agent:
description: "Agent name from bundle.json"
required: true
type: string
default: openclaw
publish:
description: "Upload to R2 and merge manifest slice"
required: false
default: false
type: boolean
pull_request:
paths:
- "packages/browseros-agent/packages/build-tools/**"
- ".github/workflows/build-agent.yml"
env:
BUN_VERSION: "1.3.6"
PKG_DIR: packages/browseros-agent/packages/build-tools
permissions:
contents: read
jobs:
check:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
with:
bun-version: ${{ env.BUN_VERSION }}
- working-directory: packages/browseros-agent
run: bun install --frozen-lockfile
- working-directory: packages/browseros-agent
run: bun run --filter @browseros/build-tools typecheck
- working-directory: packages/browseros-agent
run: bun run --filter @browseros/build-tools test
build:
needs: check
strategy:
fail-fast: false
matrix:
include:
- arch: arm64
runner: ubuntu-24.04-arm
runs-on: ${{ matrix.runner }}
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
with:
bun-version: ${{ env.BUN_VERSION }}
- name: Install podman
run: |
sudo apt-get update
sudo apt-get install -y podman
- working-directory: packages/browseros-agent
run: bun install --frozen-lockfile
- name: Build tarball
working-directory: ${{ env.PKG_DIR }}
env:
AGENT: ${{ inputs.agent || 'openclaw' }}
OUT: ${{ github.workspace }}/dist/images
run: bun run build:tarball -- --agent "$AGENT" --arch "${{ matrix.arch }}" --output-dir "$OUT"
- uses: actions/upload-artifact@v4
with:
name: tarball-${{ inputs.agent || 'openclaw' }}-${{ matrix.arch }}
path: dist/images/
retention-days: 7
smoke:
needs: build
runs-on: ubuntu-24.04-arm
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
with:
bun-version: ${{ env.BUN_VERSION }}
- uses: actions/download-artifact@v4
with:
name: tarball-${{ inputs.agent || 'openclaw' }}-arm64
path: dist/images
- name: Install podman
run: |
sudo apt-get update
sudo apt-get install -y podman
- working-directory: packages/browseros-agent
run: bun install --frozen-lockfile
- name: Smoke test tarball
working-directory: ${{ env.PKG_DIR }}
env:
AGENT: ${{ inputs.agent || 'openclaw' }}
run: |
set -euo pipefail
tarball="$(find "$GITHUB_WORKSPACE/dist/images" -name "${AGENT}-*-arm64.tar.gz" -print -quit)"
if [ -z "$tarball" ]; then
echo "missing arm64 tarball artifact for ${AGENT}" >&2
exit 1
fi
bun run smoke:tarball -- --agent "$AGENT" --arch arm64 --tarball "$tarball"
publish:
needs: [build, smoke]
if: ${{ github.event_name == 'workflow_dispatch' && inputs.publish == true }}
runs-on: ubuntu-24.04
environment: release
concurrency:
group: r2-manifest-publish
cancel-in-progress: false
steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v2
with:
bun-version: ${{ env.BUN_VERSION }}
- uses: actions/download-artifact@v4
with:
pattern: tarball-*
path: dist/images
merge-multiple: true
- working-directory: packages/browseros-agent
run: bun install --frozen-lockfile
- name: Upload tarballs to R2
working-directory: ${{ env.PKG_DIR }}
env:
R2_ACCOUNT_ID: ${{ secrets.R2_ACCOUNT_ID }}
R2_ACCESS_KEY_ID: ${{ secrets.R2_ACCESS_KEY_ID }}
R2_SECRET_ACCESS_KEY: ${{ secrets.R2_SECRET_ACCESS_KEY }}
R2_BUCKET: ${{ secrets.R2_BUCKET }}
run: |
set -euo pipefail
for file in "$GITHUB_WORKSPACE"/dist/images/*.tar.gz; do
base="$(basename "$file")"
bun run upload -- --file "$file" --key "vm/images/$base" --content-type "application/gzip" --sidecar-sha
done
- name: Merge agent slice into manifest
working-directory: ${{ env.PKG_DIR }}
env:
AGENT: ${{ inputs.agent || 'openclaw' }}
R2_ACCOUNT_ID: ${{ secrets.R2_ACCOUNT_ID }}
R2_ACCESS_KEY_ID: ${{ secrets.R2_ACCESS_KEY_ID }}
R2_SECRET_ACCESS_KEY: ${{ secrets.R2_SECRET_ACCESS_KEY }}
R2_BUCKET: ${{ secrets.R2_BUCKET }}
run: |
set -euo pipefail
mkdir -p dist/images
cp -R "$GITHUB_WORKSPACE"/dist/images/* dist/images/
bun run download -- --key vm/manifest.json --out dist/baseline-manifest.json
bun run emit-manifest -- \
--slice "agents:${AGENT}" \
--dist-dir dist \
--merge-from dist/baseline-manifest.json \
--out dist/manifest.json
bun run upload -- --file dist/manifest.json --key vm/manifest.json --content-type "application/json"

View File

@@ -14,7 +14,7 @@ on:
config:
description: 'Eval config file (relative to apps/eval/)'
required: false
default: 'configs/legacy/browseros-agent-weekly.json'
default: 'configs/browseros-agent-weekly.json'
permissions:
contents: read
@@ -42,12 +42,10 @@ jobs:
- name: Install dependencies
working-directory: packages/browseros-agent
run: bun install --ignore-scripts
run: bun install --ignore-scripts && bun run build:agent-sdk
- name: Install Python eval dependencies
# agisdk pinned so silent upstream releases can't shift task definitions
# or grader behavior. Bump intentionally with a documented re-baseline.
run: pip install agisdk==0.3.5 requests
run: pip install agisdk requests
- name: Clone WebArena-Infinity
run: git clone --depth 1 https://github.com/web-arena-x/webarena-infinity.git /tmp/webarena-infinity
@@ -62,27 +60,33 @@ jobs:
curl -sL -o /tmp/nopecha.zip https://github.com/NopeCHALLC/nopecha-extension/releases/latest/download/chromium_automation.zip
unzip -qo /tmp/nopecha.zip -d extensions/nopecha
- name: Run eval and publish to R2
- name: Run eval
working-directory: packages/browseros-agent/apps/eval
env:
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
NOPECHA_API_KEY: ${{ secrets.NOPECHA_API_KEY }}
BROWSEROS_BINARY: /usr/bin/browseros
WEBARENA_INFINITY_DIR: /tmp/webarena-infinity
EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/browseros-agent-weekly.json' }}
run: |
echo "Running eval with config: $EVAL_CONFIG"
xvfb-run --auto-servernum --server-args="-screen 0 1440x900x24" bun run src/index.ts -c "$EVAL_CONFIG"
- name: Upload runs to R2
if: success()
working-directory: packages/browseros-agent/apps/eval
env:
EVAL_R2_ACCOUNT_ID: ${{ secrets.EVAL_R2_ACCOUNT_ID }}
EVAL_R2_ACCESS_KEY_ID: ${{ secrets.EVAL_R2_ACCESS_KEY_ID }}
EVAL_R2_SECRET_ACCESS_KEY: ${{ secrets.EVAL_R2_SECRET_ACCESS_KEY }}
EVAL_R2_BUCKET: ${{ secrets.EVAL_R2_BUCKET }}
EVAL_R2_CDN_BASE_URL: ${{ secrets.EVAL_R2_CDN_BASE_URL }}
BROWSEROS_BINARY: /usr/bin/browseros
WEBARENA_INFINITY_DIR: /tmp/webarena-infinity
# OpenClaw container runtime is macOS-only; opt the Linux runner
# into the no-op stub so the server can boot and the eval can run.
BROWSEROS_SKIP_OPENCLAW: '1'
EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/legacy/browseros-agent-weekly.json' }}
EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/browseros-agent-weekly.json' }}
run: |
echo "Running eval with config: $EVAL_CONFIG"
xvfb-run --auto-servernum --server-args="-screen 0 1440x900x24" bun run src/index.ts suite --config "$EVAL_CONFIG" --publish r2
CONFIG_NAME=$(basename "$EVAL_CONFIG" .json)
bun scripts/upload-run.ts "results/$CONFIG_NAME"
- name: Generate trend report
if: success()
@@ -103,11 +107,3 @@ jobs:
with:
name: eval-report-${{ github.run_id }}
path: /tmp/eval-report.html
- name: Upload server stderr logs (for post-mortem on startup failures)
if: always()
uses: actions/upload-artifact@v4
with:
name: browseros-server-logs-${{ github.run_id }}
path: /tmp/browseros-server-logs/
if-no-files-found: ignore

View File

@@ -1,11 +1,168 @@
name: Release BrowserOS Agent SDK (disabled)
name: Release BrowserOS Agent SDK
on:
workflow_dispatch:
concurrency:
group: release-agent-sdk
cancel-in-progress: false
jobs:
disabled:
if: ${{ false }}
publish:
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
defaults:
run:
working-directory: packages/browseros-agent/packages/agent-sdk
steps:
- run: echo "Agent SDK publishing is disabled."
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: oven-sh/setup-bun@v2
- uses: actions/setup-node@v6
with:
node-version: "20"
registry-url: "https://registry.npmjs.org"
- name: Install dependencies
run: bun ci
working-directory: packages/browseros-agent
- name: Build
run: bun run build
- name: Test
run: bun test
- name: Get version
id: version
run: |
echo "version=$(node -p "require('./package.json').version")" >> "$GITHUB_OUTPUT"
echo "release_sha=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"
- name: Generate release notes
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
SDK_PATH="packages/browseros-agent/packages/agent-sdk"
CURRENT_TAG="agent-sdk-v${{ steps.version.outputs.version }}"
# Find the previous tag, excluding the current version's tag
# (which may already exist from a prior failed run)
PREV_TAG=$(git tag -l "agent-sdk-v*" --sort=-v:refname | grep -v "^${CURRENT_TAG}$" | head -n 1)
if [ -z "$PREV_TAG" ]; then
echo "Initial release" > /tmp/release-notes.md
else
# Get commits scoped to the SDK directory
COMMITS=$(git log "$PREV_TAG"..HEAD --pretty=format:"%H" -- "$SDK_PATH")
if [ -z "$COMMITS" ]; then
echo "No notable changes." > /tmp/release-notes.md
else
echo "## What's Changed" > /tmp/release-notes.md
echo "" >> /tmp/release-notes.md
# For each commit, find the associated PR and format with author
CONTRIBUTORS=""
while IFS= read -r SHA; do
# Get commit subject and author
SUBJECT=$(git log -1 --pretty=format:"%s" "$SHA")
AUTHOR=$(git log -1 --pretty=format:"%an" "$SHA")
GITHUB_USER=$(gh api "/repos/${{ github.repository }}/commits/${SHA}" --jq '.author.login // empty' 2>/dev/null)
# Find associated PR number
PR_NUM=$(gh api "/repos/${{ github.repository }}/commits/${SHA}/pulls" --jq '.[0].number // empty' 2>/dev/null)
# Format line: skip PR number if already in the commit subject
# (squash merges include "(#123)" in the subject automatically)
if [ -n "$PR_NUM" ] && ! echo "$SUBJECT" | grep -qF "(#${PR_NUM})"; then
echo "- ${SUBJECT} (#${PR_NUM})" >> /tmp/release-notes.md
else
echo "- ${SUBJECT}" >> /tmp/release-notes.md
fi
done <<< "$COMMITS"
fi
fi
working-directory: ${{ github.workspace }}
- name: Publish
run: npm publish --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
- name: Create GitHub release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
TAG="agent-sdk-v${{ steps.version.outputs.version }}"
RELEASE_SHA="${{ steps.version.outputs.release_sha }}"
TITLE="BrowserOS Agent SDK - v${{ steps.version.outputs.version }}"
# Create or reuse tag (idempotent for re-runs)
if git rev-parse "$TAG" >/dev/null 2>&1; then
echo "Tag $TAG already exists, skipping tag creation"
else
git tag "$TAG" "$RELEASE_SHA"
fi
# Push tag (skip if already on remote)
if git ls-remote --tags origin "$TAG" | grep -q "$TAG"; then
echo "Tag $TAG already on remote, skipping push"
else
git push origin "$TAG"
fi
# Create or update release
if gh release view "$TAG" >/dev/null 2>&1; then
echo "Release $TAG already exists, updating"
gh release edit "$TAG" --title "$TITLE" --notes-file /tmp/release-notes.md
else
gh release create "$TAG" --title "$TITLE" --notes-file /tmp/release-notes.md
fi
working-directory: ${{ github.workspace }}
- name: Update CHANGELOG.md via PR
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
VERSION="${{ steps.version.outputs.version }}"
DATE=$(date -u +"%Y-%m-%d")
BRANCH="docs/agent-sdk-changelog-v${VERSION}"
CHANGELOG="packages/browseros-agent/packages/agent-sdk/CHANGELOG.md"
# Return to main before branching
git checkout main
# Use head/tail to safely insert without sed quoting issues
{
head -n 1 "$CHANGELOG"
echo ""
echo "## v${VERSION} (${DATE})"
echo ""
cat /tmp/release-notes.md
echo ""
tail -n +2 "$CHANGELOG"
} > /tmp/new-changelog.md
mv /tmp/new-changelog.md "$CHANGELOG"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git checkout -b "$BRANCH"
git add "$CHANGELOG"
git commit -m "docs: update agent-sdk changelog for v${VERSION}"
git push origin "$BRANCH"
gh pr create \
--title "docs: update agent-sdk changelog for v${VERSION}" \
--body "Auto-generated changelog update for BrowserOS Agent SDK v${VERSION}." \
--base main \
--head "$BRANCH"
gh pr merge "$BRANCH" --squash --auto || true
working-directory: ${{ github.workspace }}

View File

@@ -1,53 +0,0 @@
name: Sync internal-docs submodule
on:
schedule:
- cron: '0 */4 * * *'
workflow_dispatch:
jobs:
sync:
name: Bump internal-docs submodule pointer on dev
runs-on: ubuntu-latest
steps:
- name: Rewrite SSH submodule URL to HTTPS-with-token
env:
TOKEN: ${{ secrets.INTERNAL_DOCS_SYNC_TOKEN }}
run: |
git config --global "url.https://x-access-token:${TOKEN}@github.com/.insteadOf" "git@github.com:"
- uses: actions/checkout@v4
with:
token: ${{ secrets.INTERNAL_DOCS_SYNC_TOKEN }}
submodules: true
ref: dev
fetch-depth: 50
- name: Bump submodule pointer if internal-docs has new commits
env:
GH_TOKEN: ${{ secrets.INTERNAL_DOCS_SYNC_TOKEN }}
run: |
set -e
# Skip if submodule not yet configured (handoff window before someone adds it)
if ! git config --file .gitmodules --get-regexp '^submodule\..internal-docs\.path$' >/dev/null 2>&1; then
echo "internal-docs submodule not yet configured in .gitmodules. Skipping."
exit 0
fi
git submodule update --remote --merge .internal-docs
if git diff --quiet .internal-docs; then
echo "No internal-docs changes to sync."
exit 0
fi
git config user.name "browseros-bot"
git config user.email "bot@browseros.ai"
git add .internal-docs
git commit -m "chore: sync internal-docs submodule"
# Rebase onto latest dev to absorb any commits that landed during the run,
# then push. set -e takes care of failing the run on rebase conflict.
git pull --rebase origin dev
git push origin dev

View File

@@ -58,20 +58,28 @@ jobs:
command: (cd apps/server && bun run test:lib)
junit_path: test-results/server-lib.xml
needs_browser: false
- suite: server-sdk
command: (cd apps/server && bun run test:sdk)
junit_path: test-results/server-sdk.xml
needs_browser: true
- suite: server-root
command: (cd apps/server && bun run test:root)
junit_path: test-results/server-root.xml
needs_browser: false
- suite: agent
command: (cd apps/agent && bun run test)
command: bun run test:agent
junit_path: test-results/agent.xml
needs_browser: false
- suite: eval
command: (cd apps/eval && bun run test)
command: bun run test:eval
junit_path: test-results/eval.xml
needs_browser: false
- suite: agent-sdk
command: bun run test:agent-sdk
junit_path: test-results/agent-sdk.xml
needs_browser: false
- suite: build
command: bun run ./scripts/run-bun-test.ts ./scripts/build
command: bun run test:build
junit_path: test-results/build.xml
needs_browser: false

View File

@@ -188,21 +188,6 @@ We'd love your help making BrowserOS better! See our [Contributing Guide](CONTRI
- [ungoogled-chromium](https://github.com/ungoogled-software/ungoogled-chromium) — BrowserOS uses some patches for enhanced privacy. Thanks to everyone behind this project!
- [The Chromium Project](https://www.chromium.org/) — at the core of BrowserOS, making it possible to exist in the first place.
## Citation
If you use BrowserOS in your research or project, please cite:
```bibtex
@software{browseros2025,
author = {Nithin Sonti and Nikhil Sonti and {BrowserOS-team}},
title = {BrowserOS: The open-source Agentic browser},
url = {https://github.com/browseros-ai/BrowserOS},
year = {2025},
publisher = {GitHub},
license = {AGPL-3.0},
}
```
## License
BrowserOS is open source under the [AGPL-3.0 license](LICENSE).

View File

@@ -1,6 +1,6 @@
# BrowserOS Agent
The agent platform powering [BrowserOS](https://github.com/browseros-ai/BrowserOS) — contains the MCP server, agent UI, CLI, and evaluation framework.
The agent platform powering [BrowserOS](https://github.com/browseros-ai/BrowserOS) — contains the MCP server, agent UI, CLI, evaluation framework, and SDK.
## Monorepo Structure
@@ -12,6 +12,7 @@ apps/
eval/ # Evaluation framework for benchmarking agents
packages/
agent-sdk/ # Node.js SDK (@browseros-ai/agent-sdk)
cdp-protocol/ # Type-safe Chrome DevTools Protocol bindings
shared/ # Shared constants (ports, timeouts, limits)
```
@@ -22,6 +23,7 @@ packages/
| `apps/agent` | Agent UI — Chrome extension for the chat interface |
| `apps/cli` | Go CLI — control BrowserOS from the terminal or AI coding agents |
| `apps/eval` | Benchmark framework — WebVoyager, Mind2Web evaluation |
| `packages/agent-sdk` | Node.js SDK for browser automation with natural language |
| `packages/cdp-protocol` | Auto-generated CDP type bindings used by the server |
| `packages/shared` | Shared constants used across packages |
@@ -79,15 +81,14 @@ cp apps/server/.env.example apps/server/.env.development
cp apps/agent/.env.example apps/agent/.env.development
cp apps/server/.env.production.example apps/server/.env.production
# Install deps and generate agent code
# Install deps, generate agent code, and sync the VM cache
bun run dev:setup
# Start the full dev environment
bun run dev:watch
```
`dev:watch` starts the server immediately. OpenClaw VM/image prewarm runs from
the server startup path and pulls the configured GHCR image on demand.
`dev:watch` exits when the VM cache manifest is missing, but setup stays in `dev:setup`.
### Environment Variables
@@ -157,14 +158,9 @@ bun run build:server # Build production server resource artifacts and u
bun run build:agent # Build agent extension
# Test
bun run test # Run all tests
bun run test:all # Run all tests
bun run test:main # Run key server tools and integration tests
# App-specific test groups (from packages/browseros-agent)
cd apps/server && bun run test:tools
cd apps/server && bun run test:cdp
cd apps/server && bun run test:integration
bun run test # Run standard tests
bun run test:cdp # Run CDP-based tests
bun run test:integration # Run integration tests
# Quality
bun run lint # Check with Biome

View File

@@ -1,50 +0,0 @@
import type { Provider } from './chatComponentTypes'
export interface ProviderOptionGroup {
key: 'llm' | 'acp'
label: string
options: Provider[]
}
export function groupProviderOptions(
providers: Provider[],
): ProviderOptionGroup[] {
const llm = providers.filter((provider) => provider.kind !== 'acp')
const acp = providers.filter((provider) => provider.kind === 'acp')
return [
...(llm.length
? [{ key: 'llm' as const, label: 'AI Providers', options: llm }]
: []),
...(acp.length
? [{ key: 'acp' as const, label: 'Agents', options: acp }]
: []),
]
}
export function getProviderSearchValue(
provider: Provider,
groupLabel: string,
): string {
return [
provider.id,
provider.name,
provider.type,
groupLabel,
provider.adapterName,
provider.modelLabel,
]
.filter(Boolean)
.join(' ')
}
export function getProviderSubtitle(provider: Provider): string | undefined {
if (provider.kind !== 'acp') return undefined
return [
provider.adapterName,
provider.modelLabel,
provider.modelControl === 'best-effort' ? 'best effort' : undefined,
]
.filter(Boolean)
.join(' · ')
}

View File

@@ -1,72 +0,0 @@
import { describe, expect, it } from 'bun:test'
import {
getProviderSearchValue,
getProviderSubtitle,
groupProviderOptions,
} from './ChatProviderSelector.helpers'
import type { Provider } from './chatComponentTypes'
const options: Provider[] = [
{ kind: 'llm', id: 'browseros', name: 'BrowserOS', type: 'browseros' },
{
kind: 'llm',
id: 'anthropic-sonnet',
name: 'Anthropic Sonnet',
type: 'anthropic',
},
{
kind: 'acp',
id: 'agent-claude-review',
name: 'Review Bot',
type: 'acp',
adapterName: 'Claude Code',
modelLabel: 'Haiku',
modelControl: 'best-effort',
},
{
kind: 'acp',
id: 'agent-codex-browser',
name: 'Browser Driver',
type: 'acp',
adapterName: 'Codex',
modelLabel: 'GPT-5.5',
modelControl: 'runtime-supported',
},
]
describe('groupProviderOptions', () => {
it('groups normal providers separately from created agents', () => {
expect(groupProviderOptions(options)).toEqual([
{
key: 'llm',
label: 'AI Providers',
options: [options[0], options[1]],
},
{
key: 'acp',
label: 'Agents',
options: [options[2], options[3]],
},
])
})
})
describe('getProviderSearchValue', () => {
it('matches created-agent group labels and item labels', () => {
expect(getProviderSearchValue(options[2], 'Agents')).toContain('Agents')
expect(getProviderSearchValue(options[2], 'Agents')).toContain('Review Bot')
expect(getProviderSearchValue(options[2], 'Agents')).toContain(
'Claude Code',
)
})
})
describe('getProviderSubtitle', () => {
it('describes created-agent runtime context without model-target copy', () => {
expect(getProviderSubtitle(options[2])).toBe(
'Claude Code · Haiku · best effort',
)
expect(getProviderSubtitle(options[3])).toBe('Codex · GPT-5.5')
expect(getProviderSubtitle(options[0])).toBeUndefined()
})
})

View File

@@ -1,4 +1,4 @@
import { Bot, Check, Plus } from 'lucide-react'
import { Check, Plus } from 'lucide-react'
import type { FC, PropsWithChildren } from 'react'
import { useState } from 'react'
import {
@@ -17,11 +17,6 @@ import {
import { BrowserOSIcon, ProviderIcon } from '@/lib/llm-providers/providerIcons'
import type { ProviderType } from '@/lib/llm-providers/types'
import { cn } from '@/lib/utils'
import {
getProviderSearchValue,
getProviderSubtitle,
groupProviderOptions,
} from './ChatProviderSelector.helpers'
import type { Provider } from './chatComponentTypes'
interface ChatProviderSelectorProps {
@@ -34,58 +29,54 @@ export const ChatProviderSelector: FC<
PropsWithChildren<ChatProviderSelectorProps>
> = ({ children, providers, selectedProvider, onSelectProvider }) => {
const [open, setOpen] = useState(false)
const groups = groupProviderOptions(providers)
return (
<Popover open={open} onOpenChange={setOpen}>
<PopoverTrigger asChild>{children}</PopoverTrigger>
<PopoverContent side="bottom" align="start" className="w-64 p-0">
<PopoverContent side="bottom" align="start" className="w-48 p-0">
<Command>
<CommandInput
placeholder="Search providers or agents..."
className="h-9"
/>
<CommandInput placeholder="Search providers..." className="h-9" />
<CommandList>
<div className="my-2 px-2 font-semibold text-muted-foreground text-xs uppercase tracking-wide">
AI Provider
</div>
<CommandEmpty>No provider found</CommandEmpty>
{groups.map((group) => (
<CommandGroup key={group.key} heading={group.label}>
{group.options.map((provider) => {
const isSelected = selectedProvider.id === provider.id
const subtitle = getProviderSubtitle(provider)
return (
<CommandItem
key={provider.id}
value={getProviderSearchValue(provider, group.label)}
onSelect={() => {
onSelectProvider(provider)
setOpen(false)
}}
className={cn(
'flex w-full items-center gap-3 rounded-md p-2 transition-colors',
isSelected && 'bg-[var(--accent-orange)]/10',
<CommandGroup>
{providers.map((provider) => {
const isSelected = selectedProvider.id === provider.id
return (
<CommandItem
key={provider.id}
value={`${provider.id} ${provider.name}`}
onSelect={() => {
onSelectProvider(provider)
setOpen(false)
}}
className={cn(
'flex w-full items-center gap-3 rounded-md p-2 transition-colors',
isSelected && 'bg-[var(--accent-orange)]/10',
)}
>
<span className="text-muted-foreground">
{provider.type === 'browseros' ? (
<BrowserOSIcon size={18} />
) : (
<ProviderIcon
type={provider.type as ProviderType}
size={18}
/>
)}
>
<span className="text-muted-foreground">
<ProviderOptionIcon provider={provider} />
</span>
<span className="min-w-0 flex-1 text-left">
<span className="block truncate text-sm">
{provider.name}
</span>
{subtitle && (
<span className="block truncate text-muted-foreground text-xs">
{subtitle}
</span>
)}
</span>
{isSelected && (
<Check className="h-3.5 w-3.5 text-[var(--accent-orange)]" />
)}
</CommandItem>
)
})}
</CommandGroup>
))}
</span>
<span className="flex-1 text-left text-sm">
{provider.name}
</span>
{isSelected && (
<Check className="h-3.5 w-3.5 text-[var(--accent-orange)]" />
)}
</CommandItem>
)
})}
</CommandGroup>
<div className="border-border border-t p-1">
<button
type="button"
@@ -105,9 +96,3 @@ export const ChatProviderSelector: FC<
</Popover>
)
}
function ProviderOptionIcon({ provider }: { provider: Provider }) {
if (provider.kind === 'acp') return <Bot size={18} />
if (provider.type === 'browseros') return <BrowserOSIcon size={18} />
return <ProviderIcon type={provider.type as ProviderType} size={18} />
}

View File

@@ -1,14 +1,7 @@
import type { ProviderType } from '@/lib/llm-providers/types'
export type ChatProviderType = ProviderType | 'acp'
export interface Provider {
id: string
name: string
type: ChatProviderType
kind: 'llm' | 'acp'
agentId?: string
adapterName?: string
modelLabel?: string
modelControl?: 'runtime-supported' | 'best-effort'
type: ProviderType
}

View File

@@ -0,0 +1,136 @@
import { Bot, Loader2, Wrench } from 'lucide-react'
import type { FC } from 'react'
import type { AgentCardData } from '@/lib/agent-conversations/types'
import { cn } from '@/lib/utils'
interface AgentCardProps {
agent: AgentCardData
onClick: () => void
active?: boolean
}
function formatTimestamp(timestamp?: number): string {
if (!timestamp) return 'No activity yet'
const diff = Date.now() - timestamp
const minutes = Math.floor(diff / 60000)
if (minutes < 1) return 'just now'
if (minutes < 60) return `${minutes}m ago`
const hours = Math.floor(minutes / 60)
if (hours < 24) return `${hours}h ago`
return `${Math.floor(hours / 24)}d ago`
}
function getStatusLabel(status: AgentCardData['status']): string {
if (status === 'working') return 'Working'
if (status === 'error') return 'Error'
return 'Ready'
}
function getStatusTone(status: AgentCardData['status']): string {
if (status === 'working') return 'bg-amber-500'
if (status === 'error') return 'bg-destructive'
return 'bg-emerald-500'
}
function formatCost(usd: number): string {
if (usd < 0.005) return `$${usd.toFixed(4)}`
return `$${usd.toFixed(2)}`
}
export const AgentCardExpanded: FC<AgentCardProps> = ({
agent,
onClick,
active,
}) => (
<button
type="button"
onClick={onClick}
className={cn(
'group flex min-h-32 w-full min-w-0 flex-col rounded-2xl border p-4 text-left shadow-sm transition-all duration-200',
active
? 'border-border/80 bg-card shadow-md ring-1 ring-[var(--accent-orange)]/20'
: 'border-border/60 bg-card/85 hover:border-border hover:bg-card hover:shadow-md',
)}
>
<div className="flex items-start justify-between gap-3">
<div className="flex min-w-0 items-center gap-3">
<div
className={cn(
'flex size-10 shrink-0 items-center justify-center rounded-xl',
active
? 'bg-[var(--accent-orange)]/10 text-[var(--accent-orange)]'
: 'bg-muted text-muted-foreground',
)}
>
<Bot className="size-5" />
</div>
<div className="min-w-0">
<div className="truncate font-semibold text-sm">{agent.name}</div>
<div className="truncate text-muted-foreground text-xs">
{agent.model ?? 'OpenClaw agent'}
</div>
</div>
</div>
<div className="flex items-center gap-2 rounded-full border border-border/60 bg-background/70 px-2.5 py-1 text-[11px] text-muted-foreground">
<span
className={cn('size-2 rounded-full', getStatusTone(agent.status))}
/>
<span>{getStatusLabel(agent.status)}</span>
</div>
</div>
<div className="mt-4 flex-1">
<p className="line-clamp-2 text-foreground/90 text-sm">
{agent.lastMessage ??
'Start a conversation to see recent work and summaries.'}
</p>
</div>
<div className="mt-4 space-y-1.5 text-muted-foreground text-xs">
<div className="flex items-center justify-between gap-3">
<span>{formatTimestamp(agent.lastMessageTimestamp)}</span>
{agent.costUsd ? (
<span className="tabular-nums opacity-70">
{formatCost(agent.costUsd)}
</span>
) : null}
</div>
{agent.status === 'working' && agent.currentTool ? (
<div className="flex items-center gap-1.5 text-[var(--accent-orange)]/70">
<Loader2 className="size-3 shrink-0 animate-spin" />
<span className="truncate">{agent.currentTool}</span>
</div>
) : agent.activitySummary ? (
<div className="flex items-center gap-1.5 text-muted-foreground/60">
<Wrench className="size-3 shrink-0" />
<span className="truncate">{agent.activitySummary}</span>
</div>
) : null}
</div>
</button>
)
export const AgentCardCompact: FC<AgentCardProps> = ({
agent,
onClick,
active,
}) => (
<button
type="button"
onClick={onClick}
className={cn(
'inline-flex items-center gap-2 rounded-full border px-3 py-2 text-sm transition-colors',
active
? 'border-border bg-card shadow-sm ring-1 ring-[var(--accent-orange)]/20'
: 'border-border/60 bg-card/85 text-foreground hover:border-border hover:bg-card',
)}
>
<span
className={cn(
'size-2 rounded-full',
active ? 'bg-[var(--accent-orange)]' : getStatusTone(agent.status),
)}
/>
<span className="truncate">{agent.name}</span>
</button>
)

View File

@@ -1,71 +1,70 @@
import { Plus } from 'lucide-react'
import type { FC } from 'react'
import type {
HarnessAdapterDescriptor,
HarnessAdapterHealth,
HarnessAgent,
HarnessAgentAdapter,
} from '@/entrypoints/app/agents/agent-harness-types'
import type { AgentCardData } from '@/lib/agent-conversations/types'
import { cn } from '@/lib/utils'
import { HomeAgentCard } from './HomeAgentCard'
import { AgentCardCompact, AgentCardExpanded } from './AgentCard'
interface AgentCardDockProps {
agents: HarnessAgent[]
adapters: HarnessAdapterDescriptor[]
agents: AgentCardData[]
activeAgentId?: string
onSelectAgent: (agentId: string) => void
onCreateAgent?: () => void
compact?: boolean
}
function CreateAgentButton({ onCreateAgent }: { onCreateAgent: () => void }) {
function CreateAgentButton({
compact,
onCreateAgent,
}: {
compact?: boolean
onCreateAgent: () => void
}) {
return (
<button
type="button"
onClick={onCreateAgent}
className={cn(
'flex min-h-32 shrink-0 items-center justify-center gap-2 rounded-2xl border border-dashed px-5 py-4 text-muted-foreground transition-colors',
'hover:border-[var(--accent-orange)] hover:text-[var(--accent-orange)]',
'flex shrink-0 items-center justify-center gap-2 border border-dashed text-muted-foreground transition-colors hover:border-[var(--accent-orange)] hover:text-[var(--accent-orange)]',
compact
? 'rounded-full px-3 py-2 text-sm'
: 'min-h-32 rounded-2xl px-5 py-4',
)}
>
<Plus className="size-5" />
<span>Create agent</span>
<Plus className={compact ? 'size-3.5' : 'size-5'} />
<span>{compact ? 'New' : 'Create agent'}</span>
</button>
)
}
/**
* 3-column grid of HomeAgentCards plus a trailing "Create agent"
* tile. The previous `compact` mode (rendered a horizontal pill rail)
* had no callers and was dropped along with the legacy AgentCard.
*/
export const AgentCardDock: FC<AgentCardDockProps> = ({
agents,
adapters,
activeAgentId,
onSelectAgent,
onCreateAgent,
compact,
}) => {
if (agents.length === 0 && !onCreateAgent) return null
const adapterHealth = new Map<HarnessAgentAdapter, HarnessAdapterHealth>()
for (const descriptor of adapters) {
if (descriptor.health) adapterHealth.set(descriptor.id, descriptor.health)
}
const Card = compact ? AgentCardCompact : AgentCardExpanded
return (
<div className="grid gap-4 md:grid-cols-3">
<div
className={cn(
compact
? 'flex items-center gap-2 overflow-x-auto pb-1'
: 'grid gap-4 md:grid-cols-3',
)}
>
{agents.map((agent) => (
<HomeAgentCard
key={agent.id}
<Card
key={agent.agentId}
agent={agent}
adapter={agent.adapter}
adapterHealth={adapterHealth.get(agent.adapter) ?? null}
active={agent.id === activeAgentId}
onClick={() => onSelectAgent(agent.id)}
active={agent.agentId === activeAgentId}
onClick={() => onSelectAgent(agent.agentId)}
/>
))}
{onCreateAgent ? (
<CreateAgentButton onCreateAgent={onCreateAgent} />
<CreateAgentButton compact={compact} onCreateAgent={onCreateAgent} />
) : null}
</div>
)

View File

@@ -1,13 +1,7 @@
import { ArrowLeft, Bot, Home } from 'lucide-react'
import { type FC, useEffect, useMemo, useRef } from 'react'
import { type FC, useEffect, useMemo, useRef, useState } from 'react'
import { Navigate, useNavigate, useParams, useSearchParams } from 'react-router'
import { Button } from '@/components/ui/button'
import {
cancelHarnessTurn,
useEnqueueHarnessMessage,
useHarnessAgents,
useRemoveHarnessQueuedMessage,
} from '@/entrypoints/app/agents/useAgents'
import {
type AgentEntry,
getModelDisplayName,
@@ -21,9 +15,10 @@ import {
filterTurnsPersistedInHistory,
flattenHistoryPages,
} from './claw-chat-types'
import { QueuePanel } from './QueuePanel'
import { useAgentConversation } from './useAgentConversation'
import { useClawChatHistory } from './useClawChatHistory'
import { useHarnessChatHistory } from './useHarnessChatHistory'
import { useOutboundQueue } from './useOutboundQueue'
function StatusBadge({ status }: { status: string }) {
return (
@@ -181,10 +176,19 @@ function getAgentEntryMeta(agent: AgentEntry | undefined): string {
return getModelDisplayName(agent?.model) ?? 'OpenClaw agent'
}
function getConversationStatusCopy(status: string | undefined): string {
if (status === 'running') return 'Ready'
if (status === 'starting') return 'Connecting'
if (status === 'error') return 'Attention'
if (status === 'stopped') return 'Offline'
return 'Setup'
}
function AgentConversationController({
agentId,
initialMessage,
onInitialMessageConsumed,
status,
agents,
agentPathPrefix,
createAgentPath,
@@ -192,6 +196,7 @@ function AgentConversationController({
agentId: string
initialMessage: string | null
onInitialMessageConsumed: () => void
status: ReturnType<typeof useAgentCommandData>['status']
agents: AgentEntry[]
agentPathPrefix: string
createAgentPath: string
@@ -199,67 +204,121 @@ function AgentConversationController({
const navigate = useNavigate()
const initialMessageSentRef = useRef<string | null>(null)
const onInitialMessageConsumedRef = useRef(onInitialMessageConsumed)
const [streamSessionKey, setStreamSessionKey] = useState<string | null>(null)
const agent = agents.find((entry) => entry.agentId === agentId)
const agentName = agent?.name || agentId || 'Agent'
// Routing is now harness-only. Every OpenClaw agent has a harness
// record post the gateway → harness backfill, so the chat panel
// always talks to /agents/<id>/chat. The legacy ClawChat surface
// was deleted with the /claw/agents/:id/chat server route.
const harnessHistoryQuery = useHarnessChatHistory(agentId, Boolean(agent))
const isAgentHarnessAgent = agent?.source === 'agent-harness'
const clawHistoryQuery = useClawChatHistory({
agentId,
sessionKey: streamSessionKey,
enabled: Boolean(agent) && !isAgentHarnessAgent,
})
const harnessHistoryQuery = useHarnessChatHistory(
agentId,
Boolean(agent) && isAgentHarnessAgent,
)
const historyMessages = useMemo(
() =>
flattenHistoryPages(
harnessHistoryQuery.data ? [harnessHistoryQuery.data] : [],
isAgentHarnessAgent
? harnessHistoryQuery.data
? [harnessHistoryQuery.data]
: []
: (clawHistoryQuery.data?.pages ?? []),
),
[harnessHistoryQuery.data],
[
clawHistoryQuery.data?.pages,
harnessHistoryQuery.data,
isAgentHarnessAgent,
],
)
const chatHistory = useMemo(
() => buildChatHistoryFromClawMessages(historyMessages),
[historyMessages],
)
// Listing query feeds queue + active-turn state for this agent. We
// already poll it every 5s for the rail; reusing the same cache
// keeps cross-tab queue state in sync without a second poll.
const { harnessAgents } = useHarnessAgents()
const harnessAgent = harnessAgents.find((entry) => entry.id === agentId)
const queue = harnessAgent?.queue ?? []
const activeTurnId = harnessAgent?.activeTurnId ?? null
const resolvedSessionKey =
streamSessionKey ??
(isAgentHarnessAgent
? null
: (clawHistoryQuery.data?.pages?.[0]?.sessionKey ?? null))
const { turns, streaming, send } = useAgentConversation(agentId, {
runtime: 'agent-harness',
sessionKey: null,
runtime: isAgentHarnessAgent ? 'agent-harness' : 'openclaw',
sessionKey: resolvedSessionKey,
history: chatHistory,
activeTurnId,
onComplete: () => {
void harnessHistoryQuery.refetch()
if (isAgentHarnessAgent) {
void harnessHistoryQuery.refetch()
}
},
onSessionKeyChange: (sessionKey) => {
setStreamSessionKey(sessionKey)
},
onSessionKeyChange: () => {},
})
const enqueueMessage = useEnqueueHarnessMessage()
const removeQueuedMessage = useRemoveHarnessQueuedMessage()
const handleStop = () => {
void cancelHarnessTurn(agentId, {
turnId: activeTurnId ?? undefined,
reason: 'user pressed stop',
})
}
const visibleTurns = useMemo(
() => filterTurnsPersistedInHistory(turns, historyMessages),
[historyMessages, turns],
() =>
isAgentHarnessAgent
? filterTurnsPersistedInHistory(turns, historyMessages)
: turns,
[historyMessages, isAgentHarnessAgent, turns],
)
const outboundQueue = useOutboundQueue({
agentId,
sessionKey: resolvedSessionKey,
enabled: Boolean(agent) && !isAgentHarnessAgent,
})
onInitialMessageConsumedRef.current = onInitialMessageConsumed
const disabled = !agent
// Refetch history whenever a server-dispatched queue item completes.
// The server worker streams the queued turn into OpenClaw directly, so
// the client never observes the live tokens — we only see the new
// assistant turn once the JSONL is updated. Watching the queue for
// any 'sending' item dropping out is the cleanest "turn finalized"
// signal we have without exposing per-turn SSE.
const previousSendingIdsRef = useRef<Set<string>>(new Set())
useEffect(() => {
if (isAgentHarnessAgent) return
const currentSending = new Set(
outboundQueue.queue
.filter((item) => item.status === 'sending')
.map((item) => item.id),
)
const dropped = [...previousSendingIdsRef.current].filter(
(id) => !currentSending.has(id),
)
previousSendingIdsRef.current = currentSending
if (dropped.length > 0) {
void clawHistoryQuery.refetch()
}
}, [clawHistoryQuery, isAgentHarnessAgent, outboundQueue.queue])
const disabled =
!agent || (!isAgentHarnessAgent && status?.status !== 'running')
// Two-part gate: cover both "still fetching" AND "just got enabled but
// hasn't started fetching yet". When `enabled` flips true (baseUrl
// resolves), there's a render frame where React Query reports
// isLoading=false but hasn't run the queryFn yet — `isFetched` is still
// false. Without this we render EmptyState during that one frame.
const isInitialLoading =
!isAgentHarnessAgent &&
(clawHistoryQuery.isLoading ||
(!clawHistoryQuery.isFetched && !clawHistoryQuery.isError))
const historyReady =
harnessHistoryQuery.isFetched || harnessHistoryQuery.isError
(isAgentHarnessAgent &&
(harnessHistoryQuery.isFetched || harnessHistoryQuery.isError)) ||
(!isAgentHarnessAgent &&
(clawHistoryQuery.isFetched || clawHistoryQuery.isError))
const initialMessageKey = initialMessage
? `${agentId}:${initialMessage}`
: null
const error = harnessHistoryQuery.error ?? null
const error = isAgentHarnessAgent
? (harnessHistoryQuery.error ?? null)
: (clawHistoryQuery.error ?? null)
const enqueueRef = useRef(outboundQueue.enqueue)
enqueueRef.current = outboundQueue.enqueue
const sendRef = useRef(send)
sendRef.current = send
@@ -281,8 +340,18 @@ function AgentConversationController({
initialMessageSentRef.current = initialMessageKey
onInitialMessageConsumedRef.current()
void sendRef.current({ text: query })
}, [disabled, historyReady, initialMessage, initialMessageKey])
if (isAgentHarnessAgent) {
void sendRef.current({ text: query })
} else {
enqueueRef.current({ text: query })
}
}, [
disabled,
historyReady,
initialMessage,
initialMessageKey,
isAgentHarnessAgent,
])
const handleSelectAgent = (entry: AgentEntry) => {
navigate(`${agentPathPrefix}/${entry.agentId}`)
@@ -295,26 +364,32 @@ function AgentConversationController({
historyMessages={historyMessages}
turns={visibleTurns}
streaming={streaming}
isInitialLoading={harnessHistoryQuery.isLoading}
isInitialLoading={
isAgentHarnessAgent ? harnessHistoryQuery.isLoading : isInitialLoading
}
error={error}
hasNextPage={false}
isFetchingNextPage={false}
onFetchNextPage={() => {}}
hasNextPage={
isAgentHarnessAgent ? false : Boolean(clawHistoryQuery.hasNextPage)
}
isFetchingNextPage={
isAgentHarnessAgent ? false : clawHistoryQuery.isFetchingNextPage
}
onFetchNextPage={() => {
if (!isAgentHarnessAgent) {
void clawHistoryQuery.fetchNextPage()
}
}}
onRetry={() => {
void harnessHistoryQuery.refetch()
if (isAgentHarnessAgent) {
void harnessHistoryQuery.refetch()
} else {
void clawHistoryQuery.refetch()
}
}}
/>
<div className="border-border/50 border-t bg-background/88 px-4 py-3 backdrop-blur-md">
<div className="mx-auto max-w-3xl space-y-3">
{queue.length > 0 ? (
<QueuePanel
queue={queue}
onRemove={(messageId) =>
removeQueuedMessage.mutate({ agentId, messageId })
}
/>
) : null}
<div className="mx-auto max-w-3xl">
<ConversationInput
variant="conversation"
agents={agents}
@@ -329,30 +404,31 @@ function AgentConversationController({
name: a.name,
dataUrl: a.dataUrl,
}))
// When the agent already has an in-flight turn, route
// the new message into the durable queue instead of
// starting a parallel turn. Drains automatically as
// soon as the active turn ends.
if (streaming || activeTurnId) {
enqueueMessage.mutate({
agentId,
message: input.text,
if (isAgentHarnessAgent) {
void send({ text: input.text, attachments, attachmentPreviews })
} else {
outboundQueue.enqueue({
text: input.text,
attachments,
attachmentPreviews,
history: chatHistory,
})
return
}
void send({ text: input.text, attachments, attachmentPreviews })
}}
onCreateAgent={() => navigate(createAgentPath)}
onStop={handleStop}
streaming={streaming}
disabled={disabled}
status="running"
attachmentsEnabled={true}
placeholder={
streaming
? `Type to queue another message for ${agentName}...`
: `Message ${agentName}...`
status={isAgentHarnessAgent ? 'running' : status?.status}
attachmentsEnabled={!isAgentHarnessAgent}
placeholder={`Message ${agentName}...`}
outboundQueue={
isAgentHarnessAgent ? undefined : outboundQueue.queue
}
onCancelQueued={
isAgentHarnessAgent ? undefined : outboundQueue.cancel
}
onRetryQueued={
isAgentHarnessAgent ? undefined : outboundQueue.retry
}
/>
</div>
@@ -377,7 +453,7 @@ export const AgentCommandConversation: FC<AgentCommandConversationProps> = ({
const { agentId } = useParams<{ agentId: string }>()
const [searchParams, setSearchParams] = useSearchParams()
const navigate = useNavigate()
const { agents } = useAgentCommandData()
const { status, agents } = useAgentCommandData()
const shouldRedirectHome = !agentId
const resolvedAgentId = agentId ?? ''
const agent = agents.find((entry) => entry.agentId === resolvedAgentId)
@@ -395,11 +471,10 @@ export const AgentCommandConversation: FC<AgentCommandConversationProps> = ({
navigate(`${agentPathPrefix}/${entry.agentId}`)
}
// Every visible agent runs through the harness now, so per-agent
// runtime status doesn't gate chat the way OpenClaw's legacy
// gateway lifecycle did. Show "Ready" once the agent record is
// resolved from the rail, "Setup" otherwise.
const statusCopy = agent ? 'Ready' : 'Setup'
const statusCopy =
agent?.source === 'agent-harness'
? 'Ready'
: getConversationStatusCopy(status?.status)
return (
<div className="absolute inset-0 overflow-hidden bg-background md:pl-[theme(spacing.14)]">
@@ -425,6 +500,7 @@ export const AgentCommandConversation: FC<AgentCommandConversationProps> = ({
key={resolvedAgentId}
agentId={resolvedAgentId}
agents={agents}
status={status}
initialMessage={initialMessage}
onInitialMessageConsumed={() =>
setSearchParams({}, { replace: true })

View File

@@ -1,25 +1,18 @@
import { Plus } from 'lucide-react'
import { type FC, useEffect, useMemo, useState } from 'react'
import { type FC, useEffect, useState } from 'react'
import { useNavigate } from 'react-router'
import { Button } from '@/components/ui/button'
import { Card, CardContent } from '@/components/ui/card'
import { Separator } from '@/components/ui/separator'
import type {
HarnessAdapterDescriptor,
HarnessAgent,
} from '@/entrypoints/app/agents/agent-harness-types'
import {
useAgentAdapters,
useHarnessAgents,
} from '@/entrypoints/app/agents/useAgents'
import type { AgentEntry } from '@/entrypoints/app/agents/useOpenClaw'
import { ImportDataHint } from '@/entrypoints/newtab/index/ImportDataHint'
import { SignInHint } from '@/entrypoints/newtab/index/SignInHint'
import { useActiveHint } from '@/entrypoints/newtab/index/useActiveHint'
import type { AgentCardData } from '@/lib/agent-conversations/types'
import { AgentCardDock } from './AgentCardDock'
import { useAgentCommandData } from './agent-command-layout'
import { ConversationInput } from './ConversationInput'
import { orderHomeAgents } from './home-agent-card.helpers'
import { buildAgentCardData } from './useAgentCardData'
function EmptyAgentsState({ onOpenAgents }: { onOpenAgents: () => void }) {
return (
@@ -45,13 +38,11 @@ function EmptyAgentsState({ onOpenAgents }: { onOpenAgents: () => void }) {
function RecentThreads({
activeAgentId,
agents,
adapters,
onOpenAgents,
onSelectAgent,
}: {
activeAgentId?: string | null
agents: HarnessAgent[]
adapters: HarnessAdapterDescriptor[]
agents: AgentCardData[]
onOpenAgents: () => void
onSelectAgent: (agentId: string) => void
}) {
@@ -77,7 +68,6 @@ function RecentThreads({
</div>
<AgentCardDock
agents={agents}
adapters={adapters}
activeAgentId={activeAgentId ?? undefined}
onSelectAgent={onSelectAgent}
onCreateAgent={onOpenAgents}
@@ -89,32 +79,25 @@ function RecentThreads({
export const AgentCommandHome: FC = () => {
const navigate = useNavigate()
const activeHint = useActiveHint()
// The conversation input still consumes the merged AgentEntry list
// from the layout context (handles legacy /claw/agents entries that
// haven't yet been backfilled into the harness store). The Recent
// Agents grid below reads the richer harness payload directly.
const { agents: legacyAgents, status } = useAgentCommandData()
const { harnessAgents } = useHarnessAgents()
const { adapters } = useAgentAdapters()
const { agents, status } = useAgentCommandData()
const [selectedAgentId, setSelectedAgentId] = useState<string | null>(null)
const orderedAgents = useMemo(
() => orderHomeAgents(harnessAgents),
[harnessAgents],
)
const cardData = buildAgentCardData(agents, status?.status, undefined)
useEffect(() => {
if (legacyAgents.length === 0) {
if (selectedAgentId) setSelectedAgentId(null)
if (agents.length === 0) {
if (selectedAgentId) {
setSelectedAgentId(null)
}
return
}
if (
!selectedAgentId ||
!legacyAgents.some((agent) => agent.agentId === selectedAgentId)
!agents.some((agent) => agent.agentId === selectedAgentId)
) {
setSelectedAgentId(legacyAgents[0].agentId)
setSelectedAgentId(agents[0].agentId)
}
}, [legacyAgents, selectedAgentId])
}, [agents, selectedAgentId])
const handleSend = (input: { text: string }) => {
if (!selectedAgentId) return
@@ -127,7 +110,7 @@ export const AgentCommandHome: FC = () => {
setSelectedAgentId(agent.agentId)
}
const selectedAgent = legacyAgents.find(
const selectedAgent = agents.find(
(agent) => agent.agentId === selectedAgentId,
)
const selectedAgentReady = selectedAgent
@@ -135,15 +118,13 @@ export const AgentCommandHome: FC = () => {
: false
const selectedAgentStatus =
selectedAgent?.source === 'agent-harness' ? 'running' : status?.status
const selectedAgentName =
selectedAgent?.name ?? orderedAgents[0]?.name ?? 'your agent'
const hasAgents = legacyAgents.length > 0
const selectedCard =
cardData.find((agent) => agent.agentId === selectedAgentId) ?? cardData[0]
return (
<div className="min-h-full px-4 py-6">
<div className="mx-auto flex w-full max-w-5xl flex-col gap-8">
{hasAgents ? (
{cardData.length > 0 ? (
<>
<div className="flex flex-col items-center gap-5 pt-[max(10vh,24px)] text-center">
<div className="space-y-3">
@@ -159,7 +140,7 @@ export const AgentCommandHome: FC = () => {
<div className="w-full max-w-3xl">
<ConversationInput
variant="home"
agents={legacyAgents}
agents={agents}
selectedAgentId={selectedAgentId}
onSelectAgent={handleSelectAgent}
onSend={handleSend}
@@ -170,7 +151,7 @@ export const AgentCommandHome: FC = () => {
attachmentsEnabled={false}
placeholder={
selectedAgentReady
? `Ask ${selectedAgentName} to handle a task...`
? `Ask ${selectedCard?.name ?? 'your agent'} to handle a task...`
: 'Agent runtime is not running...'
}
/>
@@ -181,8 +162,7 @@ export const AgentCommandHome: FC = () => {
<RecentThreads
activeAgentId={selectedAgentId}
agents={orderedAgents}
adapters={adapters}
agents={cardData}
onOpenAgents={() => navigate('/agents')}
onSelectAgent={(agentId) => navigate(`/home/agents/${agentId}`)}
/>

View File

@@ -1,4 +1,5 @@
import {
AlertTriangle,
ArrowRight,
Bot,
ChevronDown,
@@ -8,6 +9,7 @@ import {
Loader2,
Mic,
Paperclip,
RefreshCw,
Square,
X,
} from 'lucide-react'
@@ -36,6 +38,7 @@ import { cn } from '@/lib/utils'
import { useVoiceInput } from '@/lib/voice/useVoiceInput'
import { useWorkspace } from '@/lib/workspace/use-workspace'
import { AgentSelector } from './AgentSelector'
import type { OutboundMessage } from './useOutboundQueue'
export interface ConversationInputSendInput {
text: string
@@ -54,40 +57,34 @@ interface ConversationInputProps {
placeholder?: string
attachmentsEnabled?: boolean
variant?: 'home' | 'conversation'
/**
* When set, a Stop button surfaces to the left of the voice mic
* while `streaming === true`. Click cancels the active turn
* server-side via the chat-cancel endpoint. Absent → no Stop
* button (legacy behaviour for the home composer).
*/
onStop?: () => void
// Outbound queue: when present, the composer renders the queue strip
// above the textarea and lets the user keep sending while a previous
// turn is in flight. Optional so non-conversation variants (the home
// page) can opt out — the queue only makes sense in the conversation
// page where each enqueued message will eventually be delivered to the
// active agent.
outboundQueue?: OutboundMessage[]
onCancelQueued?: (id: string) => void
onRetryQueued?: (id: string) => void
}
function InputActionButton({
disabled,
onClick,
streaming,
hasContent,
}: {
disabled: boolean
onClick: () => void
streaming: boolean
hasContent: boolean
}) {
// Show the spinner while streaming only when there's nothing to
// send — once the user types something, the icon flips back to the
// paper-plane so it reads as "queue this message" instead of
// "still working".
const showSpinner = streaming && !hasContent
return (
<Button
onClick={onClick}
size="icon"
disabled={disabled}
title={streaming && hasContent ? 'Queue message' : undefined}
className="h-10 w-10 flex-shrink-0 rounded-xl bg-primary text-primary-foreground hover:bg-primary/90"
>
{showSpinner ? (
{streaming ? (
<Loader2 className="h-5 w-5 animate-spin" />
) : (
<ArrowRight className="h-5 w-5" />
@@ -96,22 +93,6 @@ function InputActionButton({
)
}
function StopButton({ onStop }: { onStop: () => void }) {
return (
<Button
type="button"
size="icon"
variant="ghost"
onClick={onStop}
title="Stop current turn — queued messages will start next."
aria-label="Stop current turn"
className="h-8 w-8 flex-shrink-0 rounded-lg bg-destructive/10 text-destructive transition-colors hover:bg-destructive/15 hover:text-destructive"
>
<Square className="h-3.5 w-3.5 fill-current" />
</Button>
)
}
function VoiceButton({
isRecording,
isTranscribing,
@@ -330,7 +311,9 @@ export const ConversationInput: FC<ConversationInputProps> = ({
placeholder,
attachmentsEnabled = true,
variant = 'conversation',
onStop,
outboundQueue,
onCancelQueued,
onRetryQueued,
}) => {
const [input, setInput] = useState('')
const [selectedTabs, setSelectedTabs] = useState<chrome.tabs.Tab[]>([])
@@ -411,17 +394,15 @@ export const ConversationInput: FC<ConversationInputProps> = ({
}
const hasContent = input.trim().length > 0 || attachments.length > 0
// Queue-aware composers (the conversation panel passes `onStop`)
// accept input while streaming — the parent decides whether the
// submission opens a new turn or enqueues onto the active one.
// Surfaces without a Stop hook (home) keep the legacy behaviour
// and block input until the current turn finishes.
const queueAware = Boolean(onStop)
const queueEnabled = outboundQueue !== undefined
const handleSend = () => {
const text = input.trim()
// The outbound queue accepts new messages while streaming; legacy
// direct-send callers (e.g., the home composer) keep the original
// streaming-blocks-send semantic.
if (disabled || isStaging) return
if (streaming && !queueAware) return
if (!queueEnabled && streaming) return
if (!text && attachments.length === 0) return
onSend({ text, attachments })
setInput('')
@@ -513,6 +494,13 @@ export const ConversationInput: FC<ConversationInputProps> = ({
error={attachmentError}
/>
) : null}
{queueEnabled && outboundQueue && outboundQueue.length > 0 ? (
<OutboundQueueStrip
messages={outboundQueue}
onCancel={onCancelQueued}
onRetry={onRetryQueued}
/>
) : null}
<div
className={cn(
'flex gap-3',
@@ -551,7 +539,6 @@ export const ConversationInput: FC<ConversationInputProps> = ({
)}
/>
</div>
{streaming && onStop ? <StopButton onStop={onStop} /> : null}
<VoiceButton
isRecording={voice.isRecording}
isTranscribing={voice.isTranscribing}
@@ -569,13 +556,15 @@ export const ConversationInput: FC<ConversationInputProps> = ({
!!disabled ||
voice.isRecording ||
voice.isTranscribing ||
(streaming && !queueAware)
// Only block on `streaming` for the legacy direct-send path
// (no queue). With the queue active the press always
// succeeds — it just enqueues instead of dispatching.
(!queueEnabled && streaming)
}
onClick={handleSend}
// Spinner stays the user-facing "agent is busy" hint; with the
// queue active we still spin while a turn is in flight.
streaming={streaming}
hasContent={hasContent}
/>
</div>
{voice.error ? (
@@ -606,6 +595,117 @@ export const ConversationInput: FC<ConversationInputProps> = ({
)
}
function OutboundQueueStrip({
messages,
onCancel,
onRetry,
}: {
messages: OutboundMessage[]
onCancel?: (id: string) => void
onRetry?: (id: string) => void
}) {
return (
<div className="border-border/40 border-b px-4 pt-3 pb-2">
<ul className="flex flex-col gap-1">
{messages.map((message) => (
<OutboundQueueItem
key={message.id}
message={message}
onCancel={onCancel}
onRetry={onRetry}
/>
))}
</ul>
</div>
)
}
function OutboundQueueItem({
message,
onCancel,
onRetry,
}: {
message: OutboundMessage
onCancel?: (id: string) => void
onRetry?: (id: string) => void
}) {
const preview = message.text.trim() || '(attachments only)'
return (
<li className="flex items-center gap-2 rounded-md px-2 py-1 text-xs">
<OutboundQueueStatusIcon status={message.status} />
<span className="min-w-0 flex-1 truncate text-muted-foreground">
{preview}
</span>
{message.attachmentPreviews.length > 0 ? (
<span className="inline-flex items-center gap-1 text-muted-foreground/70">
<Paperclip className="size-3" />
<span className="tabular-nums">
{message.attachmentPreviews.length}
</span>
</span>
) : null}
{message.status === 'queued' && onCancel ? (
<button
type="button"
onClick={() => onCancel(message.id)}
className="ml-1 inline-flex size-5 items-center justify-center rounded-full text-muted-foreground hover:bg-accent hover:text-foreground"
aria-label="Cancel queued message"
title="Cancel"
>
<X className="size-3" />
</button>
) : null}
{message.status === 'failed' ? (
<span className="ml-1 inline-flex items-center gap-2 text-destructive">
<span className="max-w-[160px] truncate" title={message.error}>
{message.error ?? 'Failed'}
</span>
{onRetry ? (
<button
type="button"
onClick={() => onRetry(message.id)}
className="inline-flex size-5 items-center justify-center rounded-full hover:bg-accent hover:text-foreground"
aria-label="Retry failed message"
title="Retry"
>
<RefreshCw className="size-3" />
</button>
) : null}
{onCancel ? (
<button
type="button"
onClick={() => onCancel(message.id)}
className="inline-flex size-5 items-center justify-center rounded-full hover:bg-accent hover:text-foreground"
aria-label="Discard failed message"
title="Discard"
>
<X className="size-3" />
</button>
) : null}
</span>
) : null}
</li>
)
}
function OutboundQueueStatusIcon({
status,
}: {
status: OutboundMessage['status']
}) {
if (status === 'sending') {
return (
<Loader2 className="size-3.5 shrink-0 animate-spin text-muted-foreground" />
)
}
if (status === 'failed') {
return <AlertTriangle className="size-3.5 shrink-0 text-destructive" />
}
return (
<span className="inline-block size-2 shrink-0 rounded-full bg-muted-foreground/40" />
)
}
function AttachmentStrip({
attachments,
onRemove,

View File

@@ -1,243 +0,0 @@
import { Quote, TriangleAlert } from 'lucide-react'
import type { FC } from 'react'
import { Badge } from '@/components/ui/badge'
import {
HoverCard,
HoverCardContent,
HoverCardTrigger,
} from '@/components/ui/hover-card'
import { adapterLabel } from '@/entrypoints/app/agents/AdapterIcon'
import { formatRelativeTime } from '@/entrypoints/app/agents/agent-display.helpers'
import type {
HarnessAdapterHealth,
HarnessAgent,
HarnessAgentAdapter,
} from '@/entrypoints/app/agents/agent-harness-types'
import { AgentTile } from '@/entrypoints/app/agents/agent-row/AgentTile'
import {
firstNonBlankLine,
truncate,
} from '@/entrypoints/app/agents/agent-row/agent-row.helpers'
import type { AgentLiveness } from '@/entrypoints/app/agents/LivenessDot'
import { cn } from '@/lib/utils'
interface HomeAgentCardProps {
agent: HarnessAgent
adapter: HarnessAgentAdapter | 'unknown'
/** Per-adapter health snapshot, shared across cards rendering the
* same adapter. `null` when the /adapters response hasn't surfaced
* health yet (we treat that as healthy until proven otherwise). */
adapterHealth: HarnessAdapterHealth | null
/** Highlights the card with an accent ring; tells the user which
* agent the conversation input is bound to. */
active?: boolean
onClick: () => void
}
const PREVIEW_CHARS = 100
/**
* Grid-shaped card for the /home Recent agents section. Composition
* mirrors the rail's `AgentRowCard` but the layout is a vertical
* column sized for a 1/3-width tile rather than a full-width row.
*
* Reuses `<AgentTile>`, `<LivenessDot>`, `livenessDetail`,
* `formatRelativeTime`, `firstNonBlankLine`, `truncate`, and the
* inline `Unavailable` chip pattern so the visual language is
* continuous between rail and grid.
*/
export const HomeAgentCard: FC<HomeAgentCardProps> = ({
agent,
adapter,
adapterHealth,
active,
onClick,
}) => {
const status = agent.status ?? 'unknown'
const lastUsedAt = agent.lastUsedAt ?? null
const isWorking = status === 'working'
const isAsleep = status === 'asleep'
const isError = status === 'error'
const hasActiveTurn = Boolean(agent.activeTurnId)
return (
<button
type="button"
onClick={onClick}
className={cn(
'group flex min-h-32 w-full min-w-0 flex-col rounded-2xl border bg-card p-4 text-left shadow-sm transition-colors',
active && 'ring-1 ring-[var(--accent-orange)]/30',
isWorking
? 'border-[var(--accent-orange)]/40'
: isError
? 'border-destructive/30'
: 'border-border/60 hover:border-[var(--accent-orange)]/30',
)}
>
<div className="flex items-start gap-3">
<AgentTile adapter={adapter} status={status} lastUsedAt={lastUsedAt} />
<div className="min-w-0 flex-1">
<div className="flex items-center gap-1.5">
<span className="truncate font-semibold text-sm">
{displayName(agent)}
</span>
{isWorking && (
<Badge
variant="secondary"
className="ml-auto bg-amber-50 text-amber-900 hover:bg-amber-50"
>
Working
</Badge>
)}
</div>
<SummaryLine
adapter={adapter}
modelId={agent.modelId ?? null}
reasoningEffort={agent.reasoningEffort ?? null}
adapterHealth={adapterHealth}
/>
</div>
</div>
<LastMessage message={agent.lastUserMessage ?? null} />
<div className="mt-3 flex items-center justify-between gap-2 text-muted-foreground text-xs">
<span>{statusFootnote(status, lastUsedAt)}</span>
{hasActiveTurn ? (
<ResumeChip />
) : isAsleep ? (
<Badge variant="outline" className="text-muted-foreground">
Asleep
</Badge>
) : isError ? (
<ErrorChip lastError={agent.lastError ?? null} />
) : null}
</div>
</button>
)
}
const SummaryLine: FC<{
adapter: HarnessAgentAdapter | 'unknown'
modelId: string | null
reasoningEffort: string | null
adapterHealth: HarnessAdapterHealth | null
}> = ({ adapter, modelId, reasoningEffort, adapterHealth }) => {
const parts = [adapterLabel(adapter)]
if (modelId) parts.push(modelId)
if (reasoningEffort) parts.push(reasoningEffort)
const unhealthy = adapterHealth?.healthy === false
return (
<div
className={cn(
'mt-0.5 flex items-center gap-1.5 text-muted-foreground text-xs',
unhealthy && 'text-muted-foreground/70',
)}
>
<span className="truncate">{parts.join(' · ')}</span>
{unhealthy && (
<HoverCard openDelay={200}>
<HoverCardTrigger asChild>
<Badge
variant="outline"
className="h-5 cursor-default gap-1 border-amber-500/40 bg-amber-50 px-1.5 text-amber-900 hover:bg-amber-50"
>
<TriangleAlert className="size-2.5" />
<span className="font-normal">Unavailable</span>
</Badge>
</HoverCardTrigger>
<HoverCardContent side="right" className="w-72 text-sm">
<div className="font-medium">
{adapterLabel(adapter)} CLI not available
</div>
<div className="mt-1 text-muted-foreground text-xs">
{adapterHealth?.reason ??
'Adapter binary missing on $PATH. Install it from the adapter docs to use this agent.'}
</div>
</HoverCardContent>
</HoverCard>
)}
</div>
)
}
const LastMessage: FC<{ message: string | null }> = ({ message }) => {
if (!message) {
return (
<p className="mt-3 flex-1 text-muted-foreground/70 text-xs italic">
No messages yet start a chat
</p>
)
}
return (
<p className="mt-3 line-clamp-2 flex flex-1 items-start gap-1.5 text-foreground/85 text-sm italic leading-snug">
<Quote
className="mt-1 size-3 shrink-0 text-muted-foreground/60"
aria-hidden
/>
<span className="line-clamp-2">
{truncate(firstNonBlankLine(message), PREVIEW_CHARS)}
</span>
</p>
)
}
const ResumeChip: FC = () => (
<span className="inline-flex items-center gap-1.5 rounded-full bg-[var(--accent-orange)] px-2.5 py-0.5 font-medium text-[11px] text-white shadow-sm">
<span className="relative flex size-1.5">
<span className="absolute inline-flex h-full w-full animate-ping rounded-full bg-white/70 opacity-75" />
<span className="relative inline-flex size-1.5 rounded-full bg-white" />
</span>
Resume
</span>
)
const ErrorChip: FC<{ lastError: string | null }> = ({ lastError }) => {
if (!lastError) {
return <Badge variant="destructive">Attention</Badge>
}
return (
<HoverCard openDelay={200}>
<HoverCardTrigger asChild>
<Badge variant="destructive" className="cursor-default">
Attention
</Badge>
</HoverCardTrigger>
<HoverCardContent
side="left"
className="max-w-xs whitespace-pre-wrap font-mono text-xs"
>
{lastError}
</HoverCardContent>
</HoverCard>
)
}
/**
* Footer left side: relative time on every state EXCEPT working,
* which shows `now` (the dot is already pulsing — restating it as
* "Working" would duplicate the pill in the title row).
*/
function statusFootnote(
status: AgentLiveness,
lastUsedAt: number | null,
): string {
if (status === 'working') return 'now'
return formatRelativeTime(lastUsedAt)
}
const UUID_PATTERN =
/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
const OC_UUID_PATTERN =
/^oc-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
function displayName(agent: HarnessAgent): string {
const name = agent.name?.trim()
const id = agent.id
if (!name || name === id) {
if (OC_UUID_PATTERN.test(id)) return id.slice(0, 11)
if (UUID_PATTERN.test(id)) return id.slice(0, 8)
return id
}
return name
}

View File

@@ -1,94 +0,0 @@
import { ListPlus, X } from 'lucide-react'
import type { FC } from 'react'
import {
Queue,
QueueItem,
QueueItemAction,
QueueItemActions,
QueueItemAttachment,
QueueItemContent,
QueueItemFile,
QueueItemImage,
QueueList,
QueueSection,
QueueSectionContent,
QueueSectionLabel,
QueueSectionTrigger,
} from '@/components/ai-elements/queue'
import type {
HarnessQueuedMessage,
HarnessQueuedMessageAttachment,
} from '@/entrypoints/app/agents/agent-harness-types'
import { firstNonBlankLine } from '@/entrypoints/app/agents/agent-row/agent-row.helpers'
interface QueuePanelProps {
queue: HarnessQueuedMessage[]
onRemove: (messageId: string) => void
}
/**
* Renders the agent's pending message queue using the shared AI
* Elements `Queue` primitives. Caller is expected to gate render on
* `queue.length > 0` — when empty, this returns null so the panel
* disappears cleanly between turns.
*/
export const QueuePanel: FC<QueuePanelProps> = ({ queue, onRemove }) => {
if (queue.length === 0) return null
return (
<Queue>
<QueueSection>
<QueueSectionTrigger>
<QueueSectionLabel
count={queue.length}
label={queue.length === 1 ? 'queued message' : 'queued messages'}
icon={<ListPlus className="size-3.5" />}
/>
</QueueSectionTrigger>
<QueueSectionContent>
<QueueList>
{queue.map((entry) => (
<QueueItem key={entry.id}>
<div className="flex items-center gap-2">
<QueueItemContent>
{firstNonBlankLine(entry.message)}
</QueueItemContent>
<QueueItemActions>
<QueueItemAction
aria-label="Remove from queue"
onClick={() => onRemove(entry.id)}
>
<X className="size-3" />
</QueueItemAction>
</QueueItemActions>
</div>
{entry.attachments && entry.attachments.length > 0 ? (
<QueueItemAttachment>
{entry.attachments.map((attachment, idx) =>
renderAttachment(entry.id, attachment, idx),
)}
</QueueItemAttachment>
) : null}
</QueueItem>
))}
</QueueList>
</QueueSectionContent>
</QueueSection>
</Queue>
)
}
function renderAttachment(
messageId: string,
attachment: HarnessQueuedMessageAttachment,
idx: number,
) {
if (attachment.mediaType.startsWith('image/')) {
const src = `data:${attachment.mediaType};base64,${attachment.data}`
return <QueueItemImage key={`${messageId}-${idx}`} src={src} />
}
return (
<QueueItemFile key={`${messageId}-${idx}`}>
{attachment.mediaType}
</QueueItemFile>
)
}

View File

@@ -26,15 +26,7 @@ export const AgentCommandLayout: FC = () => {
const { agents: harnessAgents, loading: harnessAgentsLoading } =
useHarnessAgents()
const visibleOpenClawAgents = openClawEnabled ? openClawAgents : []
// Dual-created OpenClaw agents appear in both `/claw/agents` (gateway
// record) and `/agents` (harness record) under the same id. Prefer the
// harness entry so the chat panel can route through the harness path
// and the rail doesn't show duplicates.
const harnessAgentIds = new Set(harnessAgents.map((entry) => entry.agentId))
const dedupedOpenClawAgents = visibleOpenClawAgents.filter(
(entry) => !harnessAgentIds.has(entry.agentId),
)
const agents = [...dedupedOpenClawAgents, ...harnessAgents]
const agents = [...visibleOpenClawAgents, ...harnessAgents]
return (
<Outlet

View File

@@ -23,9 +23,9 @@ export interface BrowserOSChatHistoryToolCall {
toolName: string
label: string
subject?: string
status: 'pending' | 'running' | 'completed' | 'failed'
input?: unknown
output?: unknown
status: 'completed' | 'failed'
input?: Record<string, unknown>
output?: string
error?: string
durationMs?: number
}

View File

@@ -1,71 +0,0 @@
import { buildToolLabel } from '../../../lib/tool-labels'
import type { HarnessAgentHistoryPage } from '../agents/agent-harness-types'
import type {
AgentHistoryPageResponse,
BrowserOSChatHistoryItem,
BrowserOSChatHistoryToolCall,
} from './claw-chat-types'
export function mapHarnessHistoryPage(
page: HarnessAgentHistoryPage,
): AgentHistoryPageResponse {
const items: BrowserOSChatHistoryItem[] = page.items.map((item, index) => {
const toolCalls = item.toolCalls?.map(
(tool): BrowserOSChatHistoryToolCall => {
const input = asRecord(tool.input)
const { label, subject } = buildToolLabel(tool.toolName, input)
return {
toolName: tool.toolName,
label,
status: tool.status,
...(tool.toolCallId ? { toolCallId: tool.toolCallId } : {}),
...(subject ? { subject } : {}),
...(tool.input !== undefined ? { input: tool.input } : {}),
...(tool.output !== undefined ? { output: tool.output } : {}),
...(tool.error ? { error: tool.error } : {}),
...(tool.durationMs != null ? { durationMs: tool.durationMs } : {}),
}
},
)
return {
id: item.id,
role: item.role,
text: item.text,
timestamp: item.createdAt,
messageSeq: index + 1,
sessionKey: 'main',
source: 'user-chat',
...(item.reasoning ? { reasoning: item.reasoning } : {}),
...(toolCalls && toolCalls.length > 0 ? { toolCalls } : {}),
}
})
const updatedAt =
page.items.length > 0
? Math.max(...page.items.map((item) => item.createdAt))
: Date.now()
return {
agentId: page.agentId,
sessionKey: 'main',
session: {
key: 'main',
updatedAt,
sessionId: 'main',
agentId: page.agentId,
kind: 'agent-harness',
source: 'user-chat',
},
items,
page: {
hasMore: false,
limit: items.length,
},
}
}
function asRecord(value: unknown): Record<string, unknown> | undefined {
return value && typeof value === 'object' && !Array.isArray(value)
? (value as Record<string, unknown>)
: undefined
}

View File

@@ -1,69 +0,0 @@
import { describe, expect, it } from 'bun:test'
import type { HarnessAgent } from '@/entrypoints/app/agents/agent-harness-types'
import { orderHomeAgents } from './home-agent-card.helpers'
function agent(overrides: Partial<HarnessAgent>): HarnessAgent {
return {
id: overrides.id ?? 'agent-x',
name: overrides.name ?? overrides.id ?? 'agent-x',
adapter: overrides.adapter ?? 'codex',
permissionMode: 'approve-all',
sessionKey: `agent:${overrides.id ?? 'agent-x'}:main`,
createdAt: 1000,
updatedAt: 1000,
...overrides,
}
}
describe('orderHomeAgents', () => {
it('places active-turn agents before everyone else', () => {
const sorted = orderHomeAgents([
agent({ id: 'a', lastUsedAt: 5000 }),
agent({ id: 'b', lastUsedAt: 9000, activeTurnId: 'turn-1' }),
agent({ id: 'c', lastUsedAt: 7000 }),
])
expect(sorted.map((a) => a.id)).toEqual(['b', 'c', 'a'])
})
it('orders non-active agents by lastUsedAt desc', () => {
const sorted = orderHomeAgents([
agent({ id: 'old', lastUsedAt: 1000 }),
agent({ id: 'new', lastUsedAt: 9000 }),
agent({ id: 'mid', lastUsedAt: 5000 }),
])
expect(sorted.map((a) => a.id)).toEqual(['new', 'mid', 'old'])
})
it('puts the gateway `main` seed agent above other never-used agents', () => {
const sorted = orderHomeAgents([
agent({ id: 'oc-aaaaaa', lastUsedAt: null }),
agent({ id: 'main', lastUsedAt: null }),
agent({ id: 'oc-bbbbbb', lastUsedAt: null }),
])
expect(sorted.map((a) => a.id)).toEqual(['main', 'oc-aaaaaa', 'oc-bbbbbb'])
})
it('sends never-used agents to the bottom even when `main` is among them', () => {
const sorted = orderHomeAgents([
agent({ id: 'main', lastUsedAt: null }),
agent({ id: 'used', lastUsedAt: 5000 }),
])
expect(sorted.map((a) => a.id)).toEqual(['used', 'main'])
})
it('does NOT sort by pinned — pinned agents are treated like any other', () => {
const sorted = orderHomeAgents([
agent({ id: 'unpinned-recent', lastUsedAt: 9000, pinned: false }),
agent({ id: 'pinned-old', lastUsedAt: 1000, pinned: true }),
])
expect(sorted.map((a) => a.id)).toEqual(['unpinned-recent', 'pinned-old'])
})
it('falls back to id-stable ordering when lastUsedAt ties', () => {
const sorted = orderHomeAgents([
agent({ id: 'b', lastUsedAt: 5000 }),
agent({ id: 'a', lastUsedAt: 5000 }),
])
expect(sorted.map((a) => a.id)).toEqual(['a', 'b'])
})
})

View File

@@ -1,42 +0,0 @@
import type { HarnessAgent } from '@/entrypoints/app/agents/agent-harness-types'
/**
* Order for the /home Recent agents grid.
*
* 1. Active turn first — agents mid-turn float to the top so the
* Resume affordance is the first thing the user sees on /home.
* 2. The protected gateway-side `main` agent stays pinned-to-top in
* the never-used group on a fresh install (mirrors the rail).
* 3. Recency (`lastUsedAt` desc).
* 4. `id` tiebreaker for stability so the grid doesn't reshuffle on
* every 5-second poll.
*
* Pin is NOT a sort key. The home grid is action-oriented and trusts
* recency + active-turn to surface the right agent; pinning is an
* organisation tool that lives on the rail at /agents.
*/
export function orderHomeAgents(agents: HarnessAgent[]): HarnessAgent[] {
return [...agents].sort((a, b) => {
const aActive = a.activeTurnId != null
const bActive = b.activeTurnId != null
if (aActive !== bActive) return aActive ? -1 : 1
// Recency wins outright. Never-used agents (`lastUsedAt == null`)
// both fall to the same `-Infinity` bucket and the seed/id rules
// below decide their order — but a used agent always beats any
// never-used agent regardless of id.
const aValue = a.lastUsedAt ?? Number.NEGATIVE_INFINITY
const bValue = b.lastUsedAt ?? Number.NEGATIVE_INFINITY
if (aValue !== bValue) return bValue - aValue
// Inside the never-used (or exact-tie) group: pin the gateway
// `main` seed to the top of the group on a fresh install, then
// fall back to id-stable order so the grid doesn't reshuffle on
// every poll.
const aSeed = a.id === 'main' && a.lastUsedAt == null
const bSeed = b.id === 'main' && b.lastUsedAt == null
if (aSeed !== bSeed) return aSeed ? -1 : 1
return a.id.localeCompare(b.id)
})
}

View File

@@ -0,0 +1,53 @@
import {
type AgentEntry,
getModelDisplayName,
type OpenClawStatus,
} from '@/entrypoints/app/agents/useOpenClaw'
import type { AgentCardData } from '@/lib/agent-conversations/types'
import type { AgentOverview } from './useAgentDashboard'
function resolveAgentStatus(
gatewayStatus: OpenClawStatus['status'] | undefined,
liveStatus: AgentOverview['status'] | undefined,
): AgentCardData['status'] {
// Gateway-level errors take precedence
if (gatewayStatus === 'error') return 'error'
if (gatewayStatus === 'starting') return 'working'
// Per-agent live status from the WS observer
if (liveStatus === 'working') return 'working'
if (liveStatus === 'error') return 'error'
return 'idle'
}
/**
* Build agent card display data by merging the raw agent entries from
* the gateway with enriched overview data from the dashboard API.
*
* Pure function — no hooks, no IndexedDB, no async.
*/
export function buildAgentCardData(
agents: AgentEntry[],
status: OpenClawStatus['status'] | undefined,
dashboard: AgentOverview[] | undefined,
): AgentCardData[] {
return agents.map((agent) => {
const overview = dashboard?.find((d) => d.agentId === agent.agentId)
return {
agentId: agent.agentId,
name: agent.name,
model: getModelDisplayName(agent.model),
status:
agent.source === 'agent-harness'
? 'idle'
: resolveAgentStatus(status, overview?.status),
lastMessage: overview?.latestMessage?.slice(0, 200) ?? undefined,
lastMessageTimestamp: overview?.latestMessageAt ?? undefined,
activitySummary: overview?.activitySummary ?? undefined,
currentTool: overview?.currentTool ?? undefined,
costUsd: overview?.totalCostUsd ?? undefined,
}
})
}

View File

@@ -1,12 +1,13 @@
import { useEffect, useRef, useState } from 'react'
import {
type AgentHarnessStreamEvent,
attachToHarnessTurn,
cancelHarnessTurn,
chatWithHarnessAgent,
fetchActiveHarnessTurn,
} from '@/entrypoints/app/agents/useAgents'
import type { OpenClawChatHistoryMessage } from '@/entrypoints/app/agents/useOpenClaw'
import {
chatWithAgent,
type OpenClawChatHistoryMessage,
type OpenClawStreamEvent,
} from '@/entrypoints/app/agents/useOpenClaw'
import type {
AgentConversationTurn,
AssistantPart,
@@ -28,23 +29,11 @@ export interface SendInput {
}
interface UseAgentConversationOptions {
// The hook always speaks to the harness chat path now; the OpenClaw
// legacy /claw/agents/:id/chat surface was removed in Step 12. The
// option remains for forward-compatibility.
runtime?: 'agent-harness'
runtime?: 'openclaw' | 'agent-harness'
sessionKey?: string | null
history?: OpenClawChatHistoryMessage[]
onComplete?: () => void
onSessionKeyChange?: (sessionKey: string) => void
/**
* Server-side active turn id, surfaced via the listing query. When
* this changes from null/<id> to a different non-null id while we
* aren't already streaming (e.g. the server just popped a queued
* message and started a new turn), the hook reattaches via
* /chat/active so the chat panel picks up the live stream without
* waiting for a remount.
*/
activeTurnId?: string | null
}
export function useAgentConversation(
@@ -60,11 +49,6 @@ export function useAgentConversation(
const streamAbortRef = useRef<AbortController | null>(null)
const onCompleteRef = useRef(options.onComplete)
const onSessionKeyChangeRef = useRef(options.onSessionKeyChange)
// Per-turn resume bookkeeping. `turnId` is captured from the response
// header; `lastSeq` advances with every SSE event so a reconnect can
// resume via Last-Event-ID.
const turnIdRef = useRef<string | null>(null)
const lastSeqRef = useRef<number | null>(null)
useEffect(() => {
sessionKeyRef.current = options.sessionKey ?? ''
@@ -88,12 +72,6 @@ export function useAgentConversation(
}
}, [])
// Indirection for the resume effect below: lets it call the latest
// event handler without re-subscribing on every render.
const processEventRef = useRef<(event: AgentHarnessStreamEvent) => void>(
() => {},
)
const updateCurrentTurnParts = (
updater: (parts: AssistantPart[]) => AssistantPart[],
) => {
@@ -104,6 +82,85 @@ export function useAgentConversation(
})
}
const processStreamEvent = (event: OpenClawStreamEvent) => {
switch (event.type) {
case 'text-delta': {
appendTextDelta((event.data.text as string) ?? '')
break
}
case 'thinking': {
appendThinkingDelta((event.data.text as string) ?? '')
break
}
case 'tool-start': {
const rawName = (event.data.toolName as string) ?? 'unknown'
const args = event.data.args as Record<string, unknown> | undefined
const { label, subject } = buildToolLabel(rawName, args)
const tool = {
id: (event.data.toolCallId as string) ?? crypto.randomUUID(),
name: rawName,
label,
subject,
status: 'running' as const,
}
updateCurrentTurnParts((parts) => {
const last = parts[parts.length - 1]
if (last?.kind === 'tool-batch') {
return [
...parts.slice(0, -1),
{ ...last, tools: [...last.tools, tool] },
]
}
return [...parts, { kind: 'tool-batch', tools: [tool] }]
})
break
}
case 'tool-end': {
const toolId = event.data.toolCallId as string
const toolStatus: 'completed' | 'error' =
(event.data.status as string) === 'error' ? 'error' : 'completed'
const durationMs = event.data.durationMs as number | undefined
updateCurrentTurnParts((parts) => {
for (let i = parts.length - 1; i >= 0; i--) {
const part = parts[i]
if (
part.kind === 'tool-batch' &&
part.tools.some((t) => t.id === toolId)
) {
const updatedTools = part.tools.map((t) =>
t.id === toolId ? { ...t, status: toolStatus, durationMs } : t,
)
return [
...parts.slice(0, i),
{ ...part, tools: updatedTools },
...parts.slice(i + 1),
]
}
}
return parts
})
break
}
case 'done': {
markCurrentTurnDone()
break
}
case 'error': {
const msg =
(event.data.message as string) ??
(event.data.error as string) ??
'Unknown error'
appendErrorText(msg)
break
}
}
}
const appendTextDelta = (delta: string) => {
textAccRef.current += delta
const text = textAccRef.current
@@ -218,105 +275,6 @@ export function useAgentConversation(
break
}
}
processEventRef.current = processAgentHarnessStreamEvent
const activeTurnIdDep = options.activeTurnId ?? null
// On mount, on agent change, and whenever the listing reports a
// *new* active turn id, check whether the server has an in-flight
// turn for this agent and reattach to it. This catches three
// cases at once: the chat resilience flow (tab close/reopen),
// navigation between agents, AND queue drain (the server starts a
// new turn from a queued message → activeTurnId flips → attach).
useEffect(() => {
let cancelled = false
const abortController = new AbortController()
// Reference the dep inside the body so biome's exhaustive-deps
// rule sees it consumed; the value is just an "any non-null
// active turn id" trigger — the actual id we attach to comes
// from the fresh fetchActiveHarnessTurn call below.
void activeTurnIdDep
const attemptResume = async () => {
// Track whether *we* started a stream in this run. When the
// early-return paths fire (no active turn, or a `send()` /
// earlier resume already owns `streamAbortRef`), the finally
// block must NOT touch streaming/turnIdRef/lastSeqRef —
// otherwise we clobber the in-flight stream's state and the
// Stop button drops out mid-turn while events keep arriving.
let weStartedStream = false
try {
const active = await fetchActiveHarnessTurn(agentId)
if (cancelled || !active || active.status !== 'running') return
if (streamAbortRef.current) return // someone else already owns the stream
// Stage a placeholder turn so the streamed events have a row
// to render into. The server now persists the kicking-off
// prompt on the active turn, so we render it as the user
// bubble immediately — no empty-bubble flicker when a queued
// message starts running.
setTurns((prev) => [
...prev,
{
id: crypto.randomUUID(),
userText: active.prompt ?? '',
parts: [],
done: false,
timestamp: active.startedAt,
},
])
textAccRef.current = ''
thinkAccRef.current = ''
turnIdRef.current = active.turnId
lastSeqRef.current = null
streamAbortRef.current = abortController
setStreaming(true)
weStartedStream = true
const response = await attachToHarnessTurn(agentId, {
turnId: active.turnId,
signal: abortController.signal,
})
if (!response.ok) return
await consumeSSEStream<AgentHarnessStreamEvent>(
response,
(event, meta) => {
if (typeof meta.seq === 'number') lastSeqRef.current = meta.seq
processEventRef.current(event)
},
abortController.signal,
)
} catch {
// Resume is best-effort; transient errors fall back to the
// user starting a new turn manually.
} finally {
// Always release `streamAbortRef` if we owned it — even when
// the effect was cancelled mid-stream (a listing poll
// captured the next queue-drain turn id, for example). If we
// don't, the next effect run hits `if (streamAbortRef.current)
// return` against our now-aborted controller and never
// reattaches, leaving `streaming === true` with no live stream.
if (weStartedStream && streamAbortRef.current === abortController) {
streamAbortRef.current = null
}
// The other state (streaming flag, turn id, lastSeq) is the
// *current run's* lifecycle: only reset it on a clean exit.
// When `cancelled` is true the next run will set these
// itself, so resetting here would only cause a brief flicker.
if (!cancelled && weStartedStream) {
turnIdRef.current = null
lastSeqRef.current = null
setStreaming(false)
}
}
}
void attemptResume()
return () => {
cancelled = true
abortController.abort()
}
}, [agentId, activeTurnIdDep])
const send = async (input: string | SendInput) => {
const normalized: SendInput =
@@ -346,25 +304,17 @@ export function useAgentConversation(
streamAbortRef.current = abortController
try {
let response = await chatWithHarnessAgent(
agentId,
trimmed,
abortController.signal,
attachments,
)
// 409 means the server already has an active turn for this
// agent (e.g. a previous tab kicked one off and we're a fresh
// mount that missed the resume window). Attach to it instead of
// double-sending.
if (response.status === 409) {
const body = (await response.json()) as { turnId?: string }
if (body.turnId) {
response = await attachToHarnessTurn(agentId, {
turnId: body.turnId,
signal: abortController.signal,
})
}
}
const response =
options.runtime === 'agent-harness'
? await chatWithHarnessAgent(agentId, trimmed, abortController.signal)
: await chatWithAgent(
agentId,
trimmed,
sessionKeyRef.current || undefined,
historyRef.current,
abortController.signal,
attachments,
)
const responseSessionKey =
response.headers.get('X-Session-Key') ??
response.headers.get('X-Session-Id')
@@ -372,11 +322,6 @@ export function useAgentConversation(
sessionKeyRef.current = responseSessionKey
onSessionKeyChangeRef.current?.(responseSessionKey)
}
const responseTurnId = response.headers.get('X-Turn-Id')
if (responseTurnId) {
turnIdRef.current = responseTurnId
lastSeqRef.current = null
}
if (!response.ok) {
const err = await response.text()
updateCurrentTurnParts((parts) => [
@@ -385,14 +330,19 @@ export function useAgentConversation(
])
return
}
await consumeSSEStream<AgentHarnessStreamEvent>(
response,
(event, meta) => {
if (typeof meta.seq === 'number') lastSeqRef.current = meta.seq
processAgentHarnessStreamEvent(event)
},
abortController.signal,
)
if (options.runtime === 'agent-harness') {
await consumeSSEStream<AgentHarnessStreamEvent>(
response,
processAgentHarnessStreamEvent,
abortController.signal,
)
} else {
await consumeSSEStream<OpenClawStreamEvent>(
response,
processStreamEvent,
abortController.signal,
)
}
} catch (err) {
if (abortController.signal.aborted) return
const msg = err instanceof Error ? err.message : String(err)
@@ -404,35 +354,14 @@ export function useAgentConversation(
if (streamAbortRef.current === abortController) {
streamAbortRef.current = null
}
turnIdRef.current = null
lastSeqRef.current = null
onCompleteRef.current?.()
setStreaming(false)
}
}
/**
* Stop button. The fetch abort only detaches *this* SSE subscriber
* now — the underlying turn would otherwise keep running on the
* server. So we explicitly cancel via the new endpoint, then unwind
* the local stream.
*/
const stop = async () => {
const turnId = turnIdRef.current ?? undefined
const resetConversation = () => {
streamAbortRef.current?.abort()
streamAbortRef.current = null
try {
await cancelHarnessTurn(agentId, {
turnId,
reason: 'user pressed stop',
})
} catch {
// Best-effort — UI already aborted.
}
}
const resetConversation = () => {
void stop()
setTurns([])
setStreaming(false)
}
@@ -442,7 +371,6 @@ export function useAgentConversation(
streaming,
sessionKey: sessionKeyRef.current,
send,
stop,
resetConversation,
}
}

View File

@@ -0,0 +1,95 @@
import { useQuery, useQueryClient } from '@tanstack/react-query'
import { useEffect } from 'react'
import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
export interface AgentOverview {
agentId: string
status: 'working' | 'idle' | 'error' | 'unknown'
latestMessage: string | null
latestMessageAt: number | null
activitySummary: string | null
currentTool: string | null
totalCostUsd: number
sessionCount: number
}
export interface DashboardResponse {
agents: AgentOverview[]
summary: {
totalAgents: number
totalCostUsd: number
}
}
interface StatusEvent {
agentId: string
status: AgentOverview['status']
currentTool: string | null
error: string | null
timestamp: number
}
const DASHBOARD_QUERY_KEY = ['claw', 'dashboard']
export function useAgentDashboard(enabled: boolean) {
const { baseUrl, isLoading: urlLoading } = useAgentServerUrl()
const queryClient = useQueryClient()
const ready = enabled && Boolean(baseUrl) && !urlLoading
// Initial data load + periodic refresh as fallback
const query = useQuery<DashboardResponse>({
queryKey: [...DASHBOARD_QUERY_KEY, baseUrl],
queryFn: async () => {
const url = new URL('/claw/dashboard', baseUrl as string)
const response = await fetch(url.toString())
if (!response.ok) throw new Error('Failed to fetch dashboard')
return response.json()
},
enabled: ready,
})
// SSE subscription for real-time status patches
useEffect(() => {
if (!ready || !baseUrl) return
const streamUrl = new URL('/claw/dashboard/stream', baseUrl)
const eventSource = new EventSource(streamUrl.toString())
eventSource.addEventListener('snapshot', (event) => {
try {
const dashboard = JSON.parse(event.data) as DashboardResponse
queryClient.setQueryData([...DASHBOARD_QUERY_KEY, baseUrl], dashboard)
} catch {}
})
eventSource.addEventListener('status', (event) => {
try {
const status = JSON.parse(event.data) as StatusEvent
queryClient.setQueryData<DashboardResponse>(
[...DASHBOARD_QUERY_KEY, baseUrl],
(prev) => {
if (!prev) return prev
return {
...prev,
agents: prev.agents.map((agent) =>
agent.agentId === status.agentId
? {
...agent,
status: status.status,
currentTool: status.currentTool,
}
: agent,
),
}
},
)
} catch {}
})
return () => {
eventSource.close()
}
}, [ready, baseUrl, queryClient])
return query
}

View File

@@ -0,0 +1,71 @@
import { useInfiniteQuery } from '@tanstack/react-query'
import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
import type { AgentHistoryPageResponse } from './claw-chat-types'
const HISTORY_QUERY_KEY = 'claw-agent-history'
async function fetchClawJson<T>(url: string): Promise<T> {
const response = await fetch(url)
if (!response.ok) {
let message = `Request failed with status ${response.status}`
try {
const body = (await response.json()) as { error?: string }
if (body.error) message = body.error
} catch {}
throw new Error(message)
}
return response.json() as Promise<T>
}
function buildClawUrl(baseUrl: string, path: string): URL {
return new URL(`/claw${path}`, baseUrl)
}
export function useClawChatHistory({
agentId,
sessionKey,
enabled = true,
limit = 50,
}: {
agentId: string
// null lets the server resolve the most recent user-chat session for the
// agent — avoids an extra /session round-trip and the race that came with it.
sessionKey: string | null
enabled?: boolean
limit?: number
}) {
const {
baseUrl,
isLoading: urlLoading,
error: urlError,
} = useAgentServerUrl()
const query = useInfiniteQuery<AgentHistoryPageResponse, Error>({
queryKey: [HISTORY_QUERY_KEY, baseUrl, agentId, sessionKey],
initialPageParam: undefined as string | undefined,
queryFn: async ({ pageParam }) => {
const url = buildClawUrl(baseUrl as string, `/agents/${agentId}/history`)
url.searchParams.set('limit', String(limit))
if (sessionKey) {
url.searchParams.set('sessionKey', sessionKey)
}
if (typeof pageParam === 'string' && pageParam) {
url.searchParams.set('cursor', pageParam)
}
return fetchClawJson<AgentHistoryPageResponse>(url.toString())
},
getNextPageParam: (lastPage) =>
lastPage.page.hasMore ? lastPage.page.cursor : undefined,
enabled: enabled && Boolean(baseUrl) && !urlLoading && Boolean(agentId),
})
return {
...query,
error: query.error ?? urlError,
isLoading: query.isLoading || urlLoading,
}
}

View File

@@ -1,55 +0,0 @@
import { describe, expect, it } from 'bun:test'
import { mapHarnessHistoryPage } from './harness-history-mapper'
describe('mapHarnessHistoryPage', () => {
it('maps rich harness history into chat history items', () => {
const page = mapHarnessHistoryPage({
agentId: 'agent-1',
sessionId: 'main',
items: [
{
id: 'agent:agent-1:main:1',
agentId: 'agent-1',
sessionId: 'main',
role: 'assistant',
text: 'Done.',
createdAt: 1000,
reasoning: { text: 'checking state' },
toolCalls: [
{
toolCallId: 'tool-1',
toolName: 'read_file',
status: 'completed',
input: { path: 'src/index.ts' },
output: 'file contents',
},
],
},
],
})
expect(page.items).toEqual([
{
id: 'agent:agent-1:main:1',
role: 'assistant',
text: 'Done.',
timestamp: 1000,
messageSeq: 1,
sessionKey: 'main',
source: 'user-chat',
reasoning: { text: 'checking state' },
toolCalls: [
{
toolCallId: 'tool-1',
toolName: 'read_file',
label: 'Read file',
subject: 'index.ts',
status: 'completed',
input: { path: 'src/index.ts' },
output: 'file contents',
},
],
},
])
})
})

View File

@@ -1,8 +1,11 @@
import { useQuery } from '@tanstack/react-query'
import type { HarnessAgentHistoryPage } from '@/entrypoints/app/agents/agent-harness-types'
import { fetchHarnessAgentHistory } from '@/entrypoints/app/agents/useAgents'
import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
import type { AgentHistoryPageResponse } from './claw-chat-types'
import { mapHarnessHistoryPage } from './harness-history-mapper'
import type {
AgentHistoryPageResponse,
BrowserOSChatHistoryItem,
} from './claw-chat-types'
const HISTORY_QUERY_KEY = 'harness-agent-history'
@@ -27,3 +30,39 @@ export function useHarnessChatHistory(agentId: string, enabled = true) {
isLoading: query.isLoading || urlLoading,
}
}
function mapHarnessHistoryPage(
page: HarnessAgentHistoryPage,
): AgentHistoryPageResponse {
const items: BrowserOSChatHistoryItem[] = page.items.map((item, index) => ({
id: item.id,
role: item.role,
text: item.text,
timestamp: item.createdAt,
messageSeq: index + 1,
sessionKey: 'main',
source: 'user-chat',
}))
const updatedAt =
page.items.length > 0
? Math.max(...page.items.map((item) => item.createdAt))
: Date.now()
return {
agentId: page.agentId,
sessionKey: 'main',
session: {
key: 'main',
updatedAt,
sessionId: 'main',
agentId: page.agentId,
kind: 'agent-harness',
source: 'user-chat',
},
items,
page: {
hasMore: false,
limit: items.length,
},
}
}

View File

@@ -0,0 +1,271 @@
import { useCallback, useEffect, useRef, useState } from 'react'
import type { OpenClawChatHistoryMessage } from '@/entrypoints/app/agents/useOpenClaw'
import type { UserAttachmentPreview } from '@/lib/agent-conversations/types'
import type { ServerAttachmentPayload } from '@/lib/attachments'
import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
export type OutboundMessageStatus = 'queued' | 'sending' | 'failed'
export interface OutboundMessage {
id: string
text: string
attachments: ServerAttachmentPayload[]
attachmentPreviews: UserAttachmentPreview[]
status: OutboundMessageStatus
error?: string
createdAt: number
}
export interface OutboundQueueEnqueueInput {
text: string
attachments?: ServerAttachmentPayload[]
attachmentPreviews?: UserAttachmentPreview[]
history?: OpenClawChatHistoryMessage[]
}
export interface OutboundQueueApi {
queue: OutboundMessage[]
enqueue(input: OutboundQueueEnqueueInput): void
cancel(id: string): void
retry(id: string): void
}
interface UseOutboundQueueOptions {
agentId: string | null | undefined
sessionKey?: string | null
enabled?: boolean
}
interface ServerQueuedItem {
id: string
status: 'queued' | 'dispatching' | 'failed'
message: string
attachmentsPreview: Array<{
kind: 'image' | 'file'
mediaType: string
name?: string
}>
error?: string
createdAt: number
}
function makeId(): string {
if (typeof crypto !== 'undefined' && crypto.randomUUID) {
return crypto.randomUUID()
}
return `${Date.now().toString(36)}-${Math.random().toString(36).slice(2, 10)}`
}
/**
* Server-backed outbound message queue. The browser is purely a
* projection of server state — closing the tab is safe because the queue
* keeps draining server-side via the OutboundQueueService.
*
* Single id-keyed list: the client generates the queue id and hands it
* to the server in the POST body, so the optimistic row and the SSE
* snapshot reconcile on the same key from frame zero — there is no
* window in which the message renders twice.
*/
export function useOutboundQueue(
options: UseOutboundQueueOptions,
): OutboundQueueApi {
const { agentId, enabled = true, sessionKey } = options
const { baseUrl } = useAgentServerUrl()
const sessionKeyRef = useRef<string | null | undefined>(sessionKey)
sessionKeyRef.current = sessionKey
const [items, setItems] = useState<OutboundMessage[]>([])
// Track which ids the server has confirmed seeing in any SSE snapshot.
// We use this to know whether a missing-from-snapshot id is "drained
// by the server" (drop it) or "still in flight client-side" (keep
// showing the optimistic row).
const everSeenByServerRef = useRef<Set<string>>(new Set())
// Local-only attachment previews, keyed by queue id. Data URLs never
// leave the browser — the SSE feed only carries metadata, so we hold
// them here so the chip strip keeps rendering after server takeover.
const previewMapRef = useRef<Map<string, UserAttachmentPreview[]>>(new Map())
useEffect(() => {
if (!enabled || !baseUrl || !agentId) {
setItems([])
everSeenByServerRef.current = new Set()
previewMapRef.current = new Map()
return
}
let cancelled = false
const url = `${baseUrl}/claw/agents/${encodeURIComponent(agentId)}/queue/stream`
const source = new EventSource(url)
source.onmessage = (event) => {
if (cancelled) return
try {
const parsed = JSON.parse(event.data) as { items: ServerQueuedItem[] }
const snapshotIds = new Set(parsed.items.map((item) => item.id))
for (const id of snapshotIds) everSeenByServerRef.current.add(id)
setItems((prev) => {
const next: OutboundMessage[] = parsed.items.map((item) => ({
id: item.id,
text: item.message,
attachments: [],
attachmentPreviews: previewMapRef.current.get(item.id) ?? [],
status: serverStatusToClient(item.status),
error: item.error,
createdAt: item.createdAt,
}))
// Carry forward any optimistic / failed entries the server
// doesn't know about yet (POST in flight) or has finished
// dispatching but the client wants to keep visible (failed).
const carried = prev.filter((local) => {
if (snapshotIds.has(local.id)) return false
if (everSeenByServerRef.current.has(local.id)) {
// Server saw it before and it's gone now — drained.
previewMapRef.current.delete(local.id)
return false
}
return local.status !== 'failed' || Boolean(local.error)
})
return [...carried, ...next]
})
} catch {
// Malformed event — ignore; next snapshot will recover.
}
}
source.onerror = () => {
// Auto-reconnects; nothing to do here.
}
return () => {
cancelled = true
source.close()
}
}, [baseUrl, agentId, enabled])
const enqueue = useCallback(
(input: OutboundQueueEnqueueInput) => {
if (!enabled || !baseUrl || !agentId) return
const trimmed = input.text.trim()
const attachments = input.attachments ?? []
if (!trimmed && attachments.length === 0) return
const id = makeId()
const previews = input.attachmentPreviews ?? []
previewMapRef.current.set(id, previews)
setItems((prev) => [
...prev,
{
id,
text: trimmed,
attachments,
attachmentPreviews: previews,
status: 'queued',
createdAt: Date.now(),
},
])
void (async () => {
try {
const response = await fetch(
`${baseUrl}/claw/agents/${encodeURIComponent(agentId)}/queue`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
id,
message: trimmed,
attachments: attachments.length > 0 ? attachments : undefined,
sessionKey: sessionKeyRef.current ?? undefined,
history: input.history,
}),
},
)
if (!response.ok) {
const text = await response.text().catch(() => '')
previewMapRef.current.delete(id)
setItems((prev) =>
prev.map((item) =>
item.id === id
? {
...item,
status: 'failed',
error:
text || `Failed to enqueue (status ${response.status})`,
}
: item,
),
)
}
} catch (err) {
// Only mark as failed if the SSE snapshot hasn't already
// taken ownership of the entry (i.e. the request actually
// reached the server).
if (everSeenByServerRef.current.has(id)) return
previewMapRef.current.delete(id)
setItems((prev) =>
prev.map((item) =>
item.id === id
? {
...item,
status: 'failed',
error:
err instanceof Error
? err.message
: 'Failed to enqueue message',
}
: item,
),
)
}
})()
},
[baseUrl, agentId, enabled],
)
const cancel = useCallback(
(id: string) => {
// If the server has never seen this id, just drop it locally.
if (!everSeenByServerRef.current.has(id)) {
previewMapRef.current.delete(id)
setItems((prev) => prev.filter((item) => item.id !== id))
return
}
if (!enabled || !baseUrl || !agentId) return
void fetch(
`${baseUrl}/claw/agents/${encodeURIComponent(agentId)}/queue/${encodeURIComponent(id)}`,
{ method: 'DELETE' },
).catch(() => {})
},
[baseUrl, agentId, enabled],
)
const retry = useCallback(
(id: string) => {
if (!everSeenByServerRef.current.has(id)) {
// Optimistic-only entry, never made it to the server. Reset
// status so the user can press Send again.
setItems((prev) =>
prev.map((item) =>
item.id === id
? { ...item, status: 'queued', error: undefined }
: item,
),
)
return
}
if (!enabled || !baseUrl || !agentId) return
void fetch(
`${baseUrl}/claw/agents/${encodeURIComponent(agentId)}/queue/${encodeURIComponent(id)}/retry`,
{ method: 'POST' },
).catch(() => {})
},
[baseUrl, agentId, enabled],
)
return { queue: items, enqueue, cancel, retry }
}
function serverStatusToClient(
status: ServerQueuedItem['status'],
): OutboundMessageStatus {
if (status === 'dispatching') return 'sending'
if (status === 'failed') return 'failed'
return 'queued'
}

View File

@@ -1,42 +0,0 @@
import { Bot, Cpu, Sparkles } from 'lucide-react'
import type { FC } from 'react'
import type { HarnessAgentAdapter } from './agent-harness-types'
/**
* Single icon component for any adapter the agent rail can render.
* Falls back to a generic bot when the adapter is unknown so future
* adapters land without a code change at the call site.
*/
interface AdapterIconProps {
adapter: HarnessAgentAdapter | 'unknown'
className?: string
}
export const AdapterIcon: FC<AdapterIconProps> = ({ adapter, className }) => {
switch (adapter) {
case 'claude':
// Claude Code — text-based agent, sparkles to evoke the "AI assistant" feel.
return <Sparkles className={className} aria-label="Claude Code" />
case 'codex':
// Codex — code-leaning, CPU mark.
return <Cpu className={className} aria-label="Codex" />
case 'openclaw':
// OpenClaw — bot/automation framing.
return <Bot className={className} aria-label="OpenClaw" />
default:
return <Bot className={className} aria-label="Agent" />
}
}
export function adapterLabel(adapter: HarnessAgentAdapter | 'unknown'): string {
switch (adapter) {
case 'claude':
return 'Claude Code'
case 'codex':
return 'Codex'
case 'openclaw':
return 'OpenClaw'
default:
return 'Agent'
}
}

View File

@@ -1,176 +1,117 @@
import { Loader2 } from 'lucide-react'
import { type FC, useMemo } from 'react'
import { AgentRowCard } from './AgentRowCard'
import { AgentsEmptyState } from './AgentsEmptyState'
import type {
HarnessAdapterDescriptor,
HarnessAgent,
HarnessAgentAdapter,
} from './agent-harness-types'
import type {
AgentAdapterHealth,
AgentRowData,
} from './agent-row/agent-row.types'
import { Bot, Cpu, Loader2, MessageSquare, Plus, Trash2 } from 'lucide-react'
import type { FC } from 'react'
import { Badge } from '@/components/ui/badge'
import { Button } from '@/components/ui/button'
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
import type { AgentListItem } from './agents-page-types'
import type { AgentLiveness } from './LivenessDot'
interface AgentListProps {
agents: AgentListItem[]
/** Optional per-agent activity metadata, keyed by `agentId`. */
activity?: Record<
string,
{ status: AgentLiveness; lastUsedAt: number | null }
>
/** Lookup table from harness id → enriched agent record. */
harnessAgentLookup?: Map<string, HarnessAgent>
/** Adapter catalog (carries per-adapter health). */
adapters: HarnessAdapterDescriptor[]
loading: boolean
deletingAgentKey: string | null
onChatAgent: (agent: AgentListItem) => void
onCreateAgent: () => void
onDeleteAgent: (agent: AgentListItem) => void
onPinToggle: (agent: AgentListItem, next: boolean) => void
}
export const AgentList: FC<AgentListProps> = ({
agents,
activity,
harnessAgentLookup,
adapters,
loading,
deletingAgentKey,
onChatAgent,
onCreateAgent,
onDeleteAgent,
onPinToggle,
}) => {
const adapterHealth = useMemo(() => {
const map = new Map<HarnessAgentAdapter, AgentAdapterHealth>()
for (const adapter of adapters) {
if (adapter.health) {
map.set(adapter.id, {
healthy: adapter.health.healthy,
reason: adapter.health.reason,
})
}
}
return map
}, [adapters])
// Sort: pinned rows first, then most recently used, then never-used
// agents in id-stable order. The gateway's `main` agent stays
// pinned-to-top when never touched so a fresh install has an
// obvious starting point.
const ordered = useMemo(() => {
const withMeta = agents.map((agent) => {
const harness = harnessAgentLookup?.get(agent.agentId)
return {
agent,
pinned: harness?.pinned ?? false,
lastUsedAt: activity?.[agent.agentId]?.lastUsedAt ?? null,
}
})
return withMeta
.sort((a, b) => {
if (a.pinned !== b.pinned) return a.pinned ? -1 : 1
const aSeed = a.agent.agentId === 'main' && a.lastUsedAt === null
const bSeed = b.agent.agentId === 'main' && b.lastUsedAt === null
if (aSeed && !bSeed) return -1
if (!aSeed && bSeed) return 1
const aValue = a.lastUsedAt ?? -Infinity
const bValue = b.lastUsedAt ?? -Infinity
if (aValue !== bValue) return bValue - aValue
return a.agent.agentId.localeCompare(b.agent.agentId)
})
.map((entry) => entry.agent)
}, [activity, agents, harnessAgentLookup])
if (loading && agents.length === 0) {
return (
<div className="flex h-36 items-center justify-center rounded-xl border border-border border-dashed bg-card/50">
<div className="flex h-36 items-center justify-center rounded-lg border border-border/70">
<Loader2 className="size-5 animate-spin text-muted-foreground" />
</div>
)
}
if (agents.length === 0) {
return <AgentsEmptyState onCreateAgent={onCreateAgent} />
return (
<Card>
<CardContent className="flex h-48 flex-col items-center justify-center gap-4 text-center">
<div className="flex size-10 items-center justify-center rounded-lg bg-muted text-muted-foreground">
<Bot className="size-5" />
</div>
<div className="space-y-1">
<h2 className="font-medium text-base">No agents</h2>
<p className="text-muted-foreground text-sm">
Create an OpenClaw, Claude Code, or Codex agent.
</p>
</div>
<Button variant="outline" onClick={onCreateAgent}>
<Plus className="mr-2 size-4" />
New Agent
</Button>
</CardContent>
</Card>
)
}
return (
<div className="grid gap-3">
{ordered.map((agent) => {
const harness = harnessAgentLookup?.get(agent.agentId)
const adapter: HarnessAgentAdapter | 'unknown' =
harness?.adapter ?? inferAdapterFromLabel(agent.runtimeLabel)
const data = buildRowData({
agent,
adapter,
harness,
activity: activity?.[agent.agentId],
adapterHealth:
adapterHealth.get(adapter as HarnessAgentAdapter) ?? null,
})
return (
<AgentRowCard
key={agent.key}
data={data}
deleting={deletingAgentKey === agent.key}
onDelete={onDeleteAgent}
onPinToggle={onPinToggle}
/>
)
})}
{agents.map((agent) => (
<Card key={agent.key} className="rounded-lg border-border/70">
<CardHeader className="flex flex-row items-center justify-between gap-4 py-3">
<div className="flex min-w-0 items-center gap-3">
<div className="flex size-10 shrink-0 items-center justify-center rounded-lg bg-muted text-muted-foreground">
{agent.source === 'openclaw' ? (
<Cpu className="size-5" />
) : (
<Bot className="size-5" />
)}
</div>
<div className="min-w-0">
<CardTitle className="truncate text-base">
{agent.name}
</CardTitle>
<div className="mt-1 flex flex-wrap items-center gap-2 text-muted-foreground text-xs">
<Badge variant="outline" className="rounded-md">
{agent.runtimeLabel}
</Badge>
<span>{agent.modelLabel}</span>
<Badge variant="outline" className="rounded-md">
main
</Badge>
</div>
<p className="mt-1 truncate font-mono text-muted-foreground text-xs">
{agent.detail}
</p>
</div>
</div>
<div className="flex shrink-0 items-center gap-1">
<Button
variant="ghost"
size="sm"
onClick={() => onChatAgent(agent)}
disabled={!agent.canChat}
>
<MessageSquare className="mr-1 size-4" />
Chat
</Button>
{agent.canDelete ? (
<Button
variant="ghost"
size="icon"
title="Delete agent"
onClick={() => onDeleteAgent(agent)}
disabled={deletingAgentKey === agent.key}
>
{deletingAgentKey === agent.key ? (
<Loader2 className="size-4 animate-spin" />
) : (
<Trash2 className="size-4 text-destructive" />
)}
</Button>
) : null}
</div>
</CardHeader>
</Card>
))}
</div>
)
}
function inferAdapterFromLabel(label: string): HarnessAgentAdapter | 'unknown' {
const lower = label?.toLowerCase()
if (lower === 'claude code') return 'claude'
if (lower === 'codex') return 'codex'
if (lower === 'openclaw') return 'openclaw'
return 'unknown'
}
const ZERO_BUCKETS = (): number[] => Array.from({ length: 14 }, () => 0)
function buildRowData(input: {
agent: AgentListItem
adapter: HarnessAgentAdapter | 'unknown'
harness: HarnessAgent | undefined
activity: { status: AgentLiveness; lastUsedAt: number | null } | undefined
adapterHealth: AgentAdapterHealth | null
}): AgentRowData {
const { agent, adapter, harness, activity, adapterHealth } = input
return {
agent,
adapter,
modelLabel: deriveModelLabel(agent, harness),
reasoningEffort: harness?.reasoningEffort ?? null,
status: activity?.status ?? 'unknown',
lastUsedAt: activity?.lastUsedAt ?? harness?.lastUsedAt ?? null,
pinned: harness?.pinned ?? false,
cwd: harness?.cwd ?? null,
lastUserMessage: harness?.lastUserMessage ?? null,
tokens: harness?.tokens ?? null,
turnsByDay: harness?.turnsByDay ?? ZERO_BUCKETS(),
failedByDay: harness?.failedByDay ?? ZERO_BUCKETS(),
lastError: harness?.lastError ?? null,
lastErrorAt: harness?.lastErrorAt ?? null,
activeTurnId: harness?.activeTurnId ?? null,
adapterHealth,
}
}
function deriveModelLabel(
agent: AgentListItem,
harness: HarnessAgent | undefined,
): string | null {
// Prefer the agent rail's modelLabel when meaningful; harness's
// modelId is a stable identifier but the rail's `modelLabel`
// already maps to a friendly display string.
if (agent.modelLabel && agent.modelLabel !== 'default') {
return agent.modelLabel
}
return harness?.modelId ?? null
}

View File

@@ -1,99 +0,0 @@
import type { FC } from 'react'
import { cn } from '@/lib/utils'
import { AgentActions } from './agent-row/AgentActions'
import { AgentErrorPanel } from './agent-row/AgentErrorPanel'
import { AgentLastMessage } from './agent-row/AgentLastMessage'
import { AgentMetaRow } from './agent-row/AgentMetaRow'
import { AgentSummaryChips } from './agent-row/AgentSummaryChips'
import { AgentTile } from './agent-row/AgentTile'
import { AgentTitleRow } from './agent-row/AgentTitleRow'
import type {
AgentRowCallbacks,
AgentRowData,
} from './agent-row/agent-row.types'
interface AgentRowCardProps extends AgentRowCallbacks {
data: AgentRowData
/** Whether THIS agent is mid-delete; renders a spinner in the menu. */
deleting?: boolean
}
/**
* Composition shell for the agent rail. Owns no state; sub-components
* each handle their own micro-state (error-panel collapse, etc.) and
* emit callbacks (delete, pin/unpin) for the page to act on.
*
* The whole card carries state — not just the tile — so the row's
* border subtly tells the user what's going on at a glance:
* working → accent-orange border with a soft glow
* error → destructive border
* idle → muted border, lifts on hover
*/
export const AgentRowCard: FC<AgentRowCardProps> = ({
data,
deleting,
onDelete,
onPinToggle,
}) => {
return (
<div
className={cn(
// Layout-stable hover. No translate, no shadow change — both
// visibly perturb neighbouring rows. Only the border tint
// shifts on hover, and the rail's vertical rhythm stays
// exactly the same in every state.
'group rounded-xl border bg-card p-4 shadow-sm transition-colors',
data.status === 'working'
? 'border-[var(--accent-orange)]/40'
: data.status === 'error'
? 'border-destructive/40'
: 'border-border hover:border-[var(--accent-orange)]/30',
)}
>
<div className="flex items-start gap-4">
<AgentTile
adapter={data.adapter}
status={data.status}
lastUsedAt={data.lastUsedAt}
/>
<div className="min-w-0 flex-1">
<AgentTitleRow
agent={data.agent}
status={data.status}
pinned={data.pinned}
turnsByDay={data.turnsByDay}
failedByDay={data.failedByDay}
onPinToggle={(next) => onPinToggle(data.agent, next)}
/>
<AgentSummaryChips
adapter={data.adapter}
modelLabel={data.modelLabel}
reasoningEffort={data.reasoningEffort}
adapterHealth={data.adapterHealth}
/>
<AgentLastMessage message={data.lastUserMessage} />
<AgentMetaRow lastUsedAt={data.lastUsedAt} tokens={data.tokens} />
{data.status === 'error' && data.lastError && (
<AgentErrorPanel
agentId={data.agent.agentId}
message={data.lastError}
errorAt={data.lastErrorAt}
/>
)}
</div>
<AgentActions
agent={data.agent}
activeTurnId={data.activeTurnId}
deleting={deleting}
onDelete={onDelete}
/>
</div>
</div>
)
}

View File

@@ -1,32 +0,0 @@
import { Bot, Plus } from 'lucide-react'
import type { FC } from 'react'
import { Button } from '@/components/ui/button'
interface AgentsEmptyStateProps {
onCreateAgent: () => void
}
export const AgentsEmptyState: FC<AgentsEmptyStateProps> = ({
onCreateAgent,
}) => {
return (
<div className="rounded-xl border border-border border-dashed bg-card/50 p-12 text-center">
<div className="mx-auto mb-4 flex h-12 w-12 items-center justify-center rounded-xl bg-[var(--accent-orange)]/10">
<Bot className="h-6 w-6 text-[var(--accent-orange)]" />
</div>
<h3 className="mb-1 font-semibold">No agents yet</h3>
<p className="mx-auto mb-4 max-w-sm text-muted-foreground text-sm">
Spin up an OpenClaw, Claude Code, or Codex agent to chat with, schedule,
or run in the background.
</p>
<Button
onClick={onCreateAgent}
variant="outline"
className="border-[var(--accent-orange)] bg-[var(--accent-orange)]/10 text-[var(--accent-orange)] hover:bg-[var(--accent-orange)]/20 hover:text-[var(--accent-orange)]"
>
<Plus className="mr-1.5 h-4 w-4" />
Create your first agent
</Button>
</div>
)
}

View File

@@ -1,41 +0,0 @@
import { Bot, Plus } from 'lucide-react'
import type { FC } from 'react'
import { Button } from '@/components/ui/button'
interface AgentsHeaderProps {
onCreateAgent: () => void
}
/**
* Mirrors the visual shape of `SoulHeader` and `ScheduledTasksHeader`
* so the page reads as part of the same family. Loose lifecycle
* controls that used to sit next to the title moved into
* `GatewayStatusBar` — they're OpenClaw-specific and don't apply to
* Claude/Codex agents.
*/
export const AgentsHeader: FC<AgentsHeaderProps> = ({ onCreateAgent }) => {
return (
<div className="rounded-xl border border-border bg-card p-6 shadow-sm transition-all hover:shadow-md">
<div className="flex items-start gap-4">
<div className="flex h-12 w-12 shrink-0 items-center justify-center rounded-xl bg-[var(--accent-orange)]/10">
<Bot className="h-6 w-6 text-[var(--accent-orange)]" />
</div>
<div className="flex-1">
<h2 className="mb-1 font-semibold text-xl">Agents</h2>
<p className="text-muted-foreground text-sm">
OpenClaw, Claude Code, and Codex agents chat, schedule, and run
them in the background.
</p>
</div>
<Button
onClick={onCreateAgent}
className="border-[var(--accent-orange)] bg-[var(--accent-orange)]/10 text-[var(--accent-orange)] hover:bg-[var(--accent-orange)]/20 hover:text-[var(--accent-orange)]"
variant="outline"
>
<Plus className="mr-1.5 h-4 w-4" />
New Agent
</Button>
</div>
</div>
)
}

View File

@@ -3,9 +3,8 @@ import { type FC, useMemo, useState } from 'react'
import { useNavigate } from 'react-router'
import { useLlmProviders } from '@/lib/llm-providers/useLlmProviders'
import { AgentList } from './AgentList'
import { AgentsHeader } from './AgentsHeader'
import { AgentTerminal } from './AgentTerminal'
import type { HarnessAgent, HarnessAgentAdapter } from './agent-harness-types'
import type { HarnessAgentAdapter } from './agent-harness-types'
import { createAgentPageActions } from './agents-page-actions'
import {
useDefaultAgentName,
@@ -30,9 +29,9 @@ import {
toHarnessListItem,
toOpenClawListItem,
} from './agents-page-utils'
import { GatewayStatusBar } from './GatewayStatusBar'
import { NewAgentDialog } from './NewAgentDialog'
import {
AgentsPageHeader,
ControlPlaneAlert,
GatewayStateCards,
InlineErrorAlert,
@@ -44,45 +43,51 @@ import {
useCreateHarnessAgent,
useDeleteHarnessAgent,
useHarnessAgents,
useUpdateHarnessAgent,
} from './useAgents'
import { useOpenClawAgents, useOpenClawMutations } from './useOpenClaw'
import {
useOpenClawAgents,
useOpenClawMutations,
useOpenClawStatus,
} from './useOpenClaw'
export const AgentsPage: FC = () => {
const navigate = useNavigate()
const {
status,
loading: statusLoading,
error: statusError,
refetch: refetchStatus,
} = useOpenClawStatus()
const { providers, defaultProviderId } = useLlmProviders()
const {
adapters,
loading: adaptersLoading,
error: adaptersError,
refetch: refetchAdapters,
} = useAgentAdapters()
// The harness listing now carries the gateway lifecycle snapshot
// alongside the agents — one polling source for everything the
// agents page renders. The legacy `/claw/status` poll is dead from
// this surface; the chat-panel layout still uses it for now.
const {
harnessAgents,
gateway: status,
loading: harnessAgentsLoading,
error: harnessAgentsError,
} = useHarnessAgents()
const openClawAgentsEnabled =
status?.status === 'running' && status.controlPlaneStatus === 'connected'
const {
agents: openClawAgents,
loading: openClawAgentsLoading,
error: openClawAgentsError,
refetch: refetchOpenClawAgents,
} = useOpenClawAgents(openClawAgentsEnabled)
const {
harnessAgents,
loading: harnessAgentsLoading,
error: harnessAgentsError,
refetch: refetchHarnessAgents,
} = useHarnessAgents()
const createHarnessAgent = useCreateHarnessAgent()
const deleteHarnessAgent = useDeleteHarnessAgent()
const updateHarnessAgent = useUpdateHarnessAgent()
const {
setupOpenClaw,
createAgent: createOpenClawAgent,
deleteAgent: deleteOpenClawAgent,
startOpenClaw,
stopOpenClaw,
restartOpenClaw,
reconnectOpenClaw,
actionInProgress,
@@ -153,68 +158,42 @@ export const AgentsPage: FC = () => {
openClawAgentsEnabled,
openClawAgents,
)
const agentListItems = useMemo(() => {
// Dual-created OpenClaw agents (and the backfilled `main`/orphans
// post Step 9) live in both `/claw/agents` and `/agents` under the
// same id. Prefer the harness entry — it carries adapter/model/
// reasoning/lastUsedAt/status that the chat path actually uses —
// and drop the legacy duplicate so the rail doesn't show every
// OpenClaw agent twice.
const harnessIds = new Set(harnessAgents.map((agent) => agent.id))
const dedupedOpenClawAgents = visibleOpenClawAgents.filter(
(agent) => !harnessIds.has(agent.agentId),
)
return [
...dedupedOpenClawAgents.map((agent) =>
const agentListItems = useMemo(
() => [
...visibleOpenClawAgents.map((agent) =>
toOpenClawListItem(agent, openClawManageable),
),
...harnessAgents.map(toHarnessListItem),
]
}, [harnessAgents, openClawManageable, visibleOpenClawAgents])
// Lookup map so AgentList can render adapter chips, reasoning, etc.
// Computed up here to keep all hooks above the early returns below.
const harnessAgentLookup = useMemo(() => {
const map = new Map<string, HarnessAgent>()
for (const agent of harnessAgents) map.set(agent.id, agent)
return map
}, [harnessAgents])
// Activity map keyed by agentId. Sourced from the harness listing's
// server-side enrichment (`status` + `lastUsedAt`). Legacy gateway
// agents that don't have a harness record yet (rare post-backfill)
// simply miss from the map and render with the default `unknown`
// dot until reconciliation picks them up.
const agentActivity = useMemo(() => {
const map: Record<
string,
{
status: 'working' | 'idle' | 'asleep' | 'error'
lastUsedAt: number | null
}
> = {}
for (const agent of harnessAgents) {
if (!agent.status) continue
map[agent.id] = {
status: agent.status,
lastUsedAt: agent.lastUsedAt ?? null,
}
}
return map
}, [harnessAgents])
],
[harnessAgents, openClawManageable, visibleOpenClawAgents],
)
const inlineError = getInlineError({
lifecyclePending,
pageError,
statusError,
openClawAgentsError,
adaptersError,
harnessAgentsError,
})
const agentsLoading = getAgentsLoading({
statusLoading,
adaptersLoading,
harnessAgentsLoading,
openClawAgentsEnabled,
openClawAgentsLoading,
})
const creatingAgent = creatingOpenClawAgent || createHarnessAgent.isPending
const deletingAgent = deletingOpenClawAgent || deleteHarnessAgent.isPending
const refreshAll = async () => {
await Promise.all([
refetchStatus(),
refetchAdapters(),
refetchHarnessAgents(),
openClawAgentsEnabled ? refetchOpenClawAgents() : Promise.resolve(),
])
}
const handleHarnessAdapterChange = (adapter: HarnessAgentAdapter) => {
const descriptor = adapters.find((entry) => entry.id === adapter)
setHarnessAdapterId(adapter)
@@ -260,9 +239,7 @@ export const AgentsPage: FC = () => {
)
}
// First-paint loader: until the harness listing has resolved at
// least once we don't know which adapters / agents to render.
if (harnessAgentsLoading && !status) {
if (statusLoading && !status) {
return (
<div className="flex items-center justify-center py-20">
<Loader2 className="size-6 animate-spin text-muted-foreground" />
@@ -278,18 +255,27 @@ export const AgentsPage: FC = () => {
const recoveryDetail = status ? getRecoveryDetail(status) : null
const controlPlaneCopy = getControlPlaneCopyForStatus(status)
// Bar only makes sense when the gateway is meaningfully alive AND
// there's at least one OpenClaw agent in the merged list. Hide it
// for Claude/Codex-only setups so the page stays uncluttered.
const showGatewayStatusBar =
status?.status === 'running' &&
(visibleOpenClawAgents.length > 0 ||
harnessAgents.some((agent) => agent.adapter === 'openclaw'))
return (
<div className="min-h-full bg-background px-6 py-8">
<div className="fade-in slide-in-from-bottom-5 mx-auto flex w-full max-w-5xl animate-in flex-col gap-6 duration-500">
<AgentsHeader onCreateAgent={() => setCreateOpen(true)} />
<div className="mx-auto flex w-full max-w-5xl flex-col gap-6">
<AgentsPageHeader
actionInProgress={actionInProgress}
controlPlaneBusy={gatewayUiState.controlPlaneBusy}
reconnecting={reconnecting}
status={status}
onCreateAgent={() => setCreateOpen(true)}
onOpenTerminal={() => setShowTerminal(true)}
onReconnect={() => {
void runWithPageErrorHandling(reconnectOpenClaw)
}}
onRefresh={() => void refreshAll()}
onRestart={() => {
void runWithPageErrorHandling(restartOpenClaw)
}}
onStop={() => {
void runWithPageErrorHandling(stopOpenClaw)
}}
/>
{lifecycleBanner ? <LifecycleAlert message={lifecycleBanner} /> : null}
@@ -329,39 +315,15 @@ export const AgentsPage: FC = () => {
}}
/>
{showGatewayStatusBar ? (
<GatewayStatusBar
status={status}
actionInProgress={actionInProgress}
onOpenTerminal={() => setShowTerminal(true)}
onRestart={() => {
void runWithPageErrorHandling(restartOpenClaw)
}}
/>
) : null}
<AgentList
agents={agentListItems}
activity={agentActivity}
harnessAgentLookup={harnessAgentLookup}
adapters={adapters}
loading={agentsLoading}
deletingAgentKey={deletingAgent ? deletingAgentKey : null}
onChatAgent={(agent) => navigate(`/agents/${agent.agentId}`)}
onCreateAgent={() => setCreateOpen(true)}
onDeleteAgent={(agent) => {
void handleDelete(agent)
}}
onPinToggle={(agent, next) => {
// Optimistic mutation; harness-only — gateway-original
// OpenClaw entries are gated server-side via the harness
// backfill, so we only fire when the row maps to a
// harness agent record.
if (!harnessAgentLookup.has(agent.agentId)) return
updateHarnessAgent.mutate({
agentId: agent.agentId,
patch: { pinned: next },
})
}}
/>
<SetupOpenClawDialog

View File

@@ -1,206 +0,0 @@
import { Loader2, RotateCcw, Terminal } from 'lucide-react'
import type { FC, ReactNode } from 'react'
import { Badge } from '@/components/ui/badge'
import { Button } from '@/components/ui/button'
import { Separator } from '@/components/ui/separator'
import {
Tooltip,
TooltipContent,
TooltipProvider,
TooltipTrigger,
} from '@/components/ui/tooltip'
import { cn } from '@/lib/utils'
import type { OpenClawStatus } from './useOpenClaw'
interface GatewayStatusBarProps {
status: OpenClawStatus | null
/** Disabled while a gateway lifecycle mutation is mid-flight. */
actionInProgress: boolean
onOpenTerminal: () => void
onRestart: () => void
}
/**
* Compact one-line status bar for the OpenClaw gateway. Renders the
* lifecycle pills (Running / Control plane connected) plus a Terminal
* escape hatch and a Restart Gateway action. Lives between the page
* header and the agent list when at least one OpenClaw agent is in
* the merged list; collapses to nothing for Claude/Codex-only setups.
*
* Status is sourced from `GET /agents`'s `gateway` field — the agents
* page no longer polls `/claw/status` directly. One endpoint, one
* 5s interval, no duplicate state.
*/
export const GatewayStatusBar: FC<GatewayStatusBarProps> = ({
status,
actionInProgress,
onOpenTerminal,
onRestart,
}) => {
if (!status) return null
const runningPill = pillForRuntimeStatus(status.status)
const controlPlanePill = pillForControlPlane(status.controlPlaneStatus)
return (
<div className="rounded-xl border border-border bg-card px-4 py-3 shadow-sm">
<div className="flex items-center gap-3 text-sm">
<span className="font-medium text-muted-foreground">
OpenClaw gateway
</span>
<Badge
variant={runningPill.variant}
className={cn('gap-1.5', runningPill.className)}
>
<span
className={cn(
'inline-block h-1.5 w-1.5 rounded-full',
runningPill.dot,
)}
/>
{runningPill.label}
</Badge>
<Badge
variant={controlPlanePill.variant}
className={cn('gap-1.5', controlPlanePill.className)}
>
<span
className={cn(
'inline-block h-1.5 w-1.5 rounded-full',
controlPlanePill.dot,
)}
/>
{controlPlanePill.label}
</Badge>
<Separator orientation="vertical" className="h-4" />
<WithTooltip label="Open a shell into the OpenClaw gateway container for raw CLI access (config edits, session inspection).">
<Button variant="ghost" size="sm" onClick={onOpenTerminal}>
<Terminal className="mr-1.5 h-3.5 w-3.5" />
Terminal
</Button>
</WithTooltip>
<WithTooltip label="Restart the OpenClaw gateway. Useful when the gateway is stuck or after editing provider config.">
<Button
variant="ghost"
size="sm"
onClick={onRestart}
disabled={actionInProgress}
className="ml-auto"
>
{actionInProgress ? (
<Loader2 className="mr-1.5 h-3.5 w-3.5 animate-spin" />
) : (
<RotateCcw className="mr-1.5 h-3.5 w-3.5" />
)}
Restart Gateway
</Button>
</WithTooltip>
</div>
</div>
)
}
const WithTooltip: FC<{ label: string; children: ReactNode }> = ({
label,
children,
}) => (
<TooltipProvider delayDuration={250}>
<Tooltip>
<TooltipTrigger asChild>{children}</TooltipTrigger>
<TooltipContent side="bottom" className="max-w-xs text-xs">
{label}
</TooltipContent>
</Tooltip>
</TooltipProvider>
)
type PillKind = {
variant: 'default' | 'secondary' | 'outline' | 'destructive'
label: string
dot: string
className?: string
}
function pillForRuntimeStatus(status: OpenClawStatus['status']): PillKind {
switch (status) {
case 'running':
return {
variant: 'secondary',
label: 'Running',
dot: 'bg-emerald-500',
className: 'bg-emerald-50 text-emerald-900 hover:bg-emerald-50',
}
case 'starting':
return {
variant: 'secondary',
label: 'Starting',
dot: 'bg-amber-500 animate-pulse',
className: 'bg-amber-50 text-amber-900 hover:bg-amber-50',
}
case 'stopped':
return {
variant: 'outline',
label: 'Stopped',
dot: 'bg-muted-foreground/40',
}
case 'error':
return {
variant: 'destructive',
label: 'Error',
dot: 'bg-destructive-foreground',
}
default:
return {
variant: 'outline',
label: 'Unknown',
dot: 'bg-muted-foreground/40',
}
}
}
function pillForControlPlane(
status: OpenClawStatus['controlPlaneStatus'],
): PillKind {
switch (status) {
case 'connected':
return {
variant: 'secondary',
label: 'Control plane connected',
dot: 'bg-emerald-500',
className: 'bg-emerald-50 text-emerald-900 hover:bg-emerald-50',
}
case 'connecting':
return {
variant: 'secondary',
label: 'Connecting',
dot: 'bg-amber-500 animate-pulse',
className: 'bg-amber-50 text-amber-900 hover:bg-amber-50',
}
case 'reconnecting':
return {
variant: 'secondary',
label: 'Reconnecting',
dot: 'bg-amber-500 animate-pulse',
className: 'bg-amber-50 text-amber-900 hover:bg-amber-50',
}
case 'recovering':
return {
variant: 'secondary',
label: 'Recovering',
dot: 'bg-amber-500 animate-pulse',
className: 'bg-amber-50 text-amber-900 hover:bg-amber-50',
}
case 'failed':
return {
variant: 'destructive',
label: 'Needs attention',
dot: 'bg-destructive-foreground',
}
default:
return {
variant: 'outline',
label: 'Disconnected',
dot: 'bg-muted-foreground/40',
}
}
}

View File

@@ -1,83 +0,0 @@
import type { FC } from 'react'
import {
Tooltip,
TooltipContent,
TooltipProvider,
TooltipTrigger,
} from '@/components/ui/tooltip'
import { cn } from '@/lib/utils'
export type AgentLiveness = 'working' | 'idle' | 'asleep' | 'error' | 'unknown'
interface LivenessDotProps {
status: AgentLiveness
/**
* Optional human-friendly secondary line, e.g. "Idle for 4 min" or
* "Asleep — no activity for 22 min". When absent the tooltip just
* reads the status label.
*/
detail?: string
className?: string
}
const VARIANT: Record<
AgentLiveness,
{ dot: string; ring: string; label: string }
> = {
working: {
// Animated amber pulse + soft halo so the eye catches an active
// agent in a long list without the dot screaming for attention.
dot: 'bg-amber-500 animate-pulse',
ring: 'ring-2 ring-amber-200',
label: 'Working on a turn',
},
idle: {
dot: 'bg-emerald-500',
ring: 'ring-2 ring-emerald-100',
label: 'Idle',
},
asleep: {
dot: 'bg-muted-foreground/40',
ring: 'ring-2 ring-muted',
label: 'Asleep',
},
error: {
dot: 'bg-destructive',
ring: 'ring-2 ring-destructive/30',
label: 'Attention',
},
unknown: {
dot: 'bg-muted-foreground/30',
ring: 'ring-2 ring-muted',
label: 'Status unknown',
},
}
export const LivenessDot: FC<LivenessDotProps> = ({
status,
detail,
className,
}) => {
const variant = VARIANT[status]
return (
<TooltipProvider delayDuration={150}>
<Tooltip>
<TooltipTrigger asChild>
<span
role="img"
aria-label={detail ?? variant.label}
className={cn(
'inline-block h-3 w-3 rounded-full',
variant.dot,
variant.ring,
className,
)}
/>
</TooltipTrigger>
<TooltipContent side="right" className="text-xs">
{detail ?? variant.label}
</TooltipContent>
</Tooltip>
</TooltipProvider>
)
}

View File

@@ -154,6 +154,7 @@ export const NewAgentDialog: FC<NewAgentDialogProps> = ({
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="openclaw">OpenClaw</SelectItem>
{adapters.map((adapter) => (
<SelectItem key={adapter.id} value={adapter.id}>
{adapter.name}

View File

@@ -1,107 +0,0 @@
import type { AgentListItem } from './agents-page-types'
import type { AgentLiveness } from './LivenessDot'
/**
* Display rules for the redesigned agent rows. Pure helpers — no React,
* no API calls — so they're trivial to unit-test and the row card stays
* focused on layout.
*/
const UUID_PATTERN =
/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
const OC_UUID_PATTERN =
/^oc-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i
/**
* The agent rail used to render whatever the gateway returned for `name`.
* Post-migration that's frequently the agent's UUID — readable to nobody.
* Prefer the explicit `name` when it differs meaningfully from the id;
* otherwise fall back to a short prefix users can recognize on second
* glance.
*/
export function displayName(agent: AgentListItem): string {
const name = agent.name?.trim()
const id = agent.agentId
if (!name || name === id) {
if (OC_UUID_PATTERN.test(id)) return id.slice(0, 11) // "oc-XXXXXXXX"
if (UUID_PATTERN.test(id)) return id.slice(0, 8)
return id
}
return name
}
export function canDelete(agent: AgentListItem): boolean {
// The gateway's protected `main` agent must not be deletable. The
// server enforces this too, but disabling the menu item avoids users
// hitting an opaque 400.
if (agent.agentId === 'main') return false
return agent.canDelete
}
/**
* Rename will be wired to a future `PATCH /agents/:id` endpoint. The
* legacy `/claw/agents` create flow named the agent on the gateway via
* the `name` field but the field isn't editable post-create today.
*/
export function canRename(_agent: AgentListItem): boolean {
return false
}
/**
* The detail line carries the agent's workspace path. The `detail`
* field on AgentListItem already holds it for OpenClaw entries
* (`/home/node/.openclaw/workspace-...`); for harness agents it's the
* synthetic `<adapter>:main` marker that's not informative — hide it.
*/
export function workspaceLabel(agent: AgentListItem): string | null {
if (!agent.detail) return null
if (/^(claude|codex|openclaw):main$/.test(agent.detail)) return null
return agent.detail
}
const ONE_MINUTE = 60_000
const ONE_HOUR = 60 * ONE_MINUTE
const ONE_DAY = 24 * ONE_HOUR
/**
* Lightweight relative-time formatter. We don't want to drag in
* `dayjs/relativeTime` just for a few labels.
*/
export function formatRelativeTime(epochMs: number | null): string {
if (epochMs === null || !Number.isFinite(epochMs)) return 'never'
const diff = Math.max(0, Date.now() - epochMs)
if (diff < ONE_MINUTE) return 'just now'
if (diff < ONE_HOUR) {
const m = Math.floor(diff / ONE_MINUTE)
return `${m} min ago`
}
if (diff < ONE_DAY) {
const h = Math.floor(diff / ONE_HOUR)
return h === 1 ? '1 hr ago' : `${h} hr ago`
}
const d = Math.floor(diff / ONE_DAY)
return d === 1 ? '1 day ago' : `${d} days ago`
}
/**
* Tooltip-friendly description of a row's current liveness state.
* Returns `undefined` when the state has nothing extra to add (e.g.
* `unknown` with no timestamp).
*/
export function livenessDetail(
status: AgentLiveness,
lastUsedAt: number | null | undefined,
): string | undefined {
if (lastUsedAt == null) return undefined
const diffMin = Math.floor((Date.now() - lastUsedAt) / 60_000)
if (status === 'idle') return `Idle for ${Math.max(0, diffMin)} min`
if (status === 'asleep') {
if (diffMin < 60) return `Asleep — quiet for ${diffMin} min`
const hr = Math.floor(diffMin / 60)
return `Asleep — quiet for ${hr} hr`
}
if (status === 'working') return 'Working on a turn'
if (status === 'error') return 'Attention — last turn failed'
return undefined
}

View File

@@ -1,6 +1,6 @@
import type { AgentEntry } from './useOpenClaw'
export type HarnessAgentAdapter = 'claude' | 'codex' | 'openclaw'
export type HarnessAgentAdapter = 'claude' | 'codex'
export type AgentHarnessStreamEvent =
| {
@@ -33,8 +33,6 @@ export type AgentHarnessStreamEvent =
code?: string
}
export type HarnessAgentLiveness = 'working' | 'idle' | 'asleep' | 'error'
export interface HarnessAgent {
id: string
name: string
@@ -45,54 +43,6 @@ export interface HarnessAgent {
sessionKey: string
createdAt: number
updatedAt: number
/**
* Server-derived liveness state. When the listing endpoint hasn't
* been enriched yet (older deployments) this is undefined and the UI
* falls back to `unknown`.
*/
status?: HarnessAgentLiveness
/**
* Wall-clock ms of the last persisted turn. `null` for never-used
* agents. Drives the recency sort and the "Last used X min ago" copy.
*/
lastUsedAt?: number | null
/** Pinned agents float to the top of the list. Defaults to `false`. */
pinned?: boolean
/** First non-blank line of the most recent user message; null if none. */
lastUserMessage?: string | null
/** Working directory the agent runs in; null when no session record yet. */
cwd?: string | null
/** Cumulative + 7-day rolling token usage; null when no record. */
tokens?: {
last7d: { input: number; output: number; requestCount: number }
cumulative: { input: number; output: number }
} | null
turnsByDay?: number[]
failedByDay?: number[]
lastError?: string | null
lastErrorAt?: number | null
/** When non-null, an in-flight turn this row can be resumed from. */
activeTurnId?: string | null
/** Persistent FIFO queue of messages waiting for this agent. */
queue?: HarnessQueuedMessage[]
}
export interface HarnessQueuedMessageAttachment {
mediaType: string
data: string
}
export interface HarnessQueuedMessage {
id: string
createdAt: number
message: string
attachments?: ReadonlyArray<HarnessQueuedMessageAttachment>
}
export interface HarnessAdapterHealth {
healthy: boolean
reason?: string
checkedAt: number
}
export interface HarnessAdapterDescriptor {
@@ -103,7 +53,6 @@ export interface HarnessAdapterDescriptor {
modelControl: 'runtime-supported' | 'best-effort'
models: Array<{ id: string; label: string; recommended?: boolean }>
reasoningEfforts: Array<{ id: string; label: string; recommended?: boolean }>
health?: HarnessAdapterHealth
}
export interface CreateHarnessAgentInput {
@@ -113,36 +62,19 @@ export interface CreateHarnessAgentInput {
reasoningEffort?: string
}
export interface HarnessHistoryReasoning {
text: string
durationMs?: number
}
export interface HarnessHistoryToolCall {
toolCallId?: string
toolName: string
status: 'pending' | 'running' | 'completed' | 'failed'
input?: unknown
output?: unknown
error?: string
durationMs?: number
}
export interface HarnessHistoryEntry {
export interface HarnessTranscriptEntry {
id: string
agentId: string
sessionId: 'main'
role: 'user' | 'assistant'
text: string
createdAt: number
reasoning?: HarnessHistoryReasoning
toolCalls?: HarnessHistoryToolCall[]
}
export interface HarnessAgentHistoryPage {
agentId: string
sessionId: 'main'
items: HarnessHistoryEntry[]
items: HarnessTranscriptEntry[]
}
export function mapHarnessAgentToEntry(agent: HarnessAgent): AgentEntry {

View File

@@ -1,160 +0,0 @@
import {
Copy,
Loader2,
MessageSquare,
MoreHorizontal,
Pencil,
RotateCcw,
Trash2,
} from 'lucide-react'
import type { FC } from 'react'
import { useNavigate } from 'react-router'
import { toast } from 'sonner'
import { Button } from '@/components/ui/button'
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuSeparator,
DropdownMenuTrigger,
} from '@/components/ui/dropdown-menu'
import {
Tooltip,
TooltipContent,
TooltipProvider,
TooltipTrigger,
} from '@/components/ui/tooltip'
import {
canDelete as canDeleteAgent,
canRename as canRenameAgent,
displayName,
} from '../agent-display.helpers'
import type { AgentListItem } from '../agents-page-types'
interface AgentActionsProps {
agent: AgentListItem
activeTurnId: string | null
deleting?: boolean
onDelete: (agent: AgentListItem) => void
}
/**
* Single primary CTA per row: `Resume` (filled, accent-orange, with a
* pulsing dot) when an active turn exists; otherwise `Chat` (outline).
* Both navigate to the same place — the chat hook auto-attaches via
* `/chat/active` when there's a live turn — but the row signals which
* action the user is actually taking.
*/
export const AgentActions: FC<AgentActionsProps> = ({
agent,
activeTurnId,
deleting,
onDelete,
}) => {
const navigate = useNavigate()
const allowDelete = canDeleteAgent(agent)
const allowRename = canRenameAgent(agent)
const handleChat = () => navigate(`/agents/${agent.agentId}`)
const handleCopyId = async () => {
try {
await navigator.clipboard.writeText(agent.agentId)
toast.success('Agent id copied')
} catch {
toast.error('Could not copy agent id')
}
}
return (
<div className="flex shrink-0 items-center gap-1.5">
{activeTurnId ? (
<Button
variant="default"
size="sm"
onClick={handleChat}
className="gap-2 bg-[var(--accent-orange)] text-white shadow-sm hover:bg-[var(--accent-orange)]/90"
>
<span className="relative flex size-2">
<span className="absolute inline-flex h-full w-full animate-ping rounded-full bg-white/70 opacity-75" />
<span className="relative inline-flex size-2 rounded-full bg-white" />
</span>
Resume
</Button>
) : (
<Button variant="outline" size="sm" onClick={handleChat}>
<MessageSquare className="mr-1.5 size-3" />
Chat
</Button>
)}
<DropdownMenu>
<DropdownMenuTrigger asChild>
<Button
variant="ghost"
size="icon"
aria-label={`More actions for ${displayName(agent)}`}
className="size-8 text-muted-foreground hover:text-foreground"
>
<MoreHorizontal className="size-4" />
</Button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-44">
<DropdownMenuItem onSelect={() => void handleCopyId()}>
<Copy className="mr-2 size-3.5" />
Copy id
</DropdownMenuItem>
<ComingSoonItem
icon={Pencil}
label="Rename"
disabled={!allowRename}
/>
<ComingSoonItem icon={RotateCcw} label="Reset history" disabled />
<DropdownMenuSeparator />
<DropdownMenuItem
onSelect={() => onDelete(agent)}
disabled={!allowDelete || deleting}
className="text-destructive focus:text-destructive"
>
{deleting ? (
<Loader2 className="mr-2 size-3.5 animate-spin" />
) : (
<Trash2 className="mr-2 size-3.5" />
)}
Delete
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
</div>
)
}
interface ComingSoonItemProps {
icon: typeof Pencil
label: string
disabled: boolean
}
const ComingSoonItem: FC<ComingSoonItemProps> = ({
icon: Icon,
label,
disabled,
}) => {
const item = (
<DropdownMenuItem disabled className="text-muted-foreground">
<Icon className="mr-2 size-3.5" />
{label}
</DropdownMenuItem>
)
if (!disabled) return item
return (
<TooltipProvider delayDuration={300}>
<Tooltip>
<TooltipTrigger asChild>
<span className="block w-full">{item}</span>
</TooltipTrigger>
<TooltipContent side="left" className="text-xs">
{label} coming soon
</TooltipContent>
</Tooltip>
</TooltipProvider>
)
}

View File

@@ -1,96 +0,0 @@
import { AlertTriangle, ChevronDown } from 'lucide-react'
import { type FC, useEffect, useState } from 'react'
import { Button } from '@/components/ui/button'
import {
Collapsible,
CollapsibleContent,
CollapsibleTrigger,
} from '@/components/ui/collapsible'
import {
HoverCard,
HoverCardContent,
HoverCardTrigger,
} from '@/components/ui/hover-card'
import { cn } from '@/lib/utils'
import { truncate } from './agent-row.helpers'
interface AgentErrorPanelProps {
agentId: string
message: string
errorAt: number | null
}
const STORAGE_PREFIX = 'agent-row:lastErrorSeenAt:'
const PREVIEW_CHARS = 200
export const AgentErrorPanel: FC<AgentErrorPanelProps> = ({
agentId,
message,
errorAt,
}) => {
const storageKey = `${STORAGE_PREFIX}${agentId}`
// Open if we've never seen this `errorAt` for this agent. Once the
// user collapses the panel (or refreshes after seeing it), we mark
// it seen so it doesn't re-pop on every poll.
const [open, setOpen] = useState<boolean>(() => {
if (typeof window === 'undefined' || !errorAt) return true
const seen = Number(window.localStorage.getItem(storageKey) ?? 0)
return !Number.isFinite(seen) || errorAt > seen
})
useEffect(() => {
if (!open && errorAt && typeof window !== 'undefined') {
window.localStorage.setItem(storageKey, String(errorAt))
}
}, [open, errorAt, storageKey])
const preview = truncate(message, PREVIEW_CHARS)
const truncated = preview.length < message.length
return (
<Collapsible open={open} onOpenChange={setOpen} className="mt-3">
<div className="flex items-center justify-between rounded-md border border-destructive/30 bg-destructive/5 px-3 py-2">
<div className="flex items-center gap-2 font-medium text-destructive text-xs">
<AlertTriangle className="size-3.5" />
Last error
</div>
<CollapsibleTrigger asChild>
<Button
variant="ghost"
size="sm"
className="h-6 px-2 text-muted-foreground"
>
<span className="text-xs">{open ? 'hide' : 'show'}</span>
<ChevronDown
className={cn(
'ml-1 size-3 transition-transform',
open && 'rotate-180',
)}
/>
</Button>
</CollapsibleTrigger>
</div>
<CollapsibleContent>
<div className="mt-1 rounded-md border-destructive/30 border-x border-b bg-destructive/5 px-3 pb-2 text-xs">
{truncated ? (
<HoverCard openDelay={300}>
<HoverCardTrigger asChild>
<span className="cursor-default font-mono text-foreground/80">
{preview}
</span>
</HoverCardTrigger>
<HoverCardContent
side="bottom"
className="max-w-md whitespace-pre-wrap font-mono text-xs"
>
{message}
</HoverCardContent>
</HoverCard>
) : (
<span className="font-mono text-foreground/80">{message}</span>
)}
</div>
</CollapsibleContent>
</Collapsible>
)
}

View File

@@ -1,35 +0,0 @@
import { Quote } from 'lucide-react'
import type { FC } from 'react'
import { firstNonBlankLine, truncate } from './agent-row.helpers'
interface AgentLastMessageProps {
message: string | null
}
const PREVIEW_CHARS = 110
/**
* Inline preview of the most recent user message. Renders as a quoted,
* italic line so the row reads like a conversation snippet rather than
* a label-and-value pair. No hover-card — opening the agent's chat is
* the canonical way to read the full message.
*/
export const AgentLastMessage: FC<AgentLastMessageProps> = ({ message }) => {
if (!message) {
return (
<p className="mt-1 text-muted-foreground/70 text-xs italic">
No messages yet start a chat
</p>
)
}
const preview = truncate(firstNonBlankLine(message), PREVIEW_CHARS)
return (
<p className="mt-1.5 flex items-start gap-1.5 text-foreground/85 text-sm italic leading-snug">
<Quote
className="mt-1 size-3 shrink-0 text-muted-foreground/60"
aria-hidden
/>
<span className="truncate">{preview}</span>
</p>
)
}

View File

@@ -1,37 +0,0 @@
import type { FC } from 'react'
import { formatRelativeTime } from '../agent-display.helpers'
import { AgentTokenSummary } from './AgentTokenSummary'
import type { AgentTokenUsage } from './agent-row.types'
interface AgentMetaRowProps {
lastUsedAt: number | null
tokens: AgentTokenUsage | null
}
/**
* Bottom-of-row meta line. Intentionally sparse — last activity time
* and lifetime tokens. CWD is no longer surfaced here because the path
* the server happens to be running from isn't actionable; if a future
* surface needs the cwd (chat panel, debug view) it reads from the
* listing payload directly.
*/
export const AgentMetaRow: FC<AgentMetaRowProps> = ({ lastUsedAt, tokens }) => {
const lastUsedLabel = formatRelativeTime(lastUsedAt)
const tokensTotal =
(tokens?.cumulative.input ?? 0) + (tokens?.cumulative.output ?? 0)
const showTokens = tokensTotal > 0
return (
<div className="mt-2 flex flex-wrap items-center gap-x-2 text-muted-foreground text-xs">
<span>{lastUsedLabel}</span>
{showTokens && (
<>
<span aria-hidden className="text-muted-foreground/50">
·
</span>
<AgentTokenSummary tokens={tokens} />
</>
)}
</div>
)
}

View File

@@ -1,92 +0,0 @@
import type { FC } from 'react'
import {
HoverCard,
HoverCardContent,
HoverCardTrigger,
} from '@/components/ui/hover-card'
import { cn } from '@/lib/utils'
import { formatLocalDate, ROW_BAR_COUNT } from './agent-row.helpers'
interface AgentSparklineProps {
/** 14 entries, oldest → newest. Today's bucket is the last index. */
turnsByDay: number[]
/** Same length, same order. Failed turns counted separately. */
failedByDay: number[]
className?: string
}
const MIN_BAR_HEIGHT_PX = 2
const MAX_BAR_HEIGHT_PX = 18
export const AgentSparkline: FC<AgentSparklineProps> = ({
turnsByDay,
failedByDay,
className,
}) => {
if (turnsByDay.length === 0 || turnsByDay.every((n) => n === 0)) return null
const max = Math.max(1, ...turnsByDay)
return (
<HoverCard openDelay={250}>
<HoverCardTrigger asChild>
<div
role="img"
aria-label={`Last ${ROW_BAR_COUNT} days of activity`}
className={cn('flex h-5 items-end gap-px', className)}
>
{turnsByDay.map((count, idx) => {
const ratio = count / max
const height = Math.max(
MIN_BAR_HEIGHT_PX,
Math.round(ratio * MAX_BAR_HEIGHT_PX),
)
const isToday = idx === ROW_BAR_COUNT - 1
const failed = failedByDay[idx] ?? 0
return (
<div
// biome-ignore lint/suspicious/noArrayIndexKey: fixed-length sparkline buckets keyed by day position
key={`bar-${idx}`}
className={cn(
'w-1.5 rounded-sm',
count === 0
? 'bg-muted-foreground/15'
: failed > 0
? 'bg-destructive/50'
: 'bg-[var(--accent-orange)]/50',
isToday && 'ring-1 ring-foreground/30',
)}
style={{ height }}
/>
)
})}
</div>
</HoverCardTrigger>
<HoverCardContent side="left" className="w-56 text-xs">
<div className="mb-2 font-medium text-sm">Last 14 days</div>
<ul className="space-y-0.5">
{turnsByDay.map((count, idx) => {
const failed = failedByDay[idx] ?? 0
const dayLabel = formatLocalDate(idx)
return (
<li
// biome-ignore lint/suspicious/noArrayIndexKey: fixed-length list keyed by day position
key={`day-${idx}`}
className="flex items-center justify-between text-muted-foreground"
>
<span>{dayLabel}</span>
<span>
{count}
{failed > 0 && (
<span className="ml-1 text-destructive">
({failed} failed)
</span>
)}
</span>
</li>
)
})}
</ul>
</HoverCardContent>
</HoverCard>
)
}

View File

@@ -1,71 +0,0 @@
import { TriangleAlert } from 'lucide-react'
import type { FC } from 'react'
import { Badge } from '@/components/ui/badge'
import {
HoverCard,
HoverCardContent,
HoverCardTrigger,
} from '@/components/ui/hover-card'
import { cn } from '@/lib/utils'
import { adapterLabel } from '../AdapterIcon'
import type { HarnessAgentAdapter } from '../agent-harness-types'
import type { AgentAdapterHealth } from './agent-row.types'
interface AgentSummaryChipsProps {
adapter: HarnessAgentAdapter | 'unknown'
modelLabel: string | null
reasoningEffort: string | null
/** When unhealthy, the adapter label dims and a warning chip appears. */
adapterHealth: AgentAdapterHealth | null
}
/**
* Adapter / model / reasoning summary line. Always rendered (so OpenClaw
* rows that fall back to defaults still expose what they're set up to do)
* and surfaces adapter-health *only when unhealthy* — keeping the calm
* default state silent and reserving visual noise for things the user
* needs to act on.
*/
export const AgentSummaryChips: FC<AgentSummaryChipsProps> = ({
adapter,
modelLabel,
reasoningEffort,
adapterHealth,
}) => {
const parts = [adapterLabel(adapter)]
if (modelLabel) parts.push(modelLabel)
if (reasoningEffort) parts.push(reasoningEffort)
const unhealthy = adapterHealth?.healthy === false
return (
<div
className={cn(
'flex items-center gap-1.5 text-muted-foreground text-xs',
unhealthy && 'text-muted-foreground/70',
)}
>
<span className="truncate">{parts.join(' · ')}</span>
{unhealthy && adapterHealth && (
<HoverCard openDelay={200}>
<HoverCardTrigger asChild>
<Badge
variant="outline"
className="h-5 cursor-default gap-1 border-amber-500/40 bg-amber-50 px-1.5 text-amber-900 hover:bg-amber-50"
>
<TriangleAlert className="size-2.5" />
<span className="font-normal">Unavailable</span>
</Badge>
</HoverCardTrigger>
<HoverCardContent side="right" className="w-72 text-sm">
<div className="font-medium">
{adapterLabel(adapter)} CLI not available
</div>
<div className="mt-1 text-muted-foreground text-xs">
{adapterHealth.reason ??
'Adapter binary missing on $PATH. Install it from the adapter docs to use this agent.'}
</div>
</HoverCardContent>
</HoverCard>
)}
</div>
)
}

View File

@@ -1,37 +0,0 @@
import type { FC } from 'react'
import { cn } from '@/lib/utils'
import { AdapterIcon } from '../AdapterIcon'
import { livenessDetail } from '../agent-display.helpers'
import type { HarnessAgentAdapter } from '../agent-harness-types'
import { type AgentLiveness, LivenessDot } from '../LivenessDot'
export interface AgentTileProps {
adapter: HarnessAgentAdapter | 'unknown'
status: AgentLiveness
lastUsedAt: number | null
}
/**
* Adapter glyph + a single liveness dot. Adapter health is no longer
* surfaced here — it lives as an inline pill inside `AgentSummaryChips`
* so the user isn't asked to disambiguate two dots on the same tile.
*/
export const AgentTile: FC<AgentTileProps> = ({
adapter,
status,
lastUsedAt,
}) => (
<div className="relative shrink-0">
<div className="flex h-12 w-12 items-center justify-center rounded-xl bg-muted text-muted-foreground">
<AdapterIcon adapter={adapter} className="h-6 w-6" />
</div>
<LivenessDot
status={status}
detail={livenessDetail(status, lastUsedAt)}
className={cn(
'absolute -right-0.5 -bottom-0.5',
status === 'working' && 'animate-pulse',
)}
/>
</div>
)

View File

@@ -1,55 +0,0 @@
import type { FC } from 'react'
import { Badge } from '@/components/ui/badge'
import { displayName } from '../agent-display.helpers'
import type { AgentListItem } from '../agents-page-types'
import type { AgentLiveness } from '../LivenessDot'
import { AgentSparkline } from './AgentSparkline'
import { PinToggle } from './PinToggle'
interface AgentTitleRowProps {
agent: AgentListItem
status: AgentLiveness
pinned: boolean
turnsByDay: number[]
failedByDay: number[]
onPinToggle: (next: boolean) => void
}
/**
* Title strip: name + status badge + (right-aligned) sparkline. The
* pin toggle sits trailing the title so the title always flushes left
* regardless of pin state — moving the star left of the title indents
* the row's first line off-axis from the model/preview/meta lines
* below it. When unpinned and not hovered, the toggle is removed from
* layout entirely so it reserves no space at all.
*/
export const AgentTitleRow: FC<AgentTitleRowProps> = ({
agent,
status,
pinned,
turnsByDay,
failedByDay,
onPinToggle,
}) => (
<div className="mb-1 flex items-center gap-2">
<span className="truncate font-semibold">{displayName(agent)}</span>
{status === 'working' && (
<Badge
variant="secondary"
className="bg-amber-50 text-amber-900 hover:bg-amber-50"
>
Working
</Badge>
)}
{status === 'asleep' && (
<Badge variant="outline" className="text-muted-foreground">
Asleep
</Badge>
)}
{status === 'error' && <Badge variant="destructive">Attention</Badge>}
<PinToggle pinned={pinned} onToggle={onPinToggle} />
<div className="ml-auto">
<AgentSparkline turnsByDay={turnsByDay} failedByDay={failedByDay} />
</div>
</div>
)

View File

@@ -1,63 +0,0 @@
import type { FC } from 'react'
import {
HoverCard,
HoverCardContent,
HoverCardTrigger,
} from '@/components/ui/hover-card'
import { Progress } from '@/components/ui/progress'
import { formatTokens } from './agent-row.helpers'
import type { AgentTokenUsage } from './agent-row.types'
interface AgentTokenSummaryProps {
tokens: AgentTokenUsage | null
}
/**
* Inline token total + a HoverCard breakdown. Surfaces lifetime tokens
* (the only window we can compute reliably from the session record).
* Per-window stats land in a follow-up once the activity ledger ships.
*/
export const AgentTokenSummary: FC<AgentTokenSummaryProps> = ({ tokens }) => {
if (!tokens) return null
const { input, output } = tokens.cumulative
const total = input + output
if (total === 0) return null
const inputPct = (input / total) * 100
return (
<HoverCard openDelay={200}>
<HoverCardTrigger asChild>
<span className="cursor-default text-muted-foreground tabular-nums transition-colors hover:text-foreground">
{formatTokens(total)} tokens
</span>
</HoverCardTrigger>
<HoverCardContent side="top" align="end" className="w-72 text-sm">
<div className="mb-3 flex items-center justify-between">
<span className="font-medium">Lifetime tokens</span>
<span className="text-muted-foreground text-xs tabular-nums">
{formatTokens(total)} total
</span>
</div>
<div className="space-y-2">
<div className="flex items-center justify-between text-xs">
<span className="text-muted-foreground">Input</span>
<span className="tabular-nums">{formatTokens(input)}</span>
</div>
<Progress value={inputPct} className="h-1.5" />
<div className="mt-2 flex items-center justify-between text-xs">
<span className="text-muted-foreground">Output</span>
<span className="tabular-nums">{formatTokens(output)}</span>
</div>
<Progress value={100 - inputPct} className="h-1.5" />
</div>
<p className="mt-3 border-t pt-2 text-muted-foreground text-xs leading-snug">
Cumulative across every turn this agent has run. Per-window stats
arrive in a future release.
</p>
</HoverCardContent>
</HoverCard>
)
}

View File

@@ -1,60 +0,0 @@
import { Star } from 'lucide-react'
import type { FC } from 'react'
import { Button } from '@/components/ui/button'
import {
Tooltip,
TooltipContent,
TooltipProvider,
TooltipTrigger,
} from '@/components/ui/tooltip'
import { cn } from '@/lib/utils'
interface PinToggleProps {
pinned: boolean
onToggle: (next: boolean) => void
}
/**
* Trailing star toggle. The button is *always rendered* — only its
* opacity changes between pinned/unpinned/hover states — so the title
* row's height is constant. Hiding the slot via `display: none` would
* collapse the row's vertical metrics on hover and shift every card
* below in the rail.
*
* Placement is trailing the title (after the status badge) so the
* title itself flushes left regardless of pin state — leading the
* row with the star would indent the title relative to the model /
* preview / meta lines beneath it.
*/
export const PinToggle: FC<PinToggleProps> = ({ pinned, onToggle }) => (
<TooltipProvider delayDuration={300}>
<Tooltip>
<TooltipTrigger asChild>
<Button
variant="ghost"
size="icon"
className={cn(
'size-6 text-muted-foreground transition-opacity hover:text-foreground',
pinned ? 'opacity-100' : 'opacity-0 group-hover:opacity-100',
)}
aria-pressed={pinned}
aria-label={pinned ? 'Unpin agent' : 'Pin agent'}
onClick={(event) => {
event.stopPropagation()
onToggle(!pinned)
}}
>
<Star
className={cn(
'size-3.5',
pinned && 'fill-amber-400 text-amber-500',
)}
/>
</Button>
</TooltipTrigger>
<TooltipContent side="top" className="text-xs">
{pinned ? 'Unpin' : 'Pin to top'}
</TooltipContent>
</Tooltip>
</TooltipProvider>
)

View File

@@ -1,73 +0,0 @@
import { describe, expect, it } from 'bun:test'
import {
firstNonBlankLine,
formatLocalDate,
formatTokens,
ROW_BAR_COUNT,
truncate,
} from './agent-row.helpers'
describe('formatTokens', () => {
it('renders zero / NaN as "0"', () => {
expect(formatTokens(0)).toBe('0')
expect(formatTokens(Number.NaN)).toBe('0')
})
it('renders sub-1K as integer', () => {
expect(formatTokens(142)).toBe('142')
})
it('renders K with one decimal under 10', () => {
expect(formatTokens(8_400)).toBe('8.4K')
})
it('drops the decimal at >=10K', () => {
expect(formatTokens(120_000)).toBe('120K')
})
it('renders M with one decimal under 10', () => {
expect(formatTokens(1_200_000)).toBe('1.2M')
})
})
describe('firstNonBlankLine', () => {
it('returns the first non-blank line', () => {
expect(firstNonBlankLine('\n\nhello\nworld')).toBe('hello')
})
it('skips USER_QUERY envelope tags', () => {
expect(firstNonBlankLine('<USER_QUERY>\nfix tests\n</USER_QUERY>')).toBe(
'fix tests',
)
})
it('falls back to the trimmed input when nothing matches', () => {
expect(firstNonBlankLine(' single ')).toBe('single')
})
})
describe('truncate', () => {
it('returns input unchanged when within limit', () => {
expect(truncate('hello', 10)).toBe('hello')
})
it('appends an ellipsis when over limit', () => {
expect(truncate('hello world', 6)).toBe('hello…')
})
})
describe('formatLocalDate', () => {
const today = new Date('2026-04-30T12:00:00Z')
it('labels today and yesterday explicitly', () => {
expect(formatLocalDate(ROW_BAR_COUNT - 1, today)).toBe('today')
expect(formatLocalDate(ROW_BAR_COUNT - 2, today)).toBe('yesterday')
})
it('returns a "Mon D" format for older days', () => {
const label = formatLocalDate(0, today)
// "Apr 17" or "Apr 17," depending on locale; just assert it
// contains a month abbreviation and a day number.
expect(label).toMatch(/[A-Za-z]+ \d+/)
})
})

View File

@@ -1,64 +0,0 @@
/**
* Pure formatters consumed by row sub-components. Kept distinct from
* `agent-display.helpers.ts` (page-level helpers) so the row internals
* have an obvious single home.
*/
const TOKEN_THRESHOLDS: Array<[number, string]> = [
[1_000_000, 'M'],
[1_000, 'K'],
]
/** `1.2M`, `820K`, `8.4K`, `142`, `0`. */
export function formatTokens(n: number): string {
if (!Number.isFinite(n) || n <= 0) return '0'
for (const [threshold, suffix] of TOKEN_THRESHOLDS) {
if (n >= threshold) {
const value = n / threshold
const decimal = value < 10 ? value.toFixed(1) : value.toFixed(0)
return `${decimal}${suffix}`
}
}
return String(Math.round(n))
}
const USER_QUERY_OPEN = /^<USER_QUERY>$/i
const USER_QUERY_CLOSE = /^<\/USER_QUERY>$/i
/**
* First non-blank line, with the BrowserOS user-system-prompt
* `<USER_QUERY>` envelope tags stripped so previews don't show
* structural noise.
*/
export function firstNonBlankLine(text: string): string {
const lines = text.split('\n').map((line) => line.trim())
for (const line of lines) {
if (!line) continue
if (USER_QUERY_OPEN.test(line) || USER_QUERY_CLOSE.test(line)) continue
return line
}
return text.trim()
}
export function truncate(text: string, max: number): string {
if (text.length <= max) return text
return `${text.slice(0, max - 1).trimEnd()}`
}
const SPARKLINE_DAYS = 14
/**
* "today" / "yesterday" / "Apr 17" — given an index 0..13 from
* oldest → newest. `today` defaults to `new Date()` so callers don't
* have to thread a clock through.
*/
export function formatLocalDate(idx: number, today: Date = new Date()): string {
if (idx === SPARKLINE_DAYS - 1) return 'today'
if (idx === SPARKLINE_DAYS - 2) return 'yesterday'
const offset = SPARKLINE_DAYS - 1 - idx
const date = new Date(today)
date.setDate(date.getDate() - offset)
return date.toLocaleDateString(undefined, { month: 'short', day: 'numeric' })
}
export const ROW_BAR_COUNT = SPARKLINE_DAYS

View File

@@ -1,51 +0,0 @@
import type { HarnessAgentAdapter } from '../agent-harness-types'
import type { AgentListItem } from '../agents-page-types'
import type { AgentLiveness } from '../LivenessDot'
/**
* Window-bounded token usage. Server returns `null` when no session
* record exists yet for the agent.
*/
export interface AgentTokenUsage {
last7d: { input: number; output: number; requestCount: number }
cumulative: { input: number; output: number }
}
export interface AgentAdapterHealth {
healthy: boolean
reason?: string
}
/**
* Everything an `AgentRowCard` needs to render. Mirrors the shape
* `useHarnessAgents` exposes; the page assembles one entry per row in
* `AgentList` and passes it down. Sub-components only see slices of
* this object — no prop drilling beyond two levels.
*/
export interface AgentRowData {
agent: AgentListItem
adapter: HarnessAgentAdapter | 'unknown'
modelLabel: string | null
reasoningEffort: string | null
status: AgentLiveness
lastUsedAt: number | null
pinned: boolean
cwd: string | null
lastUserMessage: string | null
tokens: AgentTokenUsage | null
/** 14 entries, oldest → newest. Today is the last index. */
turnsByDay: number[]
/** Same length and ordering as `turnsByDay`. */
failedByDay: number[]
lastError: string | null
lastErrorAt: number | null
/** When non-null, an in-flight turn this row can be resumed from. */
activeTurnId: string | null
/** Adapter-level health, shared across rows for the same adapter. */
adapterHealth: AgentAdapterHealth | null
}
export interface AgentRowCallbacks {
onDelete: (agent: AgentListItem) => void
onPinToggle: (agent: AgentListItem, next: boolean) => void
}

View File

@@ -138,20 +138,24 @@ export function getVisibleOpenClawAgents(
}
export function getAgentsLoading(input: {
statusLoading: boolean
adaptersLoading: boolean
harnessAgentsLoading: boolean
openClawAgentsEnabled: boolean
openClawAgentsLoading: boolean
}): boolean {
return (
input.statusLoading ||
input.adaptersLoading ||
input.harnessAgentsLoading ||
input.openClawAgentsLoading
(input.openClawAgentsEnabled && input.openClawAgentsLoading)
)
}
export function getInlineError(input: {
lifecyclePending: boolean
pageError: string | null
statusError: Error | null
openClawAgentsError: Error | null
adaptersError: Error | null
harnessAgentsError: Error | null
@@ -159,6 +163,7 @@ export function getInlineError(input: {
if (input.lifecyclePending) return null
return (
input.pageError ??
input.statusError?.message ??
input.openClawAgentsError?.message ??
input.adaptersError?.message ??
input.harnessAgentsError?.message ??

View File

@@ -8,20 +8,8 @@ import {
type HarnessAdapterDescriptor,
type HarnessAgent,
type HarnessAgentHistoryPage,
type HarnessQueuedMessage,
mapHarnessAgentToEntry,
} from './agent-harness-types'
import type { OpenClawStatus } from './useOpenClaw'
/**
* Combined response shape of `GET /agents`. The page polls this once
* and consumes both fields, replacing the dedicated `/claw/status`
* poll the previous design carried.
*/
interface HarnessAgentsResponse {
agents: HarnessAgent[]
gateway: OpenClawStatus | null
}
export type { AgentHarnessStreamEvent }
@@ -81,31 +69,21 @@ export function useHarnessAgents(enabled = true) {
error: urlError,
} = useAgentServerUrl()
const query = useQuery<HarnessAgentsResponse, Error>({
const query = useQuery<HarnessAgent[], Error>({
queryKey: [AGENT_QUERY_KEYS.agents, baseUrl],
queryFn: async () => {
const data = await agentsFetch<HarnessAgentsResponse>(
const data = await agentsFetch<{ agents: HarnessAgent[] }>(
baseUrl as string,
'/',
)
return {
agents: data.agents ?? [],
gateway: data.gateway ?? null,
}
return data.agents ?? []
},
enabled: Boolean(baseUrl) && !urlLoading && enabled,
// Poll every 5s so the per-agent liveness state (working / idle /
// asleep / error) and last-used timestamps stay fresh without a
// websocket. `refetchIntervalInBackground: false` lets a hidden
// tab go quiet — react-query's default, made explicit.
refetchInterval: 5_000,
refetchIntervalInBackground: false,
})
return {
agents: (query.data?.agents ?? []).map(mapHarnessAgentToEntry),
harnessAgents: query.data?.agents ?? [],
gateway: query.data?.gateway ?? null,
agents: (query.data ?? []).map(mapHarnessAgentToEntry),
harnessAgents: query.data ?? [],
loading: query.isLoading || urlLoading,
error: query.error ?? urlError,
refetch: query.refetch,
@@ -136,63 +114,6 @@ export function useCreateHarnessAgent() {
})
}
/**
* Apply a partial update to a harness agent. Used by the pin-toggle
* star and (eventually) the inline rename UI. Optimistically writes
* the patch into the listing query cache so the row updates instantly,
* then rolls back if the server rejects the change.
*/
export function useUpdateHarnessAgent() {
const { baseUrl, isLoading: urlLoading } = useAgentServerUrl()
const queryClient = useQueryClient()
return useMutation({
mutationFn: async (input: {
agentId: string
patch: { name?: string; pinned?: boolean }
}) => {
if (!baseUrl || urlLoading) {
throw new Error('BrowserOS agent server URL is not ready')
}
const data = await agentsFetch<{ agent: HarnessAgent }>(
baseUrl,
`/${encodeURIComponent(input.agentId)}`,
{
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(input.patch),
},
)
return data.agent
},
onMutate: async ({ agentId, patch }) => {
const queryKey = [AGENT_QUERY_KEYS.agents, baseUrl]
await queryClient.cancelQueries({ queryKey })
const previous = queryClient.getQueryData<HarnessAgentsResponse>(queryKey)
if (!previous) return { previous: undefined }
queryClient.setQueryData<HarnessAgentsResponse>(queryKey, {
...previous,
agents: previous.agents.map((agent) =>
agent.id === agentId ? { ...agent, ...patch } : agent,
),
})
return { previous }
},
onError: (_err, _vars, context) => {
if (!context?.previous) return
queryClient.setQueryData(
[AGENT_QUERY_KEYS.agents, baseUrl],
context.previous,
)
},
onSettled: async () => {
await queryClient.invalidateQueries({
queryKey: [AGENT_QUERY_KEYS.agents],
})
},
})
}
export function useDeleteHarnessAgent() {
const { baseUrl, isLoading: urlLoading } = useAgentServerUrl()
const queryClient = useQueryClient()
@@ -220,97 +141,16 @@ export async function chatWithHarnessAgent(
agentId: string,
message: string,
signal?: AbortSignal,
attachments?: ReadonlyArray<unknown>,
): Promise<Response> {
const baseUrl = await getAgentServerUrl()
return fetch(`${baseUrl}/agents/${encodeURIComponent(agentId)}/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message,
...(attachments && attachments.length > 0 ? { attachments } : {}),
}),
body: JSON.stringify({ message }),
signal,
})
}
/**
* Subscribe to an existing turn (the server's `ActiveTurnRegistry`
* decoupled the turn lifecycle from POST /chat). `lastSeq` lets the
* client resume after a disconnect — the server replays buffered
* frames with seq > lastSeq, then tails new ones.
*/
export async function attachToHarnessTurn(
agentId: string,
options: { turnId?: string; lastSeq?: number; signal?: AbortSignal } = {},
): Promise<Response> {
const baseUrl = await getAgentServerUrl()
const url = new URL(
`${baseUrl}/agents/${encodeURIComponent(agentId)}/chat/stream`,
)
if (options.turnId) url.searchParams.set('turnId', options.turnId)
const headers: Record<string, string> = {}
if (typeof options.lastSeq === 'number') {
headers['Last-Event-ID'] = String(options.lastSeq)
}
return fetch(url.toString(), { signal: options.signal, headers })
}
export interface HarnessActiveTurnInfo {
turnId: string
agentId: string
sessionId: 'main'
status: 'running' | 'done' | 'error' | 'cancelled'
lastSeq: number
startedAt: number
endedAt?: number
/** User message that kicked off the turn; null when not captured. */
prompt: string | null
}
/**
* Discover an in-flight turn for an agent. Used on chat mount so the
* UI reattaches instead of starting a new turn after a tab/refresh.
*/
export async function fetchActiveHarnessTurn(
agentId: string,
): Promise<HarnessActiveTurnInfo | null> {
const baseUrl = await getAgentServerUrl()
const response = await fetch(
`${baseUrl}/agents/${encodeURIComponent(agentId)}/chat/active`,
)
if (!response.ok) return null
const body = (await response.json()) as {
active: HarnessActiveTurnInfo | null
}
return body.active
}
/**
* Stop button. Hits the explicit cancel endpoint instead of just
* aborting the fetch (which now only detaches *this* subscriber from
* the buffer; the underlying turn would otherwise keep running).
*/
export async function cancelHarnessTurn(
agentId: string,
options: { turnId?: string; reason?: string } = {},
): Promise<{ cancelled: boolean }> {
const baseUrl = await getAgentServerUrl()
const response = await fetch(
`${baseUrl}/agents/${encodeURIComponent(agentId)}/chat/cancel`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
...(options.turnId ? { turnId: options.turnId } : {}),
...(options.reason ? { reason: options.reason } : {}),
}),
},
)
if (!response.ok) return { cancelled: false }
return (await response.json()) as { cancelled: boolean }
}
export async function fetchHarnessAgentHistory(
agentId: string,
): Promise<HarnessAgentHistoryPage> {
@@ -320,145 +160,3 @@ export async function fetchHarnessAgentHistory(
`/${encodeURIComponent(agentId)}/sessions/main/history`,
)
}
export interface EnqueueMessageInput {
message: string
attachments?: ReadonlyArray<unknown>
}
export async function enqueueHarnessMessage(
agentId: string,
input: EnqueueMessageInput,
): Promise<HarnessQueuedMessage> {
const baseUrl = await getAgentServerUrl()
const response = await fetch(
`${baseUrl}/agents/${encodeURIComponent(agentId)}/queue`,
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: input.message,
...(input.attachments && input.attachments.length > 0
? { attachments: input.attachments }
: {}),
}),
},
)
if (!response.ok) {
let message = `Request failed with status ${response.status}`
try {
const body = (await response.json()) as { error?: string }
if (body.error) message = body.error
} catch {}
throw new Error(message)
}
const body = (await response.json()) as { queued: HarnessQueuedMessage }
return body.queued
}
export async function removeHarnessQueuedMessage(
agentId: string,
messageId: string,
): Promise<{ removed: boolean }> {
const baseUrl = await getAgentServerUrl()
const response = await fetch(
`${baseUrl}/agents/${encodeURIComponent(agentId)}/queue/${encodeURIComponent(
messageId,
)}`,
{ method: 'DELETE' },
)
if (!response.ok) return { removed: false }
return (await response.json()) as { removed: boolean }
}
/**
* Optimistic enqueue: writes the new queued message into the listing
* cache immediately so the queue panel reflects the change without
* waiting for the next poll. Rolls back if the server rejects.
*/
export function useEnqueueHarnessMessage() {
const { baseUrl } = useAgentServerUrl()
const queryClient = useQueryClient()
return useMutation({
mutationFn: async (input: { agentId: string } & EnqueueMessageInput) =>
enqueueHarnessMessage(input.agentId, input),
onMutate: async (input) => {
const queryKey = [AGENT_QUERY_KEYS.agents, baseUrl]
await queryClient.cancelQueries({ queryKey })
const previous = queryClient.getQueryData<HarnessAgentsResponse>(queryKey)
if (!previous) return { previous: undefined }
const optimistic: HarnessQueuedMessage = {
id: `optimistic-${Math.random().toString(36).slice(2, 10)}`,
createdAt: Date.now(),
message: input.message,
}
queryClient.setQueryData<HarnessAgentsResponse>(queryKey, {
...previous,
agents: previous.agents.map((agent) =>
agent.id === input.agentId
? { ...agent, queue: [...(agent.queue ?? []), optimistic] }
: agent,
),
})
return { previous }
},
onError: (_err, _vars, context) => {
if (!context?.previous) return
queryClient.setQueryData(
[AGENT_QUERY_KEYS.agents, baseUrl],
context.previous,
)
},
onSettled: async () => {
await queryClient.invalidateQueries({
queryKey: [AGENT_QUERY_KEYS.agents],
})
},
})
}
/**
* Optimistic queue removal mirror of `useEnqueueHarnessMessage`.
*/
export function useRemoveHarnessQueuedMessage() {
const { baseUrl } = useAgentServerUrl()
const queryClient = useQueryClient()
return useMutation({
mutationFn: async (input: { agentId: string; messageId: string }) =>
removeHarnessQueuedMessage(input.agentId, input.messageId),
onMutate: async (input) => {
const queryKey = [AGENT_QUERY_KEYS.agents, baseUrl]
await queryClient.cancelQueries({ queryKey })
const previous = queryClient.getQueryData<HarnessAgentsResponse>(queryKey)
if (!previous) return { previous: undefined }
queryClient.setQueryData<HarnessAgentsResponse>(queryKey, {
...previous,
agents: previous.agents.map((agent) =>
agent.id === input.agentId
? {
...agent,
queue: (agent.queue ?? []).filter(
(entry) => entry.id !== input.messageId,
),
}
: agent,
),
})
return { previous }
},
onError: (_err, _vars, context) => {
if (!context?.previous) return
queryClient.setQueryData(
[AGENT_QUERY_KEYS.agents, baseUrl],
context.previous,
)
},
onSettled: async () => {
await queryClient.invalidateQueries({
queryKey: [AGENT_QUERY_KEYS.agents],
})
},
})
}

View File

@@ -1,4 +1,5 @@
import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query'
import { getAgentServerUrl } from '@/lib/browseros/helpers'
import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
export interface AgentEntry {
@@ -318,3 +319,25 @@ export function buildChatHistoryFromTurns(
return messages
}
export async function chatWithAgent(
agentId: string,
message: string,
sessionKey?: string,
history: OpenClawChatHistoryMessage[] = [],
signal?: AbortSignal,
attachments?: ReadonlyArray<unknown>,
): Promise<Response> {
const baseUrl = await getAgentServerUrl()
return fetch(`${baseUrl}/claw/agents/${agentId}/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message,
sessionKey,
history,
...(attachments && attachments.length > 0 ? { attachments } : {}),
}),
signal,
})
}

View File

@@ -164,17 +164,9 @@ export const NewScheduledTaskDialog: FC<NewScheduledTaskDialogProps> = ({
const resolvedProvider: Provider | null = (() => {
const id = selectedProviderId ?? defaultProviderId
const found = providers.find((p) => p.id === id)
if (found) {
return {
kind: 'llm' as const,
id: found.id,
name: found.name,
type: found.type,
}
}
if (found) return { id: found.id, name: found.name, type: found.type }
if (providers[0])
return {
kind: 'llm' as const,
id: providers[0].id,
name: providers[0].name,
type: providers[0].type,
@@ -183,7 +175,6 @@ export const NewScheduledTaskDialog: FC<NewScheduledTaskDialogProps> = ({
})()
const providerOptions: Provider[] = providers.map((p) => ({
kind: 'llm',
id: p.id,
name: p.name,
type: p.type,

View File

@@ -1,4 +1,4 @@
import { Bot, Github, History, Plus, SettingsIcon } from 'lucide-react'
import { Github, History, Plus, SettingsIcon } from 'lucide-react'
import type { FC } from 'react'
import { Link, useLocation, useNavigate } from 'react-router'
import { ChatProviderSelector } from '@/components/chat/ChatProviderSelector'
@@ -64,9 +64,7 @@ export const ChatHeader: FC<ChatHeaderProps> = ({
className="group relative inline-flex cursor-pointer items-center gap-2 rounded-lg p-2 text-muted-foreground transition-colors hover:bg-muted/50 hover:text-foreground data-[state=open]:bg-accent"
title="Change AI Provider"
>
{selectedProvider.kind === 'acp' ? (
<Bot className="h-[18px] w-[18px]" />
) : selectedProvider.type === 'browseros' ? (
{selectedProvider.type === 'browseros' ? (
<BrowserOSIcon size={18} />
) : (
<ProviderIcon

View File

@@ -1,258 +0,0 @@
import { describe, expect, it } from 'bun:test'
import type {
HarnessAdapterDescriptor,
HarnessAgent,
} from '@/entrypoints/app/agents/agent-harness-types'
import type { LlmProviderConfig } from '@/lib/llm-providers/types'
import {
buildSidepanelChatTargets,
persistSidepanelChatTargetSelection,
resolveSidepanelChatTarget,
type SidepanelChatTargetSelection,
toLlmProviderConfig,
} from './sidepanel-chat-targets'
const timestamp = 1000
const providers: LlmProviderConfig[] = [
{
id: 'browseros',
type: 'browseros',
name: 'BrowserOS',
baseUrl: 'https://api.browseros.com/v1',
modelId: 'browseros-auto',
supportsImages: true,
contextWindow: 200000,
temperature: 0.2,
createdAt: timestamp,
updatedAt: timestamp,
},
{
id: 'anthropic-sonnet',
type: 'anthropic',
name: 'Anthropic Sonnet',
modelId: 'claude-sonnet-4-6',
apiKey: 'sk-ant',
supportsImages: true,
contextWindow: 200000,
temperature: 0.2,
createdAt: timestamp,
updatedAt: timestamp,
},
]
const adapters: HarnessAdapterDescriptor[] = [
{
id: 'claude',
name: 'Claude Code',
defaultModelId: 'haiku',
defaultReasoningEffort: 'medium',
modelControl: 'best-effort',
models: [
{ id: 'sonnet', label: 'Sonnet' },
{ id: 'haiku', label: 'Haiku', recommended: true },
],
reasoningEfforts: [
{ id: 'medium', label: 'Medium', recommended: true },
{ id: 'high', label: 'High' },
],
},
{
id: 'codex',
name: 'Codex',
defaultModelId: 'gpt-5.5',
defaultReasoningEffort: 'medium',
modelControl: 'runtime-supported',
models: [{ id: 'gpt-5.5', label: 'GPT-5.5', recommended: true }],
reasoningEfforts: [{ id: 'medium', label: 'Medium', recommended: true }],
},
{
id: 'openclaw',
name: 'OpenClaw',
defaultModelId: 'default',
defaultReasoningEffort: 'medium',
modelControl: 'best-effort',
models: [],
reasoningEfforts: [
{ id: 'medium', label: 'Medium', recommended: true },
{ id: 'high', label: 'High' },
],
},
]
const agents: HarnessAgent[] = [
{
id: 'agent-codex',
name: 'Review Bot',
adapter: 'codex',
modelId: 'gpt-5.5',
reasoningEffort: 'medium',
permissionMode: 'approve-all',
sessionKey: 'agent:agent-codex:main',
createdAt: timestamp,
updatedAt: timestamp,
},
{
id: 'agent-openclaw',
name: 'Research Claw',
adapter: 'openclaw',
modelId: 'default',
reasoningEffort: 'high',
permissionMode: 'approve-all',
sessionKey: 'agent:agent-openclaw:main',
createdAt: timestamp,
updatedAt: timestamp,
},
]
describe('buildSidepanelChatTargets', () => {
it('returns LLM targets plus one ACP target per persisted harness agent', () => {
const targets = buildSidepanelChatTargets({ providers, adapters, agents })
expect(targets.map((target) => target.id)).toEqual([
'browseros',
'anthropic-sonnet',
'agent-codex',
'agent-openclaw',
])
})
it('does not emit catalog-only ACP targets without persisted agents', () => {
const targets = buildSidepanelChatTargets({
providers,
adapters,
agents: [],
})
expect(targets.map((target) => target.id)).toEqual([
'browseros',
'anthropic-sonnet',
])
})
it('uses the created OpenClaw agent name instead of a generic adapter target', () => {
const targets = buildSidepanelChatTargets({ providers, adapters, agents })
const openclaw = targets.find((target) => target.id === 'agent-openclaw')
expect(openclaw).toMatchObject({
kind: 'acp',
id: 'agent-openclaw',
agentId: 'agent-openclaw',
adapter: 'openclaw',
adapterName: 'OpenClaw',
modelId: 'default',
modelLabel: 'default',
name: 'Research Claw',
modelControl: 'best-effort',
reasoningEffort: 'high',
})
})
it('preserves adapter metadata for created agent targets', () => {
const targets = buildSidepanelChatTargets({ providers, adapters, agents })
const codex = targets.find((target) => target.id === 'agent-codex')
expect(codex).toMatchObject({
kind: 'acp',
agentId: 'agent-codex',
adapter: 'codex',
adapterName: 'Codex',
modelId: 'gpt-5.5',
modelLabel: 'GPT-5.5',
modelControl: 'runtime-supported',
recommended: true,
reasoningEffort: 'medium',
reasoningEffortLabel: 'Medium',
})
})
it('still returns LLM targets when agents and adapters are unavailable', () => {
expect(
buildSidepanelChatTargets({ providers, adapters: [], agents: [] }),
).toEqual([
{
kind: 'llm',
id: 'browseros',
name: 'BrowserOS',
type: 'browseros',
provider: providers[0],
},
{
kind: 'llm',
id: 'anthropic-sonnet',
name: 'Anthropic Sonnet',
type: 'anthropic',
provider: providers[1],
},
])
})
})
describe('resolveSidepanelChatTarget', () => {
it('resolves selected LLM targets back to their provider config', () => {
const targets = buildSidepanelChatTargets({ providers, adapters, agents })
const resolved = resolveSidepanelChatTarget({
targets,
defaultProviderId: 'browseros',
selection: { kind: 'llm', id: 'anthropic-sonnet' },
})
expect(resolved?.kind).toBe('llm')
expect(toLlmProviderConfig(resolved)?.modelId).toBe('claude-sonnet-4-6')
})
it('falls back to the current default LLM provider when a persisted ACP target is stale', () => {
const targets = buildSidepanelChatTargets({
providers,
adapters,
agents: [],
})
expect(
resolveSidepanelChatTarget({
targets,
defaultProviderId: 'anthropic-sonnet',
selection: { kind: 'acp', id: 'agent-codex' },
}),
).toMatchObject({
kind: 'llm',
id: 'anthropic-sonnet',
})
})
it('falls back when an old catalog-style ACP target id is persisted', () => {
const targets = buildSidepanelChatTargets({ providers, adapters, agents })
expect(
resolveSidepanelChatTarget({
targets,
defaultProviderId: 'anthropic-sonnet',
selection: { kind: 'acp', id: 'acp:codex:gpt-5.5:medium' },
}),
).toMatchObject({
kind: 'llm',
id: 'anthropic-sonnet',
})
})
})
describe('persistSidepanelChatTargetSelection', () => {
it('stores only target identity and does not mutate LLM provider arrays', async () => {
let savedSelection: SidepanelChatTargetSelection | null = null
const originalProviders = providers.map((provider) => ({ ...provider }))
const targets = buildSidepanelChatTargets({ providers, adapters, agents })
const target = targets.find((candidate) => candidate.id === 'agent-codex')
await persistSidepanelChatTargetSelection(target, {
setValue: async (value) => {
savedSelection = value
},
})
expect(savedSelection as SidepanelChatTargetSelection | null).toEqual({
kind: 'acp',
id: 'agent-codex',
})
expect(providers).toEqual(originalProviders)
})
})

View File

@@ -1,178 +0,0 @@
import type {
HarnessAdapterDescriptor,
HarnessAgent,
HarnessAgentAdapter,
} from '@/entrypoints/app/agents/agent-harness-types'
import type { LlmProviderConfig, ProviderType } from '@/lib/llm-providers/types'
export type SidepanelTargetKind = 'llm' | 'acp'
export type SidepanelChatTarget =
| {
kind: 'llm'
id: string
name: string
type: ProviderType
provider: LlmProviderConfig
}
| {
kind: 'acp'
id: string
name: string
type: 'acp'
agentId: string
adapter: HarnessAgentAdapter
adapterName: string
modelId: string
modelLabel: string
modelControl: HarnessAdapterDescriptor['modelControl']
recommended?: boolean
reasoningEffort: string
reasoningEffortLabel?: string
}
export type SidepanelChatTargetSelection = Pick<
SidepanelChatTarget,
'kind' | 'id'
>
interface BuildSidepanelChatTargetsInput {
providers: LlmProviderConfig[]
adapters: HarnessAdapterDescriptor[]
agents?: HarnessAgent[]
}
interface ResolveSidepanelChatTargetInput {
targets: SidepanelChatTarget[]
defaultProviderId: string
selection?: SidepanelChatTargetSelection | null
}
interface SidepanelChatTargetSelectionWriter {
setValue(value: SidepanelChatTargetSelection | null): Promise<void>
}
interface SidepanelChatTargetSelectionReader {
getValue(): Promise<SidepanelChatTargetSelection | null>
}
type SidepanelChatTargetSelectionStore = SidepanelChatTargetSelectionReader &
SidepanelChatTargetSelectionWriter
let sidepanelChatTargetSelectionStorage:
| SidepanelChatTargetSelectionStore
| undefined
export function buildSidepanelChatTargets({
providers,
adapters,
agents = [],
}: BuildSidepanelChatTargetsInput): SidepanelChatTarget[] {
return [
...providers.map(toLlmTarget),
...agents.map((agent) => toAcpTargetForAgent(agent, adapters)),
]
}
function toAcpTargetForAgent(
agent: HarnessAgent,
adapters: HarnessAdapterDescriptor[],
): SidepanelChatTarget {
const adapter = adapters.find((entry) => entry.id === agent.adapter)
const modelId = agent.modelId ?? adapter?.defaultModelId ?? 'default'
const reasoningEffort =
agent.reasoningEffort ?? adapter?.defaultReasoningEffort ?? 'medium'
const model = adapter?.models.find((entry) => entry.id === modelId)
const reasoning = adapter?.reasoningEfforts.find(
(effort) => effort.id === reasoningEffort,
)
return {
kind: 'acp',
id: agent.id,
name: agent.name,
type: 'acp',
agentId: agent.id,
adapter: agent.adapter,
adapterName: adapter?.name ?? formatAdapterName(agent.adapter),
modelId,
modelLabel: model?.label ?? modelId,
modelControl: adapter?.modelControl ?? 'best-effort',
recommended: model?.recommended,
reasoningEffort,
reasoningEffortLabel: reasoning?.label,
}
}
function formatAdapterName(adapter: HarnessAgentAdapter): string {
if (adapter === 'claude') return 'Claude Code'
if (adapter === 'codex') return 'Codex'
if (adapter === 'openclaw') return 'OpenClaw'
return adapter
}
export function resolveSidepanelChatTarget({
targets,
defaultProviderId,
selection,
}: ResolveSidepanelChatTargetInput): SidepanelChatTarget | undefined {
if (selection) {
const selected = targets.find(
(target) => target.kind === selection.kind && target.id === selection.id,
)
if (selected) return selected
}
return (
targets.find(
(target) => target.kind === 'llm' && target.id === defaultProviderId,
) ?? targets.find((target) => target.kind === 'llm')
)
}
export function toLlmProviderConfig(
target: SidepanelChatTarget | undefined,
): LlmProviderConfig | undefined {
return target?.kind === 'llm' ? target.provider : undefined
}
export async function persistSidepanelChatTargetSelection(
target: SidepanelChatTarget | undefined,
store?: SidepanelChatTargetSelectionWriter,
): Promise<void> {
const targetStore = store ?? (await getSidepanelChatTargetSelectionStorage())
await targetStore.setValue(
target ? { kind: target.kind, id: target.id } : null,
)
}
export async function loadSidepanelChatTargetSelection(
store?: SidepanelChatTargetSelectionReader,
): Promise<SidepanelChatTargetSelection | null> {
const targetStore = store ?? (await getSidepanelChatTargetSelectionStorage())
return targetStore.getValue()
}
function toLlmTarget(provider: LlmProviderConfig): SidepanelChatTarget {
return {
kind: 'llm',
id: provider.id,
name: provider.name,
type: provider.type,
provider,
}
}
async function getSidepanelChatTargetSelectionStorage(): Promise<SidepanelChatTargetSelectionStore> {
if (sidepanelChatTargetSelectionStorage) {
return sidepanelChatTargetSelectionStorage
}
const { storage } = await import('@wxt-dev/storage')
sidepanelChatTargetSelectionStorage =
storage.defineItem<SidepanelChatTargetSelection | null>(
'local:sidepanel-chat-target-selection',
{ fallback: null },
)
return sidepanelChatTargetSelectionStorage
}

View File

@@ -1,21 +1,9 @@
import { useCallback, useEffect, useMemo, useRef, useState } from 'react'
import { useEffect, useRef } from 'react'
import useDeepCompareEffect from 'use-deep-compare-effect'
import {
useAgentAdapters,
useHarnessAgents,
} from '@/entrypoints/app/agents/useAgents'
import type { LlmProviderConfig } from '@/lib/llm-providers/types'
import { useLlmProviders } from '@/lib/llm-providers/useLlmProviders'
import { type McpServer, useMcpServers } from '@/lib/mcp/mcpServerStorage'
import { usePersonalization } from '@/lib/personalization/personalizationStorage'
import {
buildSidepanelChatTargets,
loadSidepanelChatTargetSelection,
persistSidepanelChatTargetSelection,
resolveSidepanelChatTarget,
type SidepanelChatTarget,
type SidepanelChatTargetSelection,
} from './sidepanel-chat-targets'
const constructMcpServers = (servers: McpServer[]) => {
return servers
@@ -35,53 +23,14 @@ const constructCustomServers = (servers: McpServer[]) => {
export const useChatRefs = () => {
const { servers: mcpServers } = useMcpServers()
const {
providers: llmProviders,
selectedProvider: selectedLlmProvider,
setDefaultProvider,
isLoading: isLoadingProviders,
} = useLlmProviders()
const { adapters, loading: isLoadingAdapters } = useAgentAdapters()
const { harnessAgents, loading: isLoadingAgents } = useHarnessAgents()
const { personalization } = usePersonalization()
const [targetSelection, setTargetSelection] =
useState<SidepanelChatTargetSelection | null>(null)
useEffect(() => {
let cancelled = false
loadSidepanelChatTargetSelection().then((selection) => {
if (!cancelled) setTargetSelection(selection)
})
return () => {
cancelled = true
}
}, [])
const chatTargets = useMemo(
() =>
buildSidepanelChatTargets({
providers: llmProviders,
adapters,
agents: harnessAgents,
}),
[llmProviders, adapters, harnessAgents],
)
const selectedChatTarget = useMemo(
() =>
resolveSidepanelChatTarget({
targets: chatTargets,
defaultProviderId: selectedLlmProvider?.id ?? llmProviders[0]?.id ?? '',
selection: targetSelection,
}),
[chatTargets, llmProviders, selectedLlmProvider, targetSelection],
)
const selectedLlmProviderRef = useRef<LlmProviderConfig | null>(
selectedLlmProvider,
)
const selectedChatTargetRef = useRef<SidepanelChatTarget | undefined>(
selectedChatTarget,
)
const enabledMcpServersRef = useRef(constructMcpServers(mcpServers))
const enabledCustomServersRef = useRef(constructCustomServers(mcpServers))
const personalizationRef = useRef(personalization)
@@ -92,36 +41,16 @@ export const useChatRefs = () => {
enabledCustomServersRef.current = constructCustomServers(mcpServers)
}, [selectedLlmProvider, mcpServers])
useEffect(() => {
selectedChatTargetRef.current = selectedChatTarget
}, [selectedChatTarget])
useEffect(() => {
personalizationRef.current = personalization
}, [personalization])
const selectChatTarget = useCallback(
async (target: SidepanelChatTarget | undefined) => {
selectedChatTargetRef.current = target
setTargetSelection(target ? { kind: target.kind, id: target.id } : null)
await persistSidepanelChatTargetSelection(target)
},
[],
)
return {
selectedLlmProviderRef,
selectedChatTargetRef,
enabledMcpServersRef,
enabledCustomServersRef,
personalizationRef,
llmProviders,
setDefaultProvider,
chatTargets,
selectedChatTarget,
selectChatTarget,
selectedLlmProvider,
isLoadingProviders:
isLoadingProviders || isLoadingAdapters || isLoadingAgents,
isLoadingProviders,
}
}

View File

@@ -1,153 +0,0 @@
import { describe, expect, it } from 'bun:test'
import type { LlmProviderConfig } from '@/lib/llm-providers/types'
import type { ChatMode } from './chatTypes'
import type { SidepanelChatTarget } from './sidepanel-chat-targets'
import { buildSidepanelPreparedSendMessagesRequest } from './useChatSessionRequest'
const conversationId = '00000000-0000-4000-8000-000000000001'
describe('buildSidepanelPreparedSendMessagesRequest', () => {
it('keeps LLM targets on the existing /chat request body', () => {
const request = buildSidepanelPreparedSendMessagesRequest({
agentServerUrl: 'http://127.0.0.1:5151',
target: llmTarget,
fallbackProvider,
message: 'Summarize this page',
...commonRequestInput(),
})
expect(request.api).toBe('http://127.0.0.1:5151/chat')
expect(request.body).toMatchObject({
message: 'Summarize this page',
conversationId,
provider: 'browseros',
providerType: 'browseros',
providerName: 'BrowserOS',
model: 'gpt-5',
mode: 'agent',
browserContext: {
activeTab: { id: 10, url: 'https://example.com', title: 'Example' },
enabledMcpServers: ['slack'],
},
userSystemPrompt: 'Be concise',
userWorkingDir: '/tmp/work',
previousConversation: [{ role: 'assistant', content: 'Prior answer' }],
selectedText: 'selected text',
selectedTextSource: {
url: 'https://example.com',
title: 'Example',
},
})
})
it('sends created-agent targets to the agent-id sidepanel route', () => {
const request = buildSidepanelPreparedSendMessagesRequest({
agentServerUrl: 'http://127.0.0.1:5151',
target: acpTarget,
fallbackProvider,
message: 'Inspect the current tab',
approvalResponses: [
{ approvalId: 'approval-1', approved: true, reason: 'ok' },
],
...commonRequestInput(),
})
expect(request.api).toBe(
'http://127.0.0.1:5151/agents/agent-codex/sidepanel/chat',
)
expect(request.body).toEqual({
conversationId,
message: 'Inspect the current tab',
browserContext: {
activeTab: { id: 10, url: 'https://example.com', title: 'Example' },
enabledMcpServers: ['slack'],
},
userSystemPrompt: 'Be concise',
userWorkingDir: '/tmp/work',
selectedText: 'selected text',
selectedTextSource: {
url: 'https://example.com',
title: 'Example',
},
})
})
it('keeps tool approval retry payloads scoped to LLM chat', () => {
const request = buildSidepanelPreparedSendMessagesRequest({
agentServerUrl: 'http://127.0.0.1:5151',
target: llmTarget,
fallbackProvider,
approvalResponses: [
{ approvalId: 'approval-1', approved: false, reason: 'no' },
],
...commonRequestInput(),
})
expect(request.api).toBe('http://127.0.0.1:5151/chat')
expect(request.body).toMatchObject({
message: '',
toolApprovalResponses: [
{ approvalId: 'approval-1', approved: false, reason: 'no' },
],
})
})
})
function commonRequestInput() {
return {
conversationId,
mode: 'agent' as ChatMode,
browserContext: {
activeTab: { id: 10, url: 'https://example.com', title: 'Example' },
enabledMcpServers: ['slack'],
},
userSystemPrompt: 'Be concise',
userWorkingDir: '/tmp/work',
previousConversation: [
{ role: 'assistant' as const, content: 'Prior answer' },
],
declinedApps: ['gmail'],
aclRules: [{ id: 'rule-1', sitePattern: '*://*/*', enabled: true }],
selectedText: 'selected text',
selectedTextSource: {
url: 'https://example.com',
title: 'Example',
},
toolApprovalConfig: { categories: { navigation: true } },
}
}
const fallbackProvider: LlmProviderConfig = {
id: 'browseros',
type: 'browseros',
name: 'BrowserOS',
modelId: 'gpt-5',
supportsImages: true,
contextWindow: 128000,
temperature: 0.7,
createdAt: 1000,
updatedAt: 1000,
}
const llmTarget: SidepanelChatTarget = {
kind: 'llm',
id: fallbackProvider.id,
name: fallbackProvider.name,
type: fallbackProvider.type,
provider: fallbackProvider,
}
const acpTarget: SidepanelChatTarget = {
kind: 'acp',
id: 'agent-codex',
name: 'Review bot',
type: 'acp',
agentId: 'agent-codex',
adapter: 'codex',
adapterName: 'Codex',
modelId: 'gpt-5.5',
modelLabel: 'GPT-5.5',
modelControl: 'best-effort',
reasoningEffort: 'medium',
reasoningEffortLabel: 'Medium',
}

View File

@@ -26,14 +26,15 @@ import { useInvalidateCredits } from '@/lib/credits/useCredits'
import { declinedAppsStorage } from '@/lib/declined-apps/storage'
import { useGraphqlQuery } from '@/lib/graphql/useGraphqlQuery'
import { createDefaultBrowserOSProvider } from '@/lib/llm-providers/storage'
import type {
ApprovalResponseData,
ChatRequestBrowserContext,
import { useLlmProviders } from '@/lib/llm-providers/useLlmProviders'
import {
type ApprovalResponseData,
buildChatRequestBody,
type ChatRequestBrowserContext,
} from '@/lib/messaging/server/buildChatRequestBody'
import { track } from '@/lib/metrics/track'
import { searchActionsStorage } from '@/lib/search-actions/searchActionsStorage'
import { selectedTextStorage } from '@/lib/selected-text/selectedTextStorage'
import { sentry } from '@/lib/sentry/sentry'
import { stopAgentStorage } from '@/lib/stop-agent/stop-agent-storage'
import {
type ApprovalResponse,
@@ -51,12 +52,7 @@ import {
import { selectedWorkspaceStorage } from '@/lib/workspace/workspace-storage'
import type { ChatMode } from './chatTypes'
import { GetConversationWithMessagesDocument } from './graphql/chatSessionDocument'
import { toLlmProviderConfig } from './sidepanel-chat-targets'
import { useChatRefs } from './useChatRefs'
import {
buildSidepanelPreparedSendMessagesRequest,
toProviderOption,
} from './useChatSessionRequest'
import { useExecutionHistoryTracker } from './useExecutionHistoryTracker'
import { useNotifyActiveTab } from './useNotifyActiveTab'
import { useRemoteConversationSave } from './useRemoteConversationSave'
@@ -190,19 +186,16 @@ const buildRequestBrowserContext = ({
export const useChatSession = (options?: ChatSessionOptions) => {
const {
selectedLlmProviderRef,
selectedChatTargetRef,
enabledMcpServersRef,
enabledCustomServersRef,
personalizationRef,
setDefaultProvider,
chatTargets,
selectedChatTarget,
selectChatTarget,
selectedLlmProvider,
isLoadingProviders,
} = useChatRefs()
const invalidateCredits = useInvalidateCredits()
const { providers: llmProviders, setDefaultProvider } = useLlmProviders()
const {
baseUrl: agentServerUrl,
isLoading: isLoadingAgentUrl,
@@ -225,7 +218,11 @@ export const useChatSession = (options?: ChatSessionOptions) => {
agentUrlRef.current = agentServerUrl
}, [agentServerUrl])
const providers: Provider[] = chatTargets.map(toProviderOption)
const providers: Provider[] = llmProviders.map((p) => ({
id: p.id,
name: p.name,
type: p.type,
}))
const [mode, setMode] = useState<ChatMode>('agent')
const [textToAction, setTextToAction] = useState<Map<string, ChatAction>>(
@@ -327,8 +324,15 @@ export const useChatSession = (options?: ChatSessionOptions) => {
textToActionRef.current = textToAction
}, [mode, textToAction])
const selectedProvider = selectedChatTarget
? toProviderOption(selectedChatTarget)
const selectedProvider = selectedLlmProvider
? {
id: selectedLlmProvider.id,
name: selectedLlmProvider.name,
type:
selectedLlmProvider.id === 'browseros'
? ('browseros' as const)
: selectedLlmProvider.type,
}
: providers[0]
const {
@@ -342,8 +346,7 @@ export const useChatSession = (options?: ChatSessionOptions) => {
} = useChat({
transport: new DefaultChatTransport({
prepareSendMessagesRequest: async ({ messages }) => {
const target = selectedChatTargetRef.current
const fallbackProvider =
const provider =
selectedLlmProviderRef.current ?? createDefaultBrowserOSProvider()
const activeTabsList = await chrome.tabs.query({
active: true,
@@ -392,46 +395,51 @@ export const useChatSession = (options?: ChatSessionOptions) => {
personalizationRef.current,
)
const commonRequest = {
conversationId: conversationIdRef.current,
mode: currentMode,
browserContext: requestBrowserContext,
userSystemPrompt,
userWorkingDir: workingDirRef.current,
previousConversation,
declinedApps,
aclRules: enabledAclRules,
toolApprovalConfig: approvalConfig,
}
const approvalResponses =
target?.kind === 'acp' ? null : extractApprovalResponses(messages)
const approvalResponses = extractApprovalResponses(messages)
if (approvalResponses) {
return buildSidepanelPreparedSendMessagesRequest({
agentServerUrl: agentUrlRef.current ?? undefined,
target,
fallbackProvider,
...commonRequest,
approvalResponses,
})
return {
api: `${agentUrlRef.current}/chat`,
body: buildChatRequestBody({
conversationId: conversationIdRef.current,
provider,
mode: currentMode,
browserContext: requestBrowserContext,
userSystemPrompt,
userWorkingDir: workingDirRef.current,
previousConversation,
declinedApps,
aclRules: enabledAclRules,
toolApprovalConfig: approvalConfig,
toolApprovalResponses: approvalResponses,
}),
}
}
const message = getLastMessageText(messages)
const result = buildSidepanelPreparedSendMessagesRequest({
agentServerUrl: agentUrlRef.current ?? undefined,
target,
fallbackProvider,
message,
...commonRequest,
selectedText: activeTabSelection?.text,
selectedTextSource: activeTabSelection
? {
url: activeTabSelection.url,
title: activeTabSelection.title,
}
: undefined,
})
const result = {
api: `${agentUrlRef.current}/chat`,
body: buildChatRequestBody({
message,
conversationId: conversationIdRef.current,
provider,
mode: currentMode,
browserContext: requestBrowserContext,
userSystemPrompt,
userWorkingDir: workingDirRef.current,
previousConversation,
declinedApps,
aclRules: enabledAclRules,
selectedText: activeTabSelection?.text,
selectedTextSource: activeTabSelection
? {
url: activeTabSelection.url,
title: activeTabSelection.title,
}
: undefined,
toolApprovalConfig: approvalConfig,
}),
}
// Track which tab's selection was sent so we can clear it on success
pendingSelectionTabKeyRef.current =
@@ -443,7 +451,7 @@ export const useChatSession = (options?: ChatSessionOptions) => {
sendAutomaticallyWhen: () => {
if (approvalJustRespondedRef.current) {
approvalJustRespondedRef.current = false
return selectedChatTargetRef.current?.kind !== 'acp'
return true
}
return false
},
@@ -678,22 +686,10 @@ export const useChatSession = (options?: ChatSessionOptions) => {
}, [dispatchMessage, isIntegrationsSynced])
const sendMessage = (params: { text: string; action?: ChatAction }) => {
const target = selectedChatTargetRef.current
const llmTargetProvider = toLlmProviderConfig(target)
const agentTarget = target?.kind === 'acp' ? target : undefined
track(MESSAGE_SENT_EVENT, {
mode,
provider_id:
agentTarget?.agentId ??
llmTargetProvider?.id ??
selectedLlmProvider?.id,
provider_type: agentTarget ? 'acp' : llmTargetProvider?.type,
agent_id: agentTarget?.agentId,
adapter: agentTarget?.adapter,
model:
agentTarget?.modelId ??
llmTargetProvider?.modelId ??
selectedLlmProvider?.modelId,
provider_type: selectedLlmProvider?.type,
model: selectedLlmProvider?.modelId,
})
if (!isIntegrationsSyncedRef.current) {
@@ -745,54 +741,14 @@ export const useChatSession = (options?: ChatSessionOptions) => {
addToolApprovalResponse(params)
}
const resetConversationState = () => {
stop()
void finishExecutionTask({ isAbort: true })
setConversationId(crypto.randomUUID())
setMessages([])
setTextToAction(new Map())
setLiked({})
setDisliked({})
setRestoredConversationId(null)
resetRemoteConversation()
}
const handleSelectProvider = (provider: Provider) => {
const target = chatTargets.find(
(candidate) =>
candidate.id === provider.id && candidate.kind === provider.kind,
)
if (!target) return
const previousTarget = selectedChatTargetRef.current
const fullProvider = llmProviders.find((p) => p.id === provider.id)
track(PROVIDER_SELECTED_EVENT, {
provider_id: target.id,
provider_type: target.kind === 'acp' ? 'acp' : target.type,
model_id:
target.kind === 'acp' ? target.modelId : target.provider.modelId,
agent_id: target.kind === 'acp' ? target.agentId : undefined,
adapter: target.kind === 'acp' ? target.adapter : undefined,
provider_id: provider.id,
provider_type: provider.type,
model_id: fullProvider?.modelId,
})
void selectChatTarget(target).catch((error) => {
sentry.captureException(error, {
extra: {
message: 'Failed to persist sidepanel chat target selection',
targetId: target.id,
targetKind: target.kind,
},
})
})
if (target.kind === 'llm') setDefaultProvider(target.provider.id)
if (
previousTarget &&
(previousTarget.kind !== target.kind ||
previousTarget.id !== target.id) &&
messagesRef.current.length > 0
) {
resetConversationState()
}
setDefaultProvider(provider.id)
}
const getActionForMessage = (message: UIMessage) => {
@@ -806,7 +762,15 @@ export const useChatSession = (options?: ChatSessionOptions) => {
const resetConversation = () => {
track(CONVERSATION_RESET_EVENT, { message_count: messages.length })
resetConversationState()
stop()
void finishExecutionTask({ isAbort: true })
setConversationId(crypto.randomUUID())
setMessages([])
setTextToAction(new Map())
setLiked({})
setDisliked({})
setRestoredConversationId(null)
resetRemoteConversation()
}
const isRestoringConversation =

View File

@@ -1,74 +0,0 @@
import type { Provider } from '../../../components/chat/chatComponentTypes'
import type { LlmProviderConfig } from '../../../lib/llm-providers/types'
import {
type ApprovalResponseData,
buildChatRequestBody,
} from '../../../lib/messaging/server/buildChatRequestBody'
import {
type SidepanelChatTarget,
toLlmProviderConfig,
} from './sidepanel-chat-targets'
type LlmChatRequestBodyInput = Parameters<typeof buildChatRequestBody>[0]
type CommonSidepanelRequestInput = Omit<
LlmChatRequestBodyInput,
'provider' | 'message' | 'toolApprovalResponses' | 'isScheduledTask'
>
interface BuildSidepanelPreparedSendMessagesRequestInput
extends CommonSidepanelRequestInput {
agentServerUrl: string | undefined
target: SidepanelChatTarget | undefined
fallbackProvider: LlmProviderConfig
message?: string
approvalResponses?: ApprovalResponseData[] | null
}
export function buildSidepanelPreparedSendMessagesRequest({
agentServerUrl,
target,
fallbackProvider,
message,
approvalResponses,
...common
}: BuildSidepanelPreparedSendMessagesRequestInput) {
if (target?.kind === 'acp') {
return {
api: `${agentServerUrl}/agents/${encodeURIComponent(target.agentId)}/sidepanel/chat`,
body: {
conversationId: common.conversationId,
message: message ?? '',
browserContext: common.browserContext,
userSystemPrompt: common.userSystemPrompt,
userWorkingDir: common.userWorkingDir,
selectedText: common.selectedText,
selectedTextSource: common.selectedTextSource,
},
}
}
const provider = toLlmProviderConfig(target) ?? fallbackProvider
return {
api: `${agentServerUrl}/chat`,
body: buildChatRequestBody({
...common,
provider,
message,
toolApprovalResponses: approvalResponses ?? undefined,
}),
}
}
export function toProviderOption(target: SidepanelChatTarget): Provider {
return {
id: target.id,
name: target.name,
type: target.type,
kind: target.kind,
agentId: target.kind === 'acp' ? target.agentId : undefined,
adapterName: target.kind === 'acp' ? target.adapterName : undefined,
modelLabel: target.kind === 'acp' ? target.modelLabel : undefined,
modelControl: target.kind === 'acp' ? target.modelControl : undefined,
}
}

View File

@@ -59,3 +59,15 @@ export interface AgentConversation {
createdAt: number
updatedAt: number
}
export interface AgentCardData {
agentId: string
name: string
model?: string
status: 'idle' | 'working' | 'error'
lastMessage?: string
lastMessageTimestamp?: number
activitySummary?: string
currentTool?: string
costUsd?: number
}

View File

@@ -2,75 +2,29 @@ function isAbortError(error: unknown): boolean {
return error instanceof DOMException && error.name === 'AbortError'
}
export interface ParsedSSEEvent<T> {
data: T
/** Numeric `id:` line on the same SSE event, if any. */
seq?: number
}
export function parseSSELines<T>(buffer: string): {
events: ParsedSSEEvent<T>[]
events: T[]
remainder: string
} {
// SSE events are separated by blank lines. Buffer lines until we hit
// a blank, then assemble each event. Lines we recognise: `id: <n>`
// and `data: <payload>`. Everything else is ignored.
const events: ParsedSSEEvent<T>[] = []
const lines = buffer.split('\n')
// Find the last blank-line boundary; everything after it is the
// remainder (next event partially received).
let lastBoundary = -1
for (let i = lines.length - 1; i >= 0; i--) {
if (lines[i] === '') {
lastBoundary = i
break
}
}
const completeLines = lastBoundary >= 0 ? lines.slice(0, lastBoundary) : []
const remainder =
lastBoundary >= 0 ? lines.slice(lastBoundary + 1).join('\n') : buffer
const remainder = lines.pop() ?? ''
const events: T[] = []
let currentSeq: number | undefined
let currentData: string | null = null
const flush = () => {
if (currentData != null && currentData !== '[DONE]') {
try {
events.push({
data: JSON.parse(currentData) as T,
seq: currentSeq,
})
} catch {
// ignore
}
}
currentSeq = undefined
currentData = null
for (const line of lines) {
if (!line.startsWith('data: ')) continue
const payload = line.slice(6)
if (payload === '[DONE]') continue
try {
events.push(JSON.parse(payload) as T)
} catch {}
}
for (const line of completeLines) {
if (line === '') {
flush()
continue
}
if (line.startsWith('id: ')) {
const n = Number.parseInt(line.slice(4).trim(), 10)
if (Number.isFinite(n)) currentSeq = n
continue
}
if (line.startsWith('data: ')) {
currentData = line.slice(6)
}
}
// Catch a complete trailing event with no terminating blank line —
// shouldn't happen in well-formed SSE, but be tolerant.
flush()
return { events, remainder }
}
export async function consumeSSEStream<T>(
response: Response,
onEvent: (event: T, meta: { seq?: number }) => void,
onEvent: (event: T) => void,
signal?: AbortSignal,
): Promise<void> {
const reader = response.body?.getReader()
@@ -95,7 +49,7 @@ export async function consumeSSEStream<T>(
buffer = remainder
for (const event of events) {
onEvent(event.data, { seq: event.seq })
onEvent(event)
}
}
} catch (error) {
@@ -110,7 +64,7 @@ export async function consumeSSEStream<T>(
if (buffer) {
const { events } = parseSSELines<T>(buffer)
for (const event of events) {
onEvent(event.data, { seq: event.seq })
onEvent(event)
}
}
}

View File

@@ -9,7 +9,6 @@
"build": "bun run codegen && wxt build",
"build:dev": "bun --env-file=.env.development wxt build --mode development",
"zip": "wxt zip",
"test": "bun run ../../scripts/run-bun-test.ts ./apps/agent",
"compile": "bun --env-file=.env.development wxt prepare && tsgo --noEmit",
"lint": "bunx biome check",
"typecheck": "bun --env-file=.env.development wxt prepare && tsgo --noEmit",

View File

@@ -1,51 +0,0 @@
# Copy to .env.development for local eval runs.
# Provider keys used by existing config files.
OPENROUTER_API_KEY=
FIREWORKS_API_KEY=
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
GOOGLE_GENERATIVE_AI_API_KEY=
# Claude Agent SDK token used by performance_grader.
CLAUDE_CODE_OAUTH_TOKEN=
# Suite-mode model selection.
EVAL_VARIANT=local
EVAL_AGENT_PROVIDER=openai-compatible
EVAL_AGENT_MODEL=
EVAL_AGENT_API_KEY=
EVAL_AGENT_BASE_URL=
EVAL_AGENT_SUPPORTS_IMAGES=true
# Optional suite-mode executor override for orchestrator suites.
EVAL_EXECUTOR_MODEL=
EVAL_EXECUTOR_API_KEY=
EVAL_EXECUTOR_BASE_URL=
# Clado visual action executor.
CLADO_ACTION_MODEL=
CLADO_ACTION_API_KEY=
CLADO_ACTION_BASE_URL=
# Backward-compatible alias used by older local scripts.
CLADO_ACTION_URL=
# BrowserOS runner.
BROWSEROS_BINARY=/Applications/BrowserOS.app/Contents/MacOS/BrowserOS
BROWSEROS_SERVER_URL=http://127.0.0.1:9110
BROWSEROS_SERVER_LOG_DIR=/tmp/browseros-server-logs
BROWSEROS_CONFIG_URL=
# Captcha solver extension.
NOPECHA_API_KEY=
# WebArena-Infinity.
WEBARENA_INFINITY_DIR=
INFINITY_APP_URL=
# R2 publishing and weekly report.
EVAL_R2_ACCOUNT_ID=
EVAL_R2_ACCESS_KEY_ID=
EVAL_R2_SECRET_ACCESS_KEY=
EVAL_R2_BUCKET=browseros-eval
EVAL_R2_CDN_BASE_URL=https://eval.browseros.com

View File

@@ -14,7 +14,6 @@ Evaluation framework for BrowserOS browser automation agents. Runs tasks from st
```bash
cd apps/eval
cp .env.example .env.development
# Edit .env.development with your keys, then:
bun run eval
```
@@ -24,55 +23,11 @@ Opens the eval dashboard at `http://localhost:9900` in config mode. From there:
### CLI mode
```bash
bun run eval -c configs/legacy/browseros-agent-weekly.json
bun run eval suite --config configs/legacy/browseros-agent-weekly.json --publish r2
bun run eval -c configs/browseros-agent-weekly.json
```
Runs immediately. Dashboard still available at `http://localhost:9900` for live progress.
The `suite` command is the workflow-compatible full loop: execute tasks, run graders, write artifacts, and optionally publish to R2. The old `-c` form remains supported during migration.
```bash
bun run eval run --config configs/legacy/browseros-agent-weekly.json
bun run eval suite --suite configs/suites/agisdk-daily-10.json --variant kimi-fireworks --publish r2
bun run eval grade --run results/browseros-agent-weekly/2026-04-29-1430
bun run eval publish --run results/browseros-agent-weekly/2026-04-29-1430 --target r2
```
Config files live in two groups:
```txt
configs/legacy/ # Complete EvalConfig files used by older workflows and the dashboard
configs/suites/ # Suite definitions; model/provider comes from CLI flags or env
```
Suite mode takes model settings from CLI flags first, then env:
```bash
EVAL_VARIANT=kimi-fireworks \
EVAL_AGENT_PROVIDER=openai-compatible \
EVAL_AGENT_MODEL=accounts/fireworks/models/kimi-k2p5 \
EVAL_AGENT_API_KEY=$FIREWORKS_API_KEY \
EVAL_AGENT_BASE_URL=https://api.fireworks.ai/inference/v1 \
bun run eval suite --suite configs/suites/agisdk-daily-10.json --publish r2
```
### Suites and variants
A **suite** is what we run: the task dataset, graders, worker count, timeout, and browser settings. For example, `agisdk-daily-10` means "run these 10 AGI SDK tasks and grade them with `agisdk_state_diff`."
A **variant** is the model setup we are testing on that suite. `EVAL_VARIANT` is just the human-readable name for that setup. The actual model connection still comes from `EVAL_AGENT_PROVIDER`, `EVAL_AGENT_MODEL`, `EVAL_AGENT_API_KEY`, and `EVAL_AGENT_BASE_URL`.
This lets us run the same suite against multiple model setups without copying the benchmark config:
```txt
agisdk-daily-10 + kimi-fireworks
agisdk-daily-10 + claude-sonnet
agisdk-daily-10 + clado-action-000159
```
For `orchestrator-executor` suites, there can also be an executor model/backend. The `EVAL_AGENT_*` vars describe the main agent or orchestrator. The optional `EVAL_EXECUTOR_*` or `CLADO_ACTION_*` vars describe the delegated executor.
## Agent types
| Type | Description |
@@ -111,9 +66,9 @@ The orchestrator works with any LLM provider. The executor can be another LLM, o
},
"executor": {
"provider": "clado-action",
"model": "Qwen3.5-35B-A3B-action-000159-merged",
"model": "qwen3-vl-30b-a3b-instruct",
"apiKey": "",
"baseUrl": "https://clado-ai--clado-browseros-action-000159-merged-actionmod-f4a6ef.modal.run"
"baseUrl": "https://clado-ai--clado-browseros-action-actionmodel-generate.modal.run"
}
}
}
@@ -141,20 +96,6 @@ The `apiKey` field supports two formats:
- **Env var name**: `"OPENAI_API_KEY"` — resolved from `.env.development` at runtime
- **Direct value**: `"sk-xxxxx"` — used as-is (not recommended)
### Environment variables
| Variable | Used for |
|----------|----------|
| `EVAL_AGENT_PROVIDER`, `EVAL_AGENT_MODEL`, `EVAL_AGENT_API_KEY`, `EVAL_AGENT_BASE_URL`, `EVAL_AGENT_SUPPORTS_IMAGES` | Suite variant model selection |
| `FIREWORKS_API_KEY`, `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, provider-specific keys | Config-file or provider-backed model calls |
| `EVAL_EXECUTOR_MODEL`, `EVAL_EXECUTOR_API_KEY`, `EVAL_EXECUTOR_BASE_URL` | Suite-mode orchestrator executor override |
| `CLADO_ACTION_MODEL`, `CLADO_ACTION_API_KEY`, `CLADO_ACTION_BASE_URL` | Clado executor defaults |
| `BROWSEROS_BINARY` | BrowserOS binary path in CI/local smoke runs |
| `BROWSEROS_SERVER_URL` | Optional grader MCP URL override |
| `WEBARENA_INFINITY_DIR` | Local WebArena-Infinity checkout for Infinity tasks |
| `NOPECHA_API_KEY` | CAPTCHA solver extension |
| `EVAL_R2_ACCOUNT_ID`, `EVAL_R2_ACCESS_KEY_ID`, `EVAL_R2_SECRET_ACCESS_KEY`, `EVAL_R2_BUCKET`, `EVAL_R2_CDN_BASE_URL` | R2 upload and viewer URL |
### Supported providers
| Provider | `provider` value | Requires `baseUrl` |
@@ -169,22 +110,6 @@ The `apiKey` field supports two formats:
| Ollama | `ollama` | No |
| Clado Action (executor only) | `clado-action` | Yes |
### R2 publishing
`suite --config ... --publish r2` and `publish --target r2` upload the run artifacts plus `viewer.html` to the viewer-compatible R2 layout:
```bash
export EVAL_R2_ACCOUNT_ID=...
export EVAL_R2_ACCESS_KEY_ID=...
export EVAL_R2_SECRET_ACCESS_KEY=...
export EVAL_R2_BUCKET=browseros-eval
export EVAL_R2_CDN_BASE_URL=https://eval.browseros.com
```
`EVAL_R2_CDN_BASE_URL` must be a public R2 custom domain, `r2.dev` URL, or Worker URL. Do not set it to the private `*.r2.cloudflarestorage.com` S3 API endpoint.
Published runs are available at `EVAL_R2_CDN_BASE_URL/viewer.html?run=<run-id>`.
### BrowserOS infrastructure
```json
@@ -212,12 +137,10 @@ Each worker gets its own Chrome instance. Worker N uses `base_port + N` for CDP
| File | Tasks | Description |
|------|-------|-------------|
| `agisdk-daily-10.jsonl` | 10 | Daily AGI SDK / REAL Bench subset |
| `webvoyager.jsonl` | 643 | Full WebVoyager benchmark |
| `mind2web.jsonl` | 300 | Online-Mind2Web |
| `webbench-{0,1,2}of4-50.jsonl` | 50 each | WebBench shards (50-task subsets) |
| `agisdk-real-smoke.jsonl` | 1 | AGI SDK / REAL Bench smoke task |
| `agisdk-real.jsonl` | 36 | AGI SDK / REAL Bench (action-only tasks) |
| `agisdk-real.jsonl` | 40 | AGI SDK / REAL Bench (action-only tasks) |
| `webarena-infinity-hard-50.jsonl` | 50 | WebArena-Infinity hard set |
| `browsecomp-medium-hard-50.jsonl` | 50 | BrowseComp medium-hard |
| `browsecomp-very-hard-50.jsonl` | 50 | BrowseComp very-hard |
@@ -244,47 +167,14 @@ results/
browseros-agent-weekly/
2026-04-29-1430/
Amazon--0/
attempt.json # Stable attempt summary for viewer/reporting
metadata.json # Task result, timing, grader scores
grades.json # Compact grader results
messages.jsonl # Full message log
grader-artifacts/ # Grader-specific inputs/outputs/stderr
screenshots/
001.png # Step-by-step screenshots
002.png
summary.json # Aggregate pass rates
```
R2 publishing preserves the task files under `runs/<run-id>/...`, writes `runs/<run-id>/manifest.json`, and uploads `viewer.html` at the bucket root. The viewer URL is `EVAL_R2_CDN_BASE_URL/viewer.html?run=<run-id>`.
### R2 viewer manifest
`runs/<run-id>/manifest.json` is the source of truth for the public viewer. New manifests include `schemaVersion: 2` and each task includes explicit artifact paths:
```json
{
"schemaVersion": 2,
"runId": "agisdk-real-smoke-2026-04-30-0000",
"tasks": [
{
"queryId": "agisdk-dashdish-10",
"paths": {
"metadata": "tasks/agisdk-dashdish-10/metadata.json",
"messages": "tasks/agisdk-dashdish-10/messages.jsonl",
"grades": "tasks/agisdk-dashdish-10/grades.json",
"trace": "tasks/agisdk-dashdish-10/trace.jsonl",
"screenshots": "tasks/agisdk-dashdish-10/screenshots",
"graderArtifacts": "tasks/agisdk-dashdish-10/grader-artifacts"
}
}
]
}
```
The static viewer uses `task.paths` when present. Older uploaded runs without `schemaVersion` or `task.paths` still work through the legacy inferred layout: `runs/<run-id>/<task-id>/metadata.json`, `messages.jsonl`, and `screenshots/<n>.png`.
Manifest paths are stable artifact locations, not a guarantee that every optional artifact exists for every task. For example, `attempt.json`, `trace.jsonl`, or grader artifact directories may be absent when that artifact was not produced by the run.
## Troubleshooting
**BrowserOS not found**: Expects `/Applications/BrowserOS.app/Contents/MacOS/BrowserOS`. Set `BROWSEROS_BINARY` to override.

View File

@@ -7,8 +7,8 @@
"baseUrl": "https://openrouter.ai/api/v1",
"supportsImages": true
},
"dataset": "../../data/agisdk-real-smoke.jsonl",
"num_workers": 1,
"dataset": "../data/agisdk-real.jsonl",
"num_workers": 10,
"restart_server_per_task": true,
"browseros": {
"server_url": "http://127.0.0.1:9110",

View File

@@ -7,7 +7,7 @@
"baseUrl": "https://openrouter.ai/api/v1",
"supportsImages": true
},
"dataset": "../../data/webbench-2of4-50.jsonl",
"dataset": "../data/webbench-2of4-50.jsonl",
"num_workers": 10,
"restart_server_per_task": true,
"browseros": {

View File

@@ -14,7 +14,7 @@
"baseUrl": "https://api.fireworks.ai/inference/v1"
}
},
"dataset": "../../data/webbench-2of4-50.jsonl",
"dataset": "../data/webbench-2of4-50.jsonl",
"num_workers": 10,
"restart_server_per_task": true,
"browseros": {

View File

@@ -9,12 +9,12 @@
},
"executor": {
"provider": "clado-action",
"model": "Qwen3.5-35B-A3B-action-000159-merged",
"model": "qwen3-vl-30b-a3b-instruct",
"apiKey": "",
"baseUrl": "https://clado-ai--clado-browseros-action-000159-merged-actionmod-f4a6ef.modal.run"
"baseUrl": "https://clado-ai--clado-browseros-action-actionmodel-generate.modal.run"
}
},
"dataset": "../../data/agisdk-real.jsonl",
"dataset": "../data/webbench-2of4-50.jsonl",
"num_workers": 10,
"restart_server_per_task": true,
"browseros": {
@@ -23,11 +23,11 @@
"base_server_port": 9110,
"base_extension_port": 9310,
"load_extensions": false,
"headless": true
"headless": false
},
"captcha": {
"api_key_env": "NOPECHA_API_KEY"
},
"graders": ["agisdk_state_diff"],
"graders": ["performance_grader"],
"timeout_ms": 1800000
}

View File

@@ -7,7 +7,7 @@
"baseUrl": "https://openrouter.ai/api/v1",
"supportsImages": true
},
"dataset": "../../data/webarena-infinity-hard-50.jsonl",
"dataset": "../data/webarena-infinity-hard-50.jsonl",
"num_workers": 10,
"restart_server_per_task": true,
"browseros": {

View File

@@ -1,26 +0,0 @@
{
"agent": {
"type": "single",
"provider": "openai-compatible",
"model": "accounts/fireworks/models/kimi-k2p5",
"apiKey": "FIREWORKS_API_KEY",
"baseUrl": "https://api.fireworks.ai/inference/v1",
"supportsImages": true
},
"dataset": "../../data/agisdk-real.jsonl",
"num_workers": 4,
"restart_server_per_task": true,
"browseros": {
"server_url": "http://127.0.0.1:9110",
"base_cdp_port": 9010,
"base_server_port": 9110,
"base_extension_port": 9310,
"load_extensions": false,
"headless": false
},
"captcha": {
"api_key_env": "NOPECHA_API_KEY"
},
"graders": ["agisdk_state_diff"],
"timeout_ms": 1800000
}

View File

@@ -1,22 +0,0 @@
{
"id": "agisdk-daily-10",
"dataset": "../../data/agisdk-daily-10.jsonl",
"agent": {
"type": "single"
},
"graders": ["agisdk_state_diff"],
"workers": 1,
"restartBrowserPerTask": true,
"timeoutMs": 1800000,
"browseros": {
"server_url": "http://127.0.0.1:9110",
"base_cdp_port": 9010,
"base_server_port": 9110,
"base_extension_port": 9310,
"load_extensions": false,
"headless": true
},
"captcha": {
"api_key_env": "NOPECHA_API_KEY"
}
}

View File

@@ -1,22 +0,0 @@
{
"id": "agisdk-real-smoke",
"dataset": "../../data/agisdk-real-smoke.jsonl",
"agent": {
"type": "single"
},
"graders": ["agisdk_state_diff"],
"workers": 1,
"restartBrowserPerTask": true,
"timeoutMs": 1800000,
"browseros": {
"server_url": "http://127.0.0.1:9110",
"base_cdp_port": 9010,
"base_server_port": 9110,
"base_extension_port": 9310,
"load_extensions": false,
"headless": false
},
"captcha": {
"api_key_env": "NOPECHA_API_KEY"
}
}

View File

@@ -1,22 +0,0 @@
{
"id": "agisdk-real",
"dataset": "../../data/agisdk-real.jsonl",
"agent": {
"type": "single"
},
"graders": ["agisdk_state_diff"],
"workers": 1,
"restartBrowserPerTask": true,
"timeoutMs": 1800000,
"browseros": {
"server_url": "http://127.0.0.1:9110",
"base_cdp_port": 9010,
"base_server_port": 9110,
"base_extension_port": 9310,
"load_extensions": false,
"headless": false
},
"captcha": {
"api_key_env": "NOPECHA_API_KEY"
}
}

View File

@@ -5,7 +5,7 @@
"model": "openai/gpt-4.1",
"apiKey": "OPENROUTER_API_KEY"
},
"dataset": "../../data/mind2web.jsonl",
"dataset": "../data/mind2web.jsonl",
"num_workers": 5,
"restart_server_per_task": true,
"browseros": {

View File

@@ -7,7 +7,7 @@
"baseUrl": "https://api.fireworks.ai/inference/v1",
"supportsImages": true
},
"dataset": "../../data/webvoyager.jsonl",
"dataset": "../data/webvoyager.jsonl",
"num_workers": 3,
"restart_server_per_task": true,
"browseros": {

View File

@@ -1,10 +0,0 @@
{"query_id": "agisdk-dashdish-10", "dataset": "agisdk-real", "query": "Place an order from \"Souvla\" for a \"Medium Classic Cheeseburger\" and a \"Small Bacon Double Cheeseburger\" with \"Standard Delivery\" as the method with the default charged options.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-10", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-10", "challenge_type": "action", "difficulty": "hard", "similar_to": "Doordash"}}}
{"query_id": "agisdk-fly-unified-5", "dataset": "agisdk-real", "query": "Find me the cheapest fare for a flight from Orlando to Milwaukee on December 5th, 2024 and book it.\nPassenger: John Doe\nDate of Birth: 01/01/1990\nSex: Male\nSeat Selection: No\nPayment: Credit Card (378342143523967), Exp: 12/30, Security Code: 420 Address: 123 Main St, San Francisco, CA, 94105, USA, Phone: 555-123-4567, Email: johndoe@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-5", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-5", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "United Airlines"}}}
{"query_id": "agisdk-udriver-10", "dataset": "agisdk-real", "query": "Order me a ride for 4pm, I'll be at the de Young muesum headed to the Waterbar, fanciest option possible please.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-10", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-10", "challenge_type": "action", "difficulty": "hard", "similar_to": "Uber"}}}
{"query_id": "agisdk-udriver-9", "dataset": "agisdk-real", "query": "Book me a ride from the thai restaurant I last took a ride to for later today at 2pm, I'll be at 333 Apartments on Fremont", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-9", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-9", "challenge_type": "retrieval-action", "difficulty": "hard", "similar_to": "Uber"}}}
{"query_id": "agisdk-topwork-4", "dataset": "agisdk-real", "query": "Create a job post for a UI/UX Designer with expertise in Figma, Sketch, and Adobe Creative Suite, including project details, timeline, and required skills (Wireframing, Prototyping, Responsive Design).", "graders": ["agisdk_state_diff"], "start_url": "https://evals-topwork.vercel.app", "metadata": {"original_task_id": "topwork-4", "website": "TopWork", "category": "agisdk-real", "additional": {"agisdk_task_id": "topwork-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "Upwork"}}}
{"query_id": "agisdk-gocalendar-4", "dataset": "agisdk-real", "query": "Change the \"Team Check-In\" event on July 18, 2024, name to \"Project Kickoff\" and update the location to \"Zoom\"", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gocalendar.vercel.app", "metadata": {"original_task_id": "gocalendar-4", "website": "GoCalendar", "category": "agisdk-real", "additional": {"agisdk_task_id": "gocalendar-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "Google Calendar"}}}
{"query_id": "agisdk-staynb-6", "dataset": "agisdk-real", "query": "Find and book the stay with the best value for money (cheapest stay with the best reviews) for 1 day. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-6", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-6", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "Airbnb"}}}
{"query_id": "agisdk-udriver-11", "dataset": "agisdk-real", "query": "I need to go from Pacific Catch on Chestnut back home to 333 Fremont now. If the fancy version is within ten dollars of the regular one, book that.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-11", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-11", "challenge_type": "action", "difficulty": "hard", "similar_to": "Uber"}}}
{"query_id": "agisdk-networkin-5", "dataset": "agisdk-real", "query": "Send a connection request to John Smith.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-5", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-5", "challenge_type": "action", "difficulty": "easy", "similar_to": "LinkedIn"}}}
{"query_id": "agisdk-zilloft-6", "dataset": "agisdk-real", "query": "Select a property listed in San Francisco as \"Condos\" within a price range under $300,000 and request a tour for tomorrow at 4:00 PM. Use these contact details: Name: Sarah Brown, Email: sarahbrown@example.com, Phone: 555-987-6543.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-zilloft.vercel.app", "metadata": {"original_task_id": "zilloft-6", "website": "Zilloft", "category": "agisdk-real", "additional": {"agisdk_task_id": "zilloft-6", "challenge_type": "action", "difficulty": "medium", "similar_to": "Zillow"}}}

View File

@@ -1 +0,0 @@
{"query_id": "agisdk-dashdish-10", "dataset": "agisdk-real", "query": "Place an order from \"Souvla\" for a \"Medium Classic Cheeseburger\" and a \"Small Bacon Double Cheeseburger\" with \"Standard Delivery\" as the method with the default charged options.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-10", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-10", "challenge_type": "action", "difficulty": "hard", "similar_to": "Doordash"}}}

View File

@@ -32,5 +32,9 @@
{"query_id": "agisdk-networkin-10", "dataset": "agisdk-real", "query": "Generate a polite follow-up message for a previous unanswered chat, starting with \"Following up on\".", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-10", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-10", "challenge_type": "action", "difficulty": "medium", "similar_to": "LinkedIn"}}}
{"query_id": "agisdk-gomail-3", "dataset": "agisdk-real", "query": "Compose a new email to jonathan.smith@example.com with the subject \"Meeting Notes\" and body \"Please find the meeting notes attached.\"", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gomail.vercel.app", "metadata": {"original_task_id": "gomail-3", "website": "GoMail", "category": "agisdk-real", "additional": {"agisdk_task_id": "gomail-3", "challenge_type": "action", "difficulty": "easy", "similar_to": "Gmail"}}}
{"query_id": "agisdk-udriver-6", "dataset": "agisdk-real", "query": "Me and 4 friends need a ride from the Palace Hotel to dinner at Osha Thai leaving now", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-6", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-6", "challenge_type": "action", "difficulty": "hard", "similar_to": "Uber"}}}
{"query_id": "agisdk-staynb-9", "dataset": "agisdk-real", "query": "Book a stay with the maximum number of guests supported. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-9", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-9", "challenge_type": "action", "difficulty": "hard", "similar_to": "Airbnb"}}}
{"query_id": "agisdk-zilloft-3", "dataset": "agisdk-real", "query": "Find a home in San Diego priced under $150,000 with at least 2 bedrooms and request a tour. Use these details: Contact Name: John Doe, Email: johndoe@example.com, Phone: 555-123-4567, Tour Time: 2:00 PM, Tour Date: First available.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-zilloft.vercel.app", "metadata": {"original_task_id": "zilloft-3", "website": "Zilloft", "category": "agisdk-real", "additional": {"agisdk_task_id": "zilloft-3", "challenge_type": "retrieval-action", "difficulty": "easy", "similar_to": "Zillow"}}}
{"query_id": "agisdk-fly-unified-6", "dataset": "agisdk-real", "query": "Reserve me a seat for the flight from Austin to Pittsburgh departing on December 11th, 2024 at 8:00 in Basic Economy.\nPassenger: Alice Brown\nDate of Birth: 05/20/1992\nSex: Female\nSeat Selection: Yes (Aisle seat)\nPayment: Credit Card (378342143523967), Exp: 09/27, security code: 332 Address: 789 Pine St, Los Angeles, CA, 90012, USA, Phone: 555-456-7890, Email: alicebrown@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-6", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-6", "challenge_type": "action", "difficulty": "medium", "similar_to": "United Airlines"}}}
{"query_id": "agisdk-opendining-3", "dataset": "agisdk-real", "query": "Book a table at \"The Royal Dine\" for a party of 4 on July 20, 2024, at 7 PM. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-opendining.vercel.app", "metadata": {"original_task_id": "opendining-3", "website": "OpenDining", "category": "agisdk-real", "additional": {"agisdk_task_id": "opendining-3", "challenge_type": "action", "difficulty": "easy", "similar_to": "OpenTable"}}}
{"query_id": "agisdk-gocalendar-7", "dataset": "agisdk-real", "query": "Reschedule the \"Morning Coffee with sister\" event from July 18, 2024, at 9 AM to July 19, 2024, at 10AM using drag-and-drop functionality", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gocalendar.vercel.app", "metadata": {"original_task_id": "gocalendar-7", "website": "GoCalendar", "category": "agisdk-real", "additional": {"agisdk_task_id": "gocalendar-7", "challenge_type": "action", "difficulty": "medium", "similar_to": "Google Calendar"}}}
{"query_id": "agisdk-staynb-5", "dataset": "agisdk-real", "query": "Use the search bar to look for a stay. For the \"Where\" section, use the \"Search by region\" popover and select \"Europe\". Set the check-in date to October 13th and the check-out date to October 23rd. For the \"Who\" section, select 1 infant, 2 children, and 2 adults. Press the search button, select the first stay, and book it.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-5", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-5", "challenge_type": "action", "difficulty": "medium", "similar_to": "Airbnb"}}}

View File

@@ -5,7 +5,6 @@
"type": "module",
"scripts": {
"eval": "bun --env-file=.env.development run src/index.ts",
"test": "bun run ../../scripts/run-bun-test.ts ./apps/eval/tests",
"typecheck": "tsc --noEmit"
},
"dependencies": {

View File

@@ -64,37 +64,6 @@ EXCLUDED_TASKS = {
# the grader's first criterion (search history contains "stanford") was
# never triggered server-side. Eval-site bug.
"networkin-9",
# Goal text instructs "move event to July 19, 10 AM" but the grader expects
# `eventsDiff.updated.*.start == "2024-07-18T17:00:00Z"` (= July 18, 10 AM
# PDT — same day, 1 hour shift). Goal contradicts grader: following the
# goal yields July 19 timestamps; satisfying the grader requires ignoring
# the explicit "to July 19" instruction. Confirmed via 8-trial deep-dive:
# never passed even after the Phase 2 HTML5 dnd dispatch fix made the drag
# actually populate `eventsDiff.updated` (now produces July 19 values, but
# grader rejects them).
"gocalendar-7",
# Grader hardcodes literal year strings `'Oct 13 2025'` / `'Oct 23 2025'`
# in checkin/checkout criteria. Today is 2026, and the staynb date picker
# interprets bare "Oct 13" as the most recent past instance — currently
# 2024, not 2025. Even a perfectly-acting agent cannot produce a booking
# whose persisted date contains "2025". Confirmed via 8 trials, 0 passes.
"staynb-5",
# Goal says "maximum number of guests supported"; grader expects the very
# specific string "32 Guests, 16 Infants" — which requires the agent to
# know that (a) Adults+Children sum into the displayed "Guests" count,
# (b) Infants render separately, (c) Pets are excluded, (d) per-category
# cap is 16 despite no UI affordance signalling it. None of this is in
# the prompt. 8 trials, 0 passes; even Opus 4.6 stopped at 16 (one
# category maxed). Task is under-specified relative to grader expectation.
"staynb-9",
# Grader requires `contains(booking.date, '2024-07-20')` but the eval-site
# date picker is a React-controlled textbox that the agent's `fill` tool
# frequently no-ops on. 3 of 8 trials passed (when fill happened to stick),
# 5 failed with `actual_value: False` (booking persisted with the eval-site
# default search date, not Jul 20). Effectively a coin-flip task that
# exercises tool-fidelity flakiness rather than agent capability —
# contributes noise, not signal. Excluding for eval reliability.
"opendining-3",
}
# Far-future replacement used by `freshen_goal_dates` when a task's hardcoded

View File

@@ -1,73 +1,34 @@
/**
* Smoke-test for the Clado BrowserOS Action endpoint.
*
* Health-checks the model, then runs a generate call and prints every
* field the new contract documents (action, coordinates, text, key,
* direction, scroll/drag fields, wait, end+final_answer, thinking,
* parse_error, raw_response).
* Test script for Clado API endpoints (grounding + action models)
*
* Usage:
* bun apps/eval/scripts/test-clado-api.ts [screenshot-path]
*
* If no screenshot path is given, captures one over MCP from a
* running BrowserOS server (default http://127.0.0.1:9110, override
* with BROWSEROS_URL).
*
* Cold start can take ~5 minutes; the script waits up to 6.
* If no screenshot provided, captures one from a running BrowserOS server.
*/
import { readFile } from 'node:fs/promises'
import { resolve } from 'node:path'
const ACTION_URL =
'https://clado-ai--clado-browseros-action-000159-merged-actionmod-f4a6ef.modal.run'
'https://clado-ai--clado-browseros-action-actionmodel-generate.modal.run'
const ACTION_HEALTH_URL =
'https://clado-ai--clado-browseros-action-000159-merged-actionmod-5e5033.modal.run'
'https://clado-ai--clado-browseros-action-actionmodel-health.modal.run'
const GROUNDING_URL =
'https://clado-ai--clado-browseros-grounding-groundingmodel-generate.modal.run'
const GROUNDING_HEALTH_URL =
'https://clado-ai--clado-browseros-grounding-groundingmodel-health.modal.run'
const COLD_START_BUDGET_MS = 360_000 // 6 min — Clado cold start is ~5 min
const COLD_START_WARN_MS = 30_000
interface CladoResponse {
action?: string | null
thinking?: string | null
raw_response?: string
parse_error?: string | null
inference_time_seconds?: number
x?: number
y?: number
text?: string
key?: string
direction?: string
amount?: number
startX?: number
startY?: number
endX?: number
endY?: number
time?: number
final_answer?: string | null
}
async function checkHealth(): Promise<boolean> {
console.log(`\n--- Action model health ---`)
console.log(` URL: ${ACTION_HEALTH_URL}`)
console.log(
` Note: cold start can take ~5 min; waiting up to ${COLD_START_BUDGET_MS / 1000}s.`,
)
async function checkHealth(name: string, url: string): Promise<boolean> {
console.log(`\n--- ${name} health check ---`)
console.log(` URL: ${url}`)
const start = performance.now()
const warn = setTimeout(() => {
console.log(
` ...still waiting (${COLD_START_WARN_MS / 1000}s in) — model is likely cold-starting on Modal.`,
)
}, COLD_START_WARN_MS)
try {
const resp = await fetch(ACTION_HEALTH_URL, {
signal: AbortSignal.timeout(COLD_START_BUDGET_MS),
})
const resp = await fetch(url, { signal: AbortSignal.timeout(30_000) })
const elapsed = ((performance.now() - start) / 1000).toFixed(2)
const body = await resp.text()
console.log(` Status: ${resp.status} (${elapsed}s)`)
console.log(` Body: ${body.slice(0, 400)}`)
console.log(` Body: ${body.slice(0, 200)}`)
return resp.ok
} catch (err) {
const elapsed = ((performance.now() - start) / 1000).toFixed(2)
@@ -75,34 +36,63 @@ async function checkHealth(): Promise<boolean> {
` FAILED (${elapsed}s): ${err instanceof Error ? err.message : err}`,
)
return false
} finally {
clearTimeout(warn)
}
}
async function generate(
label: string,
async function testGenerate(
name: string,
url: string,
payload: Record<string, unknown>,
): Promise<CladoResponse | null> {
console.log(`\n--- ${label} ---`)
console.log(` URL: ${ACTION_URL}`)
): Promise<Record<string, unknown> | null> {
console.log(`\n--- ${name} generate ---`)
console.log(` URL: ${url}`)
console.log(` Instruction: ${payload.instruction}`)
console.log(
` Image size: ${((payload.image_base64 as string).length / 1024).toFixed(0)} KB (base64)`,
` Image size: ${((payload.image_base64 as string).length / 1024).toFixed(0)} KB (base64)`,
)
if (payload.history && payload.history !== 'None') {
console.log(` History: ${payload.history}`)
}
if (payload.history) console.log(` History: ${payload.history}`)
const start = performance.now()
let resp: Response
try {
resp = await fetch(ACTION_URL, {
const resp = await fetch(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
signal: AbortSignal.timeout(COLD_START_BUDGET_MS),
signal: AbortSignal.timeout(120_000),
})
const elapsed = ((performance.now() - start) / 1000).toFixed(2)
if (!resp.ok) {
const body = await resp.text()
console.log(` FAILED: HTTP ${resp.status} (${elapsed}s)`)
console.log(` Body: ${body.slice(0, 400)}`)
return null
}
const result = (await resp.json()) as Record<string, unknown>
console.log(` Status: ${resp.status} (${elapsed}s)`)
console.log(` Action: ${result.action}`)
if (result.x !== null && result.x !== undefined)
console.log(` Coordinates: (${result.x}, ${result.y})`)
if (result.text)
console.log(` Text: ${(result.text as string).slice(0, 100)}`)
if (result.key) console.log(` Key: ${result.key}`)
if (result.inference_time_seconds)
console.log(` Inference: ${result.inference_time_seconds}s`)
// Show thinking if present
const raw = result.raw_response as string | undefined
if (raw) {
const thinkMatch = raw.match(/<thinking>([\s\S]*?)<\/thinking>/)
if (thinkMatch) {
const thinking = thinkMatch[1].trim()
console.log(
` Thinking: ${thinking.slice(0, 200)}${thinking.length > 200 ? '...' : ''}`,
)
}
}
return result
} catch (err) {
const elapsed = ((performance.now() - start) / 1000).toFixed(2)
console.log(
@@ -110,50 +100,6 @@ async function generate(
)
return null
}
const elapsed = ((performance.now() - start) / 1000).toFixed(2)
if (!resp.ok) {
const body = await resp.text()
console.log(` HTTP ${resp.status} ${resp.statusText} (${elapsed}s)`)
console.log(` Body: ${body.slice(0, 400)}`)
return null
}
const result = (await resp.json()) as CladoResponse
console.log(` HTTP ${resp.status} (${elapsed}s)`)
console.log(` action: ${result.action ?? 'null'}`)
if (result.parse_error) {
console.log(` parse_error: ${result.parse_error}`)
}
if (result.thinking) {
const trimmed = result.thinking.replace(/\s+/g, ' ').trim()
console.log(
` thinking: ${trimmed.slice(0, 240)}${trimmed.length > 240 ? '…' : ''}`,
)
}
if (typeof result.x === 'number' || typeof result.y === 'number') {
console.log(` x, y: ${result.x}, ${result.y}`)
}
if (typeof result.text === 'string')
console.log(` text: ${result.text.slice(0, 120)}`)
if (typeof result.key === 'string')
console.log(` key: ${result.key}`)
if (typeof result.direction === 'string')
console.log(` direction: ${result.direction}`)
if (typeof result.amount === 'number')
console.log(` amount: ${result.amount}`)
if (typeof result.startX === 'number' || typeof result.endX === 'number') {
console.log(
` drag: (${result.startX}, ${result.startY}) → (${result.endX}, ${result.endY})`,
)
}
if (typeof result.time === 'number')
console.log(` time: ${result.time}s`)
if (result.final_answer)
console.log(` final_answer: ${result.final_answer.slice(0, 240)}`)
if (typeof result.inference_time_seconds === 'number')
console.log(` inference_time_seconds: ${result.inference_time_seconds}`)
return result
}
async function loadScreenshot(path?: string): Promise<string> {
@@ -164,9 +110,10 @@ async function loadScreenshot(path?: string): Promise<string> {
return data.toString('base64')
}
// Try to capture from a running BrowserOS server
const serverUrl = process.env.BROWSEROS_URL || 'http://127.0.0.1:9110'
console.log(
`No screenshot path provided. Capturing from ${serverUrl} via MCP...`,
`No screenshot path provided. Trying to capture from ${serverUrl}...`,
)
const { Client } = await import('@modelcontextprotocol/sdk/client/index.js')
@@ -187,101 +134,82 @@ async function loadScreenshot(path?: string): Promise<string> {
arguments: { format: 'png', page: 1 },
})) as { content: Array<{ type: string; data?: string }> }
const image = result.content?.find((c) => c.type === 'image')
if (!image?.data)
throw new Error('No image data in take_screenshot response')
const imageContent = result.content?.find((c) => c.type === 'image')
if (!imageContent?.data)
throw new Error('No image data in screenshot response')
console.log(
`Captured screenshot (${(image.data.length / 1024).toFixed(0)} KB base64)`,
`Captured screenshot (${(imageContent.data.length / 1024).toFixed(0)} KB base64)`,
)
return image.data
return imageContent.data
} finally {
try {
await transport.close()
} catch {
/* ignore */
}
} catch {}
}
}
function summarize(history: CladoResponse[]): string {
if (history.length === 0) return 'None'
return history
.map((h) => {
switch (h.action) {
case 'click':
case 'double_click':
case 'right_click':
case 'hover':
return `${h.action}(${h.x}, ${h.y})`
case 'type':
return `type(${JSON.stringify(h.text ?? '')})`
case 'press_key':
return `press_key(${JSON.stringify(h.key ?? '')})`
case 'scroll':
return `scroll(${h.direction ?? 'down'})`
case 'drag':
return `drag(${h.startX},${h.startY} -> ${h.endX},${h.endY})`
case 'wait':
return `wait(${h.time ?? 1}s)`
case 'end':
return 'end()'
default:
return h.action ?? 'invalid'
}
})
.join(' -> ')
}
async function main() {
console.log('=== Clado action endpoint smoke test ===')
const screenshotPath = process.argv[2]
const healthy = await checkHealth()
if (!healthy) {
console.log('\nHealth check failed. Exiting.')
console.log('=== Clado API Test ===\n')
// Health checks (parallel)
const [actionHealthy, groundingHealthy] = await Promise.all([
checkHealth('Action Model', ACTION_HEALTH_URL),
checkHealth('Grounding Model', GROUNDING_HEALTH_URL),
])
if (!actionHealthy && !groundingHealthy) {
console.log('\nBoth endpoints are down. Exiting.')
process.exit(1)
}
// Load screenshot
let imageBase64: string
try {
imageBase64 = await loadScreenshot(process.argv[2])
imageBase64 = await loadScreenshot(screenshotPath)
} catch (err) {
console.log(
`\nFailed to load screenshot: ${err instanceof Error ? err.message : err}`,
)
console.log(
'Pass a path: bun apps/eval/scripts/test-clado-api.ts path/to/screenshot.png',
'Provide a screenshot path: bun apps/eval/scripts/test-clado-api.ts path/to/screenshot.png',
)
process.exit(1)
}
const history: CladoResponse[] = []
const instruction = 'Click on the search button or search bar'
// Step 1: open task — let the model decide what to do.
const step1 = await generate('Step 1: cold task', {
instruction: 'Find the search bar and click it',
image_base64: imageBase64,
history: 'None',
})
if (step1?.action) history.push(step1)
// Step 2: continuation with history, asks for typing.
if (step1?.action) {
const step2 = await generate('Step 2: with history', {
instruction: 'Type "hello world" into the search bar',
// Test grounding model
if (groundingHealthy) {
await testGenerate('Grounding Model', GROUNDING_URL, {
instruction,
image_base64: imageBase64,
history: summarize(history),
})
if (step2?.action) history.push(step2)
} else {
console.log('\nSkipping grounding model (unhealthy)')
}
// Step 3: ask for end with a final answer to exercise that field.
await generate('Step 3: ask for end+final_answer', {
instruction:
'You have completed the task. Reply with end() and final_answer="done".',
image_base64: imageBase64,
history: summarize(history),
})
// Test action model (no history)
if (actionHealthy) {
const result = await testGenerate('Action Model (step 1)', ACTION_URL, {
instruction,
image_base64: imageBase64,
history: 'None',
})
// Test action model with history (simulate multi-turn)
if (result && result.action === 'click') {
await testGenerate('Action Model (step 2, with history)', ACTION_URL, {
instruction: 'Type "hello world" in the search bar',
image_base64: imageBase64,
history: `click(${result.x}, ${result.y})`,
})
}
} else {
console.log('\nSkipping action model (unhealthy)')
}
console.log('\n=== Done ===')
}

View File

@@ -1,43 +1,349 @@
#!/usr/bin/env bun
/**
* Upload eval runs to R2.
*
* Two modes:
* bun scripts/upload-run.ts results/browseros-agent-weekly/2026-03-21-1730
* → uploads that specific run
*
* bun scripts/upload-run.ts results/browseros-agent-weekly
* → finds all timestamped subfolders, uploads any not yet in R2
*
* Env vars: EVAL_R2_ACCOUNT_ID, EVAL_R2_ACCESS_KEY_ID, EVAL_R2_SECRET_ACCESS_KEY
* EVAL_R2_BUCKET (default: browseros-eval)
* EVAL_R2_CDN_BASE_URL (default: https://eval.browseros.com)
*/
import { readdir, readFile, stat } from 'node:fs/promises'
import { basename, dirname, extname, join } from 'node:path'
import {
loadR2ConfigFromEnv,
R2Publisher,
} from '../src/publishing/r2-publisher'
GetObjectCommand,
PutObjectCommand,
S3Client,
} from '@aws-sdk/client-s3'
async function main(): Promise<void> {
const inputDir = process.argv[2]
if (!inputDir) {
throw new Error(
'Usage:\n' +
' bun scripts/upload-run.ts results/config-name/2026-03-21-1730\n' +
' bun scripts/upload-run.ts results/config-name',
const CONCURRENCY = 20
const CONTENT_TYPES: Record<string, string> = {
'.json': 'application/json',
'.jsonl': 'application/x-ndjson',
'.png': 'image/png',
}
interface R2Config {
accountId: string
accessKeyId: string
secretAccessKey: string
bucket: string
cdnBaseUrl: string
}
function loadConfig(): R2Config {
const accountId = process.env.EVAL_R2_ACCOUNT_ID
const accessKeyId = process.env.EVAL_R2_ACCESS_KEY_ID
const secretAccessKey = process.env.EVAL_R2_SECRET_ACCESS_KEY
if (!accountId || !accessKeyId || !secretAccessKey) {
console.error(
'Missing required env vars: EVAL_R2_ACCOUNT_ID, EVAL_R2_ACCESS_KEY_ID, EVAL_R2_SECRET_ACCESS_KEY',
)
process.exit(1)
}
const publisher = new R2Publisher({ config: loadR2ConfigFromEnv() })
const result = await publisher.publishPath(inputDir)
for (const run of result.uploadedRuns) {
console.log(`Uploaded ${run.uploadedFiles} files for ${run.runId}`)
console.log(run.viewerUrl)
return {
accountId,
accessKeyId,
secretAccessKey,
bucket: process.env.EVAL_R2_BUCKET || 'browseros-eval',
cdnBaseUrl: (
process.env.EVAL_R2_CDN_BASE_URL || 'https://eval.browseros.com'
).replace(/\/+$/, ''),
}
for (const runId of result.skippedRuns) {
console.log(`${runId}: already uploaded, skipping`)
}
console.log(
`Done. Uploaded ${result.uploadedRuns.length} run(s), skipped ${result.skippedRuns.length}.`,
}
function createClient(config: R2Config): S3Client {
return new S3Client({
region: 'auto',
endpoint: `https://${config.accountId}.r2.cloudflarestorage.com`,
credentials: {
accessKeyId: config.accessKeyId,
secretAccessKey: config.secretAccessKey,
},
})
}
async function upload(
client: S3Client,
bucket: string,
key: string,
body: Buffer,
contentType: string,
) {
await client.send(
new PutObjectCommand({
Bucket: bucket,
Key: key,
Body: body,
ContentType: contentType,
}),
)
}
main().catch((error) => {
console.error(error instanceof Error ? error.message : String(error))
process.exit(1)
})
async function collectFiles(dir: string): Promise<string[]> {
const files: string[] = []
const entries = await readdir(dir, { withFileTypes: true })
for (const entry of entries) {
const full = join(dir, entry.name)
if (entry.isDirectory()) {
files.push(...(await collectFiles(full)))
} else {
files.push(full)
}
}
return files
}
async function runPool<T>(
items: T[],
concurrency: number,
fn: (item: T) => Promise<void>,
) {
let i = 0
const workers = Array.from({ length: concurrency }, async () => {
while (i < items.length) {
const idx = i++
await fn(items[idx])
}
})
await Promise.all(workers)
}
// Check if a run has already been uploaded to R2
async function isUploaded(
client: S3Client,
bucket: string,
runId: string,
): Promise<boolean> {
try {
await client.send(
new GetObjectCommand({
Bucket: bucket,
Key: `runs/${runId}/manifest.json`,
}),
)
return true
} catch {
return false
}
}
// Detect if a directory is a run dir (has task subdirs with metadata.json)
// vs a config dir (has timestamped subdirs like 2026-03-21-1730/)
async function isRunDir(dir: string): Promise<boolean> {
const entries = await readdir(dir, { withFileTypes: true })
const subdirs = entries.filter((e) => e.isDirectory())
for (const subdir of subdirs) {
const metaPath = join(dir, subdir.name, 'metadata.json')
const metaStat = await stat(metaPath).catch(() => null)
if (metaStat?.isFile()) return true
}
return false
}
async function uploadSingleRun(
runDir: string,
runId: string,
r2Config: R2Config,
client: S3Client,
): Promise<void> {
const taskDirs = await readdir(runDir, { withFileTypes: true })
const taskEntries = taskDirs.filter((d) => d.isDirectory())
if (taskEntries.length === 0) {
console.warn(` No task subdirectories in ${runId}, skipping`)
return
}
const manifestTasks: Record<string, unknown>[] = []
const jobs: { key: string; filePath: string; contentType: string }[] = []
// Extract agent config from first task
let agentConfig: Record<string, unknown> | undefined
let dataset: string | undefined
for (const taskDir of taskEntries) {
const taskId = taskDir.name
const taskPath = join(runDir, taskId)
const metaPath = join(taskPath, 'metadata.json')
let meta: Record<string, unknown> = {}
try {
meta = JSON.parse(await readFile(metaPath, 'utf-8'))
} catch {
continue
}
if (!agentConfig && meta.agent_config)
agentConfig = meta.agent_config as Record<string, unknown>
if (!dataset && meta.dataset) dataset = meta.dataset as string
const files = await collectFiles(taskPath)
let screenshotCount = 0
for (const file of files) {
const relative = file.slice(taskPath.length + 1)
const ext = extname(file)
if (relative.startsWith('screenshots/') && ext === '.png')
screenshotCount++
jobs.push({
key: `runs/${runId}/${taskId}/${relative}`,
filePath: file,
contentType: CONTENT_TYPES[ext] || 'application/octet-stream',
})
}
manifestTasks.push({
queryId: meta.query_id || taskId,
query: meta.query || '',
startUrl: meta.start_url || '',
status:
meta.termination_reason === 'completed'
? 'completed'
: meta.termination_reason || 'unknown',
durationMs: meta.total_duration_ms || 0,
screenshotCount: (meta.screenshot_count as number) || screenshotCount,
graderResults: meta.grader_results || {},
})
}
if (manifestTasks.length === 0) {
console.warn(` No completed tasks in ${runId}, skipping`)
return
}
console.log(
` Uploading ${jobs.length} files across ${manifestTasks.length} tasks...`,
)
let uploaded = 0
await runPool(jobs, CONCURRENCY, async (job) => {
const body = await readFile(job.filePath)
await upload(client, r2Config.bucket, job.key, body, job.contentType)
uploaded++
if (uploaded % 50 === 0 || uploaded === jobs.length) {
console.log(` ${uploaded}/${jobs.length}`)
}
})
// Read summary.json if it exists
let summaryData: Record<string, unknown> | undefined
try {
summaryData = JSON.parse(
await readFile(join(runDir, 'summary.json'), 'utf-8'),
)
} catch {}
// Upload manifest
const manifest = {
runId,
uploadedAt: new Date().toISOString(),
agentConfig,
dataset,
summary: summaryData
? {
passRate: summaryData.passRate,
avgDurationMs: summaryData.avgDurationMs,
}
: undefined,
tasks: manifestTasks,
}
const manifestBody = Buffer.from(JSON.stringify(manifest, null, 2))
await upload(
client,
r2Config.bucket,
`runs/${runId}/manifest.json`,
manifestBody,
'application/json',
)
// Upload viewer.html to bucket root
const viewerPath = join(
import.meta.dir,
'..',
'src',
'dashboard',
'viewer.html',
)
const viewerBody = await readFile(viewerPath)
await upload(client, r2Config.bucket, 'viewer.html', viewerBody, 'text/html')
console.log(` Uploaded ${uploaded + 2} files`)
console.log(` ${r2Config.cdnBaseUrl}/viewer.html?run=${runId}`)
}
async function main() {
const inputDir = process.argv[2]
if (!inputDir) {
console.error(
'Usage:\n' +
' bun scripts/upload-run.ts results/config-name/2026-03-21-1730 (specific run)\n' +
' bun scripts/upload-run.ts results/config-name (all un-uploaded runs)',
)
process.exit(1)
}
const dirStat = await stat(inputDir).catch(() => null)
if (!dirStat?.isDirectory()) {
console.error(`Not a directory: ${inputDir}`)
process.exit(1)
}
const r2Config = loadConfig()
const client = createClient(r2Config)
if (await isRunDir(inputDir)) {
// Single run: results/config-name/2026-03-21-1730
const timestamp = basename(inputDir)
const configName = basename(dirname(inputDir))
const runId = `${configName}-${timestamp}`
console.log(`Uploading run: ${runId}`)
await uploadSingleRun(inputDir, runId, r2Config, client)
} else {
// Config dir: results/config-name/ — upload all un-uploaded runs
const configName = basename(inputDir)
const entries = await readdir(inputDir, { withFileTypes: true })
const runDirs = entries
.filter((e) => e.isDirectory())
.map((e) => e.name)
.sort()
if (runDirs.length === 0) {
console.error('No run subdirectories found')
process.exit(1)
}
console.log(
`Found ${runDirs.length} runs for config "${configName}", checking R2...`,
)
let uploadedCount = 0
for (const dir of runDirs) {
const runId = `${configName}-${dir}`
const alreadyUploaded = await isUploaded(client, r2Config.bucket, runId)
if (alreadyUploaded) {
console.log(` ${runId}: already uploaded, skipping`)
continue
}
console.log(` ${runId}: uploading...`)
await uploadSingleRun(join(inputDir, dir), runId, r2Config, client)
uploadedCount++
}
console.log(
`\nDone. Uploaded ${uploadedCount} new run(s), ${runDirs.length - uploadedCount} already in R2.`,
)
}
}
main()

View File

@@ -24,11 +24,45 @@ import {
PutObjectCommand,
S3Client,
} from '@aws-sdk/client-s3'
import {
buildRunSummaries,
type ReportManifest,
type RunSummary,
} from '../src/reporting/run-summary'
interface ManifestTask {
queryId: string
query: string
status: string
durationMs: number
screenshotCount: number
graderResults: Record<string, { pass: boolean; score: number }>
}
interface Manifest {
runId: string
uploadedAt: string
agentConfig?: { type?: string; model?: string }
dataset?: string
summary?: { passRate?: number; avgDurationMs?: number }
tasks: ManifestTask[]
}
interface RunSummary {
runId: string
configName: string
date: string
avgScore: number
total: number
completed: number
failed: number
timeout: number
avgDurationMs: number
model: string
dataset: string
agentType: string
}
const PASS_FAIL_GRADER_ORDER = [
'agisdk_state_diff',
'infinity_state',
'performance_grader',
]
function requireEnv(name: string): string {
const value = process.env[name]
@@ -53,7 +87,7 @@ const client = new S3Client({
// Step 1: List all manifest.json files in runs/
console.log('Scanning R2 for eval runs...')
const manifests: ReportManifest[] = []
const manifests: Manifest[] = []
let continuationToken: string | undefined
do {
@@ -93,9 +127,64 @@ if (manifests.length === 0) {
}
// Step 2: Build run summaries
const runs: RunSummary[] = buildRunSummaries(manifests)
const runs: RunSummary[] = manifests
.map((m) => {
const total = m.tasks.length
const completed = m.tasks.filter((t) => t.status === 'completed').length
const failed = m.tasks.filter((t) => t.status === 'failed').length
const timeout = m.tasks.filter((t) => t.status === 'timeout').length
let scoredCount = 0
let scoreSum = 0
for (const task of m.tasks) {
if (!task.graderResults) continue
for (const name of PASS_FAIL_GRADER_ORDER) {
if (task.graderResults[name]) {
scoredCount++
scoreSum += task.graderResults[name].score ?? 0
break
}
}
}
const avgScore = scoredCount > 0 ? (scoreSum / scoredCount) * 100 : 0
const durations = m.tasks
.filter((t) => t.durationMs > 0)
.map((t) => t.durationMs)
const avgDurationMs =
durations.length > 0
? durations.reduce((a, b) => a + b, 0) / durations.length
: 0
const date = m.uploadedAt
? `${m.uploadedAt.split('T')[0]} ${m.uploadedAt.split('T')[1]?.slice(0, 5) || ''}`
: m.runId.slice(0, 15)
const model = m.agentConfig?.model || 'unknown'
const dataset = m.dataset || m.runId
const agentType = m.agentConfig?.type || 'unknown'
const configName = extractConfigName(m.runId)
return {
runId: m.runId,
configName,
date,
avgScore,
total,
completed,
failed,
timeout,
avgDurationMs,
model,
dataset,
agentType,
}
})
.sort((a, b) => a.date.localeCompare(b.date))
// Step 3: Identify unique config groups
// runId can be "ci-weekly" (old) or "ci-weekly-2026-03-21-1730" (timestamped)
// Extract config name by stripping the date-time suffix pattern
function escHtml(s: string): string {
return s
.replace(/&/g, '&amp;')
@@ -104,6 +193,12 @@ function escHtml(s: string): string {
.replace(/"/g, '&quot;')
}
function extractConfigName(runId: string): string {
// "browseros-agent-weekly-2026-03-21-1730" → "browseros-agent-weekly"
// "ci-weekly" → "ci-weekly" (no timestamp, old format)
return runId.replace(/-\d{4}-\d{2}-\d{2}-\d{4}$/, '')
}
const configGroups = [...new Set(runs.map((r) => r.configName))]
const defaultConfig = configGroups.includes('ci-weekly')
? 'ci-weekly'

View File

@@ -1,191 +0,0 @@
import type {
CladoAction,
CladoActionResponse,
RawCladoActionPayload,
} from './types'
/** Parses Clado's structured response plus any raw `<answer>` blocks into executable actions. */
export function parseCladoActions(
prediction: CladoActionResponse,
): CladoAction[] {
const actionFromField =
typeof prediction.action === 'string' ? prediction.action : null
const rawActions = parseCladoActionsFromRawResponse(prediction.raw_response)
const primaryFromRaw = rawActions[0] ?? null
const mergedPrimary = {
...primaryFromRaw,
...prediction,
action: actionFromField ?? primaryFromRaw?.action,
}
const normalized: CladoAction[] = []
const primary = normalizeCladoActionPayload(mergedPrimary)
if (primary) normalized.push(primary)
for (const candidate of rawActions.slice(1)) {
const parsed = normalizeCladoActionPayload(candidate)
if (!parsed) continue
const prev = normalized[normalized.length - 1]
if (
!prev ||
getCladoActionSignature(prev) !== getCladoActionSignature(parsed)
) {
normalized.push(parsed)
}
}
return normalized
}
export function normalizeCladoActionPayload(
payload: RawCladoActionPayload,
): CladoAction | null {
if (!payload.action || typeof payload.action !== 'string') {
return null
}
return {
action: payload.action,
x: typeof payload.x === 'number' ? payload.x : undefined,
y: typeof payload.y === 'number' ? payload.y : undefined,
text: typeof payload.text === 'string' ? payload.text : undefined,
key: typeof payload.key === 'string' ? payload.key : undefined,
direction:
typeof payload.direction === 'string' ? payload.direction : undefined,
startX: typeof payload.startX === 'number' ? payload.startX : undefined,
startY: typeof payload.startY === 'number' ? payload.startY : undefined,
endX: typeof payload.endX === 'number' ? payload.endX : undefined,
endY: typeof payload.endY === 'number' ? payload.endY : undefined,
amount: typeof payload.amount === 'number' ? payload.amount : undefined,
time: typeof payload.time === 'number' ? payload.time : undefined,
final_answer:
typeof payload.final_answer === 'string'
? payload.final_answer
: undefined,
}
}
export function parseCladoActionsFromRawResponse(
rawResponse: string | undefined,
): RawCladoActionPayload[] {
if (!rawResponse) return []
const matches = [
...rawResponse.matchAll(/<answer>\s*([\s\S]*?)\s*<\/answer>/gi),
]
const parsed: RawCladoActionPayload[] = []
for (const match of matches) {
try {
parsed.push(JSON.parse(match[1]) as RawCladoActionPayload)
} catch {
// Ignore malformed answer blocks so one bad block does not drop the whole prediction.
}
}
return parsed
}
export function extractCladoThinking(
rawResponse: string | undefined,
): string | undefined {
if (!rawResponse) return undefined
const matches = [
...rawResponse.matchAll(/<thinking>\s*([\s\S]*?)\s*<\/thinking>/gi),
]
if (matches.length === 0) return undefined
const merged = matches
.map((match) => match[1]?.replace(/\s+/g, ' ').trim() ?? '')
.filter((value) => value.length > 0)
.join(' ')
if (!merged) return undefined
return merged
}
export function summarizeCladoPrediction(
prediction: CladoActionResponse,
): Record<string, unknown> {
const preview =
typeof prediction.raw_response === 'string' &&
prediction.raw_response.length > 0
? prediction.raw_response.slice(0, 240)
: undefined
return {
action: prediction.action,
x: prediction.x,
y: prediction.y,
text: prediction.text,
key: prediction.key,
direction: prediction.direction,
startX: prediction.startX,
startY: prediction.startY,
endX: prediction.endX,
endY: prediction.endY,
amount: prediction.amount,
time: prediction.time,
inference_time_seconds: prediction.inference_time_seconds,
raw_response_preview: preview,
}
}
export function getCladoActionSignature(action: CladoAction): string {
switch (action.action) {
case 'click':
case 'double_click':
case 'right_click':
case 'hover':
return `${action.action}:${action.x ?? 'x'}:${action.y ?? 'y'}`
case 'type':
return `${action.action}:${(action.text ?? '').slice(0, 16)}`
case 'press_key':
return `${action.action}:${action.key ?? 'key'}`
case 'scroll':
return `${action.action}:${action.direction ?? 'down'}:${action.amount ?? 500}`
case 'drag':
return `${action.action}:${action.startX}:${action.startY}:${action.endX}:${action.endY}`
case 'wait':
return `${action.action}:${action.time ?? 1}`
case 'end':
return action.final_answer
? `end(${action.final_answer.slice(0, 32)})`
: 'end()'
case 'invalid':
return `invalid(${(action.text ?? '').slice(0, 40)})`
default:
return action.action
}
}
export function formatCladoHistory(actions: CladoAction[]): string {
if (actions.length === 0) return 'None'
const parts = actions.map((action) => {
switch (action.action) {
case 'click':
case 'double_click':
case 'right_click':
case 'hover':
return `${action.action}(${Math.round(action.x ?? 500)}, ${Math.round(action.y ?? 500)})`
case 'type': {
const text = (action.text ?? '').replace(/'/g, "\\'")
return `type('${text}')`
}
case 'press_key':
return `press_key('${action.key ?? 'Enter'}')`
case 'scroll':
return `scroll(${action.direction ?? 'down'})`
case 'drag':
return `drag(${Math.round(action.startX ?? 500)},${Math.round(action.startY ?? 500)} -> ${Math.round(action.endX ?? 500)},${Math.round(action.endY ?? 500)})`
case 'wait':
return `wait(${Math.round(action.time ?? 1)}s)`
case 'end':
return 'end()'
case 'invalid':
return 'invalid()'
default:
return action.action
}
})
return parts.join(' -> ')
}

View File

@@ -1,123 +0,0 @@
import {
CLADO_PAGE_SCOPED_TOOLS,
type CladoActionPoint,
type CladoViewport,
} from './types'
export function clampCladoNormalizedCoordinate(value: number): number {
return Math.min(999, Math.max(0, Math.round(value)))
}
/** Converts Clado's 0-1000 normalized coordinate space into BrowserOS viewport pixels. */
export function resolveCladoPoint(
viewport: CladoViewport,
normalizedX: number | undefined,
normalizedY: number | undefined,
): CladoActionPoint {
const nx = clampCladoNormalizedCoordinate(normalizedX ?? 500)
const ny = clampCladoNormalizedCoordinate(normalizedY ?? 500)
return {
x: Math.round((nx / 1000) * viewport.width),
y: Math.round((ny / 1000) * viewport.height),
}
}
/** Adapts Clado action tool arguments to the BrowserOS MCP tool argument contract. */
export function prepareCladoToolArgs(
toolName: string,
args: Record<string, unknown>,
pageId: number,
): Record<string, unknown> {
const prepared: Record<string, unknown> = { ...args }
if (
toolName === 'evaluate_script' &&
typeof prepared.function === 'string' &&
prepared.expression === undefined
) {
prepared.expression = toCladoEvaluateExpression(prepared.function)
delete prepared.function
}
if (
toolName === 'click_at' &&
typeof prepared.dblClick === 'boolean' &&
prepared.clickCount === undefined
) {
prepared.clickCount = prepared.dblClick ? 2 : 1
delete prepared.dblClick
}
if (
CLADO_PAGE_SCOPED_TOOLS.has(toolName) &&
typeof prepared.page !== 'number'
) {
prepared.page = pageId
}
return prepared
}
export function toCladoEvaluateExpression(rawFunction: unknown): string {
const source = String(rawFunction).trim()
if (source.startsWith('() =>') || source.startsWith('async () =>')) {
return `(${source})()`
}
if (source.startsWith('function')) {
return `(${source})()`
}
return source
}
export function normalizeCladoPressKey(key: string | undefined): string {
const raw = (key ?? '').trim()
if (!raw) throw new Error('press_key action missing key field')
const map: Record<string, string> = {
'C-a': 'Control+A',
'C-c': 'Control+C',
'C-v': 'Control+V',
'C-x': 'Control+X',
'C-z': 'Control+Z',
'C-y': 'Control+Y',
'C-s': 'Control+S',
'C-t': 'Control+T',
'C-w': 'Control+W',
'C-h': 'Control+H',
'C-f': 'Control+F',
'C-+': 'Control++',
'C--': 'Control+-',
'C-tab': 'Control+Tab',
'C-S-tab': 'Control+Shift+Tab',
'C-S-n': 'Control+Shift+N',
'C-down': 'Control+ArrowDown',
'M-a': 'Meta+A',
'M-c': 'Meta+C',
'M-v': 'Meta+V',
'M-x': 'Meta+X',
'M-f4': 'Alt+F4',
}
return map[raw] ?? raw
}
export function normalizeCladoDirection(
direction: string | undefined,
): 'up' | 'down' | 'left' | 'right' {
if (
direction === 'up' ||
direction === 'down' ||
direction === 'left' ||
direction === 'right'
) {
return direction
}
return 'down'
}
export function normalizeCladoScrollAmount(amount: number | undefined): number {
if (typeof amount !== 'number') return 500
if (amount <= 0) return 100
const clamped = Math.min(amount, 1000)
return Math.max(100, Math.round((clamped / 1000) * 900))
}

View File

@@ -1,68 +0,0 @@
import { CLADO_REQUEST_TIMEOUT_MS } from '../../../../constants'
import { formatCladoHistory } from './clado-actions'
import type { CladoAction, CladoActionResponse } from './types'
export interface CladoActionClientOptions {
baseUrl?: string
apiKey?: string
}
export interface CladoActionPredictionInput {
instruction: string
imageBase64: string
actionHistory: CladoAction[]
signal?: AbortSignal
}
/** Calls the Clado action model without exposing credentials in process arguments or artifacts. */
export class CladoActionClient {
constructor(private readonly options: CladoActionClientOptions) {}
async requestActionPrediction(
input: CladoActionPredictionInput,
): Promise<CladoActionResponse> {
if (!this.options.baseUrl) {
throw new Error('executor.baseUrl must be set for clado-action provider')
}
const requestController = new AbortController()
const onAbort = () => requestController.abort()
input.signal?.addEventListener('abort', onAbort, { once: true })
const timeoutHandle = setTimeout(() => {
requestController.abort()
}, CLADO_REQUEST_TIMEOUT_MS)
try {
const headers: Record<string, string> = {
'Content-Type': 'application/json',
}
if (this.options.apiKey) {
headers.Authorization = `Bearer ${this.options.apiKey}`
}
const response = await fetch(this.options.baseUrl, {
method: 'POST',
headers,
body: JSON.stringify({
instruction: input.instruction,
image_base64: input.imageBase64,
history: formatCladoHistory(input.actionHistory),
}),
signal: requestController.signal,
})
if (!response.ok) {
const body = await response.text()
throw new Error(
`HTTP ${response.status} ${response.statusText}: ${body.slice(0, 400)}`,
)
}
return (await response.json()) as CladoActionResponse
} finally {
clearTimeout(timeoutHandle)
input.signal?.removeEventListener('abort', onAbort)
}
}
}

Some files were not shown because too many files have changed in this diff Show More