fix: prepare wxt before typecheck in browseros-agent

The typecheck and compile scripts failed on fresh checkouts with TS5083 because tsconfig.json extends .wxt/tsconfig.json, which is gitignored and only generated by 'wxt prepare'. Run wxt prepare before tsgo so the extended config and wxt.d.ts are always in place.
feat: role aware agents (#704 )
2026-05-14 16:14:28 +00:00 · 2026-04-15 09:20:08 -07:00 · 2026-04-14 19:13:23 +05:30 · 2026-04-14 18:22:47 +05:30 · 2026-04-14 17:53:33 +05:30 · 2026-04-14 17:32:05 +05:30
170 changed files with 13316 additions and 2304 deletions
--- a/.github/workflows/code-quality.yml
+++ b/.github/workflows/code-quality.yml
@@ -4,6 +4,7 @@ on:
  pull_request:
    branches:
      - main
+      - dev
    paths:
      - "packages/browseros-agent/**"

--- a/.github/workflows/eval-weekly.yml
+++ b/.github/workflows/eval-weekly.yml
@@ -43,12 +43,6 @@ jobs:
        working-directory: packages/browseros-agent
        run: bun install --ignore-scripts && bun run build:agent-sdk

-      - name: Install Python eval dependencies
-        run: pip install agisdk requests
-
-      - name: Clone WebArena-Infinity
-        run: git clone --depth 1 https://github.com/web-arena-x/webarena-infinity.git /tmp/webarena-infinity
-
      - name: Install xvfb
        run: sudo apt-get update && sudo apt-get install -y xvfb

@@ -63,11 +57,9 @@ jobs:
        working-directory: packages/browseros-agent/apps/eval
        env:
          FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
-          OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
          CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          NOPECHA_API_KEY: ${{ secrets.NOPECHA_API_KEY }}
          BROWSEROS_BINARY: /usr/bin/browseros
-          WEBARENA_INFINITY_DIR: /tmp/webarena-infinity
          EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/browseros-agent-weekly.json' }}
        run: |
          echo "Running eval with config: $EVAL_CONFIG"
@@ -89,8 +81,6 @@ jobs:

      - name: Generate trend report
        if: success()
-        timeout-minutes: 5
-        continue-on-error: true
        working-directory: packages/browseros-agent
        env:
          EVAL_R2_ACCOUNT_ID: ${{ secrets.EVAL_R2_ACCOUNT_ID }}
--- a/.gitignore
+++ b/.gitignore
@@ -1,6 +1,4 @@
 **/.DS_Store
-**.auctor/**
-.auctor.json
 .gcs_entries
 **/dmg
 **/env
@@ -25,6 +23,7 @@ nxtscape-cli-access.json
 gclient.json
 .env
 .grove/
+AGENTS.md
 **/resources/binaries/

 packages/browseros/build/tools/
--- a/docs/features/bring-your-own-llm.mdx
+++ b/docs/features/bring-your-own-llm.mdx
@@ -131,29 +131,6 @@ Connect to powerful AI models using your API keys. Your keys stay on your machin
    ![Gemini config](/images/byollm--gemini-provider-config.png)
  </Accordion>

-  <div id="nvidia" />
-  <Accordion title="NVIDIA (Free)" icon="microchip">
-    NVIDIA's [build.nvidia.com](https://build.nvidia.com/models) hosts 80+ models — including GLM 5.1, MiniMax M2.7, GPT-OSS-120B, Qwen 3.5, Mistral, and Nemotron — behind a **free OpenAI-compatible API endpoint**. Great for chatting, prototyping, and personal projects.
-
-    **Get your API key:**
-    1. Go to [build.nvidia.com/models](https://build.nvidia.com/models) and sign in with a free NVIDIA developer account
-    2. Pick any model tagged **Free Endpoint** (e.g. [`minimaxai/minimax-m2.7`](https://build.nvidia.com/minimaxai/minimax-m2.7), [`z-ai/glm-5.1`](https://build.nvidia.com/z-ai/glm-5.1), [`qwen/qwen3.5-122b-a10b`](https://build.nvidia.com/qwen/qwen3.5-122b-a10b))
-    3. Click **Get API Key** on the model page and copy the `nvapi-...` key
-
-    **Add to BrowserOS:**
-    1. Go to `chrome://browseros/settings`
-    2. Click **USE** on the **OpenAI Compatible** card
-    3. Set **Base URL** to `https://integrate.api.nvidia.com/v1`
-    4. Set **Model ID** to a model from the catalog (e.g. `minimaxai/minimax-m2.7`, `z-ai/glm-5.1`, `qwen/qwen3.5-122b-a10b`)
-    5. Paste your NVIDIA API key
-    6. Set **Context Window** based on the model (most are `128000` or higher)
-    7. Click **Save**
-
-    <Tip>
-    NVIDIA's free endpoints share GPU capacity across all developers, so throughput is slower than a paid API. They're best for Chat Mode, exploring new open-source models, and personal projects. For production agent workloads, use a paid provider like Claude or Kimi.
-    </Tip>
-  </Accordion>
-
  <div id="claude" />
  <Accordion title="Claude (Best for Agents)" icon="message-bot">
    Claude Opus 4.5 gives the best results for Agent Mode.
--- a/packages/browseros-agent/.gitignore
+++ b/packages/browseros-agent/.gitignore
@@ -1,3 +1,4 @@
+CLAUDE.md
 # Logs
 logs
 *.log
--- a/packages/browseros-agent/CLAUDE.md
+++ b/packages/browseros-agent/CLAUDE.md
@@ -148,7 +148,7 @@ When creating new packages in this monorepo:
 ## Test Organization

 Tests are in `apps/server/tests/`:
- `tools/` - Tool tests (require BrowserOS running with CDP)
+- `tools/` - Tool tests (require BrowserOS running with CDP), plus ACL scorer tests (standalone)
 - `browser/` - Browser backend tests
 - `agent/` - Agent tests (compaction, rate limiter)
 - `sdk/` - Agent SDK tests
--- a/packages/browseros-agent/apps/agent/components/execution-history/ExecutionStepItem.tsx
+++ b/packages/browseros-agent/apps/agent/components/execution-history/ExecutionStepItem.tsx
@@ -0,0 +1,143 @@
+import {
+  CheckCircle2,
+  ChevronDown,
+  CircleDotDashed,
+  Clock3,
+  ShieldAlert,
+  ShieldCheck,
+  XCircle,
+} from 'lucide-react'
+import { type FC, useState } from 'react'
+import { ToolInput, ToolOutput } from '@/components/ai-elements/tool'
+import { Badge } from '@/components/ui/badge'
+import {
+  Collapsible,
+  CollapsibleContent,
+  CollapsibleTrigger,
+} from '@/components/ui/collapsible'
+import type { ExecutionStepRecord } from '@/lib/execution-history/types'
+import { cn } from '@/lib/utils'
+
+const formatToolName = (name: string) =>
+  name
+    .replace(/_/g, ' ')
+    .replace(/([a-z])([A-Z])/g, '$1 $2')
+    .replace(/^./, (value) => value.toUpperCase())
+
+const formatStateLabel = (state: ExecutionStepRecord['state']) => {
+  if (state === 'input-streaming') return 'Preparing'
+  if (state === 'input-available') return 'Running'
+  if (state === 'approval-requested') return 'Approval Needed'
+  if (state === 'approval-responded') return 'Approval Responded'
+  if (state === 'output-available') return 'Completed'
+  if (state === 'output-denied') return 'Denied'
+  return 'Error'
+}
+
+const getStateIcon = (step: ExecutionStepRecord) => {
+  if (step.state === 'output-available') {
+    return <CheckCircle2 className="h-4 w-4 text-green-500" />
+  }
+
+  if (
+    step.state === 'input-streaming' ||
+    step.state === 'input-available' ||
+    step.state === 'approval-requested'
+  ) {
+    return <Clock3 className="h-4 w-4 text-[var(--accent-orange)]" />
+  }
+
+  if (step.state === 'approval-responded') {
+    return <ShieldCheck className="h-4 w-4 text-blue-500" />
+  }
+
+  if (step.state === 'output-denied') {
+    return <ShieldAlert className="h-4 w-4 text-orange-500" />
+  }
+
+  if (step.state === 'output-error') {
+    return <XCircle className="h-4 w-4 text-destructive" />
+  }
+
+  return <CircleDotDashed className="h-4 w-4 text-muted-foreground" />
+}
+
+const isAclBlocked = (step: ExecutionStepRecord) =>
+  Boolean(
+    step.errorText?.includes('Action blocked by ACL rule') ||
+      step.approval?.reason?.includes('Action blocked by ACL rule') ||
+      step.previewText === 'Blocked by ACL rule',
+  )
+
+const shouldShowPreview = (step: ExecutionStepRecord) =>
+  step.state === 'input-streaming' ||
+  step.state === 'input-available' ||
+  step.state === 'approval-requested' ||
+  step.state === 'approval-responded'
+
+export const ExecutionStepItem: FC<{
+  step: ExecutionStepRecord
+  defaultOpen?: boolean
+}> = ({ step, defaultOpen = false }) => {
+  const [open, setOpen] = useState(defaultOpen)
+  const deniedReason =
+    step.state === 'output-denied' ? step.approval?.reason : undefined
+
+  return (
+    <Collapsible open={open} onOpenChange={setOpen}>
+      <div className="rounded-xl border border-border/60 bg-card/60">
+        <CollapsibleTrigger asChild>
+          <button
+            type="button"
+            className="flex w-full items-start gap-3 px-4 py-3 text-left"
+          >
+            <div className="mt-0.5 shrink-0">{getStateIcon(step)}</div>
+            <div className="min-w-0 flex-1">
+              <div className="flex flex-wrap items-center gap-2">
+                <p className="font-medium text-foreground text-sm">
+                  {formatToolName(step.toolName)}
+                </p>
+                <Badge variant="secondary">
+                  {formatStateLabel(step.state)}
+                </Badge>
+                {isAclBlocked(step) && (
+                  <Badge variant="outline">ACL Blocked</Badge>
+                )}
+              </div>
+              {shouldShowPreview(step) && (
+                <p className="mt-1 text-muted-foreground text-xs">
+                  {step.previewText}
+                </p>
+              )}
+            </div>
+            <ChevronDown
+              className={cn(
+                'mt-0.5 h-4 w-4 shrink-0 text-muted-foreground transition-transform',
+                open && 'rotate-180',
+              )}
+            />
+          </button>
+        </CollapsibleTrigger>
+        <CollapsibleContent className="border-border/60 border-t">
+          {step.input !== undefined && <ToolInput input={step.input} />}
+          {step.state === 'output-denied' ? (
+            <div className="space-y-2 p-4">
+              <h4 className="font-medium text-muted-foreground text-xs uppercase tracking-wide">
+                Result
+              </h4>
+              <div className="rounded-md bg-orange-500/10 p-3 text-orange-700 text-sm dark:text-orange-300">
+                {deniedReason ?? 'The requested action was denied.'}
+              </div>
+            </div>
+          ) : (
+            <ToolOutput
+              output={step.output}
+              errorText={step.errorText}
+              className="pt-0"
+            />
+          )}
+        </CollapsibleContent>
+      </div>
+    </Collapsible>
+  )
+}
--- a/packages/browseros-agent/apps/agent/components/execution-history/ExecutionTaskCard.tsx
+++ b/packages/browseros-agent/apps/agent/components/execution-history/ExecutionTaskCard.tsx
@@ -0,0 +1,168 @@
+import dayjs from 'dayjs'
+import duration from 'dayjs/plugin/duration'
+import relativeTime from 'dayjs/plugin/relativeTime'
+import {
+  CheckCircle2,
+  ChevronDown,
+  CircleDot,
+  CircleSlash2,
+  MessageSquareText,
+  Trash2,
+  XCircle,
+} from 'lucide-react'
+import { type FC, useMemo, useState } from 'react'
+import { Badge } from '@/components/ui/badge'
+import { Button } from '@/components/ui/button'
+import {
+  Collapsible,
+  CollapsibleContent,
+  CollapsibleTrigger,
+} from '@/components/ui/collapsible'
+import type { ExecutionTaskRecord } from '@/lib/execution-history/types'
+import { cn } from '@/lib/utils'
+import { ExecutionStepItem } from './ExecutionStepItem'
+
+dayjs.extend(relativeTime)
+dayjs.extend(duration)
+
+function getTaskStatusIcon(status: ExecutionTaskRecord['status']) {
+  if (status === 'completed') {
+    return <CheckCircle2 className="h-4 w-4 text-green-500" />
+  }
+
+  if (status === 'running') {
+    return <CircleDot className="h-4 w-4 text-[var(--accent-orange)]" />
+  }
+
+  if (status === 'stopped') {
+    return <CircleSlash2 className="h-4 w-4 text-orange-500" />
+  }
+
+  return <XCircle className="h-4 w-4 text-destructive" />
+}
+
+function getTaskStatusLabel(status: ExecutionTaskRecord['status']) {
+  if (status === 'completed') return 'Completed'
+  if (status === 'running') return 'Running'
+  if (status === 'stopped') return 'Stopped'
+  if (status === 'interrupted') return 'Interrupted'
+  return 'Failed'
+}
+
+function formatDuration(task: ExecutionTaskRecord): string | null {
+  if (!task.completedAt) return null
+  const diff = dayjs(task.completedAt).diff(task.startedAt)
+  const parsed = dayjs.duration(diff)
+  const minutes = Math.floor(parsed.asMinutes())
+  const seconds = parsed.seconds()
+  if (minutes === 0) return `${seconds}s`
+  return `${minutes}m ${seconds}s`
+}
+
+export const ExecutionTaskCard: FC<{
+  task: ExecutionTaskRecord
+  defaultOpen?: boolean
+  onDelete?: (task: ExecutionTaskRecord) => void
+}> = ({ task, defaultOpen = false, onDelete }) => {
+  const [open, setOpen] = useState(defaultOpen)
+  const startedAgo = useMemo(
+    () => dayjs(task.startedAt).fromNow(),
+    [task.startedAt],
+  )
+
+  return (
+    <Collapsible open={open} onOpenChange={setOpen}>
+      <div className="rounded-2xl border border-border/60 bg-card shadow-sm">
+        <div className="flex items-start gap-2 px-5 py-5">
+          <CollapsibleTrigger asChild>
+            <button
+              type="button"
+              className="flex min-w-0 flex-1 items-start gap-3 text-left"
+            >
+              <div className="mt-0.5 shrink-0">
+                {getTaskStatusIcon(task.status)}
+              </div>
+              <div className="min-w-0 flex-1">
+                <div className="flex flex-wrap items-center gap-2">
+                  <p className="line-clamp-2 font-medium text-base text-foreground">
+                    {task.promptText}
+                  </p>
+                  <Badge variant="secondary">
+                    {getTaskStatusLabel(task.status)}
+                  </Badge>
+                </div>
+                <div className="mt-2 flex flex-wrap items-center gap-2 text-muted-foreground text-xs">
+                  <span>{startedAgo}</span>
+                  <span>•</span>
+                  <span>
+                    {task.actionCount} action{task.actionCount === 1 ? '' : 's'}
+                  </span>
+                  {formatDuration(task) && (
+                    <>
+                      <span>•</span>
+                      <span>{formatDuration(task)}</span>
+                    </>
+                  )}
+                  {task.deniedCount > 0 && (
+                    <Badge variant="outline" className="h-5 rounded-full px-2">
+                      {task.deniedCount} denied
+                    </Badge>
+                  )}
+                  {task.errorCount > 0 && (
+                    <Badge variant="outline" className="h-5 rounded-full px-2">
+                      {task.errorCount} error
+                      {task.errorCount === 1 ? '' : 's'}
+                    </Badge>
+                  )}
+                </div>
+                {task.responsePreview ? (
+                  <div className="mt-4 flex items-start gap-2 rounded-xl bg-muted/40 px-3 py-2 text-muted-foreground text-sm">
+                    <MessageSquareText className="mt-0.5 h-4 w-4 shrink-0" />
+                    <p className="line-clamp-2">{task.responsePreview}</p>
+                  </div>
+                ) : null}
+              </div>
+              <ChevronDown
+                className={cn(
+                  'mt-1 h-4 w-4 shrink-0 text-muted-foreground transition-transform',
+                  open && 'rotate-180',
+                )}
+              />
+            </button>
+          </CollapsibleTrigger>
+          {onDelete ? (
+            <Button
+              type="button"
+              variant="ghost"
+              size="icon-sm"
+              className="mt-0.5 shrink-0 text-muted-foreground hover:text-foreground"
+              onClick={() => onDelete(task)}
+              aria-label={`Delete ${task.promptText}`}
+            >
+              <Trash2 className="size-4" />
+            </Button>
+          ) : null}
+        </div>
+        <CollapsibleContent className="border-border/60 border-t px-5 py-5">
+          {task.steps.length === 0 ? (
+            <div className="rounded-xl border border-border/70 border-dashed bg-muted/30 px-4 py-6 text-center text-muted-foreground text-sm">
+              No tool actions were recorded for this task.
+            </div>
+          ) : (
+            <div className="space-y-3">
+              {task.steps.map((step, index) => (
+                <ExecutionStepItem
+                  key={step.id}
+                  step={step}
+                  defaultOpen={
+                    task.status === 'running' && index === task.steps.length - 1
+                  }
+                />
+              ))}
+            </div>
+          )}
+        </CollapsibleContent>
+      </div>
+    </Collapsible>
+  )
+}
--- a/packages/browseros-agent/apps/agent/components/referral/ShareForCredits.tsx
+++ b/packages/browseros-agent/apps/agent/components/referral/ShareForCredits.tsx
@@ -1,148 +0,0 @@
-import { REFERRAL_LIMITS } from '@browseros/shared/constants/limits'
-import { ExternalLink, Loader2, Send } from 'lucide-react'
-import type { FC } from 'react'
-import { useState } from 'react'
-import { Button } from '@/components/ui/button'
-import { Input } from '@/components/ui/input'
-import { useCredits, useInvalidateCredits } from '@/lib/credits/useCredits'
-import {
-  getShareOnTwitterUrl,
-  submitReferral,
-} from '@/lib/referral/submit-referral'
-
-interface ShareForCreditsProps {
-  compact?: boolean
-}
-
-export const ShareForCredits: FC<ShareForCreditsProps> = ({ compact }) => {
-  const [tweetUrl, setTweetUrl] = useState('')
-  const [isSubmitting, setIsSubmitting] = useState(false)
-  const [result, setResult] = useState<{
-    success: boolean
-    message: string
-  } | null>(null)
-
-  const { data } = useCredits()
-  const invalidateCredits = useInvalidateCredits()
-
-  const credits = data?.credits ?? 0
-  const atDailyMax = credits >= REFERRAL_LIMITS.MAX_DAILY_CREDITS
-
-  const handleSubmit = async () => {
-    if (!tweetUrl.trim() || !data?.browserosId || atDailyMax) return
-
-    setIsSubmitting(true)
-    setResult(null)
-
-    try {
-      const res = await submitReferral(tweetUrl.trim(), data.browserosId)
-      if (res.success) {
-        setResult({
-          success: true,
-          message: `${res.creditsAdded ?? 200} credits added!`,
-        })
-        setTweetUrl('')
-        invalidateCredits()
-      } else {
-        setResult({
-          success: false,
-          message: res.reason ?? 'Submission failed. Please try again.',
-        })
-      }
-    } catch {
-      setResult({
-        success: false,
-        message: 'Network error. Please try again.',
-      })
-    } finally {
-      setIsSubmitting(false)
-    }
-  }
-
-  if (atDailyMax) {
-    return (
-      <div className={compact ? 'space-y-2' : 'space-y-3'}>
-        <p className={compact ? 'text-muted-foreground text-xs' : 'text-sm'}>
-          You've reached the daily cap of {REFERRAL_LIMITS.MAX_DAILY_CREDITS}{' '}
-          credits. Come back tomorrow to earn more!
-        </p>
-      </div>
-    )
-  }
-
-  return (
-    <div className={compact ? 'space-y-2' : 'space-y-3'}>
-      <p className={compact ? 'text-muted-foreground text-xs' : 'text-sm'}>
-        Share BrowserOS on Twitter to earn{' '}
-        {REFERRAL_LIMITS.CREDITS_PER_REFERRAL} bonus credits!
-      </p>
-
-      <ul className="list-disc space-y-0.5 pl-4 text-muted-foreground text-xs">
-        <li>
-          Tweet must mention <span className="font-medium">@browserOS_ai</span>
-        </li>
-        <li>Tweet must be posted within the last 30 minutes</li>
-        <li>Each tweet can only be submitted once</li>
-        <li>
-          Daily cap of {REFERRAL_LIMITS.MAX_DAILY_CREDITS} credits — resets at
-          midnight UTC
-        </li>
-      </ul>
-
-      <Button variant="outline" size="sm" className="w-full gap-2" asChild>
-        <a
-          href={getShareOnTwitterUrl()}
-          target="_blank"
-          rel="noopener noreferrer"
-          onClick={(e) => {
-            e.currentTarget.href = getShareOnTwitterUrl()
-          }}
-        >
-          <ExternalLink className="h-3.5 w-3.5" />
-          Share on Twitter
-        </a>
-      </Button>
-
-      <p className="text-muted-foreground text-xs">
-        Already shared? Paste your tweet link:
-      </p>
-
-      <div className="flex gap-2">
-        <Input
-          type="url"
-          placeholder="https://x.com/..."
-          value={tweetUrl}
-          onChange={(e) => setTweetUrl(e.target.value)}
-          className="h-8 text-xs"
-          disabled={isSubmitting}
-        />
-        <Button
-          variant="default"
-          size="sm"
-          onClick={handleSubmit}
-          disabled={isSubmitting || !tweetUrl.trim()}
-          className="shrink-0 gap-1.5"
-        >
-          {isSubmitting ? (
-            <Loader2 className="h-3.5 w-3.5 animate-spin" />
-          ) : (
-            <Send className="h-3.5 w-3.5" />
-          )}
-          Submit
-        </Button>
-      </div>
-
-      {result && (
-        <p
-          className={
-            result.success
-              ? 'text-green-600 text-xs dark:text-green-400'
-              : 'text-destructive text-xs'
-          }
-        >
-          {result.message}
-        </p>
-      )}
-    </div>
-  )
-}
--- a/packages/browseros-agent/apps/agent/components/sidebar/SettingsSidebar.tsx
+++ b/packages/browseros-agent/apps/agent/components/sidebar/SettingsSidebar.tsx
@@ -9,6 +9,8 @@ import {
  RotateCcw,
  Search,
  Server,
+  ShieldAlert,
+  ShieldCheck,
 } from 'lucide-react'
 import type { FC } from 'react'
 import { NavLink } from 'react-router'
@@ -78,7 +80,9 @@ const primarySettingsSections: NavSection[] = [
        icon: Palette,
        feature: Feature.CUSTOMIZATION_SUPPORT,
      },
+      { name: 'Tool Approvals', to: '/settings/approvals', icon: ShieldCheck },
      { name: 'BrowserOS as MCP', to: '/settings/mcp', icon: Server },
+      { name: 'ACL Rules', to: '/settings/acl', icon: ShieldAlert },
      {
        name: 'Usage & Billing',
        to: '/settings/usage',
--- a/packages/browseros-agent/apps/agent/components/sidebar/SidebarNavigation.tsx
+++ b/packages/browseros-agent/apps/agent/components/sidebar/SidebarNavigation.tsx
@@ -1,9 +1,11 @@
 import {
  Brain,
  CalendarClock,
+  Cpu,
  Home,
  PlugZap,
  Settings,
+  Shield,
  Sparkles,
  Wand2,
 } from 'lucide-react'
@@ -39,6 +41,7 @@ const primaryNavItems: NavItem[] = [
    feature: Feature.MANAGED_MCP_SUPPORT,
  },
  { name: 'Scheduled Tasks', to: '/scheduled', icon: CalendarClock },
+  { name: 'Agents', to: '/agents', icon: Cpu },
  {
    name: 'Skills',
    to: '/home/skills',
@@ -57,6 +60,7 @@ const primaryNavItems: NavItem[] = [
    icon: Sparkles,
    feature: Feature.SOUL_SUPPORT,
  },
+  { name: 'Governance', to: '/admin', icon: Shield },
  { name: 'Settings', to: '/settings/ai', icon: Settings },
 ]

--- a/packages/browseros-agent/apps/agent/entrypoints/app/App.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/App.tsx
@@ -1,7 +1,5 @@
 import type { FC } from 'react'
 import { HashRouter, Navigate, Route, Routes, useParams } from 'react-router'
-
-import { NewTab } from '../newtab/index/NewTab'
 import { NewTabChat } from '../newtab/index/NewTabChat'
 import { NewTabLayout } from '../newtab/layout/NewTabLayout'
 import { Personalize } from '../newtab/personalize/Personalize'
@@ -9,6 +7,12 @@ import { OnboardingDemo } from '../onboarding/demo/OnboardingDemo'
 import { FeaturesPage } from '../onboarding/features/Features'
 import { Onboarding } from '../onboarding/index/Onboarding'
 import { StepsLayout } from '../onboarding/steps/StepsLayout'
+import { AclSettingsPage } from './acl-settings/AclSettingsPage'
+import { AdminDashboardPage } from './admin-dashboard/AdminDashboardPage'
+import { AgentCommandConversation } from './agent-command/AgentCommandConversation'
+import { AgentCommandHome } from './agent-command/AgentCommandHome'
+import { AgentCommandLayout } from './agent-command/agent-command-layout'
+import { AgentsPage } from './agents/AgentsPage'
 import { AISettingsPage } from './ai-settings/AISettingsPage'
 import { ConnectMCP } from './connect-mcp/ConnectMCP'
 import { CustomizationPage } from './customization/CustomizationPage'
@@ -27,6 +31,7 @@ import { ScheduledTasksPage } from './scheduled-tasks/ScheduledTasksPage'
 import { SearchProviderPage } from './search-provider/SearchProviderPage'
 import { SkillsPage } from './skills/SkillsPage'
 import { SoulPage } from './soul/SoulPage'
+import { ToolApprovalsPage } from './tool-approvals/ToolApprovalsPage'
 import { UsagePage } from './usage/UsagePage'

 function getSurveyParams(): { maxTurns?: number; experimentId?: string } {
@@ -76,7 +81,13 @@ export const App: FC = () => {
        <Route element={<SidebarLayout />}>
          {/* Home routes */}
          <Route path="home" element={<NewTabLayout />}>
-            <Route index element={<NewTab />} />
+            <Route element={<AgentCommandLayout />}>
+              <Route index element={<AgentCommandHome />} />
+              <Route
+                path="agents/:agentId"
+                element={<AgentCommandConversation />}
+              />
+            </Route>
            <Route path="chat" element={<NewTabChat />} />
            <Route path="personalize" element={<Personalize />} />
            <Route path="soul" element={<SoulPage />} />
@@ -87,6 +98,8 @@ export const App: FC = () => {
          {/* Primary nav routes */}
          <Route path="connect-apps" element={<ConnectMCP />} />
          <Route path="scheduled" element={<ScheduledTasksPage />} />
+          <Route path="agents" element={<AgentsPage />} />
+          <Route path="admin" element={<AdminDashboardPage />} />
        </Route>

        {/* Settings with dedicated sidebar */}
@@ -100,6 +113,8 @@ export const App: FC = () => {
            <Route path="search" element={<SearchProviderPage />} />
            <Route path="survey" element={<SurveyPage {...surveyParams} />} />
            <Route path="usage" element={<UsagePage />} />
+            <Route path="acl" element={<AclSettingsPage />} />
+            <Route path="approvals" element={<ToolApprovalsPage />} />
          </Route>
        </Route>

@@ -129,6 +144,12 @@ export const App: FC = () => {
          path="/settings/skills"
          element={<Navigate to="/home/skills" replace />}
        />
+        <Route path="/audit" element={<Navigate to="/admin" replace />} />
+        <Route
+          path="/observability"
+          element={<Navigate to="/admin" replace />}
+        />
+        <Route path="/executions" element={<Navigate to="/admin" replace />} />
        <Route path="/options/*" element={<OptionsRedirect />} />

        {/* Fallback to home */}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/acl-settings/AclRuleCard.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/acl-settings/AclRuleCard.tsx
@@ -0,0 +1,57 @@
+import type { AclRule } from '@browseros/shared/types/acl'
+import { Globe, Trash2 } from 'lucide-react'
+import type { FC } from 'react'
+import { Badge } from '@/components/ui/badge'
+import { Button } from '@/components/ui/button'
+import { Switch } from '@/components/ui/switch'
+import { cn } from '@/lib/utils'
+
+interface AclRuleCardProps {
+  rule: AclRule
+  onToggle: (id: string, enabled: boolean) => void
+  onDelete: (id: string) => void
+}
+
+export const AclRuleCard: FC<AclRuleCardProps> = ({
+  rule,
+  onToggle,
+  onDelete,
+}) => {
+  const summary =
+    rule.description ?? rule.textMatch ?? rule.selector ?? 'Block actions'
+
+  return (
+    <div
+      className={cn(
+        'flex items-center gap-4 rounded-xl border p-4 transition-all',
+        rule.enabled
+          ? 'border-red-300 bg-red-50/50 dark:border-red-800 dark:bg-red-950/20'
+          : 'border-border bg-card opacity-60',
+      )}
+    >
+      <Switch
+        checked={rule.enabled}
+        onCheckedChange={(checked) => onToggle(rule.id, checked)}
+      />
+
+      <div className="flex min-w-0 flex-1 flex-col gap-1">
+        <span className="truncate font-medium text-sm">{summary}</span>
+        <div className="flex flex-wrap items-center gap-2">
+          <Badge variant="secondary" className="gap-1 font-mono text-xs">
+            <Globe className="size-3" />
+            {rule.sitePattern}
+          </Badge>
+        </div>
+      </div>
+
+      <Button
+        variant="ghost"
+        size="icon"
+        className="shrink-0 text-muted-foreground hover:text-destructive"
+        onClick={() => onDelete(rule.id)}
+      >
+        <Trash2 className="size-4" />
+      </Button>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/acl-settings/AclSettingsPage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/acl-settings/AclSettingsPage.tsx
@@ -0,0 +1,85 @@
+import type { AclRule } from '@browseros/shared/types/acl'
+import { Plus, ShieldAlert } from 'lucide-react'
+import { type FC, useEffect, useState } from 'react'
+import { Button } from '@/components/ui/button'
+import { aclRulesStorage } from '@/lib/acl/storage'
+import { AclRuleCard } from './AclRuleCard'
+import { NewAclRuleDialog } from './NewAclRuleDialog'
+
+export const AclSettingsPage: FC = () => {
+  const [rules, setRules] = useState<AclRule[]>([])
+
+  useEffect(() => {
+    aclRulesStorage.getValue().then(setRules)
+    const unwatch = aclRulesStorage.watch(setRules)
+    return () => unwatch()
+  }, [])
+
+  const saveRules = (next: AclRule[]) => {
+    setRules(next)
+    aclRulesStorage.setValue(next)
+  }
+
+  const handleAddRule = (rule: AclRule) => {
+    saveRules([...rules, rule])
+  }
+
+  const handleToggle = (id: string, enabled: boolean) => {
+    saveRules(rules.map((r) => (r.id === id ? { ...r, enabled } : r)))
+  }
+
+  const handleDelete = (id: string) => {
+    saveRules(rules.filter((r) => r.id !== id))
+  }
+
+  return (
+    <div className="mx-auto max-w-2xl p-6">
+      <div className="mb-6 flex items-center justify-between">
+        <div>
+          <h1 className="font-semibold text-xl">ACL Rules</h1>
+          <p className="mt-1 text-muted-foreground text-sm">
+            Describe what the agent should avoid on a site and BrowserOS will
+            block matching actions.
+          </p>
+        </div>
+        <NewAclRuleDialog onSave={handleAddRule}>
+          <Button size="sm">
+            <Plus className="mr-1 size-4" />
+            Add Rule
+          </Button>
+        </NewAclRuleDialog>
+      </div>
+
+      {rules.length === 0 ? (
+        <div className="flex flex-col items-center gap-3 rounded-xl border border-dashed p-12 text-center">
+          <ShieldAlert className="size-10 text-muted-foreground" />
+          <div>
+            <p className="font-medium">No ACL rules defined</p>
+            <p className="mt-1 text-muted-foreground text-sm">
+              Add a plain-English rule like &ldquo;payments and checkout&rdquo;
+              or &ldquo;send email&rdquo; and BrowserOS will apply broad safety
+              blocking on that site.
+            </p>
+          </div>
+          <NewAclRuleDialog onSave={handleAddRule}>
+            <Button variant="outline" size="sm">
+              <Plus className="mr-1 size-4" />
+              Add your first rule
+            </Button>
+          </NewAclRuleDialog>
+        </div>
+      ) : (
+        <div className="flex flex-col gap-3">
+          {rules.map((rule) => (
+            <AclRuleCard
+              key={rule.id}
+              rule={rule}
+              onToggle={handleToggle}
+              onDelete={handleDelete}
+            />
+          ))}
+        </div>
+      )}
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/acl-settings/NewAclRuleDialog.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/acl-settings/NewAclRuleDialog.tsx
@@ -0,0 +1,98 @@
+import type { AclRule } from '@browseros/shared/types/acl'
+import { type FC, useState } from 'react'
+import { Button } from '@/components/ui/button'
+import {
+  Dialog,
+  DialogContent,
+  DialogFooter,
+  DialogHeader,
+  DialogTitle,
+  DialogTrigger,
+} from '@/components/ui/dialog'
+import { Input } from '@/components/ui/input'
+import { Label } from '@/components/ui/label'
+
+interface NewAclRuleDialogProps {
+  onSave: (rule: AclRule) => void
+  children: React.ReactNode
+}
+
+export const NewAclRuleDialog: FC<NewAclRuleDialogProps> = ({
+  onSave,
+  children,
+}) => {
+  const [open, setOpen] = useState(false)
+  const [sitePattern, setSitePattern] = useState('')
+  const [intent, setIntent] = useState('')
+
+  const reset = () => {
+    setSitePattern('')
+    setIntent('')
+  }
+
+  const handleSave = () => {
+    if (!sitePattern.trim() || !intent.trim()) return
+    onSave({
+      id: crypto.randomUUID(),
+      sitePattern: sitePattern.trim(),
+      description: intent.trim(),
+      enabled: true,
+    })
+    reset()
+    setOpen(false)
+  }
+
+  return (
+    <Dialog open={open} onOpenChange={setOpen}>
+      <DialogTrigger asChild>{children}</DialogTrigger>
+      <DialogContent>
+        <DialogHeader>
+          <DialogTitle>Add ACL Rule</DialogTitle>
+        </DialogHeader>
+        <div className="flex flex-col gap-4 py-4">
+          <div className="flex flex-col gap-2">
+            <Label htmlFor="site-pattern">
+              Domain <span className="text-destructive">*</span>
+            </Label>
+            <Input
+              id="site-pattern"
+              placeholder="amazon.com"
+              value={sitePattern}
+              onChange={(e) => setSitePattern(e.target.value)}
+            />
+            <p className="text-muted-foreground text-xs">
+              Matches the domain and all subdomains.
+            </p>
+          </div>
+          <div className="flex flex-col gap-2">
+            <Label htmlFor="intent">
+              What should BrowserOS block?{' '}
+              <span className="text-destructive">*</span>
+            </Label>
+            <Input
+              id="intent"
+              placeholder="Payments and checkout"
+              value={intent}
+              onChange={(e) => setIntent(e.target.value)}
+            />
+            <p className="text-muted-foreground text-xs">
+              Use plain English. BrowserOS will block matching actions on this
+              site.
+            </p>
+          </div>
+        </div>
+        <DialogFooter>
+          <Button variant="outline" onClick={() => setOpen(false)}>
+            Cancel
+          </Button>
+          <Button
+            onClick={handleSave}
+            disabled={!sitePattern.trim() || !intent.trim()}
+          >
+            Add Rule
+          </Button>
+        </DialogFooter>
+      </DialogContent>
+    </Dialog>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/admin-dashboard/AdminDashboardHeader.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/admin-dashboard/AdminDashboardHeader.tsx
@@ -0,0 +1,41 @@
+import { Shield } from 'lucide-react'
+import type { FC } from 'react'
+import { Badge } from '@/components/ui/badge'
+
+interface AdminDashboardHeaderProps {
+  pendingCount: number
+  runningCount: number
+}
+
+export const AdminDashboardHeader: FC<AdminDashboardHeaderProps> = ({
+  pendingCount,
+  runningCount,
+}) => {
+  return (
+    <div className="rounded-xl border border-border bg-card p-6 shadow-sm transition-all hover:shadow-md">
+      <div className="flex items-start gap-4">
+        <div className="flex h-12 w-12 shrink-0 items-center justify-center rounded-xl bg-[var(--accent-orange)]/10">
+          <Shield className="h-6 w-6 text-[var(--accent-orange)]" />
+        </div>
+        <div className="flex-1">
+          <div className="mb-1 flex flex-wrap items-center gap-2">
+            <h2 className="font-semibold text-xl">Governance</h2>
+            {pendingCount > 0 && (
+              <Badge className="gap-1.5 rounded-full bg-yellow-500/10 text-yellow-600">
+                {pendingCount} pending
+              </Badge>
+            )}
+            {runningCount > 0 && (
+              <Badge className="gap-1.5 rounded-full">
+                {runningCount} live
+              </Badge>
+            )}
+          </div>
+          <p className="text-muted-foreground text-sm">
+            Control agent permissions and audit every action.
+          </p>
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/admin-dashboard/AdminDashboardPage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/admin-dashboard/AdminDashboardPage.tsx
@@ -0,0 +1,199 @@
+import dayjs from 'dayjs'
+import { Shield } from 'lucide-react'
+import { type FC, useEffect, useMemo, useState } from 'react'
+import { toast } from 'sonner'
+import { ExecutionTaskCard } from '@/components/execution-history/ExecutionTaskCard'
+import {
+  AlertDialog,
+  AlertDialogAction,
+  AlertDialogCancel,
+  AlertDialogContent,
+  AlertDialogDescription,
+  AlertDialogFooter,
+  AlertDialogHeader,
+  AlertDialogTitle,
+} from '@/components/ui/alert-dialog'
+import {
+  removeConversationExecutionTask,
+  useExecutionHistoryByConversation,
+} from '@/lib/execution-history/storage'
+import type { ExecutionTaskRecord } from '@/lib/execution-history/types'
+import { pendingToolApprovalsStorage } from '@/lib/tool-approvals/approval-sync-storage'
+import { AdminDashboardHeader } from './AdminDashboardHeader'
+import { PendingApprovals } from './PendingApprovals'
+
+type TaskGroup = {
+  label: string
+  tasks: ExecutionTaskRecord[]
+}
+
+function getGroupLabel(date: string) {
+  const startedAt = dayjs(date)
+  if (startedAt.isSame(dayjs(), 'day')) return 'Today'
+  if (startedAt.isSame(dayjs().subtract(1, 'day'), 'day')) return 'Yesterday'
+  return startedAt.format('MMMM D, YYYY')
+}
+
+function groupTasks(tasks: ExecutionTaskRecord[]): TaskGroup[] {
+  const grouped = new Map<string, ExecutionTaskRecord[]>()
+
+  for (const task of tasks) {
+    const label = getGroupLabel(task.startedAt)
+    const existing = grouped.get(label) ?? []
+    grouped.set(label, [...existing, task])
+  }
+
+  return Array.from(grouped.entries()).map(([label, groupItems]) => ({
+    label,
+    tasks: groupItems,
+  }))
+}
+
+export const AdminDashboardPage: FC = () => {
+  const [pendingCount, setPendingCount] = useState(0)
+  const historyByConversation = useExecutionHistoryByConversation()
+  const [taskToDelete, setTaskToDelete] = useState<ExecutionTaskRecord | null>(
+    null,
+  )
+
+  useEffect(() => {
+    pendingToolApprovalsStorage
+      .getValue()
+      .then((v) => setPendingCount(v.length))
+    const unwatch = pendingToolApprovalsStorage.watch((v) =>
+      setPendingCount(v.length),
+    )
+    return () => unwatch()
+  }, [])
+
+  const historyList = useMemo(
+    () => Object.values(historyByConversation),
+    [historyByConversation],
+  )
+
+  const tasks = useMemo(() => {
+    return historyList
+      .flatMap((history) => history.tasks)
+      .sort(
+        (left, right) =>
+          new Date(right.startedAt).getTime() -
+          new Date(left.startedAt).getTime(),
+      )
+  }, [historyList])
+
+  const groupedTasks = useMemo(() => groupTasks(tasks), [tasks])
+  const runningCount = useMemo(
+    () => tasks.filter((task) => task.status === 'running').length,
+    [tasks],
+  )
+  const conversationCount = historyList.length
+
+  const handleDeleteTask = async () => {
+    if (!taskToDelete) return
+
+    try {
+      await removeConversationExecutionTask({
+        conversationId: taskToDelete.conversationId,
+        taskId: taskToDelete.id,
+      })
+      toast.success('Run removed')
+    } catch {
+      toast.error('Failed to remove run')
+    } finally {
+      setTaskToDelete(null)
+    }
+  }
+
+  return (
+    <div className="fade-in slide-in-from-bottom-5 animate-in space-y-6 duration-500">
+      <AdminDashboardHeader
+        pendingCount={pendingCount}
+        runningCount={runningCount}
+      />
+
+      <section className="space-y-3">
+        <h3 className="font-semibold text-sm">Approvals</h3>
+        <PendingApprovals />
+      </section>
+
+      <section className="space-y-4">
+        <div>
+          <h3 className="font-semibold text-sm">Audit Trail</h3>
+          {tasks.length > 0 && (
+            <p className="mt-1 text-muted-foreground text-sm">
+              {tasks.length} recorded run{tasks.length === 1 ? '' : 's'}
+              {conversationCount > 1
+                ? ` across ${conversationCount} chats`
+                : ''}
+              . Newest first.
+            </p>
+          )}
+        </div>
+
+        {tasks.length === 0 ? (
+          <div className="rounded-xl border border-dashed px-6 py-14 text-center">
+            <div className="mx-auto mb-4 flex size-12 items-center justify-center rounded-2xl bg-[var(--accent-orange)]/10">
+              <Shield className="size-5 text-[var(--accent-orange)]" />
+            </div>
+            <h3 className="mb-1 font-medium text-lg">No agent runs yet</h3>
+            <p className="mx-auto max-w-sm text-muted-foreground text-sm">
+              Run a task in BrowserOS and the execution history will appear
+              here.
+            </p>
+          </div>
+        ) : (
+          <div className="space-y-6">
+            {groupedTasks.map((group, groupIndex) => (
+              <section key={group.label} className="space-y-3">
+                <div className="flex items-center gap-3">
+                  <h4 className="font-medium text-muted-foreground text-xs">
+                    {group.label}
+                  </h4>
+                  <div className="h-px flex-1 bg-border/60" />
+                  <span className="text-muted-foreground text-xs">
+                    {group.tasks.length} run
+                    {group.tasks.length === 1 ? '' : 's'}
+                  </span>
+                </div>
+                <div className="space-y-3">
+                  {group.tasks.map((task, index) => (
+                    <ExecutionTaskCard
+                      key={task.id}
+                      task={task}
+                      defaultOpen={
+                        task.status === 'running' ||
+                        (groupIndex === 0 && index === 0)
+                      }
+                      onDelete={setTaskToDelete}
+                    />
+                  ))}
+                </div>
+              </section>
+            ))}
+          </div>
+        )}
+      </section>
+
+      <AlertDialog
+        open={taskToDelete !== null}
+        onOpenChange={(open) => !open && setTaskToDelete(null)}
+      >
+        <AlertDialogContent>
+          <AlertDialogHeader>
+            <AlertDialogTitle>Delete Run</AlertDialogTitle>
+            <AlertDialogDescription>
+              Remove "{taskToDelete?.promptText}" from local history? This only
+              clears the recorded run on this device.
+            </AlertDialogDescription>
+          </AlertDialogHeader>
+          <AlertDialogFooter>
+            <AlertDialogCancel>Cancel</AlertDialogCancel>
+            <AlertDialogAction onClick={handleDeleteTask}>
+              Delete
+            </AlertDialogAction>
+          </AlertDialogFooter>
+        </AlertDialogContent>
+      </AlertDialog>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/admin-dashboard/PendingApprovals.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/admin-dashboard/PendingApprovals.tsx
@@ -0,0 +1,103 @@
+import { Clock, ShieldCheck, ShieldX } from 'lucide-react'
+import { type FC, useEffect, useState } from 'react'
+import { Badge } from '@/components/ui/badge'
+import { Button } from '@/components/ui/button'
+import {
+  type ApprovalResponse,
+  approvalResponsesStorage,
+  type PendingApproval,
+  pendingToolApprovalsStorage,
+  queueApprovalResponse,
+} from '@/lib/tool-approvals/approval-sync-storage'
+
+const formatToolName = (name: string) =>
+  name
+    .replace(/_/g, ' ')
+    .replace(/([a-z])([A-Z])/g, '$1 $2')
+    .replace(/^./, (s) => s.toUpperCase())
+
+export const PendingApprovals: FC = () => {
+  const [pending, setPending] = useState<PendingApproval[]>([])
+
+  useEffect(() => {
+    pendingToolApprovalsStorage.getValue().then(setPending)
+    const unwatch = pendingToolApprovalsStorage.watch(setPending)
+    return () => unwatch()
+  }, [])
+
+  const respond = async (approvalId: string, approved: boolean) => {
+    const response: ApprovalResponse = {
+      approvalId,
+      approved,
+      timestamp: Date.now(),
+    }
+    const existing = (await approvalResponsesStorage.getValue()) ?? []
+    await approvalResponsesStorage.setValue(
+      queueApprovalResponse(existing, response),
+    )
+  }
+
+  if (pending.length === 0) {
+    return (
+      <div className="rounded-xl border border-dashed px-6 py-14 text-center">
+        <div className="mx-auto mb-4 flex size-12 items-center justify-center rounded-2xl bg-[var(--accent-orange)]/10">
+          <ShieldCheck className="size-5 text-[var(--accent-orange)]" />
+        </div>
+        <h3 className="mb-1 font-medium text-lg">No pending approvals</h3>
+        <p className="mx-auto max-w-sm text-muted-foreground text-sm">
+          When the agent needs permission to execute a tool, approval requests
+          will appear here.
+        </p>
+      </div>
+    )
+  }
+
+  return (
+    <div className="space-y-3">
+      {pending.map((item) => (
+        <div
+          key={item.approvalId}
+          className="flex items-start gap-4 rounded-xl border border-yellow-500/20 bg-yellow-500/5 p-4"
+        >
+          <div className="mt-0.5 flex size-9 shrink-0 items-center justify-center rounded-full bg-yellow-500/10">
+            <Clock className="size-4 text-yellow-600" />
+          </div>
+          <div className="min-w-0 flex-1">
+            <div className="flex items-center gap-2">
+              <span className="font-medium text-sm">
+                {formatToolName(item.toolName)}
+              </span>
+              <Badge variant="outline" className="text-[10px]">
+                awaiting
+              </Badge>
+            </div>
+            {Object.keys(item.input).length > 0 && (
+              <pre className="mt-1 max-h-20 overflow-auto rounded bg-muted/50 p-2 font-mono text-muted-foreground text-xs">
+                {JSON.stringify(item.input, null, 2)}
+              </pre>
+            )}
+            <div className="mt-3 flex gap-2">
+              <Button
+                size="sm"
+                className="h-7 gap-1 px-3 text-xs"
+                onClick={() => respond(item.approvalId, true)}
+              >
+                <ShieldCheck className="size-3" />
+                Approve
+              </Button>
+              <Button
+                size="sm"
+                variant="outline"
+                className="h-7 gap-1 px-3 text-xs"
+                onClick={() => respond(item.approvalId, false)}
+              >
+                <ShieldX className="size-3" />
+                Deny
+              </Button>
+            </div>
+          </div>
+        </div>
+      ))}
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCard.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCard.tsx
@@ -0,0 +1,114 @@
+import { Bot } from 'lucide-react'
+import type { FC } from 'react'
+import type { AgentCardData } from '@/lib/agent-conversations/types'
+import { cn } from '@/lib/utils'
+
+interface AgentCardProps {
+  agent: AgentCardData
+  onClick: () => void
+  active?: boolean
+}
+
+function formatTimestamp(timestamp?: number): string {
+  if (!timestamp) return 'No activity yet'
+  const diff = Date.now() - timestamp
+  const minutes = Math.floor(diff / 60000)
+  if (minutes < 1) return 'just now'
+  if (minutes < 60) return `${minutes}m ago`
+  const hours = Math.floor(minutes / 60)
+  if (hours < 24) return `${hours}h ago`
+  return `${Math.floor(hours / 24)}d ago`
+}
+
+function getStatusLabel(status: AgentCardData['status']): string {
+  if (status === 'working') return 'Working'
+  if (status === 'error') return 'Error'
+  return 'Ready'
+}
+
+function getStatusTone(status: AgentCardData['status']): string {
+  if (status === 'working') return 'bg-amber-500'
+  if (status === 'error') return 'bg-destructive'
+  return 'bg-emerald-500'
+}
+
+export const AgentCardExpanded: FC<AgentCardProps> = ({
+  agent,
+  onClick,
+  active,
+}) => (
+  <button
+    type="button"
+    onClick={onClick}
+    className={cn(
+      'group flex min-h-32 w-full min-w-0 flex-col rounded-2xl border p-4 text-left shadow-sm transition-all duration-200',
+      active
+        ? 'border-border/80 bg-card shadow-md ring-1 ring-[var(--accent-orange)]/20'
+        : 'border-border/60 bg-card/85 hover:border-border hover:bg-card hover:shadow-md',
+    )}
+  >
+    <div className="flex items-start justify-between gap-3">
+      <div className="flex min-w-0 items-center gap-3">
+        <div
+          className={cn(
+            'flex size-10 shrink-0 items-center justify-center rounded-xl',
+            active
+              ? 'bg-[var(--accent-orange)]/10 text-[var(--accent-orange)]'
+              : 'bg-muted text-muted-foreground',
+          )}
+        >
+          <Bot className="size-5" />
+        </div>
+        <div className="min-w-0">
+          <div className="truncate font-semibold text-sm">{agent.name}</div>
+          <div className="truncate text-muted-foreground text-xs">
+            {agent.model ?? 'OpenClaw agent'}
+          </div>
+        </div>
+      </div>
+      <div className="flex items-center gap-2 rounded-full border border-border/60 bg-background/70 px-2.5 py-1 text-[11px] text-muted-foreground">
+        <span
+          className={cn('size-2 rounded-full', getStatusTone(agent.status))}
+        />
+        <span>{getStatusLabel(agent.status)}</span>
+      </div>
+    </div>
+
+    <div className="mt-4 flex-1">
+      <p className="line-clamp-2 text-foreground/90 text-sm">
+        {agent.lastMessage ??
+          'Start a conversation to see recent work and summaries.'}
+      </p>
+    </div>
+
+    <div className="mt-4 flex items-center justify-between gap-3 text-muted-foreground text-xs">
+      <span>{formatTimestamp(agent.lastMessageTimestamp)}</span>
+      <span>Open conversation</span>
+    </div>
+  </button>
+)
+
+export const AgentCardCompact: FC<AgentCardProps> = ({
+  agent,
+  onClick,
+  active,
+}) => (
+  <button
+    type="button"
+    onClick={onClick}
+    className={cn(
+      'inline-flex items-center gap-2 rounded-full border px-3 py-2 text-sm transition-colors',
+      active
+        ? 'border-border bg-card shadow-sm ring-1 ring-[var(--accent-orange)]/20'
+        : 'border-border/60 bg-card/85 text-foreground hover:border-border hover:bg-card',
+    )}
+  >
+    <span
+      className={cn(
+        'size-2 rounded-full',
+        active ? 'bg-[var(--accent-orange)]' : getStatusTone(agent.status),
+      )}
+    />
+    <span className="truncate">{agent.name}</span>
+  </button>
+)
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCardDock.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCardDock.tsx
@@ -0,0 +1,71 @@
+import { Plus } from 'lucide-react'
+import type { FC } from 'react'
+import type { AgentCardData } from '@/lib/agent-conversations/types'
+import { cn } from '@/lib/utils'
+import { AgentCardCompact, AgentCardExpanded } from './AgentCard'
+
+interface AgentCardDockProps {
+  agents: AgentCardData[]
+  activeAgentId?: string
+  onSelectAgent: (agentId: string) => void
+  onCreateAgent?: () => void
+  compact?: boolean
+}
+
+function CreateAgentButton({
+  compact,
+  onCreateAgent,
+}: {
+  compact?: boolean
+  onCreateAgent: () => void
+}) {
+  return (
+    <button
+      type="button"
+      onClick={onCreateAgent}
+      className={cn(
+        'flex shrink-0 items-center justify-center gap-2 border border-dashed text-muted-foreground transition-colors hover:border-[var(--accent-orange)] hover:text-[var(--accent-orange)]',
+        compact
+          ? 'rounded-full px-3 py-2 text-sm'
+          : 'min-h-32 rounded-2xl px-5 py-4',
+      )}
+    >
+      <Plus className={compact ? 'size-3.5' : 'size-5'} />
+      <span>{compact ? 'New' : 'Create agent'}</span>
+    </button>
+  )
+}
+
+export const AgentCardDock: FC<AgentCardDockProps> = ({
+  agents,
+  activeAgentId,
+  onSelectAgent,
+  onCreateAgent,
+  compact,
+}) => {
+  if (agents.length === 0 && !onCreateAgent) return null
+
+  const Card = compact ? AgentCardCompact : AgentCardExpanded
+
+  return (
+    <div
+      className={cn(
+        compact
+          ? 'flex items-center gap-2 overflow-x-auto pb-1'
+          : 'grid gap-4 md:grid-cols-3',
+      )}
+    >
+      {agents.map((agent) => (
+        <Card
+          key={agent.agentId}
+          agent={agent}
+          active={agent.agentId === activeAgentId}
+          onClick={() => onSelectAgent(agent.agentId)}
+        />
+      ))}
+      {onCreateAgent ? (
+        <CreateAgentButton compact={compact} onCreateAgent={onCreateAgent} />
+      ) : null}
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandConversation.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandConversation.tsx
@@ -0,0 +1,194 @@
+import { Bot, Home, RotateCcw } from 'lucide-react'
+import { type FC, useEffect, useRef } from 'react'
+import { Navigate, useNavigate, useParams, useSearchParams } from 'react-router'
+import { Button } from '@/components/ui/button'
+import type { AgentEntry } from '@/entrypoints/app/agents/useOpenClaw'
+import { cn } from '@/lib/utils'
+import { useAgentCommandData } from './agent-command-layout'
+import { ConversationInput } from './ConversationInput'
+import { ConversationMessage } from './ConversationMessage'
+import { useAgentConversation } from './useAgentConversation'
+
+function ConversationHeader({
+  agentName,
+  status,
+  onGoHome,
+  onReset,
+}: {
+  agentName: string
+  status: string
+  onGoHome: () => void
+  onReset: () => void
+}) {
+  return (
+    <div className="overflow-hidden rounded-[1.5rem] border border-border/60 bg-card/95 shadow-sm backdrop-blur">
+      <div className="flex items-center justify-between gap-3 px-5 py-4">
+        <div className="flex min-w-0 items-center gap-3">
+          <Button
+            variant="ghost"
+            size="icon"
+            onClick={onGoHome}
+            className="rounded-xl"
+            title="Back to home"
+          >
+            <Home className="size-4" />
+          </Button>
+          <div className="flex size-11 shrink-0 items-center justify-center rounded-2xl bg-muted text-muted-foreground">
+            <Bot className="size-5" />
+          </div>
+          <div className="min-w-0">
+            <div className="truncate font-semibold text-sm">{agentName}</div>
+            <div className="truncate text-muted-foreground text-sm">
+              {status}
+            </div>
+          </div>
+        </div>
+        <Button
+          variant="ghost"
+          size="sm"
+          onClick={onReset}
+          className="rounded-xl text-muted-foreground"
+        >
+          <RotateCcw className="mr-2 size-4" />
+          New conversation
+        </Button>
+      </div>
+    </div>
+  )
+}
+
+function EmptyConversationState({ agentName }: { agentName: string }) {
+  return (
+    <div className="flex min-h-full items-center justify-center py-10">
+      <div className="max-w-md rounded-[1.5rem] border border-border/60 bg-card/90 px-8 py-10 text-center shadow-sm backdrop-blur">
+        <div className="mx-auto flex size-14 items-center justify-center rounded-2xl bg-muted text-muted-foreground">
+          <Bot className="size-6" />
+        </div>
+        <h2 className="mt-4 font-semibold text-lg">{agentName}</h2>
+        <p className="mt-2 text-muted-foreground text-sm">
+          Send a message to start a focused conversation with this agent.
+        </p>
+      </div>
+    </div>
+  )
+}
+
+function getConversationStatusCopy(
+  status: string | undefined,
+  streaming: boolean,
+): string {
+  if (streaming) return 'Working on your request'
+  if (status === 'running') return 'Ready for the next task'
+  if (status === 'starting') return 'Connecting to OpenClaw'
+  if (status === 'error') return 'OpenClaw needs attention'
+  if (status === 'stopped') return 'OpenClaw is offline'
+  return 'Open agent setup to continue'
+}
+
+export const AgentCommandConversation: FC = () => {
+  const { agentId } = useParams<{ agentId: string }>()
+  const [searchParams, setSearchParams] = useSearchParams()
+  const navigate = useNavigate()
+  const scrollRef = useRef<HTMLDivElement>(null)
+  const initialQuerySent = useRef(false)
+  const { status, agents } = useAgentCommandData()
+  const shouldRedirectHome = !agentId
+  const resolvedAgentId = agentId ?? ''
+  const agent = agents.find((entry) => entry.agentId === resolvedAgentId)
+  const agentName = agent?.name || resolvedAgentId || 'Agent'
+  const { turns, streaming, loading, send, resetConversation } =
+    useAgentConversation(resolvedAgentId, agentName)
+  const lastTurn = turns[turns.length - 1]
+  const lastTurnPartCount = lastTurn?.parts.length ?? 0
+
+  useEffect(() => {
+    if (shouldRedirectHome) return
+
+    const query = searchParams.get('q')
+    if (query && !initialQuerySent.current && !loading) {
+      initialQuerySent.current = true
+      setSearchParams({}, { replace: true })
+      void send(query)
+    }
+  }, [loading, searchParams, send, setSearchParams, shouldRedirectHome])
+
+  useEffect(() => {
+    if (
+      shouldRedirectHome ||
+      (turns.length === 0 && lastTurnPartCount === 0 && !streaming)
+    ) {
+      return
+    }
+
+    scrollRef.current?.scrollTo({
+      top: scrollRef.current.scrollHeight,
+      behavior: 'smooth',
+    })
+  }, [lastTurnPartCount, shouldRedirectHome, streaming, turns.length])
+
+  if (shouldRedirectHome) {
+    return <Navigate to="/home" replace />
+  }
+
+  const handleSelectAgent = (entry: AgentEntry) => {
+    navigate(`/home/agents/${entry.agentId}`)
+  }
+
+  const statusCopy = getConversationStatusCopy(status?.status, streaming)
+
+  return (
+    <div className="absolute inset-0 overflow-hidden">
+      <div className="fade-in slide-in-from-bottom-5 mx-auto flex h-full w-full max-w-3xl animate-in flex-col gap-3 px-4 pt-4 pb-2 duration-300">
+        <ConversationHeader
+          agentName={agentName}
+          status={statusCopy}
+          onGoHome={() => navigate('/home')}
+          onReset={resetConversation}
+        />
+
+        <main
+          ref={scrollRef}
+          className={cn(
+            'styled-scrollbar min-h-0 flex-1 overflow-y-auto overflow-x-hidden rounded-[1.5rem] border border-border/50 bg-card/85 px-5 py-5 shadow-sm',
+            '[&_[data-streamdown="code-block"]]:!max-w-full [&_[data-streamdown="table-wrapper"]]:!max-w-full [&_[data-streamdown="code-block"]]:overflow-x-auto [&_[data-streamdown="table-wrapper"]]:overflow-x-auto',
+          )}
+        >
+          {loading ? (
+            <div className="flex h-full items-center justify-center text-muted-foreground text-sm">
+              Loading conversation...
+            </div>
+          ) : turns.length === 0 ? (
+            <EmptyConversationState agentName={agentName} />
+          ) : (
+            <div className="w-full space-y-4">
+              {turns.map((turn, index) => (
+                <ConversationMessage
+                  key={turn.id}
+                  turn={turn}
+                  streaming={streaming && index === turns.length - 1}
+                />
+              ))}
+            </div>
+          )}
+        </main>
+
+        <div className="w-full flex-shrink-0">
+          <ConversationInput
+            variant="conversation"
+            agents={agents}
+            selectedAgentId={resolvedAgentId}
+            onSelectAgent={handleSelectAgent}
+            onSend={(text) => {
+              void send(text)
+            }}
+            onCreateAgent={() => navigate('/agents')}
+            streaming={streaming}
+            disabled={status?.status !== 'running'}
+            status={status?.status}
+            placeholder={`Message ${agentName}...`}
+          />
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandHome.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentCommandHome.tsx
@@ -0,0 +1,182 @@
+import { ArrowRight } from 'lucide-react'
+import { type FC, useEffect, useState } from 'react'
+import { useNavigate } from 'react-router'
+import { Button } from '@/components/ui/button'
+import { Card, CardContent } from '@/components/ui/card'
+import type { AgentEntry } from '@/entrypoints/app/agents/useOpenClaw'
+import { ImportDataHint } from '@/entrypoints/newtab/index/ImportDataHint'
+import { NewTabBranding } from '@/entrypoints/newtab/index/NewTabBranding'
+import { NewTabTip } from '@/entrypoints/newtab/index/NewTabTip'
+import { ScheduleResults } from '@/entrypoints/newtab/index/ScheduleResults'
+import { SignInHint } from '@/entrypoints/newtab/index/SignInHint'
+import { TopSites } from '@/entrypoints/newtab/index/TopSites'
+import { useActiveHint } from '@/entrypoints/newtab/index/useActiveHint'
+import { AgentCardDock } from './AgentCardDock'
+import { useAgentCommandData } from './agent-command-layout'
+import { ConversationInput } from './ConversationInput'
+import { useAgentCardData } from './useAgentCardData'
+
+function AgentCommandSetupState({
+  onOpenAgents,
+}: {
+  onOpenAgents: () => void
+}) {
+  return (
+    <Card className="border-border/60 bg-card/85 shadow-sm">
+      <CardContent className="flex flex-col items-center gap-4 p-6 text-center">
+        <p className="max-w-xl text-muted-foreground text-sm">
+          Set up OpenClaw agents to turn your new tab into an agent command
+          center.
+        </p>
+        <Button onClick={onOpenAgents} className="gap-2">
+          Open Agent Setup
+          <ArrowRight className="size-4" />
+        </Button>
+      </CardContent>
+    </Card>
+  )
+}
+
+function EmptyAgentsState({ onOpenAgents }: { onOpenAgents: () => void }) {
+  return (
+    <Card className="border-border/60 bg-card/85 shadow-sm">
+      <CardContent className="flex flex-col items-center gap-4 p-6 text-center">
+        <p className="max-w-xl text-muted-foreground text-sm">
+          OpenClaw is running, but you do not have any agents yet.
+        </p>
+        <Button variant="outline" onClick={onOpenAgents}>
+          Create your first agent
+        </Button>
+      </CardContent>
+    </Card>
+  )
+}
+
+function OpenClawUnavailableState({
+  onOpenAgents,
+}: {
+  onOpenAgents: () => void
+}) {
+  return (
+    <Card className="border-border/60 bg-card/85 shadow-sm">
+      <CardContent className="flex flex-col items-center gap-4 p-6 text-center">
+        <p className="max-w-xl text-muted-foreground text-sm">
+          OpenClaw is unavailable right now. Open the Agents page to restart the
+          gateway or review setup.
+        </p>
+        <Button onClick={onOpenAgents} className="gap-2">
+          Open Agent Setup
+          <ArrowRight className="size-4" />
+        </Button>
+      </CardContent>
+    </Card>
+  )
+}
+
+export const AgentCommandHome: FC = () => {
+  const navigate = useNavigate()
+  const activeHint = useActiveHint()
+  const { status, agents } = useAgentCommandData()
+  const [mounted, setMounted] = useState(false)
+  const [selectedAgentId, setSelectedAgentId] = useState<string | null>(null)
+  const cardData = useAgentCardData(agents, status?.status)
+
+  useEffect(() => {
+    setMounted(true)
+  }, [])
+
+  useEffect(() => {
+    if (agents.length === 0) {
+      if (selectedAgentId) {
+        setSelectedAgentId(null)
+      }
+      return
+    }
+
+    if (
+      !selectedAgentId ||
+      !agents.some((agent) => agent.agentId === selectedAgentId)
+    ) {
+      setSelectedAgentId(agents[0].agentId)
+    }
+  }, [agents, selectedAgentId])
+
+  const handleSend = (text: string) => {
+    if (!selectedAgentId) return
+    navigate(`/home/agents/${selectedAgentId}?q=${encodeURIComponent(text)}`)
+  }
+
+  const handleSelectAgent = (agent: AgentEntry) => {
+    setSelectedAgentId(agent.agentId)
+  }
+
+  const openClawStatus = status?.status
+  const isSetup = openClawStatus != null && openClawStatus !== 'uninitialized'
+  const shouldShowUnavailableState =
+    openClawStatus != null &&
+    openClawStatus !== 'running' &&
+    openClawStatus !== 'uninitialized' &&
+    cardData.length === 0
+
+  return (
+    <div className="pt-[max(25vh,16px)]">
+      <div className="relative w-full space-y-8 md:w-3xl">
+        <NewTabBranding />
+
+        <ConversationInput
+          variant="home"
+          agents={agents}
+          selectedAgentId={selectedAgentId}
+          onSelectAgent={handleSelectAgent}
+          onSend={handleSend}
+          onCreateAgent={() => navigate('/agents')}
+          streaming={false}
+          disabled={status?.status !== 'running'}
+          status={status?.status}
+          placeholder={
+            status?.status === 'running'
+              ? undefined
+              : 'OpenClaw is not running...'
+          }
+        />
+
+        {mounted ? <NewTabTip /> : null}
+
+        {isSetup ? (
+          shouldShowUnavailableState ? (
+            <OpenClawUnavailableState
+              onOpenAgents={() => navigate('/agents')}
+            />
+          ) : cardData.length > 0 ? (
+            <section className="space-y-3">
+              <div className="flex items-center justify-between">
+                <div>
+                  <h2 className="font-semibold text-base">Agents</h2>
+                  <p className="text-muted-foreground text-sm">
+                    Pick up where your agents left off.
+                  </p>
+                </div>
+              </div>
+              <AgentCardDock
+                agents={cardData}
+                activeAgentId={selectedAgentId ?? undefined}
+                onSelectAgent={(agentId) => navigate(`/home/agents/${agentId}`)}
+                onCreateAgent={() => navigate('/agents')}
+              />
+            </section>
+          ) : (
+            <EmptyAgentsState onOpenAgents={() => navigate('/agents')} />
+          )
+        ) : (
+          <AgentCommandSetupState onOpenAgents={() => navigate('/agents')} />
+        )}
+
+        {mounted ? <TopSites /> : null}
+        {mounted ? <ScheduleResults /> : null}
+      </div>
+
+      {activeHint === 'signin' ? <SignInHint /> : null}
+      {activeHint === 'import' ? <ImportDataHint /> : null}
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentSelector.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/AgentSelector.tsx
@@ -0,0 +1,132 @@
+import { Bot, Check, ChevronDown, Plus } from 'lucide-react'
+import type { FC } from 'react'
+import { useState } from 'react'
+import { Button } from '@/components/ui/button'
+import {
+  Command,
+  CommandEmpty,
+  CommandGroup,
+  CommandInput,
+  CommandItem,
+  CommandList,
+} from '@/components/ui/command'
+import {
+  Popover,
+  PopoverContent,
+  PopoverTrigger,
+} from '@/components/ui/popover'
+import {
+  type AgentEntry,
+  getModelDisplayName,
+} from '@/entrypoints/app/agents/useOpenClaw'
+import { cn } from '@/lib/utils'
+
+interface AgentSelectorProps {
+  agents: AgentEntry[]
+  selectedAgentId: string | null
+  onSelectAgent: (agent: AgentEntry) => void
+  onCreateAgent?: () => void
+  status?: string
+}
+
+function getStatusDot(status?: string) {
+  if (status === 'running') return 'bg-emerald-500'
+  if (status === 'starting') return 'bg-amber-500 animate-pulse'
+  if (status === 'error') return 'bg-destructive'
+  return 'bg-muted-foreground/50'
+}
+
+export const AgentSelector: FC<AgentSelectorProps> = ({
+  agents,
+  selectedAgentId,
+  onSelectAgent,
+  onCreateAgent,
+  status,
+}) => {
+  const [open, setOpen] = useState(false)
+  const selectedAgent = agents.find(
+    (agent) => agent.agentId === selectedAgentId,
+  )
+
+  return (
+    <Popover open={open} onOpenChange={setOpen}>
+      <PopoverTrigger asChild>
+        <Button
+          variant="ghost"
+          className={cn(
+            'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+            'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+            'data-[state=open]:bg-accent',
+          )}
+        >
+          <Bot className="h-4 w-4" />
+          <span className={cn('size-2 rounded-full', getStatusDot(status))} />
+          <span className="max-w-32 truncate">
+            {selectedAgent?.name ?? 'Select agent'}
+          </span>
+          <ChevronDown className="h-3 w-3" />
+        </Button>
+      </PopoverTrigger>
+      <PopoverContent side="bottom" align="start" className="w-72 p-0">
+        <Command>
+          <CommandInput placeholder="Search agents..." className="h-9" />
+          <CommandList>
+            <CommandEmpty>No agents found</CommandEmpty>
+            <CommandGroup>
+              {agents.map((agent) => {
+                const isSelected = selectedAgentId === agent.agentId
+                const modelLabel = getModelDisplayName(agent.model)
+                return (
+                  <CommandItem
+                    key={agent.agentId}
+                    value={`${agent.agentId} ${agent.name}`}
+                    onSelect={() => {
+                      onSelectAgent(agent)
+                      setOpen(false)
+                    }}
+                    className={cn(
+                      'flex w-full items-center gap-3 rounded-md px-3 py-2',
+                      isSelected && 'bg-[var(--accent-orange)]/10',
+                    )}
+                  >
+                    <div className="flex size-8 shrink-0 items-center justify-center rounded-lg bg-[var(--accent-orange)]/10 text-[var(--accent-orange)]">
+                      <Bot className="size-4" />
+                    </div>
+                    <div className="min-w-0 flex-1">
+                      <span className="block truncate font-medium text-sm">
+                        {agent.name}
+                      </span>
+                      {modelLabel ? (
+                        <span className="block truncate text-muted-foreground text-xs">
+                          {modelLabel}
+                        </span>
+                      ) : null}
+                    </div>
+                    {isSelected ? (
+                      <Check className="size-4 shrink-0 text-[var(--accent-orange)]" />
+                    ) : null}
+                  </CommandItem>
+                )
+              })}
+            </CommandGroup>
+            {onCreateAgent ? (
+              <div className="border-border border-t p-1">
+                <button
+                  type="button"
+                  className="flex w-full items-center gap-3 rounded-md px-3 py-2 text-left text-muted-foreground text-sm transition-colors hover:bg-accent hover:text-foreground"
+                  onClick={() => {
+                    onCreateAgent()
+                    setOpen(false)
+                  }}
+                >
+                  <Plus className="size-4" />
+                  <span>Create agent</span>
+                </button>
+              </div>
+            ) : null}
+          </CommandList>
+        </Command>
+      </PopoverContent>
+    </Popover>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationInput.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationInput.tsx
@@ -0,0 +1,371 @@
+import {
+  ArrowRight,
+  Bot,
+  ChevronDown,
+  Folder,
+  Layers,
+  Loader2,
+  Mic,
+  Square,
+} from 'lucide-react'
+import { type FC, type ReactNode, useEffect, useState } from 'react'
+import { AppSelector } from '@/components/elements/AppSelector'
+import { TabPickerPopover } from '@/components/elements/tab-picker-popover'
+import { WorkspaceSelector } from '@/components/elements/workspace-selector'
+import { Button } from '@/components/ui/button'
+import type { AgentEntry } from '@/entrypoints/app/agents/useOpenClaw'
+import { McpServerIcon } from '@/entrypoints/app/connect-mcp/McpServerIcon'
+import { useGetUserMCPIntegrations } from '@/entrypoints/app/connect-mcp/useGetUserMCPIntegrations'
+import { Feature } from '@/lib/browseros/capabilities'
+import { useCapabilities } from '@/lib/browseros/useCapabilities'
+import { useMcpServers } from '@/lib/mcp/mcpServerStorage'
+import { cn } from '@/lib/utils'
+import { useVoiceInput } from '@/lib/voice/useVoiceInput'
+import { useWorkspace } from '@/lib/workspace/use-workspace'
+import { AgentSelector } from './AgentSelector'
+
+interface ConversationInputProps {
+  agents: AgentEntry[]
+  selectedAgentId: string | null
+  onSelectAgent: (agent: AgentEntry) => void
+  onSend: (text: string) => void
+  onCreateAgent?: () => void
+  streaming: boolean
+  disabled?: boolean
+  status?: string
+  placeholder?: string
+  variant?: 'home' | 'conversation'
+}
+
+function InputActionButton({
+  disabled,
+  onClick,
+  streaming,
+}: {
+  disabled: boolean
+  onClick: () => void
+  streaming: boolean
+}) {
+  return (
+    <Button
+      onClick={onClick}
+      size="icon"
+      disabled={disabled}
+      className="h-10 w-10 flex-shrink-0 rounded-xl bg-primary text-primary-foreground hover:bg-primary/90"
+    >
+      {streaming ? (
+        <Loader2 className="h-5 w-5 animate-spin" />
+      ) : (
+        <ArrowRight className="h-5 w-5" />
+      )}
+    </Button>
+  )
+}
+
+function VoiceButton({
+  isRecording,
+  isTranscribing,
+  onStart,
+  onStop,
+}: {
+  isRecording: boolean
+  isTranscribing: boolean
+  onStart: () => void
+  onStop: () => void
+}) {
+  if (isRecording) {
+    return (
+      <Button
+        type="button"
+        size="icon"
+        onClick={onStop}
+        className="h-10 w-10 flex-shrink-0 rounded-xl bg-red-600 text-white hover:bg-red-700"
+      >
+        <Square className="h-4 w-4" />
+      </Button>
+    )
+  }
+
+  if (isTranscribing) {
+    return (
+      <Button
+        type="button"
+        variant="ghost"
+        size="icon"
+        disabled
+        className="h-10 w-10 flex-shrink-0 rounded-xl"
+      >
+        <Loader2 className="h-5 w-5 animate-spin" />
+      </Button>
+    )
+  }
+
+  return (
+    <Button
+      type="button"
+      variant="ghost"
+      size="icon"
+      onClick={onStart}
+      className="h-10 w-10 flex-shrink-0 rounded-xl text-muted-foreground transition-colors hover:text-foreground"
+      title="Voice input"
+    >
+      <Mic className="h-5 w-5" />
+    </Button>
+  )
+}
+
+function ContextControls({
+  agents,
+  onCreateAgent,
+  onSelectAgent,
+  selectedAgentId,
+  selectedTabs,
+  onToggleTab,
+  showAgentSelector,
+  status,
+}: {
+  agents: AgentEntry[]
+  onCreateAgent?: () => void
+  onSelectAgent: (agent: AgentEntry) => void
+  selectedAgentId: string | null
+  selectedTabs: chrome.tabs.Tab[]
+  onToggleTab: (tab: chrome.tabs.Tab) => void
+  showAgentSelector: boolean
+  status?: string
+}) {
+  const { supports } = useCapabilities()
+  const { selectedFolder } = useWorkspace()
+  const { servers: mcpServers } = useMcpServers()
+  const { data: userMCPIntegrations } = useGetUserMCPIntegrations()
+
+  const connectedManagedServers = mcpServers.filter((server) => {
+    if (server.type !== 'managed' || !server.managedServerName) return false
+    return userMCPIntegrations?.integrations?.find(
+      (integration) => integration.name === server.managedServerName,
+    )?.is_authenticated
+  })
+
+  return (
+    <div className="flex items-center justify-between border-border/50 border-t px-5 py-3">
+      <div className="flex items-center gap-1">
+        {showAgentSelector ? (
+          <AgentSelector
+            agents={agents}
+            selectedAgentId={selectedAgentId}
+            onSelectAgent={onSelectAgent}
+            onCreateAgent={onCreateAgent}
+            status={status}
+          />
+        ) : null}
+        {supports(Feature.WORKSPACE_FOLDER_SUPPORT) ? (
+          <WorkspaceSelector>
+            <Button
+              variant="ghost"
+              className={cn(
+                'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+                'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+                'data-[state=open]:bg-accent',
+              )}
+            >
+              <Folder className="h-4 w-4" />
+              <span>{selectedFolder?.name || 'Add workspace'}</span>
+              <ChevronDown className="h-3 w-3" />
+            </Button>
+          </WorkspaceSelector>
+        ) : null}
+        <TabPickerPopover
+          variant="selector"
+          selectedTabs={selectedTabs}
+          onToggleTab={onToggleTab}
+        >
+          <Button
+            className={cn(
+              'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+              selectedTabs.length > 0
+                ? 'bg-[var(--accent-orange)]! text-white shadow-sm'
+                : 'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+              'data-[state=open]:bg-accent',
+            )}
+          >
+            <Layers className="h-4 w-4" />
+            <span>Tabs</span>
+          </Button>
+        </TabPickerPopover>
+      </div>
+
+      {supports(Feature.MANAGED_MCP_SUPPORT) ? (
+        <div className="ml-auto flex items-center gap-1.5">
+          <AppSelector side="bottom">
+            <Button
+              variant="ghost"
+              className={cn(
+                'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
+                'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
+                'data-[state=open]:bg-accent',
+              )}
+            >
+              <div className="flex items-center -space-x-1.5">
+                {connectedManagedServers.slice(0, 4).map((server) => (
+                  <div
+                    key={server.id}
+                    className="rounded-full ring-2 ring-card"
+                  >
+                    <McpServerIcon
+                      serverName={server.managedServerName ?? ''}
+                      size={16}
+                    />
+                  </div>
+                ))}
+              </div>
+              {connectedManagedServers.length > 4 ? (
+                <span className="text-xs">
+                  +{connectedManagedServers.length - 4}
+                </span>
+              ) : null}
+              <span>Apps</span>
+              <ChevronDown className="h-3 w-3" />
+            </Button>
+          </AppSelector>
+        </div>
+      ) : null}
+    </div>
+  )
+}
+
+function HomeShell({ children }: { children: ReactNode }) {
+  return (
+    <div className="overflow-hidden rounded-[1.5rem] border border-border/60 bg-card/95 shadow-sm backdrop-blur">
+      {children}
+    </div>
+  )
+}
+
+function ConversationShell({ children }: { children: ReactNode }) {
+  return (
+    <div className="overflow-hidden rounded-[1.5rem] border border-border/60 bg-card/95 shadow-sm backdrop-blur">
+      {children}
+    </div>
+  )
+}
+
+export const ConversationInput: FC<ConversationInputProps> = ({
+  agents,
+  selectedAgentId,
+  onSelectAgent,
+  onSend,
+  onCreateAgent,
+  streaming,
+  disabled,
+  status,
+  placeholder,
+  variant = 'conversation',
+}) => {
+  const [input, setInput] = useState('')
+  const [selectedTabs, setSelectedTabs] = useState<chrome.tabs.Tab[]>([])
+  const voice = useVoiceInput()
+  const selectedAgent = agents.find(
+    (agent) => agent.agentId === selectedAgentId,
+  )
+
+  useEffect(() => {
+    if (voice.transcript && !voice.isTranscribing) {
+      setInput(voice.transcript)
+      voice.clearTranscript()
+    }
+  }, [voice.transcript, voice.isTranscribing, voice])
+
+  const toggleTab = (tab: chrome.tabs.Tab) => {
+    setSelectedTabs((prev) => {
+      const isSelected = prev.some((selected) => selected.id === tab.id)
+      if (isSelected) {
+        return prev.filter((selected) => selected.id !== tab.id)
+      }
+      return [...prev, tab]
+    })
+  }
+
+  const handleSend = () => {
+    const text = input.trim()
+    if (!text || streaming || disabled) return
+    onSend(text)
+    setInput('')
+  }
+
+  const shell = variant === 'home' ? HomeShell : ConversationShell
+  const Shell = shell
+
+  return (
+    <Shell>
+      <div className="flex items-center gap-3 px-5 py-4">
+        <BotInputIcon variant={variant} />
+        <input
+          type="text"
+          value={input}
+          onChange={(event) => setInput(event.currentTarget.value)}
+          onKeyDown={(event) => {
+            if (event.key === 'Enter') {
+              event.preventDefault()
+              handleSend()
+            }
+          }}
+          placeholder={
+            voice.isTranscribing
+              ? 'Transcribing...'
+              : (placeholder ?? `Message ${selectedAgent?.name ?? 'agent'}...`)
+          }
+          disabled={disabled || voice.isTranscribing}
+          className="flex-1 border-none bg-transparent text-base text-foreground outline-none placeholder:text-muted-foreground disabled:opacity-60"
+        />
+        <VoiceButton
+          isRecording={voice.isRecording}
+          isTranscribing={voice.isTranscribing}
+          onStart={() => {
+            void voice.startRecording()
+          }}
+          onStop={() => {
+            void voice.stopRecording()
+          }}
+        />
+        <InputActionButton
+          disabled={
+            !input.trim() ||
+            streaming ||
+            !!disabled ||
+            voice.isRecording ||
+            voice.isTranscribing
+          }
+          onClick={handleSend}
+          streaming={streaming}
+        />
+      </div>
+      {voice.error ? (
+        <div className="px-5 pb-2 text-destructive text-xs">{voice.error}</div>
+      ) : null}
+      <ContextControls
+        agents={agents}
+        onCreateAgent={onCreateAgent}
+        onSelectAgent={onSelectAgent}
+        selectedAgentId={selectedAgentId}
+        selectedTabs={selectedTabs}
+        onToggleTab={toggleTab}
+        showAgentSelector={variant === 'home'}
+        status={status}
+      />
+    </Shell>
+  )
+}
+
+function BotInputIcon({ variant }: { variant: 'home' | 'conversation' }) {
+  return (
+    <div
+      className={cn(
+        'flex items-center justify-center text-[var(--accent-orange)]',
+        variant === 'home'
+          ? 'h-10 w-10 rounded-xl bg-[var(--accent-orange)]/10'
+          : 'h-9 w-9 rounded-xl bg-[var(--accent-orange)]/12',
+      )}
+    >
+      <Bot className="h-4 w-4" />
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationMessage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationMessage.tsx
@@ -0,0 +1,105 @@
+import { Bot, CheckCircle2, Loader2, XCircle } from 'lucide-react'
+import type { FC } from 'react'
+import {
+  Message,
+  MessageContent,
+  MessageResponse,
+} from '@/components/ai-elements/message'
+import {
+  Reasoning,
+  ReasoningContent,
+  ReasoningTrigger,
+} from '@/components/ai-elements/reasoning'
+import type { AgentConversationTurn } from '@/lib/agent-conversations/types'
+
+interface ConversationMessageProps {
+  turn: AgentConversationTurn
+  streaming: boolean
+}
+
+export const ConversationMessage: FC<ConversationMessageProps> = ({
+  turn,
+  streaming,
+}) => (
+  <div className="space-y-3">
+    <Message from="user">
+      <MessageContent>
+        <pre className="whitespace-pre-wrap font-sans text-sm">
+          {turn.userText}
+        </pre>
+      </MessageContent>
+    </Message>
+
+    {turn.parts.length > 0 && (
+      <Message from="assistant">
+        <MessageContent>
+          {turn.parts.map((part, i) => {
+            const key = `${turn.id}-part-${i}`
+
+            switch (part.kind) {
+              case 'thinking':
+                return (
+                  <Reasoning
+                    key={key}
+                    className="w-full"
+                    isStreaming={!part.done}
+                    defaultOpen={!part.done}
+                  >
+                    <ReasoningTrigger />
+                    <ReasoningContent>{part.text}</ReasoningContent>
+                  </Reasoning>
+                )
+
+              case 'tool-batch':
+                return (
+                  <div key={key} className="w-full space-y-1">
+                    {part.tools.map((tool) => (
+                      <div
+                        key={tool.id}
+                        className="flex items-center gap-2 rounded-md border px-3 py-2 text-sm"
+                      >
+                        {tool.status === 'running' && (
+                          <Loader2 className="size-3.5 animate-spin text-muted-foreground" />
+                        )}
+                        {tool.status === 'completed' && (
+                          <CheckCircle2 className="size-3.5 text-green-500" />
+                        )}
+                        {tool.status === 'error' && (
+                          <XCircle className="size-3.5 text-destructive" />
+                        )}
+                        <span className="font-mono text-xs">{tool.name}</span>
+                        {tool.durationMs != null && (
+                          <span className="ml-auto text-muted-foreground text-xs">
+                            {(tool.durationMs / 1000).toFixed(1)}s
+                          </span>
+                        )}
+                      </div>
+                    ))}
+                  </div>
+                )
+
+              case 'text':
+                return <MessageResponse key={key}>{part.text}</MessageResponse>
+
+              default:
+                return null
+            }
+          })}
+        </MessageContent>
+      </Message>
+    )}
+
+    {!turn.done && turn.parts.length === 0 && streaming && (
+      <div className="flex gap-2">
+        <div className="flex size-7 shrink-0 items-center justify-center rounded-full bg-[var(--accent-orange)] text-white">
+          <Bot className="size-3.5" />
+        </div>
+        <div className="flex items-center gap-1 rounded-xl rounded-tl-none border border-border/50 bg-card px-3 py-2.5 shadow-sm">
+          <span className="size-1.5 animate-bounce rounded-full bg-[var(--accent-orange)] [animation-delay:-0.3s]" />
+          <span className="size-1.5 animate-bounce rounded-full bg-[var(--accent-orange)] [animation-delay:-0.15s]" />
+          <span className="size-1.5 animate-bounce rounded-full bg-[var(--accent-orange)]" />
+        </div>
+      </div>
+    )}
+  </div>
+)
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-command-layout.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/agent-command-layout.tsx
@@ -0,0 +1,39 @@
+import type { FC } from 'react'
+import { Outlet, useOutletContext } from 'react-router'
+import {
+  type AgentEntry,
+  type OpenClawStatus,
+  useOpenClawAgents,
+  useOpenClawStatus,
+} from '@/entrypoints/app/agents/useOpenClaw'
+
+interface AgentCommandContextValue {
+  agents: AgentEntry[]
+  agentsLoading: boolean
+  status: OpenClawStatus | null
+  statusLoading: boolean
+}
+
+export const AgentCommandLayout: FC = () => {
+  const { status, loading: statusLoading } = useOpenClawStatus(5000)
+  const { agents, loading: agentsLoading } = useOpenClawAgents(
+    status?.status === 'running' && status.controlPlaneStatus === 'connected',
+  )
+
+  return (
+    <Outlet
+      context={
+        {
+          agents,
+          agentsLoading,
+          status,
+          statusLoading,
+        } satisfies AgentCommandContextValue
+      }
+    />
+  )
+}
+
+export function useAgentCommandData(): AgentCommandContextValue {
+  return useOutletContext<AgentCommandContextValue>()
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentCardData.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentCardData.ts
@@ -0,0 +1,69 @@
+import { useEffect, useState } from 'react'
+import {
+  type AgentEntry,
+  getModelDisplayName,
+  type OpenClawStatus,
+} from '@/entrypoints/app/agents/useOpenClaw'
+import { getLatestConversation } from '@/lib/agent-conversations/storage'
+import type { AgentCardData } from '@/lib/agent-conversations/types'
+
+function getAgentStatusTone(
+  status: OpenClawStatus['status'] | undefined,
+): AgentCardData['status'] {
+  if (status === 'error') return 'error'
+  if (status === 'starting') return 'working'
+  return 'idle'
+}
+
+async function getAgentCardData(
+  agent: AgentEntry,
+  status: OpenClawStatus['status'] | undefined,
+): Promise<AgentCardData> {
+  const conversation = await getLatestConversation(agent.agentId)
+  const lastTurn = conversation?.turns[conversation.turns.length - 1]
+  const lastTextPart = lastTurn?.parts.findLast((part) => part.kind === 'text')
+
+  return {
+    agentId: agent.agentId,
+    name: agent.name,
+    model: getModelDisplayName(agent.model),
+    status: getAgentStatusTone(status),
+    lastMessage:
+      lastTextPart?.kind === 'text'
+        ? lastTextPart.text.slice(0, 120)
+        : undefined,
+    lastMessageTimestamp: lastTurn?.timestamp,
+  }
+}
+
+export function useAgentCardData(
+  agents: AgentEntry[],
+  status: OpenClawStatus['status'] | undefined,
+) {
+  const [cardData, setCardData] = useState<AgentCardData[]>([])
+
+  useEffect(() => {
+    let active = true
+
+    const loadCardData = async () => {
+      const nextCardData = await Promise.all(
+        agents.map((agent) => getAgentCardData(agent, status)),
+      )
+      if (active) {
+        setCardData(nextCardData)
+      }
+    }
+
+    if (agents.length > 0) {
+      void loadCardData()
+    } else {
+      setCardData([])
+    }
+
+    return () => {
+      active = false
+    }
+  }, [agents, status])
+
+  return cardData
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentConversation.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/useAgentConversation.ts
@@ -0,0 +1,256 @@
+import { useEffect, useRef, useState } from 'react'
+import {
+  chatWithAgent,
+  type OpenClawStreamEvent,
+} from '@/entrypoints/app/agents/useOpenClaw'
+import {
+  getLatestConversation,
+  saveConversation,
+} from '@/lib/agent-conversations/storage'
+import type {
+  AgentConversation,
+  AgentConversationTurn,
+  AssistantPart,
+} from '@/lib/agent-conversations/types'
+import { consumeSSEStream } from '@/lib/sse'
+
+export function useAgentConversation(agentId: string, agentName: string) {
+  const [turns, setTurns] = useState<AgentConversationTurn[]>([])
+  const [streaming, setStreaming] = useState(false)
+  const [loading, setLoading] = useState(true)
+  const sessionKeyRef = useRef('')
+  const textAccRef = useRef('')
+  const thinkAccRef = useRef('')
+  const streamAbortRef = useRef<AbortController | null>(null)
+
+  useEffect(() => {
+    let active = true
+    getLatestConversation(agentId)
+      .then((conv) => {
+        if (!active) return
+        if (conv) {
+          setTurns(conv.turns)
+          sessionKeyRef.current = conv.sessionKey
+        } else {
+          sessionKeyRef.current = crypto.randomUUID()
+        }
+        setLoading(false)
+      })
+      .catch(() => {
+        if (active) {
+          sessionKeyRef.current = crypto.randomUUID()
+          setLoading(false)
+        }
+      })
+    return () => {
+      active = false
+    }
+  }, [agentId])
+
+  useEffect(() => {
+    return () => {
+      streamAbortRef.current?.abort()
+    }
+  }, [])
+
+  const persistTurns = (updatedTurns: AgentConversationTurn[]) => {
+    const conv: AgentConversation = {
+      agentId,
+      agentName,
+      sessionKey: sessionKeyRef.current,
+      turns: updatedTurns,
+      createdAt: updatedTurns[0]?.timestamp ?? Date.now(),
+      updatedAt: Date.now(),
+    }
+    saveConversation(conv).catch(() => {})
+  }
+
+  const updateCurrentTurnParts = (
+    updater: (parts: AssistantPart[]) => AssistantPart[],
+  ) => {
+    setTurns((prev) => {
+      const last = prev[prev.length - 1]
+      if (!last) return prev
+      return [...prev.slice(0, -1), { ...last, parts: updater(last.parts) }]
+    })
+  }
+
+  const processStreamEvent = (event: OpenClawStreamEvent) => {
+    switch (event.type) {
+      case 'text-delta': {
+        const delta = (event.data.text as string) ?? ''
+        textAccRef.current += delta
+        const text = textAccRef.current
+        updateCurrentTurnParts((parts) => {
+          const last = parts[parts.length - 1]
+          if (last?.kind === 'text') {
+            return [...parts.slice(0, -1), { ...last, text }]
+          }
+          return [...parts, { kind: 'text', text }]
+        })
+        break
+      }
+
+      case 'thinking': {
+        const delta = (event.data.text as string) ?? ''
+        thinkAccRef.current += delta
+        const text = thinkAccRef.current
+        updateCurrentTurnParts((parts) => {
+          const idx = parts.findIndex((p) => p.kind === 'thinking' && !p.done)
+          if (idx >= 0) {
+            return [
+              ...parts.slice(0, idx),
+              { ...parts[idx], text, done: false },
+              ...parts.slice(idx + 1),
+            ]
+          }
+          return [...parts, { kind: 'thinking', text, done: false }]
+        })
+        break
+      }
+
+      case 'tool-start': {
+        const tool = {
+          id: (event.data.toolCallId as string) ?? crypto.randomUUID(),
+          name: (event.data.toolName as string) ?? 'unknown',
+          status: 'running' as const,
+        }
+        updateCurrentTurnParts((parts) => {
+          const last = parts[parts.length - 1]
+          if (last?.kind === 'tool-batch') {
+            return [
+              ...parts.slice(0, -1),
+              { ...last, tools: [...last.tools, tool] },
+            ]
+          }
+          return [...parts, { kind: 'tool-batch', tools: [tool] }]
+        })
+        break
+      }
+
+      case 'tool-end': {
+        const toolId = event.data.toolCallId as string
+        const toolStatus: 'completed' | 'error' =
+          (event.data.status as string) === 'error' ? 'error' : 'completed'
+        const durationMs = event.data.durationMs as number | undefined
+        updateCurrentTurnParts((parts) => {
+          for (let i = parts.length - 1; i >= 0; i--) {
+            const part = parts[i]
+            if (
+              part.kind === 'tool-batch' &&
+              part.tools.some((t) => t.id === toolId)
+            ) {
+              const updatedTools = part.tools.map((t) =>
+                t.id === toolId ? { ...t, status: toolStatus, durationMs } : t,
+              )
+              return [
+                ...parts.slice(0, i),
+                { ...part, tools: updatedTools },
+                ...parts.slice(i + 1),
+              ]
+            }
+          }
+          return parts
+        })
+        break
+      }
+
+      case 'done': {
+        updateCurrentTurnParts((parts) =>
+          parts.map((part) =>
+            part.kind === 'thinking' ? { ...part, done: true } : part,
+          ),
+        )
+        setTurns((prev) => {
+          const last = prev[prev.length - 1]
+          if (!last) return prev
+          const updated = [...prev.slice(0, -1), { ...last, done: true }]
+          persistTurns(updated)
+          return updated
+        })
+        break
+      }
+
+      case 'error': {
+        const msg =
+          (event.data.message as string) ??
+          (event.data.error as string) ??
+          'Unknown error'
+        updateCurrentTurnParts((parts) => [
+          ...parts,
+          { kind: 'text', text: `Error: ${msg}` },
+        ])
+        break
+      }
+    }
+  }
+
+  const send = async (text: string) => {
+    if (!text.trim() || streaming) return
+
+    const turn: AgentConversationTurn = {
+      id: crypto.randomUUID(),
+      userText: text.trim(),
+      parts: [],
+      done: false,
+      timestamp: Date.now(),
+    }
+    setTurns((prev) => [...prev, turn])
+    setStreaming(true)
+    textAccRef.current = ''
+    thinkAccRef.current = ''
+    const abortController = new AbortController()
+    streamAbortRef.current = abortController
+
+    try {
+      const response = await chatWithAgent(
+        agentId,
+        text.trim(),
+        sessionKeyRef.current,
+        abortController.signal,
+      )
+      if (!response.ok) {
+        const err = await response.text()
+        updateCurrentTurnParts((parts) => [
+          ...parts,
+          { kind: 'text', text: `Error: ${err}` },
+        ])
+        return
+      }
+      await consumeSSEStream(
+        response,
+        processStreamEvent,
+        abortController.signal,
+      )
+    } catch (err) {
+      if (abortController.signal.aborted) return
+      const msg = err instanceof Error ? err.message : String(err)
+      updateCurrentTurnParts((parts) => [
+        ...parts,
+        { kind: 'text', text: `Error: ${msg}` },
+      ])
+    } finally {
+      if (streamAbortRef.current === abortController) {
+        streamAbortRef.current = null
+      }
+      setStreaming(false)
+    }
+  }
+
+  const resetConversation = () => {
+    streamAbortRef.current?.abort()
+    streamAbortRef.current = null
+    setTurns([])
+    setStreaming(false)
+    sessionKeyRef.current = crypto.randomUUID()
+  }
+
+  return {
+    turns,
+    streaming,
+    loading,
+    sessionKey: sessionKeyRef.current,
+    send,
+    resetConversation,
+  }
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentChat.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentChat.tsx
@@ -0,0 +1,393 @@
+import {
+  ArrowLeft,
+  Bot,
+  CheckCircle2,
+  Loader2,
+  Send,
+  XCircle,
+} from 'lucide-react'
+import { type FC, useEffect, useRef, useState } from 'react'
+import {
+  Message,
+  MessageContent,
+  MessageResponse,
+} from '@/components/ai-elements/message'
+import {
+  Reasoning,
+  ReasoningContent,
+  ReasoningTrigger,
+} from '@/components/ai-elements/reasoning'
+import { Button } from '@/components/ui/button'
+import { Textarea } from '@/components/ui/textarea'
+import { consumeSSEStream } from '@/lib/sse'
+import { chatWithAgent, type OpenClawStreamEvent } from './useOpenClaw'
+
+interface ToolEntry {
+  id: string
+  name: string
+  status: 'running' | 'completed' | 'error'
+  durationMs?: number
+}
+
+type AssistantPart =
+  | { kind: 'thinking'; text: string; done: boolean }
+  | { kind: 'tool-batch'; tools: ToolEntry[] }
+  | { kind: 'text'; text: string }
+
+interface ChatTurn {
+  id: string
+  userText: string
+  parts: AssistantPart[]
+  done: boolean
+}
+
+interface AgentChatProps {
+  agentId: string
+  agentName: string
+  onBack: () => void
+}
+
+export const AgentChat: FC<AgentChatProps> = ({
+  agentId,
+  agentName,
+  onBack,
+}) => {
+  const [turns, setTurns] = useState<ChatTurn[]>([])
+  const [input, setInput] = useState('')
+  const [streaming, setStreaming] = useState(false)
+  const scrollRef = useRef<HTMLDivElement>(null)
+  const sessionKeyRef = useRef(crypto.randomUUID())
+  const streamAbortRef = useRef<AbortController | null>(null)
+
+  const textAccRef = useRef('')
+  const thinkAccRef = useRef('')
+
+  const scrollToBottom = () => {
+    scrollRef.current?.scrollTo(0, scrollRef.current.scrollHeight)
+  }
+
+  // biome-ignore lint/correctness/useExhaustiveDependencies: scroll on every turns change
+  useEffect(() => {
+    scrollToBottom()
+  }, [turns])
+
+  useEffect(() => {
+    return () => {
+      streamAbortRef.current?.abort()
+    }
+  }, [])
+
+  const updateCurrentTurnParts = (
+    updater: (parts: AssistantPart[]) => AssistantPart[],
+  ) => {
+    setTurns((prev) => {
+      const last = prev[prev.length - 1]
+      if (!last) return prev
+      return [...prev.slice(0, -1), { ...last, parts: updater(last.parts) }]
+    })
+  }
+
+  const processStreamEvent = (event: OpenClawStreamEvent) => {
+    switch (event.type) {
+      case 'text-delta': {
+        const delta = (event.data.text as string) ?? ''
+        textAccRef.current += delta
+        const text = textAccRef.current
+        updateCurrentTurnParts((parts) => {
+          const last = parts[parts.length - 1]
+          if (last?.kind === 'text') {
+            return [...parts.slice(0, -1), { ...last, text }]
+          }
+          return [...parts, { kind: 'text', text }]
+        })
+        break
+      }
+
+      case 'thinking': {
+        const delta = (event.data.text as string) ?? ''
+        thinkAccRef.current += delta
+        const text = thinkAccRef.current
+        updateCurrentTurnParts((parts) => {
+          const idx = parts.findIndex((p) => p.kind === 'thinking' && !p.done)
+          if (idx >= 0) {
+            return [
+              ...parts.slice(0, idx),
+              { ...parts[idx], text, done: false },
+              ...parts.slice(idx + 1),
+            ]
+          }
+          return [...parts, { kind: 'thinking', text, done: false }]
+        })
+        break
+      }
+
+      case 'tool-start': {
+        const tool: ToolEntry = {
+          id: (event.data.toolCallId as string) ?? crypto.randomUUID(),
+          name: (event.data.toolName as string) ?? 'unknown',
+          status: 'running',
+        }
+        updateCurrentTurnParts((parts) => {
+          const last = parts[parts.length - 1]
+          if (last?.kind === 'tool-batch') {
+            return [
+              ...parts.slice(0, -1),
+              { ...last, tools: [...last.tools, tool] },
+            ]
+          }
+          return [...parts, { kind: 'tool-batch', tools: [tool] }]
+        })
+        break
+      }
+
+      case 'tool-end': {
+        const toolId = event.data.toolCallId as string
+        const status =
+          (event.data.status as string) === 'error' ? 'error' : 'completed'
+        const durationMs = event.data.durationMs as number | undefined
+        updateCurrentTurnParts((parts) => {
+          for (let i = parts.length - 1; i >= 0; i--) {
+            const part = parts[i]
+            if (
+              part.kind === 'tool-batch' &&
+              part.tools.some((t) => t.id === toolId)
+            ) {
+              const updatedTools = part.tools.map((t) =>
+                t.id === toolId
+                  ? {
+                      ...t,
+                      status: status as ToolEntry['status'],
+                      durationMs,
+                    }
+                  : t,
+              )
+              return [
+                ...parts.slice(0, i),
+                { ...part, tools: updatedTools },
+                ...parts.slice(i + 1),
+              ]
+            }
+          }
+          return parts
+        })
+        break
+      }
+
+      case 'done': {
+        updateCurrentTurnParts((parts) =>
+          parts.map((part) =>
+            part.kind === 'thinking' ? { ...part, done: true } : part,
+          ),
+        )
+        setTurns((prev) => {
+          const last = prev[prev.length - 1]
+          if (!last) return prev
+          return [...prev.slice(0, -1), { ...last, done: true }]
+        })
+        break
+      }
+
+      case 'error': {
+        const msg =
+          (event.data.message as string) ??
+          (event.data.error as string) ??
+          'Unknown error'
+        updateCurrentTurnParts((parts) => [
+          ...parts,
+          { kind: 'text', text: `Error: ${msg}` },
+        ])
+        break
+      }
+    }
+  }
+
+  const handleSend = async () => {
+    const text = input.trim()
+    if (!text || streaming) return
+
+    const turn: ChatTurn = {
+      id: crypto.randomUUID(),
+      userText: text,
+      parts: [],
+      done: false,
+    }
+    setTurns((prev) => [...prev, turn])
+    setInput('')
+    setStreaming(true)
+
+    textAccRef.current = ''
+    thinkAccRef.current = ''
+    const abortController = new AbortController()
+    streamAbortRef.current = abortController
+
+    try {
+      const response = await chatWithAgent(
+        agentId,
+        text,
+        sessionKeyRef.current,
+        abortController.signal,
+      )
+
+      if (!response.ok) {
+        const err = await response.text()
+        updateCurrentTurnParts((parts) => [
+          ...parts,
+          { kind: 'text', text: `Error: ${err}` },
+        ])
+        return
+      }
+
+      await consumeSSEStream(
+        response,
+        processStreamEvent,
+        abortController.signal,
+      )
+    } catch (err) {
+      if (abortController.signal.aborted) return
+      const msg = err instanceof Error ? err.message : String(err)
+      updateCurrentTurnParts((parts) => [
+        ...parts,
+        { kind: 'text', text: `Error: ${msg}` },
+      ])
+    } finally {
+      if (streamAbortRef.current === abortController) {
+        streamAbortRef.current = null
+      }
+      setStreaming(false)
+    }
+  }
+
+  return (
+    <div className="flex h-[calc(100vh-4rem)] flex-col">
+      <div className="flex items-center gap-2 border-b px-4 py-3">
+        <Button variant="ghost" size="icon" onClick={onBack}>
+          <ArrowLeft className="size-4" />
+        </Button>
+        <h2 className="font-semibold text-lg">{agentName}</h2>
+      </div>
+
+      <div ref={scrollRef} className="flex-1 space-y-4 overflow-y-auto p-4">
+        {turns.map((turn) => (
+          <div key={turn.id} className="space-y-3">
+            {/* User message */}
+            <Message from="user">
+              <MessageContent>
+                <pre className="whitespace-pre-wrap font-sans text-sm">
+                  {turn.userText}
+                </pre>
+              </MessageContent>
+            </Message>
+
+            {/* Assistant response — all parts grouped */}
+            {turn.parts.length > 0 && (
+              <Message from="assistant">
+                <MessageContent>
+                  {turn.parts.map((part, i) => {
+                    const key = `${turn.id}-part-${i}`
+
+                    switch (part.kind) {
+                      case 'thinking':
+                        return (
+                          <Reasoning
+                            key={key}
+                            className="w-full"
+                            isStreaming={!part.done}
+                            defaultOpen={!part.done}
+                          >
+                            <ReasoningTrigger />
+                            <ReasoningContent>{part.text}</ReasoningContent>
+                          </Reasoning>
+                        )
+
+                      case 'tool-batch':
+                        return (
+                          <div key={key} className="w-full space-y-1">
+                            {part.tools.map((tool) => (
+                              <div
+                                key={tool.id}
+                                className="flex items-center gap-2 rounded-md border px-3 py-2 text-sm"
+                              >
+                                {tool.status === 'running' && (
+                                  <Loader2 className="size-3.5 animate-spin text-muted-foreground" />
+                                )}
+                                {tool.status === 'completed' && (
+                                  <CheckCircle2 className="size-3.5 text-green-500" />
+                                )}
+                                {tool.status === 'error' && (
+                                  <XCircle className="size-3.5 text-destructive" />
+                                )}
+                                <span className="font-mono text-xs">
+                                  {tool.name}
+                                </span>
+                                {tool.durationMs != null && (
+                                  <span className="ml-auto text-muted-foreground text-xs">
+                                    {(tool.durationMs / 1000).toFixed(1)}s
+                                  </span>
+                                )}
+                              </div>
+                            ))}
+                          </div>
+                        )
+
+                      case 'text':
+                        return (
+                          <MessageResponse key={key}>
+                            {part.text}
+                          </MessageResponse>
+                        )
+                      default:
+                        return null
+                    }
+                  })}
+                </MessageContent>
+              </Message>
+            )}
+
+            {/* Streaming indicator when waiting for first part */}
+            {!turn.done && turn.parts.length === 0 && streaming && (
+              <div className="flex gap-2">
+                <div className="flex h-7 w-7 shrink-0 items-center justify-center rounded-full bg-[var(--accent-orange)] text-white">
+                  <Bot className="h-3.5 w-3.5" />
+                </div>
+                <div className="flex items-center gap-1 rounded-xl rounded-tl-none border border-border/50 bg-card px-3 py-2.5 shadow-sm">
+                  <span className="h-1.5 w-1.5 animate-bounce rounded-full bg-[var(--accent-orange)] [animation-delay:-0.3s]" />
+                  <span className="h-1.5 w-1.5 animate-bounce rounded-full bg-[var(--accent-orange)] [animation-delay:-0.15s]" />
+                  <span className="h-1.5 w-1.5 animate-bounce rounded-full bg-[var(--accent-orange)]" />
+                </div>
+              </div>
+            )}
+          </div>
+        ))}
+      </div>
+
+      <div className="border-t p-4">
+        <div className="flex gap-2">
+          <Textarea
+            value={input}
+            onChange={(e) => setInput(e.target.value)}
+            onKeyDown={(e) => {
+              if (e.key === 'Enter' && !e.shiftKey) {
+                e.preventDefault()
+                handleSend()
+              }
+            }}
+            placeholder="Send a message..."
+            className="min-h-[44px] resize-none"
+            rows={1}
+          />
+          <Button
+            onClick={handleSend}
+            disabled={!input.trim() || streaming}
+            size="icon"
+          >
+            {streaming ? (
+              <Loader2 className="size-4 animate-spin" />
+            ) : (
+              <Send className="size-4" />
+            )}
+          </Button>
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentTerminal.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentTerminal.tsx
@@ -0,0 +1,280 @@
+import {
+  OPENCLAW_CONTAINER_HOME,
+  OPENCLAW_TERMINAL_SHELL,
+} from '@browseros/shared/constants/openclaw'
+import { FitAddon } from '@xterm/addon-fit'
+import { WebLinksAddon } from '@xterm/addon-web-links'
+import { Terminal } from '@xterm/xterm'
+import { ArrowLeft } from 'lucide-react'
+import { type FC, useEffect, useRef } from 'react'
+import '@xterm/xterm/css/xterm.css'
+import { Button } from '@/components/ui/button'
+import { getAgentServerUrl } from '@/lib/browseros/helpers'
+
+interface AgentTerminalProps {
+  onBack: () => void
+}
+
+type TerminalServerMessage =
+  | { type: 'output'; data: string }
+  | { type: 'exit'; exitCode: number }
+  | { type: 'error'; message: string }
+
+const TERMINAL_HOME_DIR = OPENCLAW_CONTAINER_HOME
+const TERMINAL_FONT_FAMILY =
+  '"Geist Mono", Menlo, Monaco, "Courier New", monospace'
+
+function resolveCssColor(variableName: string): string {
+  const probe = document.createElement('div')
+  probe.style.position = 'fixed'
+  probe.style.visibility = 'hidden'
+  probe.style.pointerEvents = 'none'
+  probe.style.color = `var(${variableName})`
+  document.body.append(probe)
+  const color = window.getComputedStyle(probe).color
+  probe.remove()
+  return color
+}
+
+function withAlpha(color: string, alpha: number): string {
+  const channels = color.match(/[\d.]+/g)
+  if (!channels || channels.length < 3) return color
+  const [red, green, blue] = channels
+  return `rgb(${red} ${green} ${blue} / ${alpha})`
+}
+
+function createTerminalTheme() {
+  const isDark = document.documentElement.classList.contains('dark')
+  const background = resolveCssColor('--background')
+  const foreground = resolveCssColor('--foreground')
+  const muted = resolveCssColor('--muted-foreground')
+  const accent = resolveCssColor('--accent-orange')
+
+  return {
+    background,
+    foreground,
+    cursor: foreground,
+    cursorAccent: background,
+    selectionBackground: withAlpha(accent, isDark ? 0.3 : 0.2),
+    selectionForeground: foreground,
+    black: isDark ? '#16131a' : '#1f1b22',
+    red: isDark ? '#ef8c7c' : '#c25544',
+    green: isDark ? '#9ac67c' : '#5c8754',
+    yellow: isDark ? '#e5c07b' : '#b7791f',
+    blue: isDark ? '#8ba9ff' : '#4667d8',
+    magenta: isDark ? '#d2a8ff' : '#955ec7',
+    cyan: isDark ? '#7fd4d1' : '#0f766e',
+    white: isDark ? '#e8e0d9' : '#f7f1eb',
+    brightBlack: muted,
+    brightRed: isDark ? '#ffb0a4' : '#dc7b6d',
+    brightGreen: isDark ? '#b6d99e' : '#7bab74',
+    brightYellow: isDark ? '#f2d59b' : '#d49a44',
+    brightBlue: isDark ? '#b3c4ff' : '#6f8cf0',
+    brightMagenta: isDark ? '#e2c6ff' : '#b789dd',
+    brightCyan: isDark ? '#a6ece7' : '#3aa5a0',
+    brightWhite: isDark ? '#fff8f1' : '#ffffff',
+  }
+}
+
+function parseTerminalMessage(data: unknown): TerminalServerMessage | null {
+  if (typeof data !== 'string') return null
+
+  let parsed: unknown
+  try {
+    parsed = JSON.parse(data) as unknown
+  } catch {
+    return null
+  }
+  if (
+    parsed &&
+    typeof parsed === 'object' &&
+    'type' in parsed &&
+    parsed.type === 'output' &&
+    'data' in parsed &&
+    typeof parsed.data === 'string'
+  ) {
+    return { type: 'output', data: parsed.data }
+  }
+  if (
+    parsed &&
+    typeof parsed === 'object' &&
+    'type' in parsed &&
+    parsed.type === 'error' &&
+    'message' in parsed &&
+    typeof parsed.message === 'string'
+  ) {
+    return { type: 'error', message: parsed.message }
+  }
+  if (
+    parsed &&
+    typeof parsed === 'object' &&
+    'type' in parsed &&
+    parsed.type === 'exit' &&
+    'exitCode' in parsed &&
+    typeof parsed.exitCode === 'number'
+  ) {
+    return { type: 'exit', exitCode: parsed.exitCode }
+  }
+  return null
+}
+
+export const AgentTerminal: FC<AgentTerminalProps> = ({ onBack }) => {
+  const containerRef = useRef<HTMLDivElement>(null)
+
+  useEffect(() => {
+    if (!containerRef.current) return
+
+    const terminal = new Terminal({
+      fontSize: 14,
+      fontFamily: TERMINAL_FONT_FAMILY,
+      cursorBlink: true,
+      cursorStyle: 'block',
+      lineHeight: 1.25,
+      scrollback: 8000,
+      theme: createTerminalTheme(),
+    })
+
+    const fitAddon = new FitAddon()
+    terminal.loadAddon(fitAddon)
+    terminal.loadAddon(new WebLinksAddon())
+    terminal.open(containerRef.current)
+
+    let ws: WebSocket | null = null
+    let sawExit = false
+
+    const applyTheme = (): void => {
+      terminal.options.theme = createTerminalTheme()
+    }
+
+    const sendMessage = (
+      message:
+        | { type: 'input'; data: string }
+        | { type: 'resize'; cols: number; rows: number },
+    ): void => {
+      if (ws?.readyState !== WebSocket.OPEN) return
+      ws.send(JSON.stringify(message))
+    }
+
+    const sendResize = (cols = terminal.cols, rows = terminal.rows): void => {
+      sendMessage({ type: 'resize', cols, rows })
+    }
+
+    const connect = async () => {
+      const baseUrl = await getAgentServerUrl()
+      const wsUrl = new URL('/terminal/ws', baseUrl)
+      wsUrl.protocol = wsUrl.protocol === 'https:' ? 'wss:' : 'ws:'
+
+      ws = new WebSocket(wsUrl)
+
+      ws.onopen = () => {
+        fitAddon.fit()
+        terminal.focus()
+        sendResize()
+      }
+
+      ws.onmessage = (event) => {
+        const message = parseTerminalMessage(event.data)
+        if (!message) return
+
+        if (message.type === 'output') {
+          terminal.write(message.data)
+        } else if (message.type === 'error') {
+          terminal.write(`\r\n\x1b[31m${message.message}\x1b[0m\r\n`)
+        } else {
+          sawExit = true
+          terminal.write(
+            `\r\n\x1b[90m[session ended with exit ${message.exitCode}]\x1b[0m\r\n`,
+          )
+        }
+      }
+
+      ws.onclose = () => {
+        if (sawExit) return
+        terminal.write('\r\n\x1b[90m[session ended]\x1b[0m\r\n')
+      }
+
+      ws.onerror = () => {
+        terminal.write('\r\n\x1b[31m[connection error]\x1b[0m\r\n')
+      }
+
+      const inputDisposable = terminal.onData((data) => {
+        sendMessage({ type: 'input', data })
+      })
+
+      const resizeDisposable = terminal.onResize(({ cols, rows }) => {
+        sendResize(cols, rows)
+      })
+
+      return () => {
+        inputDisposable.dispose()
+        resizeDisposable.dispose()
+      }
+    }
+
+    let disposeSocketBindings: (() => void) | undefined
+    void connect().then((disposeBindings) => {
+      disposeSocketBindings = disposeBindings
+    })
+
+    const resizeObserver = new ResizeObserver(() => {
+      fitAddon.fit()
+      sendResize()
+    })
+    resizeObserver.observe(containerRef.current)
+
+    const themeObserver = new MutationObserver(() => {
+      applyTheme()
+    })
+    themeObserver.observe(document.documentElement, {
+      attributes: true,
+      attributeFilter: ['class'],
+    })
+
+    return () => {
+      resizeObserver.disconnect()
+      themeObserver.disconnect()
+      disposeSocketBindings?.()
+      ws?.close()
+      terminal.dispose()
+    }
+  }, [])
+
+  return (
+    <div className="flex h-[calc(100dvh-10rem)] min-h-[32rem] w-full flex-col py-2 sm:min-h-[42rem] sm:py-4">
+      <div className="flex min-h-0 flex-1 flex-col overflow-hidden rounded-xl border border-border bg-card shadow-sm">
+        <div className="flex items-center gap-3 border-border border-b px-4 py-3 sm:px-6">
+          <div className="flex min-w-0 items-center gap-3">
+            <Button variant="ghost" size="icon" onClick={onBack}>
+              <ArrowLeft className="size-4" />
+            </Button>
+            <div className="min-w-0">
+              <div className="truncate font-semibold text-sm">
+                Container Terminal
+              </div>
+              <div className="truncate text-muted-foreground text-sm">
+                OpenClaw shell in {TERMINAL_HOME_DIR}
+              </div>
+            </div>
+          </div>
+        </div>
+
+        <div className="min-h-0 flex-1 p-4 sm:p-6">
+          <div className="agent-terminal-shell flex h-full min-h-0 flex-col overflow-hidden rounded-lg border border-border bg-background">
+            <div className="flex items-center justify-between gap-3 border-border border-b px-4 py-2.5">
+              <div className="truncate font-mono text-muted-foreground text-xs">
+                {TERMINAL_HOME_DIR}
+              </div>
+              <div className="font-mono text-[11px] text-muted-foreground">
+                {OPENCLAW_TERMINAL_SHELL.split('/').pop()}
+              </div>
+            </div>
+
+            <div className="min-h-0 flex-1 px-4 py-4 sm:px-5 sm:py-5">
+              <div ref={containerRef} className="h-full w-full" />
+            </div>
+          </div>
+        </div>
+      </div>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx
--- a/packages/browseros-agent/apps/agent/entrypoints/app/agents/useOpenClaw.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/agents/useOpenClaw.ts
@@ -0,0 +1,330 @@
+import type {
+  BrowserOSAgentRoleId,
+  BrowserOSCustomRoleInput,
+} from '@browseros/shared/types/role-aware-agents'
+import { useMutation, useQuery, useQueryClient } from '@tanstack/react-query'
+import { getAgentServerUrl } from '@/lib/browseros/helpers'
+import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
+
+export interface AgentEntry {
+  agentId: string
+  name: string
+  workspace: string
+  model?: unknown
+  role?: {
+    roleSource: 'builtin' | 'custom'
+    roleId?: BrowserOSAgentRoleId
+    roleName: string
+    shortDescription: string
+  }
+}
+
+export interface RoleTemplateSummary {
+  id: BrowserOSAgentRoleId
+  name: string
+  shortDescription: string
+  longDescription: string
+  recommendedApps: string[]
+  defaultAgentName: string
+  boundaries: Array<{
+    key: string
+    label: string
+    description: string
+    defaultMode: 'allow' | 'ask' | 'block'
+  }>
+}
+
+export interface OpenClawStatus {
+  status: 'uninitialized' | 'starting' | 'running' | 'stopped' | 'error'
+  podmanAvailable: boolean
+  machineReady: boolean
+  port: number | null
+  agentCount: number
+  error: string | null
+  controlPlaneStatus:
+    | 'disconnected'
+    | 'connecting'
+    | 'connected'
+    | 'reconnecting'
+    | 'recovering'
+    | 'failed'
+  lastGatewayError: string | null
+  lastRecoveryReason:
+    | 'transient_disconnect'
+    | 'signature_expired'
+    | 'pairing_required'
+    | 'token_mismatch'
+    | 'container_not_ready'
+    | 'unknown'
+    | null
+}
+
+export interface OpenClawAgentMutationInput {
+  name: string
+  roleId?: BrowserOSAgentRoleId
+  customRole?: BrowserOSCustomRoleInput
+  providerType?: string
+  providerName?: string
+  baseUrl?: string
+  apiKey?: string
+  modelId?: string
+}
+
+export interface OpenClawSetupInput {
+  providerType?: string
+  providerName?: string
+  baseUrl?: string
+  apiKey?: string
+  modelId?: string
+}
+
+export function getModelDisplayName(model: unknown): string | undefined {
+  if (typeof model === 'string') return model.split('/').pop()
+  return undefined
+}
+
+export const OPENCLAW_QUERY_KEYS = {
+  status: 'openclaw-status',
+  agents: 'openclaw-agents',
+  roles: 'openclaw-roles',
+} as const
+
+async function clawFetch<T>(
+  baseUrl: string,
+  path: string,
+  init?: RequestInit,
+): Promise<T> {
+  const res = await fetch(`${baseUrl}/claw${path}`, init)
+  if (!res.ok) {
+    let message = `Request failed with status ${res.status}`
+    try {
+      const body = (await res.json()) as { error?: string }
+      if (body.error) {
+        message = body.error
+      }
+    } catch {}
+    throw new Error(message)
+  }
+  return res.json() as Promise<T>
+}
+
+async function fetchOpenClawStatus(baseUrl: string): Promise<OpenClawStatus> {
+  return clawFetch<OpenClawStatus>(baseUrl, '/status')
+}
+
+async function fetchOpenClawAgents(baseUrl: string): Promise<AgentEntry[]> {
+  const data = await clawFetch<{ agents: AgentEntry[] }>(baseUrl, '/agents')
+  return data.agents ?? []
+}
+
+async function fetchOpenClawRoles(
+  baseUrl: string,
+): Promise<RoleTemplateSummary[]> {
+  const data = await clawFetch<{ roles: RoleTemplateSummary[] }>(
+    baseUrl,
+    '/roles',
+  )
+  return data.roles ?? []
+}
+
+async function invalidateOpenClawQueries(
+  queryClient: ReturnType<typeof useQueryClient>,
+): Promise<void> {
+  await Promise.all([
+    queryClient.invalidateQueries({ queryKey: [OPENCLAW_QUERY_KEYS.status] }),
+    queryClient.invalidateQueries({ queryKey: [OPENCLAW_QUERY_KEYS.agents] }),
+  ])
+}
+
+export function useOpenClawStatus(pollMs = 5000) {
+  const {
+    baseUrl,
+    isLoading: urlLoading,
+    error: urlError,
+  } = useAgentServerUrl()
+
+  const query = useQuery<OpenClawStatus, Error>({
+    queryKey: [OPENCLAW_QUERY_KEYS.status, baseUrl],
+    queryFn: () => fetchOpenClawStatus(baseUrl as string),
+    enabled: !!baseUrl && !urlLoading,
+    refetchInterval: pollMs,
+  })
+
+  return {
+    status: query.data ?? null,
+    loading: query.isLoading || urlLoading,
+    error: query.error ?? urlError,
+    refetch: query.refetch,
+  }
+}
+
+export function useOpenClawAgents(enabled = true) {
+  const {
+    baseUrl,
+    isLoading: urlLoading,
+    error: urlError,
+  } = useAgentServerUrl()
+
+  const query = useQuery<AgentEntry[], Error>({
+    queryKey: [OPENCLAW_QUERY_KEYS.agents, baseUrl],
+    queryFn: () => fetchOpenClawAgents(baseUrl as string),
+    enabled: !!baseUrl && !urlLoading && enabled,
+  })
+
+  return {
+    agents: query.data ?? [],
+    loading: query.isLoading || urlLoading,
+    error: query.error ?? urlError,
+    refetch: query.refetch,
+  }
+}
+
+export function useOpenClawRoles() {
+  const {
+    baseUrl,
+    isLoading: urlLoading,
+    error: urlError,
+  } = useAgentServerUrl()
+
+  const query = useQuery<RoleTemplateSummary[], Error>({
+    queryKey: [OPENCLAW_QUERY_KEYS.roles, baseUrl],
+    queryFn: () => fetchOpenClawRoles(baseUrl as string),
+    enabled: !!baseUrl && !urlLoading,
+    staleTime: 60_000,
+  })
+
+  return {
+    roles: query.data ?? [],
+    loading: query.isLoading || urlLoading,
+    error: query.error ?? urlError,
+    refetch: query.refetch,
+  }
+}
+
+export function useOpenClawMutations() {
+  const { baseUrl, isLoading: urlLoading } = useAgentServerUrl()
+  const queryClient = useQueryClient()
+
+  const ensureBaseUrl = () => {
+    if (!baseUrl || urlLoading) {
+      throw new Error('BrowserOS agent server URL is not ready')
+    }
+    return baseUrl
+  }
+
+  const onSuccess = () => invalidateOpenClawQueries(queryClient)
+
+  const setupMutation = useMutation({
+    mutationFn: async (input: OpenClawSetupInput) =>
+      clawFetch<{ status: string; agents: AgentEntry[] }>(
+        ensureBaseUrl(),
+        '/setup',
+        {
+          method: 'POST',
+          headers: { 'Content-Type': 'application/json' },
+          body: JSON.stringify(input),
+        },
+      ),
+    onSuccess,
+  })
+
+  const createMutation = useMutation({
+    mutationFn: async (input: OpenClawAgentMutationInput) =>
+      clawFetch<{ agent: AgentEntry }>(ensureBaseUrl(), '/agents', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify(input),
+      }),
+    onSuccess,
+  })
+
+  const deleteMutation = useMutation({
+    mutationFn: async (id: string) =>
+      clawFetch<{ success: boolean }>(ensureBaseUrl(), `/agents/${id}`, {
+        method: 'DELETE',
+      }),
+    onSuccess,
+  })
+
+  const startMutation = useMutation({
+    mutationFn: async () =>
+      clawFetch<{ status: string }>(ensureBaseUrl(), '/start', {
+        method: 'POST',
+      }),
+    onSuccess,
+  })
+
+  const stopMutation = useMutation({
+    mutationFn: async () =>
+      clawFetch<{ status: string }>(ensureBaseUrl(), '/stop', {
+        method: 'POST',
+      }),
+    onSuccess,
+  })
+
+  const restartMutation = useMutation({
+    mutationFn: async () =>
+      clawFetch<{ status: string }>(ensureBaseUrl(), '/restart', {
+        method: 'POST',
+      }),
+    onSuccess,
+  })
+
+  const reconnectMutation = useMutation({
+    mutationFn: async () =>
+      clawFetch<{ status: string }>(ensureBaseUrl(), '/reconnect', {
+        method: 'POST',
+      }),
+    onSuccess,
+  })
+
+  return {
+    setupOpenClaw: setupMutation.mutateAsync,
+    createAgent: createMutation.mutateAsync,
+    deleteAgent: deleteMutation.mutateAsync,
+    startOpenClaw: startMutation.mutateAsync,
+    stopOpenClaw: stopMutation.mutateAsync,
+    restartOpenClaw: restartMutation.mutateAsync,
+    reconnectOpenClaw: reconnectMutation.mutateAsync,
+    actionInProgress:
+      setupMutation.isPending ||
+      createMutation.isPending ||
+      deleteMutation.isPending ||
+      startMutation.isPending ||
+      stopMutation.isPending ||
+      restartMutation.isPending ||
+      reconnectMutation.isPending,
+    settingUp: setupMutation.isPending,
+    creating: createMutation.isPending,
+    deleting: deleteMutation.isPending,
+    reconnecting: reconnectMutation.isPending,
+  }
+}
+
+export interface OpenClawStreamEvent {
+  type:
+    | 'text-delta'
+    | 'thinking'
+    | 'tool-start'
+    | 'tool-end'
+    | 'tool-output'
+    | 'lifecycle'
+    | 'done'
+    | 'error'
+  data: Record<string, unknown>
+}
+
+export async function chatWithAgent(
+  agentId: string,
+  message: string,
+  sessionKey?: string,
+  signal?: AbortSignal,
+): Promise<Response> {
+  const baseUrl = await getAgentServerUrl()
+  return fetch(`${baseUrl}/claw/agents/${agentId}/chat`, {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify({ message, sessionKey }),
+    signal,
+  })
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/ai-settings/NewProviderDialog.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/ai-settings/NewProviderDialog.tsx
@@ -64,7 +64,6 @@ import {
 import {
  getDefaultBaseUrlForProviders,
  getProviderTemplate,
-  MINIMAX_REGIONS,
  providerTypeOptions,
 } from '@/lib/llm-providers/providerTemplates'
 import { type TestResult, testProvider } from '@/lib/llm-providers/testProvider'
@@ -88,7 +87,6 @@ const providerTypeEnum = z.enum([
  'chatgpt-pro',
  'github-copilot',
  'qwen-code',
-  'minimax',
 ])

 /**
@@ -107,7 +105,7 @@ export const providerFormSchema = z
    temperature: z.number().min(0).max(2),
    // Azure-specific
    resourceName: z.string().optional(),
-    // Bedrock-specific / MiniMax region
+    // Bedrock-specific
    accessKeyId: z.string().optional(),
    secretAccessKey: z.string().optional(),
    region: z.string().optional(),
@@ -166,30 +164,6 @@ export const providerFormSchema = z
    ) {
      // No validation needed — OAuth tokens are on the server
    }
-    // MiniMax: require baseUrl + apiKey
-    else if (data.type === 'minimax') {
-      if (!data.baseUrl) {
-        ctx.addIssue({
-          code: z.ZodIssueCode.custom,
-          message: 'Base URL is required',
-          path: ['baseUrl'],
-        })
-      } else if (!/^https?:\/\/.+/.test(data.baseUrl)) {
-        ctx.addIssue({
-          code: z.ZodIssueCode.custom,
-          message: 'Must be a valid URL',
-          path: ['baseUrl'],
-        })
-      }
-
-      if (!data.apiKey?.trim()) {
-        ctx.addIssue({
-          code: z.ZodIssueCode.custom,
-          message: 'API Key is required',
-          path: ['apiKey'],
-        })
-      }
-    }
    // Other providers: require baseUrl
    else if (!data.baseUrl) {
      ctx.addIssue({
@@ -342,9 +316,6 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
    if (defaultUrl) {
      form.setValue('baseUrl', defaultUrl)
    }
-    if (newType === 'minimax') {
-      form.setValue('region', 'chinese')
-    }
    form.setValue('modelId', '')
  }

@@ -751,94 +722,6 @@ export const NewProviderDialog: FC<NewProviderDialogProps> = ({
      )
    }

-    // Minimax: region selector
-    if (watchedType === 'minimax') {
-      return (
-        <>
-          <FormField
-            control={form.control}
-            name="region"
-            render={({ field }) => (
-              <FormItem>
-                <FormLabel>Region *</FormLabel>
-                <Select
-                  onValueChange={(v) => {
-                    field.onChange(v)
-                    form.setValue(
-                      'baseUrl',
-                      MINIMAX_REGIONS[v as keyof typeof MINIMAX_REGIONS].api,
-                    )
-                  }}
-                  value={field.value || 'chinese'}
-                >
-                  <FormControl>
-                    <SelectTrigger className="w-full">
-                      <SelectValue />
-                    </SelectTrigger>
-                  </FormControl>
-                  <SelectContent>
-                    <SelectItem value="chinese">
-                      Chinese (api.minimaxi.com)
-                    </SelectItem>
-                    <SelectItem value="international">
-                      International (api.minimax.io)
-                    </SelectItem>
-                  </SelectContent>
-                </Select>
-                <FormDescription>
-                  Choose the endpoint closest to your location
-                </FormDescription>
-                <FormMessage />
-              </FormItem>
-            )}
-          />
-          <FormField
-            control={form.control}
-            name="baseUrl"
-            render={({ field }) => (
-              <FormItem>
-                <FormLabel>Base URL *</FormLabel>
-                <FormControl>
-                  <Input placeholder="https://api.minimaxi.com/v1" {...field} />
-                </FormControl>
-                <FormMessage />
-              </FormItem>
-            )}
-          />
-          <FormField
-            control={form.control}
-            name="apiKey"
-            render={({ field }) => (
-              <FormItem>
-                <FormLabel>API Key *</FormLabel>
-                <FormControl>
-                  <Input
-                    type="password"
-                    placeholder="Enter your MiniMax API key"
-                    {...field}
-                  />
-                </FormControl>
-                <FormDescription>
-                  Your API key is encrypted and stored locally.{' '}
-                  {setupGuideUrl && (
-                    <a
-                      href={setupGuideUrl}
-                      onClick={handleSetupGuideClick}
-                      className="inline-flex cursor-pointer items-center gap-1 text-primary hover:underline"
-                    >
-                      <ExternalLink className="h-3 w-3" />
-                      {setupGuideText}
-                    </a>
-                  )}
-                </FormDescription>
-                <FormMessage />
-              </FormItem>
-            )}
-          />
-        </>
-      )
-    }
-
    // Standard providers (OpenAI, Anthropic, Google, etc.)
    return (
      <>
--- a/packages/browseros-agent/apps/agent/entrypoints/app/tool-approvals/ToolApprovalsPage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/tool-approvals/ToolApprovalsPage.tsx
@@ -0,0 +1,120 @@
+import {
+  Bot,
+  Camera,
+  Code,
+  Database,
+  Eye,
+  Hand,
+  MousePointerClick,
+  Navigation,
+} from 'lucide-react'
+import { type FC, useEffect, useState } from 'react'
+import { Switch } from '@/components/ui/switch'
+import {
+  normalizeToolApprovalConfig,
+  toolApprovalConfigStorage,
+} from '@/lib/tool-approvals/storage'
+import {
+  TOOL_CATEGORIES,
+  type ToolApprovalConfig,
+} from '@/lib/tool-approvals/types'
+
+const CATEGORY_ICONS: Record<string, typeof Hand> = {
+  input: MousePointerClick,
+  navigation: Navigation,
+  observation: Eye,
+  screenshots: Camera,
+  scripts: Code,
+  'data-modification': Database,
+  assistant: Bot,
+}
+
+export const ToolApprovalsPage: FC = () => {
+  const [config, setConfig] = useState<ToolApprovalConfig>({ categories: {} })
+
+  useEffect(() => {
+    const applyConfig = (value: ToolApprovalConfig) =>
+      setConfig(normalizeToolApprovalConfig(value))
+
+    toolApprovalConfigStorage.getValue().then(applyConfig)
+    const unwatch = toolApprovalConfigStorage.watch(applyConfig)
+    return () => unwatch()
+  }, [])
+
+  const allEnabled =
+    TOOL_CATEGORIES.length > 0 &&
+    TOOL_CATEGORIES.every((category) => config.categories[category.id] === true)
+
+  const toggleCategory = (categoryId: string, enabled: boolean) => {
+    const next = {
+      ...config,
+      categories: { ...config.categories, [categoryId]: enabled },
+    }
+    setConfig(next)
+    toolApprovalConfigStorage.setValue(normalizeToolApprovalConfig(next))
+  }
+
+  const toggleAll = (enabled: boolean) => {
+    const categories: Record<string, boolean> = {}
+    for (const cat of TOOL_CATEGORIES) {
+      categories[cat.id] = enabled
+    }
+    const next = { ...config, categories }
+    setConfig(next)
+    toolApprovalConfigStorage.setValue(normalizeToolApprovalConfig(next))
+  }
+
+  return (
+    <div className="space-y-6">
+      <div>
+        <h2 className="font-semibold text-xl tracking-tight">Tool Approvals</h2>
+        <p className="text-muted-foreground text-sm">
+          Require human approval before the agent executes certain actions.
+          Changes apply immediately.
+        </p>
+      </div>
+
+      <div className="flex items-center justify-between rounded-lg border bg-card p-4">
+        <div className="space-y-0.5">
+          <div className="font-medium text-sm">Require approval for all</div>
+          <div className="text-muted-foreground text-xs">
+            Toggle all categories at once
+          </div>
+        </div>
+        <Switch checked={allEnabled} onCheckedChange={toggleAll} />
+      </div>
+
+      <div className="space-y-3">
+        {TOOL_CATEGORIES.map((category) => {
+          const Icon = CATEGORY_ICONS[category.id] ?? Hand
+          const enabled = config.categories[category.id] ?? false
+
+          return (
+            <div
+              key={category.id}
+              className="flex items-start gap-4 rounded-lg border bg-card p-4 transition-colors"
+            >
+              <div className="mt-0.5 flex size-9 shrink-0 items-center justify-center rounded-md bg-muted">
+                <Icon className="size-4 text-muted-foreground" />
+              </div>
+              <div className="min-w-0 flex-1 space-y-1">
+                <div className="flex items-center gap-2">
+                  <span className="font-medium text-sm">{category.name}</span>
+                </div>
+                <p className="text-muted-foreground text-xs">
+                  {category.description}
+                </p>
+              </div>
+              <Switch
+                checked={enabled}
+                onCheckedChange={(checked) =>
+                  toggleCategory(category.id, checked)
+                }
+              />
+            </div>
+          )
+        })}
+      </div>
+    </div>
+  )
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/app/usage/UsagePage.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/app/usage/UsagePage.tsx
@@ -1,6 +1,5 @@
-import { AlertCircle, Clock, Coins, Gift, Zap } from 'lucide-react'
+import { AlertCircle, Clock, Coins, CreditCard, Zap } from 'lucide-react'
 import type { FC } from 'react'
-import { ShareForCredits } from '@/components/referral/ShareForCredits'
 import { Button } from '@/components/ui/button'
 import {
  getCreditBarColor,
@@ -44,10 +43,8 @@ export const UsagePage: FC = () => {
  }

  const credits = data?.credits ?? 0
-  const total = data?.dailyLimit ?? 50
+  const total = data?.dailyLimit ?? 100
  const percentage = Math.min((credits / total) * 100, 100)
-  const bonusCredits = Math.max(0, credits - total)
-  const creditsUsed = Math.max(0, total - credits)

  return (
    <div className="space-y-6 p-6">
@@ -98,32 +95,30 @@ export const UsagePage: FC = () => {
          <div className="flex items-center gap-2.5 rounded-lg bg-muted/50 px-3 py-2.5">
            <Zap className="h-4 w-4 shrink-0 text-muted-foreground" />
            <div>
-              {bonusCredits > 0 ? (
-                <>
-                  <p className="font-medium text-xs">Bonus credits</p>
-                  <p className="text-muted-foreground text-xs">
-                    +{bonusCredits} from referrals
-                  </p>
-                </>
-              ) : (
-                <>
-                  <p className="font-medium text-xs">Credits used today</p>
-                  <p className="text-muted-foreground text-xs">
-                    {creditsUsed} of {total}
-                  </p>
-                </>
-              )}
+              <p className="font-medium text-xs">Credits used today</p>
+              <p className="text-muted-foreground text-xs">
+                {total - credits} of {total}
+              </p>
            </div>
          </div>
        </div>
      </div>

      <div className="rounded-xl border p-5">
-        <div className="mb-4 flex items-center gap-2">
-          <Gift className="h-5 w-5 text-muted-foreground" />
-          <span className="font-semibold text-sm">Earn More Credits</span>
+        <div className="flex items-center gap-3">
+          <CreditCard className="h-5 w-5 text-muted-foreground" />
+          <div>
+            <p className="flex items-center gap-2 font-semibold text-sm">
+              Need more credits?
+              <span className="rounded-full bg-muted px-2 py-0.5 font-medium text-[10px] text-muted-foreground uppercase tracking-wide">
+                Coming soon
+              </span>
+            </p>
+            <p className="text-muted-foreground text-xs">
+              Additional credit packages will be available soon
+            </p>
+          </div>
        </div>
-        <ShareForCredits />
      </div>

      <div className="rounded-xl border border-[var(--accent-orange)]/30 bg-[var(--accent-orange)]/5 p-5">
--- a/packages/browseros-agent/apps/agent/entrypoints/newtab/layout/NewTabLayout.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/newtab/layout/NewTabLayout.tsx
@@ -2,21 +2,20 @@ import type { FC } from 'react'
 import { Outlet, useLocation } from 'react-router'
 import { ChatSessionProvider } from '@/entrypoints/sidepanel/layout/ChatSessionContext'
 import { NewTabFocusGrid } from './NewTabFocusGrid'
-
-const HIDE_FOCUS_GRID_PATHS = new Set([
-  '/home/soul',
-  '/home/memory',
-  '/home/skills',
-  '/home/chat',
-])
+import { shouldHideFocusGrid, shouldUseChatSession } from './route-utils'

 export const NewTabLayout: FC = () => {
  const location = useLocation()
-
-  return (
-    <ChatSessionProvider origin="newtab">
-      {!HIDE_FOCUS_GRID_PATHS.has(location.pathname) && <NewTabFocusGrid />}
+  const hideGrid = shouldHideFocusGrid(location.pathname)
+  const useChatSession = shouldUseChatSession(location.pathname)
+  const content = (
+    <>
+      {!hideGrid && <NewTabFocusGrid />}
      <Outlet />
-    </ChatSessionProvider>
+    </>
  )
+
+  if (!useChatSession) return content
+
+  return <ChatSessionProvider origin="newtab">{content}</ChatSessionProvider>
 }
--- a/packages/browseros-agent/apps/agent/entrypoints/newtab/layout/route-utils.test.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/newtab/layout/route-utils.test.ts
@@ -0,0 +1,27 @@
+import { describe, expect, it } from 'bun:test'
+import {
+  isAgentCommandPath,
+  isAgentConversationPath,
+  shouldHideFocusGrid,
+  shouldUseChatSession,
+} from './route-utils'
+
+describe('route-utils', () => {
+  it('treats command center routes as non-chat-session paths', () => {
+    expect(isAgentCommandPath('/home')).toBe(true)
+    expect(isAgentCommandPath('/home/agents/main')).toBe(true)
+    expect(isAgentConversationPath('/home')).toBe(false)
+    expect(isAgentConversationPath('/home/agents/main')).toBe(true)
+    expect(shouldUseChatSession('/home')).toBe(false)
+    expect(shouldUseChatSession('/home/agents/main')).toBe(false)
+    expect(shouldUseChatSession('/home/chat')).toBe(true)
+  })
+
+  it('keeps the focus grid on home while hiding it on dedicated full-screen routes', () => {
+    expect(shouldHideFocusGrid('/home')).toBe(false)
+    expect(shouldHideFocusGrid('/home/agents/main')).toBe(true)
+    expect(shouldHideFocusGrid('/home/chat')).toBe(true)
+    expect(shouldHideFocusGrid('/home/skills')).toBe(true)
+    expect(shouldHideFocusGrid('/home/personalize')).toBe(false)
+  })
+})
--- a/packages/browseros-agent/apps/agent/entrypoints/newtab/layout/route-utils.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/newtab/layout/route-utils.ts
@@ -0,0 +1,24 @@
+const HIDE_FOCUS_GRID_PATHS = new Set([
+  '/home/soul',
+  '/home/memory',
+  '/home/skills',
+  '/home/chat',
+])
+
+export function isAgentCommandPath(pathname: string): boolean {
+  return pathname === '/home' || isAgentConversationPath(pathname)
+}
+
+export function isAgentConversationPath(pathname: string): boolean {
+  return pathname.startsWith('/home/agents/')
+}
+
+export function shouldHideFocusGrid(pathname: string): boolean {
+  return (
+    HIDE_FOCUS_GRID_PATHS.has(pathname) || isAgentConversationPath(pathname)
+  )
+}
+
+export function shouldUseChatSession(pathname: string): boolean {
+  return pathname === '/home/chat'
+}
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/Chat.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/Chat.tsx
@@ -43,6 +43,7 @@ export const Chat = () => {
    disliked,
    onClickDislike,
    isRestoringConversation,
+    addToolApprovalResponse,
  } = useChatSessionContext()

  const {
@@ -222,6 +223,12 @@ export const Chat = () => {
            showDontShowAgain={showDontShowAgain}
            onTakeSurvey={onTakeSurvey}
            onDismissJtbdPopup={onDismissJtbdPopup}
+            onToolApprove={(id) =>
+              addToolApprovalResponse({ id, approved: true })
+            }
+            onToolDeny={(id) =>
+              addToolApprovalResponse({ id, approved: false })
+            }
          />
        )}
        {agentUrlError && (
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/ChatError.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/ChatError.tsx
@@ -1,9 +1,7 @@
 import { AlertCircle, RefreshCw } from 'lucide-react'
 import type { FC } from 'react'
 import { useMemo } from 'react'
-import { ShareForCredits } from '@/components/referral/ShareForCredits'
 import { Button } from '@/components/ui/button'
-import type { ProviderType } from '@/lib/llm-providers/types'

 const SURVEY_DIRECTIONS = [
  'competitor',
@@ -16,45 +14,6 @@ function pickRandomDirection(): string {
  return SURVEY_DIRECTIONS[Math.floor(Math.random() * SURVEY_DIRECTIONS.length)]
 }

-const PROVIDER_DISPLAY_NAMES: Record<ProviderType, string> = {
-  anthropic: 'Anthropic',
-  openai: 'OpenAI',
-  'openai-compatible': 'OpenAI-compatible',
-  google: 'Google',
-  openrouter: 'OpenRouter',
-  azure: 'Azure OpenAI',
-  ollama: 'Ollama',
-  lmstudio: 'LM Studio',
-  bedrock: 'AWS Bedrock',
-  browseros: 'BrowserOS',
-  moonshot: 'Moonshot',
-  'chatgpt-pro': 'ChatGPT Pro',
-  'github-copilot': 'GitHub Copilot',
-  'qwen-code': 'Qwen Code',
-  minimax: 'MiniMax',
-}
-
-const UPSTREAM_RATE_LIMIT_PATTERNS: Array<string | RegExp> = [
-  'usage limit',
-  'rate limit',
-  'rate-limit',
-  'quota',
-  /\b429\b/,
-  'too many requests',
-  'insufficient_quota',
-]
-
-function getProviderDisplayName(providerType?: string): string {
-  if (providerType && providerType in PROVIDER_DISPLAY_NAMES) {
-    return PROVIDER_DISPLAY_NAMES[providerType as ProviderType]
-  }
-  return 'your provider'
-}
-
-function stripRetryPrefix(message: string): string {
-  return message.replace(/^Failed after \d+ attempts?\.\s*Last error:\s*/i, '')
-}
-
 interface ChatErrorProps {
  error: Error
  onRetry?: () => void
@@ -70,8 +29,6 @@ function parseErrorMessage(
  isRateLimit?: boolean
  isCreditsExhausted?: boolean
  isConnectionError?: boolean
-  isUpstreamRateLimit?: boolean
-  providerName?: string
 } {
  const isBrowserosProvider = providerType === 'browseros'

@@ -112,28 +69,6 @@ function parseErrorMessage(
    }
  }

-  // Detect rate limits from non-BrowserOS upstream providers. Users were
-  // confused that a quota/429 from OpenAI/Anthropic/etc. looked like a
-  // BrowserOS-imposed limit.
-  if (!isBrowserosProvider && providerType) {
-    const lower = message.toLowerCase()
-    const matchesRateLimit = UPSTREAM_RATE_LIMIT_PATTERNS.some((p) =>
-      typeof p === 'string' ? lower.includes(p) : p.test(lower),
-    )
-    if (matchesRateLimit) {
-      let stripped = stripRetryPrefix(message).trim()
-      try {
-        const parsed = JSON.parse(stripped)
-        if (parsed?.error?.message) stripped = parsed.error.message
-      } catch {}
-      return {
-        text: stripped || message,
-        isUpstreamRateLimit: true,
-        providerName: getProviderDisplayName(providerType),
-      }
-    }
-  }
-
  let text = message
  try {
    const parsed = JSON.parse(message)
@@ -155,15 +90,8 @@ export const ChatError: FC<ChatErrorProps> = ({
  onRetry,
  providerType,
 }) => {
-  const {
-    text,
-    url,
-    isRateLimit,
-    isCreditsExhausted,
-    isConnectionError,
-    isUpstreamRateLimit,
-    providerName,
-  } = parseErrorMessage(error.message, providerType)
+  const { text, url, isRateLimit, isCreditsExhausted, isConnectionError } =
+    parseErrorMessage(error.message, providerType)

  const surveyUrl = useMemo(
    () =>
@@ -172,11 +100,6 @@ export const ChatError: FC<ChatErrorProps> = ({
  )

  const getTitle = () => {
-    if (isUpstreamRateLimit) {
-      return providerName && providerName !== 'your provider'
-        ? `${providerName} rate limit reached`
-        : 'Upstream rate limit reached'
-    }
    if (isRateLimit) return 'Daily limit reached'
    if (isConnectionError) return 'Connection failed'
    return 'Something went wrong'
@@ -189,14 +112,6 @@ export const ChatError: FC<ChatErrorProps> = ({
        <span className="font-medium text-sm">{getTitle()}</span>
      </div>
      <p className="text-center text-destructive text-xs">{text}</p>
-      {isUpstreamRateLimit && (
-        <p className="text-center text-muted-foreground text-xs">
-          This is a limit from{' '}
-          <span className="font-medium">{providerName}</span>
-          {' — your configured model provider — not BrowserOS. Check your '}
-          provider's dashboard for quota, usage, or billing details.
-        </p>
-      )}
      {isConnectionError && url && (
        <a
          href={url}
@@ -207,22 +122,15 @@ export const ChatError: FC<ChatErrorProps> = ({
          View troubleshooting guide
        </a>
      )}
-      {isCreditsExhausted && (
-        <>
-          <div className="w-full border-border/50 border-t pt-3">
-            <ShareForCredits compact />
-          </div>
-          {url && (
-            <a
-              href={url}
-              target="_blank"
-              rel="noopener noreferrer"
-              className="text-muted-foreground text-xs underline hover:text-foreground"
-            >
-              View Usage & Billing
-            </a>
-          )}
-        </>
+      {isCreditsExhausted && url && (
+        <a
+          href={url}
+          target="_blank"
+          rel="noopener noreferrer"
+          className="text-muted-foreground text-xs underline hover:text-foreground"
+        >
+          View Usage & Billing
+        </a>
      )}
      {isRateLimit && !isCreditsExhausted && (
        <p className="text-muted-foreground text-xs">
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/ChatMessages.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/ChatMessages.tsx
@@ -37,6 +37,8 @@ interface ChatMessagesProps {
  showDontShowAgain: boolean
  onTakeSurvey: (opts?: { dontShowAgain?: boolean }) => void
  onDismissJtbdPopup: (dontShowAgain: boolean) => void
+  onToolApprove?: (approvalId: string) => void
+  onToolDeny?: (approvalId: string) => void
 }

 export const ChatMessages: FC<ChatMessagesProps> = ({
@@ -51,6 +53,8 @@ export const ChatMessages: FC<ChatMessagesProps> = ({
  showDontShowAgain,
  onTakeSurvey,
  onDismissJtbdPopup,
+  onToolApprove,
+  onToolDeny,
 }) => {
  const isStreaming = status === 'streaming' || status === 'submitted'

@@ -114,6 +118,8 @@ export const ChatMessages: FC<ChatMessagesProps> = ({
                                isLastBatch={segment.key === lastToolBatchKey}
                                isLastMessage={isLastMessage}
                                isStreaming={isStreaming}
+                                onApprove={onToolApprove}
+                                onDeny={onToolDeny}
                              />
                            )
                          case 'nudge':
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/ToolBatch.tsx
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/ToolBatch.tsx
@@ -2,16 +2,20 @@ import {
  BotIcon,
  CheckCircle2,
  CircleDashed,
+  Clock,
  Loader2,
+  ShieldCheck,
+  ShieldX,
  XCircle,
 } from 'lucide-react'
-import type { FC } from 'react'
+import { type FC, useEffect, useState } from 'react'
 import {
  Task,
  TaskContent,
  TaskItem,
  TaskTrigger,
 } from '@/components/ai-elements/task'
+import { Button } from '@/components/ui/button'
 import type {
  ToolInvocationInfo,
  ToolInvocationState,
@@ -22,6 +26,8 @@ interface ToolBatchProps {
  isLastBatch: boolean
  isLastMessage: boolean
  isStreaming: boolean
+  onApprove?: (approvalId: string) => void
+  onDeny?: (approvalId: string) => void
 }

 export const ToolBatch: FC<ToolBatchProps> = ({
@@ -29,12 +35,20 @@ export const ToolBatch: FC<ToolBatchProps> = ({
  isLastBatch,
  isLastMessage,
  isStreaming,
+  onApprove,
+  onDeny,
 }) => {
-  const shouldBeOpen = isLastMessage && isLastBatch && isStreaming
+  const hasPendingApproval = tools.some((t) => t.state === 'approval-requested')
+  const shouldBeOpen =
+    (isLastMessage && isLastBatch && isStreaming) || hasPendingApproval
  const [isOpen, setIsOpen] = useState(shouldBeOpen)
  const [hasUserInteracted, setHasUserInteracted] = useState(false)

  useEffect(() => {
+    if (hasPendingApproval) {
+      setIsOpen(true)
+      return
+    }
    if (isLastMessage && !hasUserInteracted) {
      if (isLastBatch) {
        setIsOpen(isStreaming)
@@ -42,9 +56,18 @@ export const ToolBatch: FC<ToolBatchProps> = ({
        setIsOpen(false)
      }
    }
-  }, [isStreaming, isLastMessage, isLastBatch, hasUserInteracted])
+  }, [
+    isStreaming,
+    isLastMessage,
+    isLastBatch,
+    hasUserInteracted,
+    hasPendingApproval,
+  ])

  const completedCount = tools.filter((t) => isToolCompleted(t.state)).length
+  const triggerTitle = hasPendingApproval
+    ? 'Waiting for approval...'
+    : `${completedCount}/${tools.length} actions completed`

  const onManualToggle = (newState: boolean) => {
    setHasUserInteracted(true)
@@ -53,16 +76,23 @@ export const ToolBatch: FC<ToolBatchProps> = ({

  return (
    <Task open={isOpen} onOpenChange={onManualToggle}>
-      <TaskTrigger
-        title={`${completedCount}/${tools.length} actions completed`}
-        TriggerIcon={BotIcon}
-      />
+      <TaskTrigger title={triggerTitle} TriggerIcon={BotIcon} />
      <TaskContent>
        {tools.map((tool) => (
-          <TaskItem key={tool.toolCallId} className="flex items-center gap-2">
-            <ToolStatusIcon state={tool.state} />
-            <span>{formatToolName(tool.toolName)}</span>
-          </TaskItem>
+          <div key={tool.toolCallId}>
+            <TaskItem className="flex items-center gap-2">
+              <ToolStatusIcon state={tool.state} />
+              <span className="flex-1">{formatToolName(tool.toolName)}</span>
+            </TaskItem>
+            {tool.state === 'approval-requested' &&
+              tool.approval?.id != null && (
+                <ApprovalButtons
+                  approvalId={tool.approval.id}
+                  onApprove={onApprove}
+                  onDeny={onDeny}
+                />
+              )}
+          </div>
        ))}
      </TaskContent>
    </Task>
@@ -84,10 +114,47 @@ const isToolInProgress = (state: ToolInvocationState) =>

 const isToolError = (state: ToolInvocationState) => state === 'output-error'

+const isToolDenied = (state: ToolInvocationState) => state === 'output-denied'
+
+const isToolApprovalPending = (state: ToolInvocationState) =>
+  state === 'approval-requested'
+
+const ApprovalButtons: FC<{
+  approvalId: string
+  onApprove?: (id: string) => void
+  onDeny?: (id: string) => void
+}> = ({ approvalId, onApprove, onDeny }) => (
+  <div className="mt-1 mb-2 ml-6 flex items-center gap-2">
+    <Button
+      size="sm"
+      className="h-7 gap-1 px-2.5 text-xs"
+      onClick={() => onApprove?.(approvalId)}
+    >
+      <ShieldCheck className="size-3" />
+      Approve
+    </Button>
+    <Button
+      size="sm"
+      variant="outline"
+      className="h-7 gap-1 px-2.5 text-xs"
+      onClick={() => onDeny?.(approvalId)}
+    >
+      <ShieldX className="size-3" />
+      Deny
+    </Button>
+  </div>
+)
+
 const ToolStatusIcon: FC<{ state: ToolInvocationState }> = ({ state }) => {
  if (isToolCompleted(state)) {
    return <CheckCircle2 className="h-3.5 w-3.5 text-green-500" />
  }
+  if (isToolApprovalPending(state)) {
+    return <Clock className="h-3.5 w-3.5 text-yellow-500" />
+  }
+  if (isToolDenied(state)) {
+    return <ShieldX className="h-3.5 w-3.5 text-red-400" />
+  }
  if (isToolInProgress(state)) {
    return (
      <Loader2 className="h-3.5 w-3.5 animate-spin text-[var(--accent-orange)]" />
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/getMessageSegments.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/getMessageSegments.ts
@@ -8,6 +8,9 @@ export type ToolInvocationState =
  | 'input-available'
  | 'output-available'
  | 'output-error'
+  | 'approval-requested'
+  | 'approval-responded'
+  | 'output-denied'

 export interface ToolInvocationInfo {
  state: ToolInvocationState
@@ -15,6 +18,7 @@ export interface ToolInvocationInfo {
  toolName: string
  input: Record<string, unknown>
  output: unknown[]
+  approval?: { id: string; approved?: boolean; reason?: string }
 }

 export type NudgeType = 'schedule_suggestion' | 'app_connection'
@@ -106,6 +110,7 @@ export const getMessageSegments = (
        state: ToolInvocationState
        input: Record<string, unknown>
        output: unknown
+        approval?: { id: string; approved?: boolean; reason?: string }
      }
      const toolName = toolPart.type?.replace('tool-', '')

@@ -127,6 +132,7 @@ export const getMessageSegments = (
          toolName,
          input: toolPart?.input ?? {},
          output: (toolPart?.output as unknown[]) ?? [],
+          approval: toolPart?.approval,
        })
      }
    }
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useChatSession.ts
@@ -1,10 +1,11 @@
 import { useChat } from '@ai-sdk/react'
 import { DefaultChatTransport, type UIMessage } from 'ai'
 import { compact } from 'es-toolkit/array'
-import { useEffect, useRef, useState } from 'react'
+import { useCallback, useEffect, useRef, useState } from 'react'
 import { useSearchParams } from 'react-router'
 import useDeepCompareEffect from 'use-deep-compare-effect'
 import type { Provider } from '@/components/chat/chatComponentTypes'
+import { aclRulesStorage } from '@/lib/acl/storage'
 import { Capabilities, Feature } from '@/lib/browseros/capabilities'
 import { useAgentServerUrl } from '@/lib/browseros/useBrowserOSProviders'
 import type { ChatAction } from '@/lib/chat-actions/types'
@@ -26,17 +27,59 @@ import { declinedAppsStorage } from '@/lib/declined-apps/storage'
 import { useGraphqlQuery } from '@/lib/graphql/useGraphqlQuery'
 import { createDefaultBrowserOSProvider } from '@/lib/llm-providers/storage'
 import { useLlmProviders } from '@/lib/llm-providers/useLlmProviders'
+import {
+  type ApprovalResponseData,
+  buildChatRequestBody,
+  type ChatRequestBrowserContext,
+} from '@/lib/messaging/server/buildChatRequestBody'
 import { track } from '@/lib/metrics/track'
 import { searchActionsStorage } from '@/lib/search-actions/searchActionsStorage'
 import { selectedTextStorage } from '@/lib/selected-text/selectedTextStorage'
 import { stopAgentStorage } from '@/lib/stop-agent/stop-agent-storage'
+import {
+  type ApprovalResponse,
+  approvalResponsesStorage,
+  extractPendingApprovals,
+  pendingToolApprovalsStorage,
+  removeApprovalResponsesById,
+  removePendingApprovalsById,
+  replacePendingApprovalsForConversation,
+} from '@/lib/tool-approvals/approval-sync-storage'
+import {
+  normalizeToolApprovalConfig,
+  toolApprovalConfigStorage,
+} from '@/lib/tool-approvals/storage'
 import { selectedWorkspaceStorage } from '@/lib/workspace/workspace-storage'
 import type { ChatMode } from './chatTypes'
 import { GetConversationWithMessagesDocument } from './graphql/chatSessionDocument'
 import { useChatRefs } from './useChatRefs'
+import { useExecutionHistoryTracker } from './useExecutionHistoryTracker'
 import { useNotifyActiveTab } from './useNotifyActiveTab'
 import { useRemoteConversationSave } from './useRemoteConversationSave'

+const extractApprovalResponses = (
+  messages: UIMessage[],
+): ApprovalResponseData[] | null => {
+  const lastMsg = messages[messages.length - 1]
+  if (lastMsg?.role !== 'assistant') return null
+
+  const approvals: ApprovalResponseData[] = []
+  for (const part of lastMsg.parts) {
+    const p = part as {
+      state?: string
+      approval?: { id: string; approved?: boolean; reason?: string }
+    }
+    if (p.state === 'approval-responded' && p.approval?.approved != null) {
+      approvals.push({
+        approvalId: p.approval.id,
+        approved: p.approval.approved,
+        reason: p.approval.reason,
+      })
+    }
+  }
+  return approvals.length > 0 ? approvals : null
+}
+
 const getLastMessageText = (messages: UIMessage[]) => {
  const lastMessage = messages[messages.length - 1]
  if (!lastMessage) return ''
@@ -46,6 +89,15 @@ const getLastMessageText = (messages: UIMessage[]) => {
    .join('')
 }

+const getLastUserMessageText = (messages: UIMessage[]) => {
+  for (let i = messages.length - 1; i >= 0; i -= 1) {
+    if (messages[i]?.role === 'user') {
+      return getLastMessageText([messages[i]])
+    }
+  }
+  return ''
+}
+
 export const getResponseAndQueryFromMessageId = (
  messages: UIMessage[],
  messageId: string,
@@ -76,6 +128,61 @@ export interface ChatSessionOptions {
  isIntegrationsSynced?: boolean
 }

+const NEWTAB_SYSTEM_PROMPT = `IMPORTANT: The user is chatting from the New Tab page. When performing browser actions, ALWAYS open content in a NEW TAB rather than navigating the current tab. The user's new tab page should remain accessible.`
+
+const getUserSystemPrompt = (
+  origin: ChatOrigin | undefined,
+  personalization: string,
+) =>
+  origin === 'newtab'
+    ? [personalization, NEWTAB_SYSTEM_PROMPT].filter(Boolean).join('\n\n')
+    : personalization
+
+const buildRequestBrowserContext = ({
+  activeTab,
+  action,
+  enabledMcpServers,
+  customMcpServers,
+}: {
+  activeTab?: chrome.tabs.Tab
+  action?: ChatAction
+  enabledMcpServers: Array<string | undefined>
+  customMcpServers: {
+    name: string
+    url?: string
+  }[]
+}): ChatRequestBrowserContext | undefined => {
+  const browserContext: ChatRequestBrowserContext = {}
+
+  if (activeTab) {
+    browserContext.windowId = activeTab.windowId
+    browserContext.activeTab = {
+      id: activeTab.id,
+      url: activeTab.url,
+      title: activeTab.title,
+    }
+  }
+
+  if (action?.tabs?.length) {
+    browserContext.selectedTabs = action.tabs.map((tab) => ({
+      id: tab.id,
+      url: tab.url,
+      title: tab.title,
+    }))
+  }
+
+  const managedMcpServers = compact(enabledMcpServers)
+  if (managedMcpServers.length) {
+    browserContext.enabledMcpServers = managedMcpServers
+  }
+
+  if (customMcpServers.length) {
+    browserContext.customMcpServers = customMcpServers
+  }
+
+  return Object.keys(browserContext).length ? browserContext : undefined
+}
+
 export const useChatSession = (options?: ChatSessionOptions) => {
  const {
    selectedLlmProviderRef,
@@ -130,6 +237,12 @@ export const useChatSession = (options?: ChatSessionOptions) => {
    conversationIdRef.current = conversationId
  }, [conversationId])

+  const {
+    startTask: startExecutionTask,
+    syncFromMessages: syncExecutionHistory,
+    finishTask: finishExecutionTask,
+  } = useExecutionHistoryTracker()
+
  const onClickLike = (messageId: string) => {
    const { responseText, queryText } = getResponseAndQueryFromMessageId(
      messages,
@@ -164,6 +277,7 @@ export const useChatSession = (options?: ChatSessionOptions) => {
  }

  const modeRef = useRef<ChatMode>(mode)
+  const approvalJustRespondedRef = useRef(false)
  const textToActionRef = useRef<Map<string, ChatAction>>(textToAction)
  const workingDirRef = useRef<string | undefined>(undefined)
  const selectionMapRef = useRef<
@@ -228,10 +342,12 @@ export const useChatSession = (options?: ChatSessionOptions) => {
    status,
    stop,
    error: chatError,
+    addToolApprovalResponse,
  } = useChat({
    transport: new DefaultChatTransport({
-      // Important: this chat logic is also used in apps/agent/lib/schedules/getChatServerResponse.ts for scheduled jobs. Make sure to keep them in sync for any future changes.
      prepareSendMessagesRequest: async ({ messages }) => {
+        const provider =
+          selectedLlmProviderRef.current ?? createDefaultBrowserOSProvider()
        const activeTabsList = await chrome.tabs.query({
          active: true,
          currentWindow: true,
@@ -240,67 +356,24 @@ export const useChatSession = (options?: ChatSessionOptions) => {
        const activeTabSelection = activeTab?.id
          ? (selectionMapRef.current[String(activeTab.id)] ?? null)
          : null
-        const message = getLastMessageText(messages)
-        const provider =
-          selectedLlmProviderRef.current ?? createDefaultBrowserOSProvider()
        const currentMode = modeRef.current
        const enabledMcpServers = enabledMcpServersRef.current
        const customMcpServers = enabledCustomServersRef.current
-
-        const getActionForMessage = (messageText: string) => {
-          return textToActionRef.current.get(messageText)
-        }
-
-        const action = getActionForMessage(message)
-
-        const browserContext: {
-          windowId?: number
-          activeTab?: {
-            id?: number
-            url?: string
-            title?: string
-          }
-          selectedTabs?: {
-            id?: number
-            url?: string
-            title?: string
-          }[]
-          enabledMcpServers?: string[]
-          customMcpServers?: {
-            name: string
-            url: string
-          }[]
-        } = {}
-
-        if (activeTab) {
-          browserContext.windowId = activeTab.windowId
-          browserContext.activeTab = {
-            id: activeTab.id,
-            url: activeTab.url,
-            title: activeTab.title,
-          }
-        }
-
-        if (action?.tabs?.length) {
-          browserContext.selectedTabs = action?.tabs?.map((tab) => ({
-            id: tab.id,
-            url: tab.url,
-            title: tab.title,
-          }))
-        }
-
-        if (enabledMcpServers.length) {
-          browserContext.enabledMcpServers = compact(enabledMcpServers)
-        }
-
-        if (customMcpServers.length) {
-          browserContext.customMcpServers = customMcpServers as {
-            name: string
-            url: string
-          }[]
-        }
+        const lastUserMessage = getLastUserMessageText(messages)
+        const action = textToActionRef.current.get(lastUserMessage)
+        const requestBrowserContext = buildRequestBrowserContext({
+          activeTab,
+          action,
+          enabledMcpServers,
+          customMcpServers,
+        })

        const declinedApps = await declinedAppsStorage.getValue()
+        const allAclRules = await aclRulesStorage.getValue()
+        const enabledAclRules = allAclRules.filter((r) => r.enabled)
+        const approvalConfig = normalizeToolApprovalConfig(
+          await toolApprovalConfigStorage.getValue(),
+        )

        const supportsArrayConversation = await Capabilities.supports(
          Feature.PREVIOUS_CONVERSATION_ARRAY,
@@ -317,37 +390,46 @@ export const useChatSession = (options?: ChatSessionOptions) => {
            : history.map((m) => `${m.role}: ${m.content}`).join('\n')
          : undefined

+        const userSystemPrompt = getUserSystemPrompt(
+          options?.origin,
+          personalizationRef.current,
+        )
+
+        const approvalResponses = extractApprovalResponses(messages)
+        if (approvalResponses) {
+          return {
+            api: `${agentUrlRef.current}/chat`,
+            body: buildChatRequestBody({
+              conversationId: conversationIdRef.current,
+              provider,
+              mode: currentMode,
+              browserContext: requestBrowserContext,
+              userSystemPrompt,
+              userWorkingDir: workingDirRef.current,
+              previousConversation,
+              declinedApps,
+              aclRules: enabledAclRules,
+              toolApprovalConfig: approvalConfig,
+              toolApprovalResponses: approvalResponses,
+            }),
+          }
+        }
+
+        const message = getLastMessageText(messages)
+
        const result = {
          api: `${agentUrlRef.current}/chat`,
-          body: {
+          body: buildChatRequestBody({
            message,
-            provider: provider?.type,
-            providerType: provider?.type,
-            providerName: provider?.name,
-            apiKey: provider?.apiKey,
-            baseUrl: provider?.baseUrl,
            conversationId: conversationIdRef.current,
-            model: provider?.modelId ?? 'default',
+            provider,
            mode: currentMode,
-            contextWindowSize: provider?.contextWindow,
-            temperature: provider?.temperature,
-            // Azure-specific
-            resourceName: provider?.resourceName,
-            // Bedrock-specific
-            accessKeyId: provider?.accessKeyId,
-            secretAccessKey: provider?.secretAccessKey,
-            region: provider?.region,
-            sessionToken: provider?.sessionToken,
-            // ChatGPT Pro (Codex)
-            reasoningEffort: provider?.reasoningEffort,
-            reasoningSummary: provider?.reasoningSummary,
-            browserContext,
-            origin: options?.origin ?? 'sidepanel',
-            userSystemPrompt: personalizationRef.current,
+            browserContext: requestBrowserContext,
+            userSystemPrompt,
            userWorkingDir: workingDirRef.current,
-            supportsImages: provider?.supportsImages,
            previousConversation,
-            declinedApps: declinedApps.length > 0 ? declinedApps : undefined,
+            declinedApps,
+            aclRules: enabledAclRules,
            selectedText: activeTabSelection?.text,
            selectedTextSource: activeTabSelection
              ? {
@@ -355,7 +437,8 @@ export const useChatSession = (options?: ChatSessionOptions) => {
                  title: activeTabSelection.title,
                }
              : undefined,
-          },
+            toolApprovalConfig: approvalConfig,
+          }),
        }

        // Track which tab's selection was sent so we can clear it on success
@@ -365,6 +448,20 @@ export const useChatSession = (options?: ChatSessionOptions) => {
        return result
      },
    }),
+    sendAutomaticallyWhen: () => {
+      if (approvalJustRespondedRef.current) {
+        approvalJustRespondedRef.current = false
+        return true
+      }
+      return false
+    },
+    onFinish: async ({ message, isAbort, isError }) => {
+      await finishExecutionTask({
+        responseText: getLastMessageText([message]),
+        isAbort,
+        isError,
+      })
+    },
  })

  // Remove messages with empty parts (e.g. interrupted assistant responses)
@@ -442,7 +539,8 @@ export const useChatSession = (options?: ChatSessionOptions) => {
  // Keep messagesRef in sync on every change (cheap ref assignment)
  useEffect(() => {
    messagesRef.current = messages
-  }, [messages])
+    syncExecutionHistory(messages, status)
+  }, [messages, status, syncExecutionHistory])

  // Save conversation only after streaming completes — not on every token
  const previousStatusRef = useRef(status)
@@ -485,6 +583,69 @@ export const useChatSession = (options?: ChatSessionOptions) => {
    if (chatError) invalidateCredits()
  }, [chatError, invalidateCredits])

+  // Sync pending tool approvals to shared storage for the admin dashboard
+  useEffect(() => {
+    let isCancelled = false
+
+    const syncPendingApprovals = async () => {
+      const pending = extractPendingApprovals(
+        messages,
+        conversationIdRef.current,
+      )
+      const current = (await pendingToolApprovalsStorage.getValue()) ?? []
+      if (isCancelled) return
+
+      await pendingToolApprovalsStorage.setValue(
+        replacePendingApprovalsForConversation(
+          current,
+          conversationIdRef.current,
+          pending,
+        ),
+      )
+    }
+
+    syncPendingApprovals()
+
+    return () => {
+      isCancelled = true
+    }
+  }, [messages])
+
+  // Watch for approval responses from the admin dashboard
+  // biome-ignore lint/correctness/useExhaustiveDependencies: only set up once
+  useEffect(() => {
+    const handleResponses = async (responses: ApprovalResponse[]) => {
+      if (!responses?.length) return
+      try {
+        for (const resp of responses) {
+          respondToToolApproval({
+            id: resp.approvalId,
+            approved: resp.approved,
+            reason: resp.reason,
+          })
+        }
+        const approvalIds = responses.map((resp) => resp.approvalId)
+        const currentResponses =
+          (await approvalResponsesStorage.getValue()) ?? []
+        const currentPending =
+          (await pendingToolApprovalsStorage.getValue()) ?? []
+
+        await approvalResponsesStorage.setValue(
+          removeApprovalResponsesById(currentResponses, approvalIds),
+        )
+        await pendingToolApprovalsStorage.setValue(
+          removePendingApprovalsById(currentPending, approvalIds),
+        )
+      } catch {
+        // Leave storage intact so the dashboard can retry
+      }
+    }
+
+    approvalResponsesStorage.getValue().then(handleResponses)
+    const unwatch = approvalResponsesStorage.watch(handleResponses)
+    return () => unwatch()
+  }, [])
+
  const isIntegrationsSynced = options?.isIntegrationsSynced ?? true
  const isIntegrationsSyncedRef = useRef(isIntegrationsSynced)
  const pendingMessageRef = useRef<{
@@ -492,6 +653,17 @@ export const useChatSession = (options?: ChatSessionOptions) => {
    action?: ChatAction
  } | null>(null)

+  const dispatchMessage = useCallback(
+    (text: string) => {
+      startExecutionTask({
+        conversationId: conversationIdRef.current,
+        promptText: text,
+      })
+      baseSendMessage({ text })
+    },
+    [baseSendMessage, startExecutionTask],
+  )
+
  useEffect(() => {
    isIntegrationsSyncedRef.current = isIntegrationsSynced
  }, [isIntegrationsSynced])
@@ -509,9 +681,9 @@ export const useChatSession = (options?: ChatSessionOptions) => {
          return next
        })
      }
-      baseSendMessage({ text: pending.text })
+      dispatchMessage(pending.text)
    }
-  }, [isIntegrationsSynced, baseSendMessage])
+  }, [dispatchMessage, isIntegrationsSynced])

  const sendMessage = (params: { text: string; action?: ChatAction }) => {
    track(MESSAGE_SENT_EVENT, {
@@ -534,7 +706,7 @@ export const useChatSession = (options?: ChatSessionOptions) => {
        return next
      })
    }
-    baseSendMessage({ text: params.text })
+    dispatchMessage(params.text)
  }

  // biome-ignore lint/correctness/useExhaustiveDependencies: only need to run this once
@@ -560,6 +732,15 @@ export const useChatSession = (options?: ChatSessionOptions) => {
    return () => unwatch()
  }, [])

+  const respondToToolApproval = (params: {
+    id: string
+    approved: boolean
+    reason?: string
+  }) => {
+    approvalJustRespondedRef.current = true
+    addToolApprovalResponse(params)
+  }
+
  const handleSelectProvider = (provider: Provider) => {
    const fullProvider = llmProviders.find((p) => p.id === provider.id)
    track(PROVIDER_SELECTED_EVENT, {
@@ -582,6 +763,7 @@ export const useChatSession = (options?: ChatSessionOptions) => {
  const resetConversation = () => {
    track(CONVERSATION_RESET_EVENT, { message_count: messages.length })
    stop()
+    void finishExecutionTask({ isAbort: true })
    setConversationId(crypto.randomUUID())
    setMessages([])
    setTextToAction(new Map())
@@ -616,5 +798,6 @@ export const useChatSession = (options?: ChatSessionOptions) => {
    disliked,
    onClickDislike,
    conversationId,
+    addToolApprovalResponse: respondToToolApproval,
  }
 }
--- a/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useExecutionHistoryTracker.ts
+++ b/packages/browseros-agent/apps/agent/entrypoints/sidepanel/index/useExecutionHistoryTracker.ts
@@ -0,0 +1,164 @@
+import type { ChatStatus, UIMessage } from 'ai'
+import { useCallback, useRef } from 'react'
+import {
+  getResponsePreview,
+  normalizeExecutionSteps,
+} from '@/lib/execution-history/normalize'
+import { upsertConversationExecutionTask } from '@/lib/execution-history/storage'
+import type {
+  ExecutionTaskRecord,
+  ExecutionTaskStatus,
+} from '@/lib/execution-history/types'
+import { sentry } from '@/lib/sentry/sentry'
+
+interface StartExecutionTaskInput {
+  conversationId: string
+  promptText: string
+}
+
+interface FinishExecutionTaskInput {
+  responseText?: string
+  isAbort?: boolean
+  isError?: boolean
+}
+
+function createTask(input: StartExecutionTaskInput): ExecutionTaskRecord {
+  return {
+    id: crypto.randomUUID(),
+    conversationId: input.conversationId,
+    promptText: input.promptText,
+    startedAt: new Date().toISOString(),
+    status: 'running',
+    actionCount: 0,
+    approvalCount: 0,
+    deniedCount: 0,
+    errorCount: 0,
+    steps: [],
+  }
+}
+
+function getLastUserMessage(messages: UIMessage[]): UIMessage | undefined {
+  for (let index = messages.length - 1; index >= 0; index--) {
+    if (messages[index]?.role === 'user') {
+      return messages[index]
+    }
+  }
+}
+
+function getLastAssistantMessage(messages: UIMessage[]): UIMessage | undefined {
+  const lastMessage = messages[messages.length - 1]
+  if (lastMessage?.role === 'assistant') {
+    return lastMessage
+  }
+}
+
+function getFinishedStatus(
+  input: FinishExecutionTaskInput,
+): ExecutionTaskStatus {
+  if (input.isError) return 'failed'
+  if (input.isAbort) return 'stopped'
+  return 'completed'
+}
+
+export function useExecutionHistoryTracker() {
+  const activeTaskRef = useRef<ExecutionTaskRecord | null>(null)
+  const lastSavedHashRef = useRef('')
+  const writeQueueRef = useRef(Promise.resolve())
+
+  const persistTask = useCallback((task: ExecutionTaskRecord) => {
+    const taskHash = JSON.stringify(task)
+    if (taskHash === lastSavedHashRef.current) return
+
+    activeTaskRef.current = task
+    writeQueueRef.current = writeQueueRef.current
+      .then(async () => {
+        await upsertConversationExecutionTask(task)
+        lastSavedHashRef.current = taskHash
+      })
+      .catch((error) => {
+        sentry.captureException(error, {
+          extra: {
+            message: 'Failed to persist execution history task',
+            conversationId: task.conversationId,
+            taskId: task.id,
+          },
+        })
+      })
+  }, [])
+
+  const startTask = useCallback(
+    (input: StartExecutionTaskInput) => {
+      const task = createTask(input)
+      lastSavedHashRef.current = ''
+      persistTask(task)
+      return task.id
+    },
+    [persistTask],
+  )
+
+  const syncFromMessages = useCallback(
+    (messages: UIMessage[], _status: ChatStatus) => {
+      const activeTask = activeTaskRef.current
+      if (!activeTask) return
+
+      const promptMessage = getLastUserMessage(messages)
+      const assistantMessage = getLastAssistantMessage(messages)
+      const normalized = normalizeExecutionSteps({
+        assistantMessage,
+        previousSteps: activeTask.steps,
+        nowIso: new Date().toISOString(),
+      })
+
+      persistTask({
+        ...activeTask,
+        promptMessageId: activeTask.promptMessageId ?? promptMessage?.id,
+        assistantMessageId:
+          normalized.assistantMessageId ?? activeTask.assistantMessageId,
+        responsePreview:
+          getResponsePreview(assistantMessage) || activeTask.responsePreview,
+        actionCount: normalized.actionCount,
+        approvalCount: normalized.approvalCount,
+        deniedCount: normalized.deniedCount,
+        errorCount: normalized.errorCount,
+        steps: normalized.steps,
+      })
+    },
+    [persistTask],
+  )
+
+  const finishTask = useCallback(
+    async (input: FinishExecutionTaskInput) => {
+      const activeTask = activeTaskRef.current
+      if (!activeTask) return
+
+      const responseText = input.responseText?.trim() || activeTask.responseText
+      const nextTask: ExecutionTaskRecord = {
+        ...activeTask,
+        completedAt: new Date().toISOString(),
+        status: getFinishedStatus(input),
+        responseText,
+        responsePreview: responseText
+          ? getResponsePreview({
+              parts: [{ type: 'text', text: responseText }],
+            } as Pick<UIMessage, 'parts'>)
+          : activeTask.responsePreview,
+      }
+
+      persistTask(nextTask)
+      activeTaskRef.current = null
+    },
+    [persistTask],
+  )
+
+  const clearActiveTask = useCallback(() => {
+    activeTaskRef.current = null
+    lastSavedHashRef.current = ''
+  }, [])
+
+  return {
+    startTask,
+    syncFromMessages,
+    finishTask,
+    clearActiveTask,
+  }
+}
--- a/packages/browseros-agent/apps/agent/lib/acl/storage.ts
+++ b/packages/browseros-agent/apps/agent/lib/acl/storage.ts
@@ -0,0 +1,7 @@
+import type { AclRule } from '@browseros/shared/types/acl'
+import { storage } from '#imports'
+
+export const aclRulesStorage = storage.defineItem<AclRule[]>(
+  'local:acl-rules',
+  { fallback: [] },
+)
--- a/packages/browseros-agent/apps/agent/lib/agent-conversations/storage.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-conversations/storage.ts
@@ -0,0 +1,31 @@
+import { del, get, keys, set } from 'idb-keyval'
+import type { AgentConversation } from './types'
+
+const PREFIX = 'agent-conv:'
+
+export async function saveConversation(conv: AgentConversation): Promise<void> {
+  await set(`${PREFIX}${conv.agentId}:${conv.sessionKey}`, conv)
+}
+
+export async function getLatestConversation(
+  agentId: string,
+): Promise<AgentConversation | undefined> {
+  const allKeys = await keys()
+  const agentKeys = (allKeys as string[]).filter((k) =>
+    k.startsWith(`${PREFIX}${agentId}:`),
+  )
+  if (!agentKeys.length) return undefined
+
+  const conversations = await Promise.all(
+    agentKeys.map((k) => get<AgentConversation>(k)),
+  )
+  const valid = conversations.filter((c): c is AgentConversation => c != null)
+  return valid.sort((a, b) => b.updatedAt - a.updatedAt)[0] ?? undefined
+}
+
+export async function deleteConversation(
+  agentId: string,
+  sessionKey: string,
+): Promise<void> {
+  await del(`${PREFIX}${agentId}:${sessionKey}`)
+}
--- a/packages/browseros-agent/apps/agent/lib/agent-conversations/types.ts
+++ b/packages/browseros-agent/apps/agent/lib/agent-conversations/types.ts
@@ -0,0 +1,53 @@
+export interface AssistantTextPart {
+  kind: 'text'
+  text: string
+}
+
+export interface AssistantThinkingPart {
+  kind: 'thinking'
+  text: string
+  done: boolean
+}
+
+export interface ToolEntry {
+  id: string
+  name: string
+  status: 'running' | 'completed' | 'error'
+  durationMs?: number
+}
+
+export interface AssistantToolBatchPart {
+  kind: 'tool-batch'
+  tools: ToolEntry[]
+}
+
+export type AssistantPart =
+  | AssistantTextPart
+  | AssistantThinkingPart
+  | AssistantToolBatchPart
+
+export interface AgentConversationTurn {
+  id: string
+  userText: string
+  parts: AssistantPart[]
+  done: boolean
+  timestamp: number
+}
+
+export interface AgentConversation {
+  agentId: string
+  agentName: string
+  sessionKey: string
+  turns: AgentConversationTurn[]
+  createdAt: number
+  updatedAt: number
+}
+
+export interface AgentCardData {
+  agentId: string
+  name: string
+  model?: string
+  status: 'idle' | 'working' | 'error'
+  lastMessage?: string
+  lastMessageTimestamp?: number
+}
--- a/packages/browseros-agent/apps/agent/lib/conversations/conversationStorage.ts
+++ b/packages/browseros-agent/apps/agent/lib/conversations/conversationStorage.ts
@@ -2,6 +2,7 @@ import { storage } from '@wxt-dev/storage'
 import type { UIMessage } from 'ai'
 import { useEffect, useState } from 'react'
 import { useSessionInfo } from '../auth/sessionStorage'
+import { removeConversationExecutionHistory } from '../execution-history/storage'
 import { uploadConversationsToGraphql } from './uploadConversationsToGraphql'

 const MAX_CONVERSATIONS = 50
@@ -42,6 +43,7 @@ export function useConversations() {
  const removeConversation = async (id: string) => {
    const current = (await conversationStorage.getValue()) ?? []
    await conversationStorage.setValue(current.filter((c) => c.id !== id))
+    await removeConversationExecutionHistory(id)
  }

  const saveConversation = async (id: string, messages: UIMessage[]) => {
@@ -68,11 +70,16 @@ export function useConversations() {
        messages,
        lastMessagedAt: Date.now(),
      }
-      let updated = [newConversation, ...current]
-      if (updated.length > MAX_CONVERSATIONS) {
-        updated = updated.slice(0, MAX_CONVERSATIONS)
-      }
-      await conversationStorage.setValue(updated)
+      const nextConversations = [newConversation, ...current]
+      const removedConversations = nextConversations.slice(MAX_CONVERSATIONS)
+      await conversationStorage.setValue(
+        nextConversations.slice(0, MAX_CONVERSATIONS),
+      )
+      await Promise.all(
+        removedConversations.map((conversation) =>
+          removeConversationExecutionHistory(conversation.id),
+        ),
+      )
    }
  }

--- a/packages/browseros-agent/apps/agent/lib/credits/browseros-id.ts
+++ b/packages/browseros-agent/apps/agent/lib/credits/browseros-id.ts
@@ -1,15 +0,0 @@
-import { getBrowserOSAdapter } from '@/lib/browseros/adapter'
-import { BROWSEROS_PREFS } from '@/lib/browseros/prefs'
-
-// TODO(credits-identity): temporary shim — reuses the BrowserOS metrics
-// install_id as the credits/referral identifier. Replace with a dedicated
-// identity module once we have one.
-export async function getBrowserosId(): Promise<string> {
-  const adapter = getBrowserOSAdapter()
-  const pref = await adapter.getPref(BROWSEROS_PREFS.INSTALL_ID)
-  const id = pref.value
-  if (typeof id !== 'string' || id.length === 0) {
-    throw new Error('browseros.metrics_install_id is not set')
-  }
-  return id
-}
--- a/packages/browseros-agent/apps/agent/lib/credits/useCredits.ts
+++ b/packages/browseros-agent/apps/agent/lib/credits/useCredits.ts
@@ -1,25 +1,20 @@
-import { EXTERNAL_URLS } from '@browseros/shared/constants/urls'
 import { useQuery, useQueryClient } from '@tanstack/react-query'
-import { getBrowserosId } from './browseros-id'
+import { getAgentServerUrl } from '@/lib/browseros/helpers'

 export interface CreditsInfo {
  credits: number
  dailyLimit: number
  lastResetAt?: string
-  browserosId?: string
 }

 const CREDITS_QUERY_KEY = ['credits']

 async function fetchCredits(): Promise<CreditsInfo> {
-  const browserosId = await getBrowserosId()
-  const response = await fetch(
-    `${EXTERNAL_URLS.CREDITS_GATEWAY}/credits/${browserosId}`,
-  )
+  const baseUrl = await getAgentServerUrl()
+  const response = await fetch(`${baseUrl}/credits`)
  if (!response.ok)
    throw new Error(`Failed to fetch credits: ${response.status}`)
-  const data = (await response.json()) as CreditsInfo
-  return { ...data, browserosId }
+  return response.json()
 }

 export function useCredits() {
--- a/packages/browseros-agent/apps/agent/lib/execution-history/normalize.test.ts
+++ b/packages/browseros-agent/apps/agent/lib/execution-history/normalize.test.ts
@@ -0,0 +1,161 @@
+import { describe, expect, it } from 'bun:test'
+import type { UIMessage } from 'ai'
+import {
+  getMessageText,
+  getResponsePreview,
+  normalizeExecutionSteps,
+} from './normalize'
+
+function asMessagePart(
+  part: Record<string, unknown>,
+): UIMessage['parts'][number] {
+  return part as unknown as UIMessage['parts'][number]
+}
+
+function createAssistantMessage(parts: UIMessage['parts']): UIMessage {
+  return {
+    id: 'assistant-1',
+    role: 'assistant',
+    parts,
+  } as UIMessage
+}
+
+describe('normalizeExecutionSteps', () => {
+  it('filters nudge tools from the execution history', () => {
+    const message = createAssistantMessage([
+      asMessagePart({ type: 'text', text: 'I checked that for you.' }),
+      asMessagePart({
+        type: 'tool-suggest_schedule',
+        toolCallId: 'nudge-1',
+        state: 'output-available',
+        input: { scheduleType: 'daily' },
+        output: { suggestedName: 'Morning briefing' },
+      }),
+      asMessagePart({
+        type: 'tool-open',
+        toolCallId: 'tool-1',
+        state: 'output-available',
+        input: { ref_id: 'page-1' },
+        output: { pageId: 1 },
+      }),
+    ])
+
+    const normalized = normalizeExecutionSteps({
+      assistantMessage: message,
+      nowIso: '2026-03-26T10:00:00.000Z',
+    })
+
+    expect(normalized.assistantMessageId).toBe('assistant-1')
+    expect(normalized.actionCount).toBe(1)
+    expect(normalized.steps).toHaveLength(1)
+    expect(normalized.steps[0]).toMatchObject({
+      id: 'tool-1',
+      toolName: 'open',
+      state: 'output-available',
+    })
+  })
+
+  it('preserves the original start time when a tool step reaches a terminal state', () => {
+    const initialTimestamp = '2026-03-26T10:00:00.000Z'
+    const completedTimestamp = '2026-03-26T10:00:04.000Z'
+
+    const running = normalizeExecutionSteps({
+      assistantMessage: createAssistantMessage([
+        asMessagePart({
+          type: 'tool-open',
+          toolCallId: 'tool-1',
+          state: 'input-available',
+          input: { ref_id: 'page-1' },
+        }),
+      ]),
+      nowIso: initialTimestamp,
+    })
+
+    const completed = normalizeExecutionSteps({
+      assistantMessage: createAssistantMessage([
+        asMessagePart({
+          type: 'tool-open',
+          toolCallId: 'tool-1',
+          state: 'output-available',
+          input: { ref_id: 'page-1' },
+          output: { title: 'Example Domain' },
+        }),
+      ]),
+      previousSteps: running.steps,
+      nowIso: completedTimestamp,
+    })
+
+    expect(completed.steps[0]?.startedAt).toBe(initialTimestamp)
+    expect(completed.steps[0]?.completedAt).toBe(completedTimestamp)
+  })
+
+  it('uses a compact preview for completed tool output', () => {
+    const normalized = normalizeExecutionSteps({
+      assistantMessage: createAssistantMessage([
+        asMessagePart({
+          type: 'tool-open',
+          toolCallId: 'tool-1',
+          state: 'output-available',
+          input: { ref_id: 'page-1' },
+          output: {
+            content: [
+              {
+                type: 'text',
+                text: 'Navigated to https://amazon.com. Additional context: page snapshot follows.',
+              },
+            ],
+          },
+        }),
+      ]),
+      nowIso: '2026-03-26T10:00:00.000Z',
+    })
+
+    expect(normalized.steps[0]?.previewText).toBe('Completed successfully')
+  })
+
+  it('surfaces ACL blocks as a compact issue label', () => {
+    const normalized = normalizeExecutionSteps({
+      assistantMessage: createAssistantMessage([
+        asMessagePart({
+          type: 'tool-click',
+          toolCallId: 'tool-1',
+          state: 'output-available',
+          input: { x: 10, y: 20 },
+          output: {
+            content: [
+              {
+                type: 'text',
+                text: "Action blocked by ACL rule: 'add to cart'. The element on this page is restricted.",
+              },
+            ],
+          },
+        }),
+      ]),
+      nowIso: '2026-03-26T10:00:00.000Z',
+    })
+
+    expect(normalized.steps[0]?.previewText).toBe('Blocked by ACL rule')
+  })
+})
+
+describe('execution history text helpers', () => {
+  it('joins text parts into a single response body', () => {
+    const text = getMessageText({
+      parts: [
+        asMessagePart({ type: 'text', text: 'First line' }),
+        asMessagePart({ type: 'text', text: 'Second line' }),
+      ],
+    })
+
+    expect(text).toBe('First line\n\nSecond line')
+  })
+
+  it('truncates long response previews', () => {
+    const preview = getResponsePreview({
+      parts: [asMessagePart({ type: 'text', text: 'a'.repeat(220) })],
+    })
+
+    expect(preview).toHaveLength(180)
+    expect(preview.endsWith('...')).toBe(true)
+  })
+})
--- a/packages/browseros-agent/apps/agent/lib/execution-history/normalize.ts
+++ b/packages/browseros-agent/apps/agent/lib/execution-history/normalize.ts
@@ -0,0 +1,215 @@
+import type { DynamicToolUIPart, ToolUIPart, UIMessage } from 'ai'
+import type {
+  ExecutionStepApproval,
+  ExecutionStepRecord,
+  ExecutionStepState,
+} from './types'
+
+const NUDGE_TOOL_NAMES = new Set(['suggest_schedule', 'suggest_app_connection'])
+const TERMINAL_STEP_STATES = new Set<ExecutionStepState>([
+  'output-available',
+  'output-error',
+  'output-denied',
+])
+const MAX_PREVIEW_CHARS = 180
+
+type ToolLikePart = ToolUIPart | DynamicToolUIPart
+
+function truncateText(value: string): string {
+  if (value.length <= MAX_PREVIEW_CHARS) return value
+  return `${value.slice(0, MAX_PREVIEW_CHARS - 3)}...`
+}
+
+function stringifyValue(value: unknown): string {
+  if (typeof value === 'string') return value
+  if (value == null) return ''
+  try {
+    return JSON.stringify(value)
+  } catch {
+    return String(value)
+  }
+}
+
+function normalizeText(value: string): string {
+  return value.replace(/\s+/g, ' ').trim()
+}
+
+function getNestedText(value: unknown, depth = 0): string | undefined {
+  if (depth > 5 || value == null) return undefined
+
+  if (typeof value === 'string') {
+    const text = normalizeText(value)
+    return text || undefined
+  }
+
+  if (Array.isArray(value)) {
+    for (const item of value) {
+      const text = getNestedText(item, depth + 1)
+      if (text) return text
+    }
+    return undefined
+  }
+
+  if (typeof value !== 'object') return undefined
+
+  const record = value as Record<string, unknown>
+  for (const key of ['text', 'message', 'reason', 'content']) {
+    const text = getNestedText(record[key], depth + 1)
+    if (text) return text
+  }
+
+  for (const nestedValue of Object.values(record)) {
+    const text = getNestedText(nestedValue, depth + 1)
+    if (text) return text
+  }
+
+  return undefined
+}
+
+function getCompactIssueLabel(value?: string): string | undefined {
+  if (!value) return undefined
+
+  if (value.includes('Action blocked by ACL rule')) {
+    return 'Blocked by ACL rule'
+  }
+
+  return undefined
+}
+
+function getToolName(part: ToolLikePart): string {
+  if (part.type === 'dynamic-tool') {
+    return part.toolName
+  }
+
+  return part.type.replace('tool-', '')
+}
+
+function isToolPart(part: UIMessage['parts'][number]): part is ToolLikePart {
+  return part.type === 'dynamic-tool' || part.type.startsWith('tool-')
+}
+
+function isExecutionToolPart(
+  part: UIMessage['parts'][number],
+): part is ToolLikePart {
+  if (!isToolPart(part)) return false
+  return !NUDGE_TOOL_NAMES.has(getToolName(part))
+}
+
+function getPreviewText(part: ToolLikePart): string {
+  if (part.state === 'approval-requested') {
+    return 'Waiting for approval'
+  }
+
+  if (part.state === 'approval-responded') {
+    return part.approval?.approved === false
+      ? 'Approval rejected'
+      : 'Approval granted'
+  }
+
+  if (part.state === 'output-denied') {
+    return getCompactIssueLabel(part.approval?.reason) ?? 'Action denied'
+  }
+
+  if (part.state === 'output-error') {
+    return getCompactIssueLabel(part.errorText) ?? 'Action failed'
+  }
+
+  if (part.state === 'output-available') {
+    const preview =
+      getCompactIssueLabel(getNestedText(part.output)) ??
+      getCompactIssueLabel(stringifyValue(part.output))
+    return preview ?? 'Completed successfully'
+  }
+
+  if (part.state === 'input-available') {
+    return 'Action running'
+  }
+
+  return 'Preparing action'
+}
+
+function getApproval(part: ToolLikePart): ExecutionStepApproval | undefined {
+  return part.approval
+    ? {
+        id: part.approval.id,
+        approved: part.approval.approved,
+        reason: part.approval.reason,
+      }
+    : undefined
+}
+
+function getCompletedAt(
+  existingStep: ExecutionStepRecord | undefined,
+  state: ExecutionStepState,
+  nowIso: string,
+): string | undefined {
+  if (existingStep?.completedAt) return existingStep.completedAt
+  if (!TERMINAL_STEP_STATES.has(state)) return undefined
+  return nowIso
+}
+
+function createStepRecord(
+  part: ToolLikePart,
+  order: number,
+  nowIso: string,
+  existingStep?: ExecutionStepRecord,
+): ExecutionStepRecord {
+  const state = part.state as ExecutionStepState
+  return {
+    id: part.toolCallId,
+    toolName: getToolName(part),
+    order,
+    state,
+    startedAt: existingStep?.startedAt ?? nowIso,
+    completedAt: getCompletedAt(existingStep, state, nowIso),
+    input: part.input,
+    output: 'output' in part ? part.output : undefined,
+    errorText: 'errorText' in part ? part.errorText : undefined,
+    previewText: getPreviewText(part),
+    approval: getApproval(part),
+  }
+}
+
+export function getMessageText(
+  message?: Pick<UIMessage, 'parts'> | null,
+): string {
+  if (!message) return ''
+
+  return message.parts
+    .filter((part) => part.type === 'text')
+    .map((part) => part.text)
+    .join('\n\n')
+    .trim()
+}
+
+export function getResponsePreview(message?: Pick<UIMessage, 'parts'> | null) {
+  return truncateText(getMessageText(message))
+}
+
+export function normalizeExecutionSteps(args: {
+  assistantMessage?: UIMessage | null
+  previousSteps?: ExecutionStepRecord[]
+  nowIso: string
+}) {
+  const { assistantMessage, previousSteps = [], nowIso } = args
+  const previousStepsById = new Map(
+    previousSteps.map((step) => [step.id, step]),
+  )
+
+  const steps = assistantMessage
+    ? assistantMessage.parts.flatMap((part, index) => {
+        if (!isExecutionToolPart(part)) return []
+        const existingStep = previousStepsById.get(part.toolCallId)
+        return [createStepRecord(part, index, nowIso, existingStep)]
+      })
+    : []
+
+  return {
+    assistantMessageId: assistantMessage?.id,
+    steps,
+    actionCount: steps.length,
+    approvalCount: steps.filter((step) => step.approval).length,
+    deniedCount: steps.filter((step) => step.state === 'output-denied').length,
+    errorCount: steps.filter((step) => step.state === 'output-error').length,
+  }
+}
--- a/packages/browseros-agent/apps/agent/lib/execution-history/storage.ts
+++ b/packages/browseros-agent/apps/agent/lib/execution-history/storage.ts
@@ -0,0 +1,146 @@
+import { storage } from '@wxt-dev/storage'
+import { useEffect, useState } from 'react'
+import type {
+  ConversationExecutionHistory,
+  ExecutionHistoryByConversation,
+  ExecutionTaskRecord,
+} from './types'
+
+export const executionHistoryStorage =
+  storage.defineItem<ExecutionHistoryByConversation>(
+    'local:executionHistoryByConversation',
+    {
+      fallback: {},
+      version: 1,
+    },
+  )
+
+function upsertTaskInHistory(
+  history: ConversationExecutionHistory,
+  task: ExecutionTaskRecord,
+): ConversationExecutionHistory {
+  const existingIndex = history.tasks.findIndex((item) => item.id === task.id)
+  if (existingIndex === -1) {
+    return {
+      ...history,
+      updatedAt: Date.now(),
+      tasks: [...history.tasks, task],
+    }
+  }
+
+  const nextTasks = [...history.tasks]
+  nextTasks[existingIndex] = task
+  return {
+    ...history,
+    updatedAt: Date.now(),
+    tasks: nextTasks,
+  }
+}
+
+function createConversationHistory(
+  conversationId: string,
+): ConversationExecutionHistory {
+  return {
+    conversationId,
+    updatedAt: Date.now(),
+    tasks: [],
+  }
+}
+
+export async function upsertConversationExecutionTask(
+  task: ExecutionTaskRecord,
+): Promise<void> {
+  const current = (await executionHistoryStorage.getValue()) ?? {}
+  const history =
+    current[task.conversationId] ??
+    createConversationHistory(task.conversationId)
+
+  await executionHistoryStorage.setValue({
+    ...current,
+    [task.conversationId]: upsertTaskInHistory(history, task),
+  })
+}
+
+export async function getConversationExecutionHistory(
+  conversationId: string,
+): Promise<ConversationExecutionHistory | null> {
+  const current = (await executionHistoryStorage.getValue()) ?? {}
+  return current[conversationId] ?? null
+}
+
+export async function getExecutionHistoryByConversation(): Promise<ExecutionHistoryByConversation> {
+  return (await executionHistoryStorage.getValue()) ?? {}
+}
+
+export async function removeConversationExecutionHistory(
+  conversationId: string,
+): Promise<void> {
+  const current = (await executionHistoryStorage.getValue()) ?? {}
+  if (!(conversationId in current)) return
+
+  const { [conversationId]: _removed, ...rest } = current
+  await executionHistoryStorage.setValue(rest)
+}
+
+export async function removeConversationExecutionTask(args: {
+  conversationId: string
+  taskId: string
+}): Promise<void> {
+  const current = (await executionHistoryStorage.getValue()) ?? {}
+  const history = current[args.conversationId]
+  if (!history) return
+
+  const nextTasks = history.tasks.filter((task) => task.id !== args.taskId)
+  if (nextTasks.length === history.tasks.length) return
+
+  if (nextTasks.length === 0) {
+    const { [args.conversationId]: _removed, ...rest } = current
+    await executionHistoryStorage.setValue(rest)
+    return
+  }
+
+  await executionHistoryStorage.setValue({
+    ...current,
+    [args.conversationId]: {
+      ...history,
+      updatedAt: Date.now(),
+      tasks: nextTasks,
+    },
+  })
+}
+
+export function useConversationExecutionHistory(conversationId?: string) {
+  const [history, setHistory] = useState<ConversationExecutionHistory | null>(
+    null,
+  )
+
+  useEffect(() => {
+    if (!conversationId) {
+      setHistory(null)
+      return
+    }
+
+    getConversationExecutionHistory(conversationId).then(setHistory)
+    const unwatch = executionHistoryStorage.watch((nextValue) => {
+      setHistory(nextValue?.[conversationId] ?? null)
+    })
+    return () => unwatch()
+  }, [conversationId])
+
+  return history
+}
+
+export function useExecutionHistoryByConversation() {
+  const [historyByConversation, setHistoryByConversation] =
+    useState<ExecutionHistoryByConversation>({})
+
+  useEffect(() => {
+    getExecutionHistoryByConversation().then(setHistoryByConversation)
+    const unwatch = executionHistoryStorage.watch((nextValue) => {
+      setHistoryByConversation(nextValue ?? {})
+    })
+    return () => unwatch()
+  }, [])
+
+  return historyByConversation
+}
--- a/packages/browseros-agent/apps/agent/lib/execution-history/types.ts
+++ b/packages/browseros-agent/apps/agent/lib/execution-history/types.ts
@@ -0,0 +1,64 @@
+export type ExecutionTaskStatus =
+  | 'running'
+  | 'completed'
+  | 'stopped'
+  | 'failed'
+  | 'interrupted'
+
+export type ExecutionStepState =
+  | 'input-streaming'
+  | 'input-available'
+  | 'approval-requested'
+  | 'approval-responded'
+  | 'output-available'
+  | 'output-error'
+  | 'output-denied'
+
+export interface ExecutionStepApproval {
+  id: string
+  approved?: boolean
+  reason?: string
+}
+
+export interface ExecutionStepRecord {
+  id: string
+  toolName: string
+  order: number
+  state: ExecutionStepState
+  startedAt: string
+  completedAt?: string
+  input?: unknown
+  output?: unknown
+  errorText?: string
+  previewText: string
+  approval?: ExecutionStepApproval
+}
+
+export interface ExecutionTaskRecord {
+  id: string
+  conversationId: string
+  promptText: string
+  promptMessageId?: string
+  assistantMessageId?: string
+  startedAt: string
+  completedAt?: string
+  status: ExecutionTaskStatus
+  responseText?: string
+  responsePreview?: string
+  actionCount: number
+  approvalCount: number
+  deniedCount: number
+  errorCount: number
+  steps: ExecutionStepRecord[]
+}
+
+export interface ConversationExecutionHistory {
+  conversationId: string
+  updatedAt: number
+  tasks: ExecutionTaskRecord[]
+}
+
+export type ExecutionHistoryByConversation = Record<
+  string,
+  ConversationExecutionHistory
+>
--- a/packages/browseros-agent/apps/agent/lib/llm-providers/models-dev-data.json
+++ b/packages/browseros-agent/apps/agent/lib/llm-providers/models-dev-data.json
@@ -5402,89 +5402,5 @@
        "outputCost": 0
      }
    ]
-  },
-  "minimax": {
-    "name": "MiniMax",
-    "api": "https://api.minimaxi.com/v1",
-    "doc": "https://platform.minimax.io",
-    "models": [
-      {
-        "id": "MiniMax-M2.7",
-        "name": "MiniMax M2.7",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": true,
-        "supportsToolCall": true,
-        "inputCost": 0.3,
-        "outputCost": 1.2
-      },
-      {
-        "id": "MiniMax-M2.7-highspeed",
-        "name": "MiniMax M2.7 Highspeed",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": true,
-        "supportsToolCall": true,
-        "inputCost": 0.6,
-        "outputCost": 2.4
-      },
-      {
-        "id": "MiniMax-M2.5",
-        "name": "MiniMax M2.5",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": true,
-        "supportsToolCall": true,
-        "inputCost": 0.3,
-        "outputCost": 1.2
-      },
-      {
-        "id": "MiniMax-M2.5-highspeed",
-        "name": "MiniMax M2.5 Highspeed",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": true,
-        "supportsToolCall": true,
-        "inputCost": 0.6,
-        "outputCost": 2.4
-      },
-      {
-        "id": "MiniMax-M2.1",
-        "name": "MiniMax M2.1",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": true,
-        "supportsToolCall": true,
-        "inputCost": 0.3,
-        "outputCost": 1.2
-      },
-      {
-        "id": "MiniMax-M2.1-highspeed",
-        "name": "MiniMax M2.1 Highspeed",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": true,
-        "supportsToolCall": true,
-        "inputCost": 0.6,
-        "outputCost": 2.4
-      },
-      {
-        "id": "M2-her",
-        "name": "M2-her",
-        "contextWindow": 204800,
-        "maxOutput": 8192,
-        "supportsImages": false,
-        "supportsReasoning": false,
-        "supportsToolCall": true,
-        "inputCost": 0.3,
-        "outputCost": 1.2
-      }
-    ]
  }
 }
--- a/packages/browseros-agent/apps/agent/lib/llm-providers/providerIcons.tsx
+++ b/packages/browseros-agent/apps/agent/lib/llm-providers/providerIcons.tsx
@@ -5,7 +5,6 @@ import {
  Gemini,
  Kimi,
  LmStudio,
-  Minimax,
  Ollama,
  OpenAI,
  OpenRouter,
@@ -37,7 +36,6 @@ const providerIconMap: Record<ProviderType, IconComponent | null> = {
  'chatgpt-pro': OpenAI,
  'github-copilot': Github,
  'qwen-code': Qwen,
-  minimax: Minimax,
 }

 interface ProviderIconProps {
--- a/packages/browseros-agent/apps/agent/lib/llm-providers/providerTemplates.ts
+++ b/packages/browseros-agent/apps/agent/lib/llm-providers/providerTemplates.ts
@@ -140,31 +140,8 @@ export const providerTemplates: ProviderTemplate[] = [
    setupGuideUrl:
      'https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started.html',
  }),
-  enrichTemplate('minimax', {
-    defaultModelId: 'MiniMax-M2.7',
-    apiKeyUrl:
-      'https://platform.minimax.io/user-center/basic-information/interface-key',
-    setupGuideUrl: 'https://platform.minimax.io/docs/guides/models-intro',
-  }),
 ]

-export const MINIMAX_REGIONS = {
-  chinese: {
-    api: 'https://api.minimaxi.com/v1',
-    apiKeyUrl:
-      'https://platform.minimaxi.com/user-center/basic-information/interface-key',
-    setupGuideUrl: 'https://platform.minimaxi.com/document',
-  },
-  international: {
-    api: 'https://api.minimax.io/v1',
-    apiKeyUrl:
-      'https://platform.minimax.io/user-center/basic-information/interface-key',
-    setupGuideUrl: 'https://platform.minimax.io/docs/guides/models-intro',
-  },
-} as const
-
-export type MinimaxRegion = keyof typeof MINIMAX_REGIONS
-
 /**
 * Provider type options for select dropdowns
 * @public
@@ -184,7 +161,6 @@ export const providerTypeOptions: { value: ProviderType; label: string }[] = [
  { value: 'lmstudio', label: 'LM Studio' },
  { value: 'bedrock', label: 'AWS Bedrock' },
  { value: 'browseros', label: 'BrowserOS' },
-  { value: 'minimax', label: 'MiniMax' },
 ]

 /**
@@ -216,7 +192,6 @@ export const DEFAULT_BASE_URLS: Record<ProviderType, string> = {
  lmstudio: 'http://localhost:1234/v1',
  bedrock: '',
  browseros: '',
-  minimax: MINIMAX_REGIONS.chinese.api,
 }

 /**
--- a/packages/browseros-agent/apps/agent/lib/llm-providers/types.ts
+++ b/packages/browseros-agent/apps/agent/lib/llm-providers/types.ts
@@ -17,7 +17,6 @@ export type ProviderType =
  | 'chatgpt-pro'
  | 'github-copilot'
  | 'qwen-code'
-  | 'minimax'

 /**
 * LLM Provider configuration
--- a/packages/browseros-agent/apps/agent/lib/messaging/server/buildChatRequestBody.test.ts
+++ b/packages/browseros-agent/apps/agent/lib/messaging/server/buildChatRequestBody.test.ts
@@ -0,0 +1,84 @@
+import { describe, expect, it } from 'bun:test'
+import type { LlmProviderConfig } from '@/lib/llm-providers/types'
+import type { ToolApprovalConfig } from '@/lib/tool-approvals/types'
+import { buildChatRequestBody } from './buildChatRequestBody'
+
+const provider: LlmProviderConfig = {
+  id: 'browseros',
+  type: 'browseros',
+  name: 'BrowserOS',
+  modelId: 'browseros-auto',
+  supportsImages: true,
+  contextWindow: 200000,
+  temperature: 0,
+  createdAt: 0,
+  updatedAt: 0,
+}
+
+describe('buildChatRequestBody', () => {
+  it('preserves approval config and browser context on approval resumes', () => {
+    const toolApprovalConfig: ToolApprovalConfig = {
+      categories: {
+        input: true,
+        navigation: true,
+        observation: true,
+        screenshots: true,
+        scripts: true,
+        'data-modification': true,
+        assistant: true,
+      },
+    }
+
+    const body = buildChatRequestBody({
+      conversationId: '6ff46e3b-e45a-40a4-9157-ca520e800f43',
+      provider,
+      mode: 'agent',
+      browserContext: {
+        windowId: 2,
+        activeTab: {
+          id: 10,
+          url: 'https://amazon.com',
+          title: 'Amazon',
+        },
+        enabledMcpServers: ['slack'],
+      },
+      userSystemPrompt: 'Stay in the current tab.',
+      toolApprovalConfig,
+      toolApprovalResponses: [
+        {
+          approvalId: 'approval-1',
+          approved: true,
+        },
+      ],
+    })
+
+    expect(body.toolApprovalConfig).toEqual(toolApprovalConfig)
+    expect(body.browserContext).toEqual({
+      windowId: 2,
+      activeTab: {
+        id: 10,
+        url: 'https://amazon.com',
+        title: 'Amazon',
+      },
+      enabledMcpServers: ['slack'],
+    })
+    expect(body.toolApprovalResponses).toEqual([
+      {
+        approvalId: 'approval-1',
+        approved: true,
+      },
+    ])
+  })
+
+  it('omits empty approval configs from requests', () => {
+    const body = buildChatRequestBody({
+      conversationId: '6ff46e3b-e45a-40a4-9157-ca520e800f43',
+      provider,
+      toolApprovalConfig: {
+        categories: {},
+      },
+    })
+
+    expect(body.toolApprovalConfig).toBeUndefined()
+  })
+})
--- a/packages/browseros-agent/apps/agent/lib/messaging/server/buildChatRequestBody.ts
+++ b/packages/browseros-agent/apps/agent/lib/messaging/server/buildChatRequestBody.ts
@@ -0,0 +1,115 @@
+import type { AclRule } from '@browseros/shared/types/acl'
+import type { ChatMode } from '@/entrypoints/sidepanel/index/chatTypes'
+import type { LlmProviderConfig } from '@/lib/llm-providers/types'
+import type { ToolApprovalConfig } from '@/lib/tool-approvals/types'
+
+export interface ApprovalResponseData {
+  approvalId: string
+  approved: boolean
+  reason?: string
+}
+
+export interface ChatHistoryEntry {
+  role: 'user' | 'assistant'
+  content: string
+}
+
+export interface ChatRequestBrowserContext {
+  windowId?: number
+  activeTab?: {
+    id?: number
+    url?: string
+    title?: string
+  }
+  selectedTabs?: {
+    id?: number
+    url?: string
+    title?: string
+  }[]
+  enabledMcpServers?: string[]
+  customMcpServers?: {
+    name: string
+    url?: string
+  }[]
+}
+
+interface ChatRequestBodyParams {
+  conversationId: string
+  provider: LlmProviderConfig
+  message?: string
+  mode?: ChatMode
+  browserContext?: ChatRequestBrowserContext
+  userSystemPrompt?: string
+  userWorkingDir?: string
+  supportsImages?: boolean
+  previousConversation?: ChatHistoryEntry[] | string
+  declinedApps?: string[]
+  aclRules?: AclRule[]
+  selectedText?: string
+  selectedTextSource?: {
+    url: string
+    title: string
+  }
+  toolApprovalConfig?: ToolApprovalConfig
+  toolApprovalResponses?: ApprovalResponseData[]
+  isScheduledTask?: boolean
+}
+
+export const toRequestToolApprovalConfig = (
+  approvalConfig?: ToolApprovalConfig,
+): ToolApprovalConfig | undefined => {
+  if (!approvalConfig) return undefined
+  return Object.values(approvalConfig.categories).some(Boolean)
+    ? approvalConfig
+    : undefined
+}
+
+export const buildChatRequestBody = ({
+  conversationId,
+  provider,
+  message = '',
+  mode,
+  browserContext,
+  userSystemPrompt,
+  userWorkingDir,
+  supportsImages,
+  previousConversation,
+  declinedApps,
+  aclRules,
+  selectedText,
+  selectedTextSource,
+  toolApprovalConfig,
+  toolApprovalResponses,
+  isScheduledTask,
+}: ChatRequestBodyParams) => ({
+  message,
+  provider: provider.type,
+  providerType: provider.type,
+  providerName: provider.name,
+  apiKey: provider.apiKey,
+  baseUrl: provider.baseUrl,
+  conversationId,
+  model: provider.modelId ?? 'default',
+  mode,
+  contextWindowSize: provider.contextWindow,
+  temperature: provider.temperature,
+  resourceName: provider.resourceName,
+  accessKeyId: provider.accessKeyId,
+  secretAccessKey: provider.secretAccessKey,
+  region: provider.region,
+  sessionToken: provider.sessionToken,
+  reasoningEffort: provider.reasoningEffort,
+  reasoningSummary: provider.reasoningSummary,
+  browserContext,
+  userSystemPrompt,
+  userWorkingDir,
+  supportsImages: supportsImages ?? provider.supportsImages,
+  previousConversation,
+  declinedApps: declinedApps?.length ? declinedApps : undefined,
+  aclRules: aclRules?.length ? aclRules : undefined,
+  selectedText,
+  selectedTextSource,
+  toolApprovalConfig: toRequestToolApprovalConfig(toolApprovalConfig),
+  toolApprovalResponses,
+  isScheduledTask,
+})
--- a/packages/browseros-agent/apps/agent/lib/referral/submit-referral.ts
+++ b/packages/browseros-agent/apps/agent/lib/referral/submit-referral.ts
@@ -1,108 +0,0 @@
-import { EXTERNAL_URLS } from '@browseros/shared/constants/urls'
-
-interface ReferralResult {
-  success: boolean
-  creditsAdded?: number
-  reason?: string
-}
-
-export async function submitReferral(
-  tweetUrl: string,
-  browserosId: string,
-): Promise<ReferralResult> {
-  const response = await fetch(
-    `${EXTERNAL_URLS.REFERRAL_SERVICE}/referral/submit`,
-    {
-      method: 'POST',
-      headers: { 'Content-Type': 'application/json' },
-      body: JSON.stringify({ tweetUrl, browserosId }),
-    },
-  )
-  if (!response.ok) {
-    return {
-      success: false,
-      reason: `Request failed with status ${response.status}`,
-    }
-  }
-  return response.json()
-}
-
-const TWEET_VARIATIONS = [
-  `ngl @browseros_ai is kinda wild
-
-just type what u want in plain english and it handles the annoying web shit
-
-forms, research, data pulls... all automated
-
-actually works`,
-
-  `been using @browseros_ai to chat with webpages lately
-
-summarize articles, pull data, translate stuff
-
-all happens in the same tab
-
-no copy/paste, no switching windows
-
-just ask and it does it`,
-
-  `wake up to @browseros_ai having already read ur emails and calendar while u were sleeping
-
-scheduled agents are lowkey magic`,
-
-  `ngl @browseros_ai is kinda crazy
-
-connects gmail, slack, linear, notion + 40 other apps into one ai assistant
-
-just talk to it in plain english and it handles cross-app workflows for u
-
-no more switching between tabs like a psycho`,
-
-  `i use @browseros_ai to automate research
-
-it handles the browser work and drops reports straight into local folders
-
-no switching between tools or manually saving files
-
-just one task instead of three`,
-
-  `been messing with @browseros_ai lately
-
-it comes with a prebuilt MCP server and I connect it claude code or codex and it just runs things for you
-
-set it up once, use it whenever
-
-way better than clicking through the same shit manually every time`,
-
-  `the ai actually remembers what we talked about yesterday
-
-no more "here's the context again" every single conversation
-
-@browseros_ai just picks up where we left off
-
-feels like talking to someone who actually pays attention`,
-
-  `i built a skill library for my ai agent
-
-now when i need it to do something specific, i just load the recipe i made earlier
-
-@browseros_ai MCP is very handy`,
-
-  `been running @browseros_ai with ollama locally
-
-everything stays on my machine, nothing gets sent out
-
-kinda nice not having to think about what data i'm sharing`,
-
-  `switched to @browseros_ai from chrome
-
-blocks 10x more ads and runs full ublock origin (not the lite version)
-
-check it out`,
-]
-
-export function getShareOnTwitterUrl(): string {
-  const text =
-    TWEET_VARIATIONS[Math.floor(Math.random() * TWEET_VARIATIONS.length)]
-  return `https://x.com/intent/tweet?text=${encodeURIComponent(text)}`
-}
--- a/packages/browseros-agent/apps/agent/lib/schedules/getChatServerResponse.ts
+++ b/packages/browseros-agent/apps/agent/lib/schedules/getChatServerResponse.ts
@@ -8,6 +8,7 @@ import {
 } from '@/lib/llm-providers/storage'
 import type { LlmProviderConfig } from '@/lib/llm-providers/types'
 import { mcpServerStorage } from '@/lib/mcp/mcpServerStorage'
+import { buildChatRequestBody } from '@/lib/messaging/server/buildChatRequestBody'
 import { personalizationStorage } from '../personalization/personalizationStorage'
 import { scheduleSystemPrompt } from './scheduleSystemPrompt'
 import type { ToolCallExecution } from './scheduleTypes'
@@ -112,42 +113,31 @@ export async function getChatServerResponse(
    headers: {
      'Content-Type': 'application/json',
    },
-    // Important: this chat logic is also used in apps/agent/entrypoints/sidepanel/index/useChatSession.ts for sidepanel conversation. Make sure to keep them in sync for any future changes.
    body: JSON.stringify({
      messages: [{ role: 'user', content: request.message }],
-      message: request.message,
-      provider: provider?.type,
-      providerType: provider?.type,
-      providerName: provider?.name,
-      apiKey: provider?.apiKey,
-      baseUrl: provider?.baseUrl,
-      conversationId,
-      model: provider?.modelId ?? 'default',
-      mode: request.mode ?? 'agent',
-      contextWindowSize: provider?.contextWindow,
-      temperature: provider?.temperature,
-      resourceName: provider?.resourceName,
-      accessKeyId: provider?.accessKeyId,
-      secretAccessKey: provider?.secretAccessKey,
-      region: provider?.region,
-      sessionToken: provider?.sessionToken,
-      browserContext:
-        request.activeTab ||
-        request.windowId ||
-        enabledMcpServers.length ||
-        customMcpServers.length
-          ? {
-              windowId: request.windowId,
-              activeTab: request.activeTab,
-              enabledMcpServers:
-                enabledMcpServers.length > 0 ? enabledMcpServers : undefined,
-              customMcpServers:
-                customMcpServers.length > 0 ? customMcpServers : undefined,
-            }
-          : undefined,
-      userSystemPrompt: `${personalization}\n${scheduleSystemPrompt}`,
-      isScheduledTask: true,
-      supportsImages: provider?.supportsImages,
+      ...buildChatRequestBody({
+        message: request.message,
+        conversationId,
+        provider,
+        mode: request.mode ?? 'agent',
+        browserContext:
+          request.activeTab ||
+          request.windowId ||
+          enabledMcpServers.length ||
+          customMcpServers.length
+            ? {
+                windowId: request.windowId,
+                activeTab: request.activeTab,
+                enabledMcpServers:
+                  enabledMcpServers.length > 0 ? enabledMcpServers : undefined,
+                customMcpServers:
+                  customMcpServers.length > 0 ? customMcpServers : undefined,
+              }
+            : undefined,
+        userSystemPrompt: `${personalization}\n${scheduleSystemPrompt}`,
+        supportsImages: provider.supportsImages,
+        isScheduledTask: true,
+      }),
    }),
  })

--- a/packages/browseros-agent/apps/agent/lib/sse.ts
+++ b/packages/browseros-agent/apps/agent/lib/sse.ts
@@ -0,0 +1,71 @@
+function isAbortError(error: unknown): boolean {
+  return error instanceof DOMException && error.name === 'AbortError'
+}
+
+export function parseSSELines<T>(buffer: string): {
+  events: T[]
+  remainder: string
+} {
+  const lines = buffer.split('\n')
+  const remainder = lines.pop() ?? ''
+  const events: T[] = []
+
+  for (const line of lines) {
+    if (!line.startsWith('data: ')) continue
+    const payload = line.slice(6)
+    if (payload === '[DONE]') continue
+    try {
+      events.push(JSON.parse(payload) as T)
+    } catch {}
+  }
+
+  return { events, remainder }
+}
+
+export async function consumeSSEStream<T>(
+  response: Response,
+  onEvent: (event: T) => void,
+  signal?: AbortSignal,
+): Promise<void> {
+  const reader = response.body?.getReader()
+  if (!reader) return
+
+  const decoder = new TextDecoder()
+  let buffer = ''
+
+  const abortReader = () => {
+    void reader.cancel()
+  }
+
+  signal?.addEventListener('abort', abortReader, { once: true })
+
+  try {
+    while (true) {
+      const { done, value } = await reader.read()
+      if (done) break
+
+      buffer += decoder.decode(value, { stream: true })
+      const { events, remainder } = parseSSELines<T>(buffer)
+      buffer = remainder
+
+      for (const event of events) {
+        onEvent(event)
+      }
+    }
+  } catch (error) {
+    if (signal?.aborted || isAbortError(error)) return
+    throw error
+  } finally {
+    signal?.removeEventListener('abort', abortReader)
+    const trailing = decoder.decode()
+    if (trailing) {
+      buffer += trailing
+    }
+    if (buffer) {
+      const { events } = parseSSELines<T>(buffer)
+      for (const event of events) {
+        onEvent(event)
+      }
+    }
+  }
+}
--- a/packages/browseros-agent/apps/agent/lib/tool-approvals/approval-sync-helpers.ts
+++ b/packages/browseros-agent/apps/agent/lib/tool-approvals/approval-sync-helpers.ts
@@ -0,0 +1,84 @@
+import type { UIMessage } from 'ai'
+import type { ApprovalResponse, PendingApproval } from './approval-sync-storage'
+
+export function extractPendingApprovals(
+  messages: UIMessage[],
+  conversationId: string,
+  timestamp = Date.now(),
+): PendingApproval[] {
+  const pending: PendingApproval[] = []
+
+  for (const msg of messages) {
+    for (const part of msg.parts) {
+      const toolPart = part as {
+        type?: string
+        state?: string
+        toolCallId?: string
+        input?: Record<string, unknown>
+        approval?: { id: string }
+      }
+
+      if (
+        toolPart.state === 'approval-requested' &&
+        toolPart.approval?.id &&
+        toolPart.toolCallId
+      ) {
+        pending.push({
+          approvalId: toolPart.approval.id,
+          toolCallId: toolPart.toolCallId,
+          toolName: (toolPart.type ?? '').replace('tool-', ''),
+          input: toolPart.input ?? {},
+          conversationId,
+          timestamp,
+        })
+      }
+    }
+  }
+
+  return pending
+}
+
+export function replacePendingApprovalsForConversation(
+  existing: PendingApproval[],
+  conversationId: string,
+  next: PendingApproval[],
+): PendingApproval[] {
+  const existingByApprovalId = new Map(
+    existing.map((item) => [item.approvalId, item]),
+  )
+  const preserved = next.map((item) => {
+    const current = existingByApprovalId.get(item.approvalId)
+    return current ? { ...item, timestamp: current.timestamp } : item
+  })
+
+  return [
+    ...existing.filter((item) => item.conversationId !== conversationId),
+    ...preserved,
+  ]
+}
+
+export function queueApprovalResponse(
+  existing: ApprovalResponse[],
+  response: ApprovalResponse,
+): ApprovalResponse[] {
+  return [
+    ...existing.filter((item) => item.approvalId !== response.approvalId),
+    response,
+  ]
+}
+
+export function removePendingApprovalsById(
+  existing: PendingApproval[],
+  approvalIds: string[],
+): PendingApproval[] {
+  const ids = new Set(approvalIds)
+  return existing.filter((item) => !ids.has(item.approvalId))
+}
+
+export function removeApprovalResponsesById(
+  existing: ApprovalResponse[],
+  approvalIds: string[],
+): ApprovalResponse[] {
+  const ids = new Set(approvalIds)
+  return existing.filter((item) => !ids.has(item.approvalId))
+}
--- a/packages/browseros-agent/apps/agent/lib/tool-approvals/approval-sync-storage.test.ts
+++ b/packages/browseros-agent/apps/agent/lib/tool-approvals/approval-sync-storage.test.ts
@@ -0,0 +1,128 @@
+import { describe, expect, it } from 'bun:test'
+import type { UIMessage } from 'ai'
+import {
+  extractPendingApprovals,
+  queueApprovalResponse,
+  removeApprovalResponsesById,
+  removePendingApprovalsById,
+  replacePendingApprovalsForConversation,
+} from './approval-sync-helpers'
+
+describe('approval sync storage helpers', () => {
+  it('extracts pending approvals from assistant tool parts', () => {
+    const messages = [
+      {
+        id: 'assistant-1',
+        role: 'assistant',
+        parts: [
+          {
+            type: 'tool-click',
+            state: 'approval-requested',
+            toolCallId: 'tool-1',
+            input: { selector: '#buy-now' },
+            approval: { id: 'approval-1' },
+          },
+        ],
+      },
+    ] as UIMessage[]
+
+    expect(extractPendingApprovals(messages, 'conversation-1', 123)).toEqual([
+      {
+        approvalId: 'approval-1',
+        toolCallId: 'tool-1',
+        toolName: 'click',
+        input: { selector: '#buy-now' },
+        conversationId: 'conversation-1',
+        timestamp: 123,
+      },
+    ])
+  })
+
+  it('replaces pending approvals for one conversation without clearing others', () => {
+    const existing = [
+      {
+        approvalId: 'approval-a',
+        toolCallId: 'tool-a',
+        toolName: 'click',
+        input: {},
+        conversationId: 'conversation-a',
+        timestamp: 1,
+      },
+      {
+        approvalId: 'approval-b',
+        toolCallId: 'tool-b',
+        toolName: 'navigate_page',
+        input: {},
+        conversationId: 'conversation-b',
+        timestamp: 2,
+      },
+    ]
+
+    expect(
+      replacePendingApprovalsForConversation(existing, 'conversation-a', []),
+    ).toEqual([existing[1]])
+  })
+
+  it('queues and removes approval responses by approval id', () => {
+    const queued = queueApprovalResponse(
+      [
+        {
+          approvalId: 'approval-a',
+          approved: true,
+          timestamp: 1,
+        },
+      ],
+      {
+        approvalId: 'approval-b',
+        approved: false,
+        timestamp: 2,
+      },
+    )
+
+    expect(queued).toEqual([
+      {
+        approvalId: 'approval-a',
+        approved: true,
+        timestamp: 1,
+      },
+      {
+        approvalId: 'approval-b',
+        approved: false,
+        timestamp: 2,
+      },
+    ])
+
+    expect(removeApprovalResponsesById(queued, ['approval-a'])).toEqual([
+      {
+        approvalId: 'approval-b',
+        approved: false,
+        timestamp: 2,
+      },
+    ])
+  })
+
+  it('removes only handled pending approvals', () => {
+    const pending = [
+      {
+        approvalId: 'approval-a',
+        toolCallId: 'tool-a',
+        toolName: 'click',
+        input: {},
+        conversationId: 'conversation-a',
+        timestamp: 1,
+      },
+      {
+        approvalId: 'approval-b',
+        toolCallId: 'tool-b',
+        toolName: 'fill',
+        input: {},
+        conversationId: 'conversation-b',
+        timestamp: 2,
+      },
+    ]
+
+    expect(removePendingApprovalsById(pending, ['approval-b'])).toEqual([
+      pending[0],
+    ])
+  })
+})
--- a/packages/browseros-agent/apps/agent/lib/tool-approvals/approval-sync-storage.ts
+++ b/packages/browseros-agent/apps/agent/lib/tool-approvals/approval-sync-storage.ts
@@ -0,0 +1,47 @@
+import { storage } from '@wxt-dev/storage'
+
+export {
+  extractPendingApprovals,
+  queueApprovalResponse,
+  removeApprovalResponsesById,
+  removePendingApprovalsById,
+  replacePendingApprovalsForConversation,
+} from './approval-sync-helpers'
+
+export interface PendingApproval {
+  approvalId: string
+  toolCallId: string
+  toolName: string
+  input: Record<string, unknown>
+  conversationId: string
+  timestamp: number
+}
+
+export interface ApprovalResponse {
+  approvalId: string
+  approved: boolean
+  reason?: string
+  timestamp: number
+}
+
+export interface ToolExecutionLogEntry {
+  toolCallId: string
+  toolName: string
+  status: 'auto-allowed' | 'approved' | 'denied' | 'error'
+  conversationId: string
+  timestamp: number
+  input?: Record<string, unknown>
+}
+
+export const pendingToolApprovalsStorage = storage.defineItem<
+  PendingApproval[]
+>('local:pending-tool-approvals', { fallback: [] })
+
+export const approvalResponsesStorage = storage.defineItem<ApprovalResponse[]>(
+  'local:approval-responses',
+  { fallback: [] },
+)
+
+export const toolExecutionLogStorage = storage.defineItem<
+  ToolExecutionLogEntry[]
+>('local:tool-execution-log', { fallback: [] })
--- a/packages/browseros-agent/apps/agent/lib/tool-approvals/storage.ts
+++ b/packages/browseros-agent/apps/agent/lib/tool-approvals/storage.ts
@@ -0,0 +1,38 @@
+import { storage } from '@wxt-dev/storage'
+import type { ToolApprovalCategoryId, ToolApprovalConfig } from './types'
+
+export const toolApprovalConfigStorage = storage.defineItem<ToolApprovalConfig>(
+  'local:tool-approval-config',
+  {
+    fallback: {
+      categories: {},
+    },
+  },
+)
+
+const LEGACY_ALL_CATEGORY_IDS: ToolApprovalCategoryId[] = [
+  'input',
+  'navigation',
+  'screenshots',
+  'scripts',
+  'data-modification',
+]
+
+const NEW_CATEGORY_IDS: ToolApprovalCategoryId[] = ['observation', 'assistant']
+
+export function normalizeToolApprovalConfig(
+  config: ToolApprovalConfig,
+): ToolApprovalConfig {
+  const categories = { ...config.categories }
+  const shouldMigrateLegacyAll =
+    LEGACY_ALL_CATEGORY_IDS.every((id) => categories[id] === true) &&
+    NEW_CATEGORY_IDS.every((id) => categories[id] === undefined)
+
+  if (shouldMigrateLegacyAll) {
+    for (const id of NEW_CATEGORY_IDS) {
+      categories[id] = true
+    }
+  }
+
+  return { categories }
+}
--- a/packages/browseros-agent/apps/agent/lib/tool-approvals/types.ts
+++ b/packages/browseros-agent/apps/agent/lib/tool-approvals/types.ts
@@ -0,0 +1,7 @@
+export {
+  TOOL_APPROVAL_CATEGORIES as TOOL_CATEGORIES,
+  TOOL_APPROVAL_CATEGORIES,
+  type ToolApprovalCategory,
+  type ToolApprovalCategoryId,
+  type ToolApprovalConfig,
+} from '@browseros/shared/constants/tool-approval'
--- a/packages/browseros-agent/apps/agent/package.json
+++ b/packages/browseros-agent/apps/agent/package.json
@@ -9,9 +9,9 @@
    "build": "bun run codegen && wxt build",
    "build:dev": "bun --env-file=.env.development wxt build --mode development",
    "zip": "wxt zip",
-    "compile": "tsgo --noEmit",
+    "compile": "bun --env-file=.env.development wxt prepare && tsgo --noEmit",
    "lint": "bunx biome check",
-    "typecheck": "tsgo --noEmit",
+    "typecheck": "bun --env-file=.env.development wxt prepare && tsgo --noEmit",
    "lint:fix": "bunx biome check --write --unsafe",
    "clean:cache": "rm -rf node_modules/.cache && rm -rf .output/ && rm -rf .wxt/",
    "codegen": "bun --env-file=.env.development graphql-codegen --config codegen.ts",
@@ -52,6 +52,9 @@
    "@types/dompurify": "^3.2.0",
    "@webext-core/messaging": "^2.3.0",
    "@wxt-dev/storage": "^1.2.8",
+    "@xterm/addon-fit": "^0.11.0",
+    "@xterm/addon-web-links": "^0.12.0",
+    "@xterm/xterm": "^6.0.0",
    "@xyflow/react": "^12.9.3",
    "ai": "^6.0.94",
    "better-auth": "^1.4.17",
--- a/packages/browseros-agent/apps/agent/styles/global.css
+++ b/packages/browseros-agent/apps/agent/styles/global.css
@@ -221,6 +221,37 @@
    background-clip: padding-box;
  }

+  .agent-terminal-shell .xterm {
+    height: 100%;
+  }
+
+  .agent-terminal-shell .xterm-viewport {
+    overscroll-behavior: contain;
+    scrollbar-width: thin;
+    scrollbar-color: oklch(from var(--border) l c h / 0.45) transparent;
+  }
+
+  .agent-terminal-shell .xterm-viewport::-webkit-scrollbar {
+    width: 8px;
+    height: 8px;
+  }
+
+  .agent-terminal-shell .xterm-viewport::-webkit-scrollbar-track {
+    background: transparent;
+  }
+
+  .agent-terminal-shell .xterm-viewport::-webkit-scrollbar-thumb {
+    background: oklch(from var(--border) l c h / 0.45);
+    border: 2px solid transparent;
+    border-radius: 999px;
+    background-clip: padding-box;
+  }
+
+  .agent-terminal-shell .xterm-viewport::-webkit-scrollbar-thumb:hover {
+    background: oklch(from var(--border) l c h / 0.7);
+    background-clip: padding-box;
+  }
+
  /* Custom animation keyframes for minimal, elegant hero animations */
  @keyframes float {
    0%,
--- a/packages/browseros-agent/apps/agent/tsconfig.json
+++ b/packages/browseros-agent/apps/agent/tsconfig.json
@@ -5,7 +5,8 @@
    "allowImportingTsExtensions": true,
    "jsx": "react-jsx",
    "paths": {
-      "@/*": ["./*"]
+      "@/*": ["./*"],
+      "@browseros/shared/*": ["../../packages/shared/src/*"]
    },
    "plugins": [
      {
--- a/packages/browseros-agent/apps/cli/cmd/root.go
+++ b/packages/browseros-agent/apps/cli/cmd/root.go
@@ -56,6 +56,7 @@ var groupOrder = []string{
 	"Observe:",
 	"Input:",
 	"Resources:",
+	"Integrations:",
 	"Setup:",
 }

--- a/packages/browseros-agent/apps/cli/cmd/root_test.go
+++ b/packages/browseros-agent/apps/cli/cmd/root_test.go
@@ -33,6 +33,7 @@ func TestCommandName(t *testing.T) {
 		{"known command", []string{"health"}, "browseros-cli health"},
 		{"unknown command", []string{"nonexistent"}, "unknown"},
 		{"subcommand", []string{"bookmark", "search"}, "browseros-cli bookmark search"},
+		{"strata subcommand", []string{"strata", "check"}, "browseros-cli strata check"},
 		{"known with extra args", []string{"snap", "--enhanced"}, "browseros-cli snap"},
 	}
 	for _, tt := range tests {
--- a/packages/browseros-agent/apps/cli/cmd/strata.go
+++ b/packages/browseros-agent/apps/cli/cmd/strata.go
@@ -0,0 +1,235 @@
+package cmd
+
+import (
+	"browseros-cli/output"
+
+	"github.com/spf13/cobra"
+)
+
+func init() {
+	strataCmd := &cobra.Command{
+		Use:         "strata",
+		Annotations: map[string]string{"group": "Integrations:"},
+		Short:       "Manage Strata MCP integrations (Gmail, Slack, GitHub, etc.)",
+		Long: `Interact with 40+ external services via Strata MCP integrations.
+
+Supported services:
+  gmail, google calendar, google docs, google drive, google sheets, slack,
+  linkedin, notion, airtable, confluence, github, gitlab, linear, jira,
+  figma, salesforce, hubspot, stripe, discord, asana, clickup, zendesk,
+  monday, shopify, dropbox, onedrive, box, youtube, whatsapp, resend,
+  posthog, mixpanel, vercel, supabase, cloudflare, wordpress, postman,
+  intercom, cal.com, brave search, microsoft teams, outlook mail,
+  outlook calendar, google forms, mem0
+
+Discovery flow — do not guess action names:
+  1. check     → verify the service is connected (get auth URL if not)
+  2. discover  → find categories or actions for a service
+  3. actions   → expand categories into specific actions
+  4. details   → get the parameter schema before executing
+  5. exec      → execute the action with parameters
+  6. search    → fallback keyword search if discover doesn't find it
+
+Authentication:
+  If a service is not connected, "check" returns an authUrl.
+  Open that URL in a browser to authenticate, then retry.
+  If "exec" fails with an auth error, use "auth" to get a fresh authUrl.
+
+Example — search Gmail:
+  browseros-cli strata check gmail
+  browseros-cli strata discover "search emails" gmail
+  browseros-cli strata actions GMAIL_EMAIL
+  browseros-cli strata details GMAIL_EMAIL gmail_search_emails
+  browseros-cli strata exec gmail GMAIL_EMAIL gmail_search_emails \
+    --body '{"query":"from:user@example.com","maxResults":5}'`,
+	}
+
+	checkCmd := &cobra.Command{
+		Use:   "check <server-name>",
+		Short: "Check if a service is connected and ready",
+		Args:  cobra.ExactArgs(1),
+		Run: func(cmd *cobra.Command, args []string) {
+			c := newClient()
+			result, err := c.CallTool("connector_mcp_servers", map[string]any{
+				"server_name": args[0],
+			})
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+
+	discoverCmd := &cobra.Command{
+		Use:   "discover <query> <server> [servers...]",
+		Short: "Discover available categories or actions for servers",
+		Args:  cobra.MinimumNArgs(2),
+		Run: func(cmd *cobra.Command, args []string) {
+			c := newClient()
+			result, err := c.CallTool("discover_server_categories_or_actions", map[string]any{
+				"user_query":   args[0],
+				"server_names": args[1:],
+			})
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+
+	actionsCmd := &cobra.Command{
+		Use:   "actions <category> [categories...]",
+		Short: "Get actions within categories",
+		Args:  cobra.MinimumNArgs(1),
+		Run: func(cmd *cobra.Command, args []string) {
+			c := newClient()
+			result, err := c.CallTool("get_category_actions", map[string]any{
+				"category_names": args,
+			})
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+
+	detailsCmd := &cobra.Command{
+		Use:   "details <category> <action>",
+		Short: "Get parameter schema for an action",
+		Args:  cobra.ExactArgs(2),
+		Run: func(cmd *cobra.Command, args []string) {
+			c := newClient()
+			result, err := c.CallTool("get_action_details", map[string]any{
+				"category_name": args[0],
+				"action_name":   args[1],
+			})
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+
+	execCmd := &cobra.Command{
+		Use:   "exec <server> <category> <action>",
+		Short: "Execute an action on a connected service",
+		Long: `Execute an action on a connected service.
+
+Pass request body as a JSON string with --body.
+Use --query and --path for query/path parameters.
+Use --output-field to limit response fields.
+
+Example:
+  browseros-cli strata exec gmail GMAIL_EMAIL gmail_search_emails \
+    --body '{"query":"from:user@example.com","maxResults":5}'`,
+		Args: cobra.ExactArgs(3),
+		Run: func(cmd *cobra.Command, args []string) {
+			bodySchema, _ := cmd.Flags().GetString("body")
+			queryParams, _ := cmd.Flags().GetString("query")
+			pathParams, _ := cmd.Flags().GetString("path")
+			outputFields, _ := cmd.Flags().GetStringArray("output-field")
+			maxChars, _ := cmd.Flags().GetInt("max-chars")
+
+			toolArgs := map[string]any{
+				"server_name":   args[0],
+				"category_name": args[1],
+				"action_name":   args[2],
+			}
+
+			if bodySchema != "" {
+				toolArgs["body_schema"] = bodySchema
+			}
+			if queryParams != "" {
+				toolArgs["query_params"] = queryParams
+			}
+			if pathParams != "" {
+				toolArgs["path_params"] = pathParams
+			}
+			if len(outputFields) > 0 {
+				toolArgs["include_output_fields"] = outputFields
+			}
+			if cmd.Flags().Changed("max-chars") {
+				toolArgs["maximum_output_characters"] = maxChars
+			}
+
+			c := newClient()
+			result, err := c.CallTool("execute_action", toolArgs)
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+	execCmd.Flags().String("body", "", "Request body as JSON string")
+	execCmd.Flags().String("query", "", "Query parameters as JSON string")
+	execCmd.Flags().String("path", "", "Path parameters as JSON string")
+	execCmd.Flags().StringArray("output-field", nil, "Limit response to these fields (repeatable)")
+	execCmd.Flags().Int("max-chars", 0, "Maximum output characters")
+
+	searchCmd := &cobra.Command{
+		Use:   "search <query> <server>",
+		Short: "Search documentation for a service",
+		Args:  cobra.ExactArgs(2),
+		Run: func(cmd *cobra.Command, args []string) {
+			c := newClient()
+			result, err := c.CallTool("search_documentation", map[string]any{
+				"query":       args[0],
+				"server_name": args[1],
+			})
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+
+	authCmd := &cobra.Command{
+		Use:   "auth <server-name>",
+		Short: "Handle authentication failure for a service",
+		Args:  cobra.ExactArgs(1),
+		Run: func(cmd *cobra.Command, args []string) {
+			intention, _ := cmd.Flags().GetString("intention")
+			c := newClient()
+			result, err := c.CallTool("handle_auth_failure", map[string]any{
+				"server_name": args[0],
+				"intention":   intention,
+			})
+			if err != nil {
+				output.Error(err.Error(), 1)
+			}
+			if jsonOut {
+				output.JSON(result)
+			} else {
+				output.Text(result)
+			}
+		},
+	}
+	authCmd.Flags().String("intention", "get_auth_url", "Auth intention")
+
+	strataCmd.AddCommand(checkCmd, discoverCmd, actionsCmd, detailsCmd, execCmd, searchCmd, authCmd)
+	rootCmd.AddCommand(strataCmd)
+}
--- a/packages/browseros-agent/apps/cli/integration_test.go
+++ b/packages/browseros-agent/apps/cli/integration_test.go
@@ -270,3 +270,84 @@ func TestInvalidPage(t *testing.T) {
 		t.Errorf("expected snap with invalid page ID to exit non-zero")
 	}
 }
+
+func TestStrataCheck(t *testing.T) {
+	r := run(t, "--json", "strata", "check", "Gmail")
+	// Klavis may not be configured — accept success or structured error
+	out := strings.TrimSpace(r.Stdout + r.Stderr)
+	if out == "" {
+		t.Fatal("strata check produced no output")
+	}
+	if r.ExitCode == 0 {
+		var data map[string]any
+		if err := json.Unmarshal([]byte(strings.TrimSpace(r.Stdout)), &data); err != nil {
+			t.Fatalf("strata check returned non-JSON: %s", r.Stdout)
+		}
+	}
+}
+
+func TestStrataDiscover(t *testing.T) {
+	r := run(t, "--json", "strata", "discover", "send email", "Gmail")
+	out := strings.TrimSpace(r.Stdout + r.Stderr)
+	if out == "" {
+		t.Fatal("strata discover produced no output")
+	}
+	if r.ExitCode == 0 {
+		var data map[string]any
+		if err := json.Unmarshal([]byte(strings.TrimSpace(r.Stdout)), &data); err != nil {
+			t.Fatalf("strata discover returned non-JSON: %s", r.Stdout)
+		}
+	}
+}
+
+func TestStrataSearch(t *testing.T) {
+	r := run(t, "--json", "strata", "search", "send email", "Gmail")
+	out := strings.TrimSpace(r.Stdout + r.Stderr)
+	if out == "" {
+		t.Fatal("strata search produced no output")
+	}
+	if r.ExitCode == 0 {
+		var data map[string]any
+		if err := json.Unmarshal([]byte(strings.TrimSpace(r.Stdout)), &data); err != nil {
+			t.Fatalf("strata search returned non-JSON: %s", r.Stdout)
+		}
+	}
+}
+
+func TestStrataActions(t *testing.T) {
+	r := run(t, "--json", "strata", "actions", "Gmail")
+	out := strings.TrimSpace(r.Stdout + r.Stderr)
+	if out == "" {
+		t.Fatal("strata actions produced no output")
+	}
+}
+
+func TestStrataDetails(t *testing.T) {
+	r := run(t, "--json", "strata", "details", "Gmail", "send_email")
+	out := strings.TrimSpace(r.Stdout + r.Stderr)
+	if out == "" {
+		t.Fatal("strata details produced no output")
+	}
+}
+
+func TestStrataAuth(t *testing.T) {
+	r := run(t, "--json", "strata", "auth", "Gmail")
+	out := strings.TrimSpace(r.Stdout + r.Stderr)
+	if out == "" {
+		t.Fatal("strata auth produced no output")
+	}
+}
+
+func TestStrataExecMissingArgs(t *testing.T) {
+	r := run(t, "strata", "exec")
+	if r.ExitCode == 0 {
+		t.Error("expected strata exec without args to exit non-zero")
+	}
+}
+
+func TestStrataCheckMissingArgs(t *testing.T) {
+	r := run(t, "strata", "check")
+	if r.ExitCode == 0 {
+		t.Error("expected strata check without args to exit non-zero")
+	}
+}
--- a/packages/browseros-agent/apps/eval/configs/agisdk-real-smoke.json
+++ b/packages/browseros-agent/apps/eval/configs/agisdk-real-smoke.json
@@ -1,26 +0,0 @@
-{
-  "agent": {
-    "type": "single",
-    "provider": "openai-compatible",
-    "model": "accounts/fireworks/models/kimi-k2p5",
-    "apiKey": "FIREWORKS_API_KEY",
-    "baseUrl": "https://api.fireworks.ai/inference/v1",
-    "supportsImages": true
-  },
-  "dataset": "../data/agisdk-real.jsonl",
-  "num_workers": 10,
-  "restart_server_per_task": true,
-  "browseros": {
-    "server_url": "http://127.0.0.1:9110",
-    "base_cdp_port": 9010,
-    "base_server_port": 9110,
-    "base_extension_port": 9310,
-    "load_extensions": false,
-    "headless": false
-  },
-  "captcha": {
-    "api_key_env": "NOPECHA_API_KEY"
-  },
-  "graders": ["agisdk_state_diff"],
-  "timeout_ms": 1800000
-}
--- a/packages/browseros-agent/apps/eval/configs/infinity-hard-50.json
+++ b/packages/browseros-agent/apps/eval/configs/infinity-hard-50.json
@@ -1,26 +0,0 @@
-{
-  "agent": {
-    "type": "single",
-    "provider": "openai-compatible",
-    "model": "accounts/fireworks/models/kimi-k2p5",
-    "apiKey": "FIREWORKS_API_KEY",
-    "baseUrl": "https://api.fireworks.ai/inference/v1",
-    "supportsImages": true
-  },
-  "dataset": "../data/webarena-infinity-hard-50.jsonl",
-  "num_workers": 10,
-  "restart_server_per_task": true,
-  "browseros": {
-    "server_url": "http://127.0.0.1:9110",
-    "base_cdp_port": 9010,
-    "base_server_port": 9110,
-    "base_extension_port": 9310,
-    "load_extensions": false,
-    "headless": false
-  },
-  "captcha": {
-    "api_key_env": "NOPECHA_API_KEY"
-  },
-  "graders": ["infinity_state"],
-  "timeout_ms": 1800000
-}
--- a/packages/browseros-agent/apps/eval/data/agisdk-real.jsonl
+++ b/packages/browseros-agent/apps/eval/data/agisdk-real.jsonl
@@ -1,52 +0,0 @@
-{"query_id": "agisdk-dashdish-10", "dataset": "agisdk-real", "query": "Place an order from \"Souvla\" for a \"Medium Classic Cheeseburger\" and a \"Small Bacon Double Cheeseburger\" with \"Standard Delivery\" as the method with the default charged options.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-10", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-10", "challenge_type": "action", "difficulty": "hard", "similar_to": "Doordash"}}}
-{"query_id": "agisdk-fly-unified-5", "dataset": "agisdk-real", "query": "Find me the cheapest fare for a flight from Orlando to Milwaukee on December 5th, 2024 and book it.\nPassenger: John Doe\nDate of Birth: 01/01/1990\nSex: Male\nSeat Selection: No\nPayment: Credit Card (378342143523967), Exp: 12/25, Security Code: 420 Address: 123 Main St, San Francisco, CA, 94105, USA, Phone: 555-123-4567, Email: johndoe@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-5", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-5", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "United Airlines"}}}
-{"query_id": "agisdk-udriver-10", "dataset": "agisdk-real", "query": "Order me a ride for 4pm, I'll be at the de Young muesum headed to the Waterbar, fanciest option possible please.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-10", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-10", "challenge_type": "action", "difficulty": "hard", "similar_to": "Uber"}}}
-{"query_id": "agisdk-udriver-9", "dataset": "agisdk-real", "query": "Book me a ride from the thai restaurant I last took a ride to for later today at 2pm, I'll be at 333 Apartments on Fremont", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-9", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-9", "challenge_type": "retrieval-action", "difficulty": "hard", "similar_to": "Uber"}}}
-{"query_id": "agisdk-topwork-4", "dataset": "agisdk-real", "query": "Create a job post for a UI/UX Designer with expertise in Figma, Sketch, and Adobe Creative Suite, including project details, timeline, and required skills (Wireframing, Prototyping, Responsive Design).", "graders": ["agisdk_state_diff"], "start_url": "https://evals-topwork.vercel.app", "metadata": {"original_task_id": "topwork-4", "website": "TopWork", "category": "agisdk-real", "additional": {"agisdk_task_id": "topwork-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "Upwork"}}}
-{"query_id": "agisdk-gocalendar-4", "dataset": "agisdk-real", "query": "Change the \"Team Check-In\" event on July 18, 2024, name to \"Project Kickoff\" and update the location to \"Zoom\"", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gocalendar.vercel.app", "metadata": {"original_task_id": "gocalendar-4", "website": "GoCalendar", "category": "agisdk-real", "additional": {"agisdk_task_id": "gocalendar-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "Google Calendar"}}}
-{"query_id": "agisdk-staynb-6", "dataset": "agisdk-real", "query": "Find and book the stay with the best value for money (cheapest stay with the best reviews) for 1 day. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-6", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-6", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "Airbnb"}}}
-{"query_id": "agisdk-omnizon-10", "dataset": "agisdk-real", "query": "Click on \"buy now\" on any product, increase its quantity to the maximum allowed, update the delivery date to the last available, and place the order.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-omnizon.vercel.app", "metadata": {"original_task_id": "omnizon-10", "website": "Omnizon", "category": "agisdk-real", "additional": {"agisdk_task_id": "omnizon-10", "challenge_type": "action", "difficulty": "hard", "similar_to": "Amazon"}}}
-{"query_id": "agisdk-fly-unified-9", "dataset": "agisdk-real", "query": "Book me a flight from San Francisco to Chicago in Basic Economy on December 18th at 10:00. Ensure no seat selection is made.\nPassenger: David Lee\nDate of Birth: 07/22/1985\nSex: Male\nSeat Selection: No\nPayment: Credit Card (9999 8888 7777), Exp: 03/30, Address: 987 Cedar St, Chicago, IL, 60601, USA, Phone: 555-987-1234, Email: davidlee@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-9", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-9", "challenge_type": "action", "difficulty": "hard", "similar_to": "United Airlines"}}}
-{"query_id": "agisdk-networkin-9", "dataset": "agisdk-real", "query": "Find a professional who attended Stanford and send them a connection request and a message.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-9", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-9", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "LinkedIn"}}}
-{"query_id": "agisdk-udriver-11", "dataset": "agisdk-real", "query": "I need to go from Pacific Catch on Chestnut back home to 333 Fremont now. If the fancy version is within ten dollars of the regular one, book that.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-11", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-11", "challenge_type": "action", "difficulty": "hard", "similar_to": "Uber"}}}
-{"query_id": "agisdk-fly-unified-4", "dataset": "agisdk-real", "query": "Book me a round-trip flight from Providence (Rhode Island) to Indianapolis, departing on December 5th, 2024 at 08:00 and returning on December 9th at 14:00.\nPassenger: Jane Smith\nDate of Birth: 02/14/1995\nSex: Female\nSeat Selection: Yes (Window seat)\nPayment: Credit Card (378342143523967), Exp: 06/26, security code: 345 Address: 456 Elm St, Miami, FL, 33101, USA, Phone: 555-987-6543, Email: janesmith@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-4", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "United Airlines"}}}
-{"query_id": "agisdk-networkin-5", "dataset": "agisdk-real", "query": "Send a connection request to John Smith.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-5", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-5", "challenge_type": "action", "difficulty": "easy", "similar_to": "LinkedIn"}}}
-{"query_id": "agisdk-zilloft-6", "dataset": "agisdk-real", "query": "Select a property listed in San Francisco as \"Condos\" within a price range under $300,000 and request a tour for tomorrow at 4:00 PM. Use these contact details: Name: Sarah Brown, Email: sarahbrown@example.com, Phone: 555-987-6543.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-zilloft.vercel.app", "metadata": {"original_task_id": "zilloft-6", "website": "Zilloft", "category": "agisdk-real", "additional": {"agisdk_task_id": "zilloft-6", "challenge_type": "action", "difficulty": "medium", "similar_to": "Zillow"}}}
-{"query_id": "agisdk-topwork-2", "dataset": "agisdk-real", "query": "Create a job posting for a Backend Developer specializing in Python, Django, and Flask to develop a high-performance web application. Include project details such as required skills (PostgreSQL, Docker, AWS, CI/CD), estimated project timeline, and budget.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-topwork.vercel.app", "metadata": {"original_task_id": "topwork-2", "website": "TopWork", "category": "agisdk-real", "additional": {"agisdk_task_id": "topwork-2", "challenge_type": "action", "difficulty": "medium", "similar_to": "Upwork"}}}
-{"query_id": "agisdk-gocalendar-3", "dataset": "agisdk-real", "query": "Delete the event titled \"Breakfast Meeting with Client\" scheduled for July 19, 2024", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gocalendar.vercel.app", "metadata": {"original_task_id": "gocalendar-3", "website": "GoCalendar", "category": "agisdk-real", "additional": {"agisdk_task_id": "gocalendar-3", "challenge_type": "action", "difficulty": "easy", "similar_to": "Google Calendar"}}}
-{"query_id": "agisdk-topwork-3", "dataset": "agisdk-real", "query": "Create a job listing for a Full-Stack Developer with expertise in Java, Spring Boot, and Angular, outlining the project scope, estimated duration, and required skills (MySQL, Docker, Kubernetes, and Jenkins). The ideal candidate should have experience in enterprise-level applications and building scalable microservices. After creating the job post, please describe what you included in the job listing.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-topwork.vercel.app", "metadata": {"original_task_id": "topwork-3", "website": "TopWork", "category": "agisdk-real", "additional": {"agisdk_task_id": "topwork-3", "challenge_type": "retrieval", "difficulty": "medium", "similar_to": "Upwork"}}}
-{"query_id": "agisdk-fly-unified-2", "dataset": "agisdk-real", "query": "Book me a one-way flight from Indiana to New York on December 2nd 2024 at 12:00.\nPassenger: John Doe\nDate of Birth: 01/01/1990\nSex: Male\nSeat Selection: No\nPayment: Credit Card (378342143523967), Exp: 12/25, Security Code: 245, Address: 123 Main St, San Francisco, CA, 94105, USA, Phone: 555-123-4567, Email: johndoe@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-2", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-2", "challenge_type": "action", "difficulty": "easy", "similar_to": "United Airlines"}}}
-{"query_id": "agisdk-dashdish-7", "dataset": "agisdk-real", "query": "Select \"Express Delivery\" for an order from \"DragonEats\" of \"Mushroom Swiss Burger\" and complete the checkout with the pre-loaded Visa card.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-7", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-7", "challenge_type": "action", "difficulty": "hard", "similar_to": "Doordash"}}}
-{"query_id": "agisdk-networkin-3", "dataset": "agisdk-real", "query": "Write a post inviting users to a networking event, including details about the event's purpose, date, and target audience.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-3", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-3", "challenge_type": "action", "difficulty": "medium", "similar_to": "LinkedIn"}}}
-{"query_id": "agisdk-gomail-7", "dataset": "agisdk-real", "query": "Delete the email with the subject \"New Leadership Articles You Can't Miss\" from the Inbox.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gomail.vercel.app", "metadata": {"original_task_id": "gomail-7", "website": "GoMail", "category": "agisdk-real", "additional": {"agisdk_task_id": "gomail-7", "challenge_type": "retrieval-action", "difficulty": "hard", "similar_to": "Gmail"}}}
-{"query_id": "agisdk-opendining-8", "dataset": "agisdk-real", "query": "Identify and book the restaurant with the lowest rating. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-opendining.vercel.app", "metadata": {"original_task_id": "opendining-8", "website": "OpenDining", "category": "agisdk-real", "additional": {"agisdk_task_id": "opendining-8", "challenge_type": "retrieval-action", "difficulty": "easy", "similar_to": "OpenTable"}}}
-{"query_id": "agisdk-omnizon-2", "dataset": "agisdk-real", "query": "Search for \"smartphones\" using the search bar, add the first two to your cart, view the details of the third product, click on \"Buy Now,\" and proceed through the checkout process.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-omnizon.vercel.app", "metadata": {"original_task_id": "omnizon-2", "website": "Omnizon", "category": "agisdk-real", "additional": {"agisdk_task_id": "omnizon-2", "challenge_type": "action", "difficulty": "medium", "similar_to": "Amazon"}}}
-{"query_id": "agisdk-udriver-1", "dataset": "agisdk-real", "query": "Book a ride from Fitness Urbano to Pacific Cafe", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-1", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-1", "challenge_type": "action", "difficulty": "easy", "similar_to": "Uber"}}}
-{"query_id": "agisdk-staynb-2", "dataset": "agisdk-real", "query": "Click on one of the stays displayed on the homepage and book it for a family of 4 (2 adults and 2 children). For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-2", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-2", "challenge_type": "action", "difficulty": "easy", "similar_to": "Airbnb"}}}
-{"query_id": "agisdk-opendining-10", "dataset": "agisdk-real", "query": "Check the menus of all restaurants for vegetarian options and make a reservation at the one with the most vegetarian choices. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-opendining.vercel.app", "metadata": {"original_task_id": "opendining-10", "website": "OpenDining", "category": "agisdk-real", "additional": {"agisdk_task_id": "opendining-10", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "OpenTable"}}}
-{"query_id": "agisdk-opendining-4", "dataset": "agisdk-real", "query": "Use the search bar to search for a restaurant on September 2nd at 4:30 PM for 7 people, using \"Japanese\" as the search term, and book the first result. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-opendining.vercel.app", "metadata": {"original_task_id": "opendining-4", "website": "OpenDining", "category": "agisdk-real", "additional": {"agisdk_task_id": "opendining-4", "challenge_type": "action", "difficulty": "hard", "similar_to": "OpenTable"}}}
-{"query_id": "agisdk-gomail-8", "dataset": "agisdk-real", "query": "Clear all emails from \"GitHub\" in the inbox to trash.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gomail.vercel.app", "metadata": {"original_task_id": "gomail-8", "website": "GoMail", "category": "agisdk-real", "additional": {"agisdk_task_id": "gomail-8", "challenge_type": "action", "difficulty": "medium", "similar_to": "Gmail"}}}
-{"query_id": "agisdk-dashdish-4", "dataset": "agisdk-real", "query": "Schedule a delivery order from \"Taco Bell\" adding a \"Classic Cheeseburger\" large size for later and add the note \"Leave at the front door\".", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-4", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "Doordash"}}}
-{"query_id": "agisdk-networkin-1", "dataset": "agisdk-real", "query": "Create a new text post for the feed with a professional update about AI trends in 2025, mentioning three key advancements and their impact on the job market.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-1", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-1", "challenge_type": "action", "difficulty": "medium", "similar_to": "LinkedIn"}}}
-{"query_id": "agisdk-dashdish-5", "dataset": "agisdk-real", "query": "Add three \"Loaded Bacon Cheese Fries\" to the shopping cart from \"Man vs. Fries\". Proceed to checkout and select \"Pickup\" as the delivery method.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-5", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-5", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "Doordash"}}}
-{"query_id": "agisdk-opendining-5", "dataset": "agisdk-real", "query": "Scroll through the homepage carousel until \"Ocean Breeze\" is visible, select the second available time slot, and complete the reservation. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-opendining.vercel.app", "metadata": {"original_task_id": "opendining-5", "website": "OpenDining", "category": "agisdk-real", "additional": {"agisdk_task_id": "opendining-5", "challenge_type": "action", "difficulty": "medium", "similar_to": "OpenTable"}}}
-{"query_id": "agisdk-topwork-1", "dataset": "agisdk-real", "query": "Create a new job post for a Frontend Developer with expertise in React and TypeScript, specifying project details such as estimated duration, required skills, and budget.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-topwork.vercel.app", "metadata": {"original_task_id": "topwork-1", "website": "TopWork", "category": "agisdk-real", "additional": {"agisdk_task_id": "topwork-1", "challenge_type": "action", "difficulty": "medium", "similar_to": "Upwork"}}}
-{"query_id": "agisdk-gocalendar-1", "dataset": "agisdk-real", "query": "Create a new event titled \"Team Meeting\" on July 19, 2024, from 2 PM to 2:30 PM, and include \"Conference Room A\" as the location", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gocalendar.vercel.app", "metadata": {"original_task_id": "gocalendar-1", "website": "GoCalendar", "category": "agisdk-real", "additional": {"agisdk_task_id": "gocalendar-1", "challenge_type": "action", "difficulty": "medium", "similar_to": "Google Calendar"}}}
-{"query_id": "agisdk-gomail-5", "dataset": "agisdk-real", "query": "Schedule an email to jane.doe@example.com with the subject \"Weekly Update\" to be sent next Monday at 9:00 AM.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gomail.vercel.app", "metadata": {"original_task_id": "gomail-5", "website": "GoMail", "category": "agisdk-real", "additional": {"agisdk_task_id": "gomail-5", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "Gmail"}}}
-{"query_id": "agisdk-staynb-4", "dataset": "agisdk-real", "query": "Book a stay for 2 children with 1 adult. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-4", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-4", "challenge_type": "action", "difficulty": "medium", "similar_to": "Airbnb"}}}
-{"query_id": "agisdk-omnizon-8", "dataset": "agisdk-real", "query": "Search for \"Automatic Espresso Machine,\" click on the cheapest one, change the quantity to 5, use \"buy now\" to purchase them and complete the checkout.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-omnizon.vercel.app", "metadata": {"original_task_id": "omnizon-8", "website": "Omnizon", "category": "agisdk-real", "additional": {"agisdk_task_id": "omnizon-8", "challenge_type": "retrieval-action", "difficulty": "easy", "similar_to": "Amazon"}}}
-{"query_id": "agisdk-networkin-6", "dataset": "agisdk-real", "query": "Choose a random person who you haven't connected with, connect with them, and send them a message saying, 'howdy, partner'.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-6", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-6", "challenge_type": "action", "difficulty": "medium", "similar_to": "LinkedIn"}}}
-{"query_id": "agisdk-dashdish-2", "dataset": "agisdk-real", "query": "Add a \"Medium Pepperoni Pizza\" from the restaurant \"Papa Johns Pizza\" to the shopping cart and purchase it.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-dashdish.vercel.app", "metadata": {"original_task_id": "dashdish-2", "website": "DashDish", "category": "agisdk-real", "additional": {"agisdk_task_id": "dashdish-2", "challenge_type": "action", "difficulty": "easy", "similar_to": "Doordash"}}}
-{"query_id": "agisdk-staynb-8", "dataset": "agisdk-real", "query": "Scroll through the homepage and book the last stay located in Paris.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-8", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-8", "challenge_type": "retrieval-action", "difficulty": "medium", "similar_to": "Airbnb"}}}
-{"query_id": "agisdk-omnizon-4", "dataset": "agisdk-real", "query": "Search for a \"Marshall Emberton II Portable Bluetooth Speaker\" and add it to your cart, then search for the \"Michael Kors Oversized Slim Runway Men's Watch,\" add it to the cart, and complete the checkout with both items.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-omnizon.vercel.app", "metadata": {"original_task_id": "omnizon-4", "website": "Omnizon", "category": "agisdk-real", "additional": {"agisdk_task_id": "omnizon-4", "challenge_type": "action", "difficulty": "hard", "similar_to": "Amazon"}}}
-{"query_id": "agisdk-gomail-2", "dataset": "agisdk-real", "query": "Mark the first email in the Inbox as \"read\".", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gomail.vercel.app", "metadata": {"original_task_id": "gomail-2", "website": "GoMail", "category": "agisdk-real", "additional": {"agisdk_task_id": "gomail-2", "challenge_type": "action", "difficulty": "easy", "similar_to": "Gmail"}}}
-{"query_id": "agisdk-networkin-10", "dataset": "agisdk-real", "query": "Generate a polite follow-up message for a previous unanswered chat, starting with \"Following up on\".", "graders": ["agisdk_state_diff"], "start_url": "https://evals-networkin.vercel.app", "metadata": {"original_task_id": "networkin-10", "website": "Networkin", "category": "agisdk-real", "additional": {"agisdk_task_id": "networkin-10", "challenge_type": "action", "difficulty": "medium", "similar_to": "LinkedIn"}}}
-{"query_id": "agisdk-gomail-3", "dataset": "agisdk-real", "query": "Compose a new email to jonathan.smith@example.com with the subject \"Meeting Notes\" and body \"Please find the meeting notes attached.\"", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gomail.vercel.app", "metadata": {"original_task_id": "gomail-3", "website": "GoMail", "category": "agisdk-real", "additional": {"agisdk_task_id": "gomail-3", "challenge_type": "action", "difficulty": "easy", "similar_to": "Gmail"}}}
-{"query_id": "agisdk-udriver-6", "dataset": "agisdk-real", "query": "Me and 4 friends need a ride from the Palace Hotel to dinner at Osha Thai leaving now", "graders": ["agisdk_state_diff"], "start_url": "https://evals-udriver.vercel.app", "metadata": {"original_task_id": "udriver-6", "website": "UDriver", "category": "agisdk-real", "additional": {"agisdk_task_id": "udriver-6", "challenge_type": "action", "difficulty": "hard", "similar_to": "Uber"}}}
-{"query_id": "agisdk-staynb-9", "dataset": "agisdk-real", "query": "Book a stay with the maximum number of guests supported. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-9", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-9", "challenge_type": "action", "difficulty": "hard", "similar_to": "Airbnb"}}}
-{"query_id": "agisdk-zilloft-3", "dataset": "agisdk-real", "query": "Find a home in San Diego priced under $150,000 with at least 2 bedrooms and request a tour. Use these details: Contact Name: John Doe, Email: johndoe@example.com, Phone: 555-123-4567, Tour Time: 2:00 PM, Tour Date: First available.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-zilloft.vercel.app", "metadata": {"original_task_id": "zilloft-3", "website": "Zilloft", "category": "agisdk-real", "additional": {"agisdk_task_id": "zilloft-3", "challenge_type": "retrieval-action", "difficulty": "easy", "similar_to": "Zillow"}}}
-{"query_id": "agisdk-fly-unified-6", "dataset": "agisdk-real", "query": "Reserve me a seat for the flight from Austin to Pittsburgh departing on December 11th, 2024 at 8:00 in Basic Economy.\nPassenger: Alice Brown\nDate of Birth: 05/20/1992\nSex: Female\nSeat Selection: Yes (Aisle seat)\nPayment: Credit Card (378342143523967), Exp: 09/27, security code: 332 Address: 789 Pine St, Los Angeles, CA, 90012, USA, Phone: 555-456-7890, Email: alicebrown@example.com.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-fly-unified.vercel.app", "metadata": {"original_task_id": "fly-unified-6", "website": "Fly Unified", "category": "agisdk-real", "additional": {"agisdk_task_id": "fly-unified-6", "challenge_type": "action", "difficulty": "medium", "similar_to": "United Airlines"}}}
-{"query_id": "agisdk-opendining-3", "dataset": "agisdk-real", "query": "Book a table at \"The Royal Dine\" for a party of 4 on July 20, 2024, at 7 PM. For fields you don't know the answer for, just fill them in with anything of your choice.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-opendining.vercel.app", "metadata": {"original_task_id": "opendining-3", "website": "OpenDining", "category": "agisdk-real", "additional": {"agisdk_task_id": "opendining-3", "challenge_type": "action", "difficulty": "easy", "similar_to": "OpenTable"}}}
-{"query_id": "agisdk-omnizon-9", "dataset": "agisdk-real", "query": "Search for \"PlayStation DualSense\", purchase it using the \"buy now\" button after opening the first result and change the default payment method to:\nname: Jack Fulton\ncard number: 9231 3432 8927 7764\nexp date: 1/2029\nsecurity code: 128\n before placing your order. ", "graders": ["agisdk_state_diff"], "start_url": "https://evals-omnizon.vercel.app", "metadata": {"original_task_id": "omnizon-9", "website": "Omnizon", "category": "agisdk-real", "additional": {"agisdk_task_id": "omnizon-9", "challenge_type": "action", "difficulty": "hard", "similar_to": "Amazon"}}}
-{"query_id": "agisdk-gocalendar-7", "dataset": "agisdk-real", "query": "Reschedule the \"Morning Coffee with sister\" event from July 18, 2024, at 9 AM to July 19, 2024, at 10AM using drag-and-drop functionality", "graders": ["agisdk_state_diff"], "start_url": "https://evals-gocalendar.vercel.app", "metadata": {"original_task_id": "gocalendar-7", "website": "GoCalendar", "category": "agisdk-real", "additional": {"agisdk_task_id": "gocalendar-7", "challenge_type": "action", "difficulty": "medium", "similar_to": "Google Calendar"}}}
-{"query_id": "agisdk-staynb-5", "dataset": "agisdk-real", "query": "Use the search bar to look for a stay. For the \"Where\" section, use the \"Search by region\" popover and select \"Europe\". Set the check-in date to October 13th and the check-out date to October 23rd. For the \"Who\" section, select 1 infant, 2 children, and 2 adults. Press the search button, select the first stay, and book it.", "graders": ["agisdk_state_diff"], "start_url": "https://evals-staynb.vercel.app", "metadata": {"original_task_id": "staynb-5", "website": "StayNB", "category": "agisdk-real", "additional": {"agisdk_task_id": "staynb-5", "challenge_type": "action", "difficulty": "medium", "similar_to": "Airbnb"}}}
--- a/packages/browseros-agent/apps/eval/data/webarena-infinity-hard-50.jsonl
+++ b/packages/browseros-agent/apps/eval/data/webarena-infinity-hard-50.jsonl
@@ -1,50 +0,0 @@
-{"query_id": "infinity-elation-prescriptions-task_h69", "dataset": "webarena-infinity", "query": "Approve all pending refill requests except for any medication that is involved in a major drug-drug interaction with another of the patient's active medications. Deny those with the reason 'Drug interaction \u2014 needs provider review before renewal'.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h69", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h69.py", "app_base_port": 8020}}}
-{"query_id": "infinity-elation-clinical-records-task_h52", "dataset": "webarena-infinity", "query": "Add the document tag 'Provider-Reviewed' to every visit note template that was created by the current logged-in provider. Do not modify templates created by other providers.", "graders": ["infinity_state"], "start_url": "http://localhost:8000", "metadata": {"original_task_id": "elation-clinical-records-task_h52", "website": "elation-clinical-records", "category": "webarena-infinity", "additional": {"app_name": "elation-clinical-records", "difficulty": "hard", "verifier_path": "real-tasks/task_h52.py", "app_base_port": 8000}}}
-{"query_id": "infinity-gmail-accounts-and-contacts-task_h44", "dataset": "webarena-infinity", "query": "Your sister's husband is one of your contacts. Find him, star his entry, and add the Friends label.", "graders": ["infinity_state"], "start_url": "http://localhost:8070", "metadata": {"original_task_id": "gmail-accounts-and-contacts-task_h44", "website": "gmail-accounts-and-contacts", "category": "webarena-infinity", "additional": {"app_name": "gmail-accounts-and-contacts", "difficulty": "hard", "verifier_path": "real-tasks/task_h44.py", "app_base_port": 8070}}}
-{"query_id": "infinity-gmail-task_h2", "dataset": "webarena-infinity", "query": "Update the Datadog alerts filter to also archive matching emails and forward them to priya.sharma@cloudnine.dev instead of nate.patel@devops.tools.", "graders": ["infinity_state"], "start_url": "http://localhost:8060", "metadata": {"original_task_id": "gmail-task_h2", "website": "gmail", "category": "webarena-infinity", "additional": {"app_name": "gmail", "difficulty": "hard", "verifier_path": "real-tasks/task_h2.py", "app_base_port": 8060}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h58", "dataset": "webarena-infinity", "query": "The Performance Initiative epic has two child epics. For the child epic with more open issues, set the weight of every issue in it to 13. For the other child epic, close all its open issues.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h58", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h58.py", "app_base_port": 8050}}}
-{"query_id": "infinity-figma-slides-task_h46", "dataset": "webarena-infinity", "query": "There are two slides with tables in the deck. Lock the table that compares competitors, and change the font size to 16 on the table that tracks quarterly feature adoption.", "graders": ["infinity_state"], "start_url": "http://localhost:8030", "metadata": {"original_task_id": "figma-slides-task_h46", "website": "figma-slides", "category": "webarena-infinity", "additional": {"app_name": "figma-slides", "difficulty": "hard", "verifier_path": "real-tasks/task_h46.py", "app_base_port": 8030}}}
-{"query_id": "infinity-elation-prescriptions-task_h50", "dataset": "webarena-infinity", "query": "Deny the pending refill for the patient's cholesterol medication because his lipid panel is overdue. Then deny the Lisinopril refill as well \u2014 he needs a follow-up blood pressure check first.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h50", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h50.py", "app_base_port": 8020}}}
-{"query_id": "infinity-elation-prescriptions-task_h19", "dataset": "webarena-infinity", "query": "Discontinue the Omeprazole and prescribe Famotidine 20mg tablet twice daily as a replacement for GERD \u2014 qty 60, 3 refills, send to CVS #4521.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h19", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h19.py", "app_base_port": 8020}}}
-{"query_id": "infinity-paypal-my-wallet-task_h25", "dataset": "webarena-infinity", "query": "Convert all of my Australian dollars to euros.", "graders": ["infinity_state"], "start_url": "http://localhost:8100", "metadata": {"original_task_id": "paypal-my-wallet-task_h25", "website": "paypal-my-wallet", "category": "webarena-infinity", "additional": {"app_name": "paypal-my-wallet", "difficulty": "hard", "verifier_path": "real-tasks/task_h25.py", "app_base_port": 8100}}}
-{"query_id": "infinity-elation-clinical-records-task_h66", "dataset": "webarena-infinity", "query": "Create a new template called 'Anxiety Management' with HPI and Assessment sections, and billing code 99213 with description 'Office visit, established, low complexity'. Then create a visit note for Emily Nakamura using that new template and the Telehealth category, add a Psychological Status block to the note, and sign it.", "graders": ["infinity_state"], "start_url": "http://localhost:8000", "metadata": {"original_task_id": "elation-clinical-records-task_h66", "website": "elation-clinical-records", "category": "webarena-infinity", "additional": {"app_name": "elation-clinical-records", "difficulty": "hard", "verifier_path": "real-tasks/task_h66.py", "app_base_port": 8000}}}
-{"query_id": "infinity-elation-clinical-records-task_h62", "dataset": "webarena-infinity", "query": "Look up which template is assigned to the COVID Vaccine appointment type. Remove all its existing document tags and replace them with the single tag 'COVID-Protocol'. Then also assign that same template to the Urgent Same-Day appointment type.", "graders": ["infinity_state"], "start_url": "http://localhost:8000", "metadata": {"original_task_id": "elation-clinical-records-task_h62", "website": "elation-clinical-records", "category": "webarena-infinity", "additional": {"app_name": "elation-clinical-records", "difficulty": "hard", "verifier_path": "real-tasks/task_h62.py", "app_base_port": 8000}}}
-{"query_id": "infinity-elation-prescriptions-task_h32", "dataset": "webarena-infinity", "query": "The patient has a medication that's being dispensed as written (brand name only). Discontinue that prescription and replace it with a new one \u2014 same medication, same sig, same pharmacy \u2014 but allow generic substitution this time. Qty 30, 3 refills, 30 days supply.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h32", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h32.py", "app_base_port": 8020}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h48", "dataset": "webarena-infinity", "query": "Add the 'breaking-change' label to every open issue in the API v3 Migration epic and remove any existing workflow-scoped labels from those issues.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h48", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h48.py", "app_base_port": 8050}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h77", "dataset": "webarena-infinity", "query": "Rename the 'UX' label to 'user-experience', change its type to 'group', and then add it to every open issue in the Frontend Modernization epic that doesn't already have it.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h77", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h77.py", "app_base_port": 8050}}}
-{"query_id": "infinity-xero-invoicing-task_h15", "dataset": "webarena-infinity", "query": "Create a new invoice for Summit Health Group for an annual software license and 12 months of support with a 10% discount on support.", "graders": ["infinity_state"], "start_url": "http://localhost:8120", "metadata": {"original_task_id": "xero-invoicing-task_h15", "website": "xero-invoicing", "category": "webarena-infinity", "additional": {"app_name": "xero-invoicing", "difficulty": "hard", "verifier_path": "real-tasks/task_h15.py", "app_base_port": 8120}}}
-{"query_id": "infinity-elation-clinical-records-task_h55", "dataset": "webarena-infinity", "query": "Resolve every problem across all patients in the system that currently has a status of Controlled.", "graders": ["infinity_state"], "start_url": "http://localhost:8000", "metadata": {"original_task_id": "elation-clinical-records-task_h55", "website": "elation-clinical-records", "category": "webarena-infinity", "additional": {"app_name": "elation-clinical-records", "difficulty": "hard", "verifier_path": "real-tasks/task_h55.py", "app_base_port": 8000}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h8", "dataset": "webarena-infinity", "query": "Create a confidential issue titled 'Emergency security patch' with priority::critical and the 'security' label, assigned to James O'Brien and Oliver Schmidt, with weight 2 in the Security Hardening milestone.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h8", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h8.py", "app_base_port": 8050}}}
-{"query_id": "infinity-paypal-my-wallet-task_h20", "dataset": "webarena-infinity", "query": "Make a $200 payment on PayPal Credit and change autopay to pay the full balance.", "graders": ["infinity_state"], "start_url": "http://localhost:8100", "metadata": {"original_task_id": "paypal-my-wallet-task_h20", "website": "paypal-my-wallet", "category": "webarena-infinity", "additional": {"app_name": "paypal-my-wallet", "difficulty": "hard", "verifier_path": "real-tasks/task_h20.py", "app_base_port": 8100}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h52", "dataset": "webarena-infinity", "query": "Create a new board called 'Performance Tracker' with lists for the priority::critical, priority::high, and priority::medium labels. Then add the 'priority::high' label to every open issue in the v4.1 milestone that has the 'performance' label.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h52", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h52.py", "app_base_port": 8050}}}
-{"query_id": "infinity-paypal-my-wallet-task_h80", "dataset": "webarena-infinity", "query": "Save all available Food & Drink offers, buy a $25 DoorDash gift card for yourself, and switch currency conversion to use my card issuer.", "graders": ["infinity_state"], "start_url": "http://localhost:8100", "metadata": {"original_task_id": "paypal-my-wallet-task_h80", "website": "paypal-my-wallet", "category": "webarena-infinity", "additional": {"app_name": "paypal-my-wallet", "difficulty": "hard", "verifier_path": "real-tasks/task_h80.py", "app_base_port": 8100}}}
-{"query_id": "infinity-gmail-accounts-and-contacts-task_h50", "dataset": "webarena-infinity", "query": "Add the Emergency label to every contact who is currently listed as a delegate (active, pending, or expired). Then remove all delegates whose status is not 'active'.", "graders": ["infinity_state"], "start_url": "http://localhost:8070", "metadata": {"original_task_id": "gmail-accounts-and-contacts-task_h50", "website": "gmail-accounts-and-contacts", "category": "webarena-infinity", "additional": {"app_name": "gmail-accounts-and-contacts", "difficulty": "hard", "verifier_path": "real-tasks/task_h50.py", "app_base_port": 8070}}}
-{"query_id": "infinity-elation-clinical-records-task_h14", "dataset": "webarena-infinity", "query": "Add the tag 'Flu-Season' to every patient whose primary provider is Dr. Sarah Chen.", "graders": ["infinity_state"], "start_url": "http://localhost:8000", "metadata": {"original_task_id": "elation-clinical-records-task_h14", "website": "elation-clinical-records", "category": "webarena-infinity", "additional": {"app_name": "elation-clinical-records", "difficulty": "hard", "verifier_path": "real-tasks/task_h14.py", "app_base_port": 8000}}}
-{"query_id": "infinity-figma-text-and-typography-task_h7", "dataset": "webarena-infinity", "query": "Remove all list formatting from every layer.", "graders": ["infinity_state"], "start_url": "http://localhost:8040", "metadata": {"original_task_id": "figma-text-and-typography-task_h7", "website": "figma-text-and-typography", "category": "webarena-infinity", "additional": {"app_name": "figma-text-and-typography", "difficulty": "hard", "verifier_path": "real-tasks/task_h7.py", "app_base_port": 8040}}}
-{"query_id": "infinity-paypal-my-wallet-task_h26", "dataset": "webarena-infinity", "query": "Send a $50 Amazon gift card to sarah.chen@email.com with 'Thank you!' as the message, and save the Amazon cashback offer.", "graders": ["infinity_state"], "start_url": "http://localhost:8100", "metadata": {"original_task_id": "paypal-my-wallet-task_h26", "website": "paypal-my-wallet", "category": "webarena-infinity", "additional": {"app_name": "paypal-my-wallet", "difficulty": "hard", "verifier_path": "real-tasks/task_h26.py", "app_base_port": 8100}}}
-{"query_id": "infinity-handshake-career-exploration-task_h97", "dataset": "webarena-infinity", "query": "Find the single most helpful answer across all Q&A questions and mark it helpful. Then find the most-viewed question and submit your own answer to it.", "graders": ["infinity_state"], "start_url": "http://localhost:8080", "metadata": {"original_task_id": "handshake-career-exploration-task_h97", "website": "handshake-career-exploration", "category": "webarena-infinity", "additional": {"app_name": "handshake-career-exploration", "difficulty": "hard", "verifier_path": "real-tasks/task_h97.py", "app_base_port": 8080}}}
-{"query_id": "infinity-figma-slides-task_h79", "dataset": "webarena-infinity", "query": "In the adoption table, find the feature with the highest Target Q4 percentage. In the competitive table, change DesignCraft's entry for that same feature to 'Market Leader'. Then update that feature's Target Q4 to '95%'.", "graders": ["infinity_state"], "start_url": "http://localhost:8030", "metadata": {"original_task_id": "figma-slides-task_h79", "website": "figma-slides", "category": "webarena-infinity", "additional": {"app_name": "figma-slides", "difficulty": "hard", "verifier_path": "real-tasks/task_h79.py", "app_base_port": 8030}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h41", "dataset": "webarena-infinity", "query": "For every open issue in the v4.2 - Security Hardening milestone: if it is already confidential, set its health status to 'at risk'. If it is not confidential, make it confidential and set its health status to 'needs attention'.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h41", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h41.py", "app_base_port": 8050}}}
-{"query_id": "infinity-handshake-career-exploration-task_h90", "dataset": "webarena-infinity", "query": "A student in the feed mentioned attending the NSBE conference. That student also answered a Q&A question about diversity programs in tech. Submit your own answer to that same question sharing your experience, then bookmark that student's feed post.", "graders": ["infinity_state"], "start_url": "http://localhost:8080", "metadata": {"original_task_id": "handshake-career-exploration-task_h90", "website": "handshake-career-exploration", "category": "webarena-infinity", "additional": {"app_name": "handshake-career-exploration", "difficulty": "hard", "verifier_path": "real-tasks/task_h90.py", "app_base_port": 8080}}}
-{"query_id": "infinity-elation-prescriptions-task_h30", "dataset": "webarena-infinity", "query": "The patient has three temporary medications. Discontinue the corticosteroid taper and the penicillin antibiotic \u2014 the patient completed both courses. Move the remaining temporary medication to permanent Rx.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h30", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h30.py", "app_base_port": 8020}}}
-{"query_id": "infinity-linear-account-settings-task_h19", "dataset": "webarena-infinity", "query": "Turn off all desktop application settings: open in desktop app, notification badge, and spell check.", "graders": ["infinity_state"], "start_url": "http://localhost:8090", "metadata": {"original_task_id": "linear-account-settings-task_h19", "website": "linear-account-settings", "category": "webarena-infinity", "additional": {"app_name": "linear-account-settings", "difficulty": "hard", "verifier_path": "real-tasks/task_h19.py", "app_base_port": 8090}}}
-{"query_id": "infinity-elation-prescriptions-task_h39", "dataset": "webarena-infinity", "query": "Change the default pharmacy to Express Scripts Mail Pharmacy for mail-order prescriptions. Then document that the patient takes Magnesium Citrate 400mg tablet as an OTC supplement \u2014 once daily at bedtime, 30-day supply.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h39", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h39.py", "app_base_port": 8020}}}
-{"query_id": "infinity-handshake-career-exploration-task_h136", "dataset": "webarena-infinity", "query": "Your earliest completed appointment was a specific type. Schedule a follow-up appointment of the same category and type with the same staff member, for March 28, 2026 at 9:00 AM, in person.", "graders": ["infinity_state"], "start_url": "http://localhost:8080", "metadata": {"original_task_id": "handshake-career-exploration-task_h136", "website": "handshake-career-exploration", "category": "webarena-infinity", "additional": {"app_name": "handshake-career-exploration", "difficulty": "hard", "verifier_path": "real-tasks/task_h136.py", "app_base_port": 8080}}}
-{"query_id": "infinity-handshake-career-exploration-task_h105", "dataset": "webarena-infinity", "query": "Find the second-most-viewed question in Q&A. It has two answers \u2014 mark the one with fewer helpful votes as helpful.", "graders": ["infinity_state"], "start_url": "http://localhost:8080", "metadata": {"original_task_id": "handshake-career-exploration-task_h105", "website": "handshake-career-exploration", "category": "webarena-infinity", "additional": {"app_name": "handshake-career-exploration", "difficulty": "hard", "verifier_path": "real-tasks/task_h105.py", "app_base_port": 8080}}}
-{"query_id": "infinity-gmail-accounts-and-contacts-task_h22", "dataset": "webarena-infinity", "query": "The Engineering Manager at TechCorp is listed as one of your delegates. Remove her delegation and unstar her contact.", "graders": ["infinity_state"], "start_url": "http://localhost:8070", "metadata": {"original_task_id": "gmail-accounts-and-contacts-task_h22", "website": "gmail-accounts-and-contacts", "category": "webarena-infinity", "additional": {"app_name": "gmail-accounts-and-contacts", "difficulty": "hard", "verifier_path": "real-tasks/task_h22.py", "app_base_port": 8070}}}
-{"query_id": "infinity-elation-patient-communication-task_h9", "dataset": "webarena-infinity", "query": "Acknowledge all unacknowledged reminders in the system.", "graders": ["infinity_state"], "start_url": "http://localhost:8010", "metadata": {"original_task_id": "elation-patient-communication-task_h9", "website": "elation-patient-communication", "category": "webarena-infinity", "additional": {"app_name": "elation-patient-communication", "difficulty": "hard", "verifier_path": "real-tasks/task_h9.py", "app_base_port": 8010}}}
-{"query_id": "infinity-superhuman-general-task_h1", "dataset": "webarena-infinity", "query": "Label the FinancePlus partnership email and the QuantumLab prototype email as 'Clients'.", "graders": ["infinity_state"], "start_url": "http://localhost:8110", "metadata": {"original_task_id": "superhuman-general-task_h1", "website": "superhuman-general", "category": "webarena-infinity", "additional": {"app_name": "superhuman-general", "difficulty": "hard", "verifier_path": "real-tasks/task_h1.py", "app_base_port": 8110}}}
-{"query_id": "infinity-xero-invoicing-task_h79", "dataset": "webarena-infinity", "query": "Change the invoice prefix to 'AUS-' and the next number to 100, then create a new invoice for CloudNine Analytics for 8 hours of UI/UX design work.", "graders": ["infinity_state"], "start_url": "http://localhost:8120", "metadata": {"original_task_id": "xero-invoicing-task_h79", "website": "xero-invoicing", "category": "webarena-infinity", "additional": {"app_name": "xero-invoicing", "difficulty": "hard", "verifier_path": "real-tasks/task_h79.py", "app_base_port": 8120}}}
-{"query_id": "infinity-figma-slides-task_h16", "dataset": "webarena-infinity", "query": "Enable slide numbers on every slide using the 'with total' format and change the aspect ratio to 4:3.", "graders": ["infinity_state"], "start_url": "http://localhost:8030", "metadata": {"original_task_id": "figma-slides-task_h16", "website": "figma-slides", "category": "webarena-infinity", "additional": {"app_name": "figma-slides", "difficulty": "hard", "verifier_path": "real-tasks/task_h16.py", "app_base_port": 8030}}}
-{"query_id": "infinity-linear-account-settings-task_h16", "dataset": "webarena-infinity", "query": "Revoke all API keys that have an expiration date.", "graders": ["infinity_state"], "start_url": "http://localhost:8090", "metadata": {"original_task_id": "linear-account-settings-task_h16", "website": "linear-account-settings", "category": "webarena-infinity", "additional": {"app_name": "linear-account-settings", "difficulty": "hard", "verifier_path": "real-tasks/task_h16.py", "app_base_port": 8090}}}
-{"query_id": "infinity-elation-prescriptions-task_h2", "dataset": "webarena-infinity", "query": "Prescribe Buspirone 10mg for the patient's anxiety \u2014 once daily in the morning, qty 30, 5 refills. Send it to the same pharmacy that fills his Sertraline.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h2", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h2.py", "app_base_port": 8020}}}
-{"query_id": "infinity-handshake-career-exploration-task_h1", "dataset": "webarena-infinity", "query": "Follow all consulting firms on Handshake.", "graders": ["infinity_state"], "start_url": "http://localhost:8080", "metadata": {"original_task_id": "handshake-career-exploration-task_h1", "website": "handshake-career-exploration", "category": "webarena-infinity", "additional": {"app_name": "handshake-career-exploration", "difficulty": "hard", "verifier_path": "real-tasks/task_h1.py", "app_base_port": 8080}}}
-{"query_id": "infinity-handshake-career-exploration-task_h141", "dataset": "webarena-infinity", "query": "Some of your saved jobs are from employers you haven't followed yet. Find and follow each of those employers.", "graders": ["infinity_state"], "start_url": "http://localhost:8080", "metadata": {"original_task_id": "handshake-career-exploration-task_h141", "website": "handshake-career-exploration", "category": "webarena-infinity", "additional": {"app_name": "handshake-career-exploration", "difficulty": "hard", "verifier_path": "real-tasks/task_h141.py", "app_base_port": 8080}}}
-{"query_id": "infinity-figma-text-and-typography-task_h74", "dataset": "webarena-infinity", "query": "Set the spelling language to Japanese, the big nudge amount to 50, and the default horizontal alignment to right.", "graders": ["infinity_state"], "start_url": "http://localhost:8040", "metadata": {"original_task_id": "figma-text-and-typography-task_h74", "website": "figma-text-and-typography", "category": "webarena-infinity", "additional": {"app_name": "figma-text-and-typography", "difficulty": "hard", "verifier_path": "real-tasks/task_h74.py", "app_base_port": 8040}}}
-{"query_id": "infinity-elation-patient-communication-task_h63", "dataset": "webarena-infinity", "query": "Check the visit summaries to find the patient whose BNP level improved. Reply to their most recent message confirming they can resume light activity, then update their emergency contact's phone number to (650) 555-0001.", "graders": ["infinity_state"], "start_url": "http://localhost:8010", "metadata": {"original_task_id": "elation-patient-communication-task_h63", "website": "elation-patient-communication", "category": "webarena-infinity", "additional": {"app_name": "elation-patient-communication", "difficulty": "hard", "verifier_path": "real-tasks/task_h63.py", "app_base_port": 8010}}}
-{"query_id": "infinity-elation-patient-communication-task_h14", "dataset": "webarena-infinity", "query": "Change Dr. Torres's notification timeframe to 'Do not notify me' and remove Dr. Torres from Dr. Chen's General Question routing.", "graders": ["infinity_state"], "start_url": "http://localhost:8010", "metadata": {"original_task_id": "elation-patient-communication-task_h14", "website": "elation-patient-communication", "category": "webarena-infinity", "additional": {"app_name": "elation-patient-communication", "difficulty": "hard", "verifier_path": "real-tasks/task_h14.py", "app_base_port": 8010}}}
-{"query_id": "infinity-gitlab-plan-and-track-task_h67", "dataset": "webarena-infinity", "query": "Delete all time entries from the GraphQL gateway issue, add a single new entry of 16 hours with summary 'Complete rewrite estimate', and set its time estimate to 40 hours.", "graders": ["infinity_state"], "start_url": "http://localhost:8050", "metadata": {"original_task_id": "gitlab-plan-and-track-task_h67", "website": "gitlab-plan-and-track", "category": "webarena-infinity", "additional": {"app_name": "gitlab-plan-and-track", "difficulty": "hard", "verifier_path": "real-tasks/task_h67.py", "app_base_port": 8050}}}
-{"query_id": "infinity-gmail-accounts-and-contacts-task_h73", "dataset": "webarena-infinity", "query": "Among the individual people in your other contacts (those with a first and last name), find the one who was saved most recently. Move them to your main contacts, set their company to 'Salesforce', job title to 'Account Executive', and add the Work label.", "graders": ["infinity_state"], "start_url": "http://localhost:8070", "metadata": {"original_task_id": "gmail-accounts-and-contacts-task_h73", "website": "gmail-accounts-and-contacts", "category": "webarena-infinity", "additional": {"app_name": "gmail-accounts-and-contacts", "difficulty": "hard", "verifier_path": "real-tasks/task_h73.py", "app_base_port": 8070}}}
-{"query_id": "infinity-elation-prescriptions-task_h4", "dataset": "webarena-infinity", "query": "Run a medication reconciliation and mark the Calcium+D3 supplement for discontinuation during the review.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h4", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h4.py", "app_base_port": 8020}}}
-{"query_id": "infinity-elation-prescriptions-task_h47", "dataset": "webarena-infinity", "query": "The patient's SSRI is currently dispensed at a different pharmacy than most of his other medications. Prescribe a refill of the same SSRI at the same dose and sig, but send it to CVS #4521 instead \u2014 qty 30, 5 refills, 30 days supply.", "graders": ["infinity_state"], "start_url": "http://localhost:8020", "metadata": {"original_task_id": "elation-prescriptions-task_h47", "website": "elation-prescriptions", "category": "webarena-infinity", "additional": {"app_name": "elation-prescriptions", "difficulty": "hard", "verifier_path": "real-tasks/task_h47.py", "app_base_port": 8020}}}
-{"query_id": "infinity-paypal-my-wallet-task_h89", "dataset": "webarena-infinity", "query": "If your USD PayPal balance is above $2,500, convert $500 to Japanese Yen. If it is $2,500 or below, first add $500 from your Chase bank account, then convert $500 to JPY. Either way, set the debit card cash back category to Fuel.", "graders": ["infinity_state"], "start_url": "http://localhost:8100", "metadata": {"original_task_id": "paypal-my-wallet-task_h89", "website": "paypal-my-wallet", "category": "webarena-infinity", "additional": {"app_name": "paypal-my-wallet", "difficulty": "hard", "verifier_path": "real-tasks/task_h89.py", "app_base_port": 8100}}}
--- a/packages/browseros-agent/apps/eval/scripts/agisdk-evaluate.py
+++ b/packages/browseros-agent/apps/eval/scripts/agisdk-evaluate.py
@@ -1,88 +0,0 @@
-#!/usr/bin/env python3
-"""
-AGI SDK evaluation helper for BrowserOS eval framework.
-
-Reads JSON from stdin with task_id and env_state, runs the agisdk
-evaluator, and outputs the result as JSON to stdout.
-
-Input format:
-    {"task_id": "dashdish-1", "env_state": {...}, "model_response": ""}
-
-Output format:
-    {"reward": 0.0, "pass": false, "message": "...", "per_criterion": [...]}
-"""
-
-import json
-import sys
-
-
-def main():
-    data = json.loads(sys.stdin.read())
-    task_id = data["task_id"]
-    env_state = data["env_state"]
-    model_response = data.get("model_response", "")
-
-    try:
-        from agisdk.REAL.browsergym.webclones.evaluate import WebCloneEvaluator
-        from agisdk.REAL.browsergym.webclones.task_config import TaskConfig
-    except ImportError:
-        print(
-            json.dumps(
-                {
-                    "reward": 0,
-                    "pass": False,
-                    "message": "agisdk package not installed. Run: pip install agisdk",
-                    "per_criterion": [],
-                }
-            )
-        )
-        sys.exit(0)
-
-    try:
-        # Redirect stdout to stderr during evaluation — agisdk's rich logger
-        # prints directly to stdout, which would corrupt our JSON output
-        real_stdout = sys.stdout
-        sys.stdout = sys.stderr
-
-        tc = TaskConfig(task_id)
-        evaluator = WebCloneEvaluator(tc)
-        reward_val, _done, message, info = evaluator.evaluate(
-            env_state=env_state, model_response=model_response
-        )
-
-        sys.stdout = real_stdout
-
-        reward_val = float(reward_val) if reward_val is not None else 0.0
-        results = info.get("results", [])
-        per_criterion = [
-            {"passed": r[0], "detail": str(r[1]) if len(r) > 1 else ""}
-            for r in results
-        ]
-
-        print(
-            json.dumps(
-                {
-                    "reward": reward_val,
-                    "pass": reward_val == 1.0,
-                    "message": str(message),
-                    "per_criterion": per_criterion,
-                }
-            )
-        )
-
-    except Exception as e:
-        sys.stdout = real_stdout if "real_stdout" in dir() else sys.__stdout__
-        print(
-            json.dumps(
-                {
-                    "reward": 0,
-                    "pass": False,
-                    "message": f"Evaluation error: {str(e)}",
-                    "per_criterion": [],
-                }
-            )
-        )
-
-
-if __name__ == "__main__":
-    main()
--- a/packages/browseros-agent/apps/eval/scripts/build-agisdk-dataset.py
+++ b/packages/browseros-agent/apps/eval/scripts/build-agisdk-dataset.py
@@ -1,83 +0,0 @@
-#!/usr/bin/env python3
-"""
-Build JSONL dataset for AGI SDK / REAL Bench evaluation.
-
-Reads task definitions from the agisdk package, filters to feasible
-action-only tasks (excludes llm_boolean evaluators), and outputs JSONL
-to stdout in the BrowserOS eval framework format.
-
-Usage:
-    python scripts/build-agisdk-dataset.py > data/agisdk-real.jsonl
-"""
-
-import json
-import sys
-
-
-def has_llm_eval(task: dict) -> bool:
-    return any(e.get("type") == "llm_boolean" for e in task.get("evals", []))
-
-
-def main():
-    try:
-        from agisdk.REAL.tasks import all_tasks
-    except ImportError:
-        print(
-            "Error: agisdk package not installed. Run: pip install agisdk",
-            file=sys.stderr,
-        )
-        sys.exit(1)
-
-    count = 0
-    skipped_infeasible = 0
-    skipped_llm = 0
-
-    for task in all_tasks:
-        if not task.get("possible", True):
-            skipped_infeasible += 1
-            continue
-
-        if has_llm_eval(task):
-            skipped_llm += 1
-            continue
-
-        task_id = task["id"]
-        website = task.get("website", {})
-        goal = task.get("goal", "")
-        start_url = website.get("url", "")
-
-        if not start_url or not goal:
-            print(f"Warning: Skipping {task_id} — missing url or goal", file=sys.stderr)
-            continue
-
-        entry = {
-            "query_id": f"agisdk-{task_id}",
-            "dataset": "agisdk-real",
-            "query": goal,
-            "graders": ["agisdk_state_diff"],
-            "start_url": start_url,
-            "metadata": {
-                "original_task_id": task_id,
-                "website": website.get("name", ""),
-                "category": "agisdk-real",
-                "additional": {
-                    "agisdk_task_id": task_id,
-                    "challenge_type": task.get("challengeType", "action"),
-                    "difficulty": task.get("difficulty", "unknown"),
-                    "similar_to": website.get("similarTo", ""),
-                },
-            },
-        }
-
-        print(json.dumps(entry))
-        count += 1
-
-    print(
-        f"Generated {count} tasks (skipped {skipped_infeasible} infeasible, "
-        f"{skipped_llm} llm_boolean)",
-        file=sys.stderr,
-    )
-
-
-if __name__ == "__main__":
-    main()
--- a/packages/browseros-agent/apps/eval/scripts/build-infinity-dataset.py
+++ b/packages/browseros-agent/apps/eval/scripts/build-infinity-dataset.py
@@ -1,118 +0,0 @@
-#!/usr/bin/env python3
-"""
-Dataset generator for WebArena-Infinity benchmark.
-
-Reads real-tasks.json from each app directory and outputs JSONL
-in the eval framework's TaskSchema format.
-
-Usage:
-    python build-infinity-dataset.py --apps-dir /path/to/webarena-infinity/apps
-    python build-infinity-dataset.py --apps-dir /path/to/apps --apps gmail linear --difficulty medium
-"""
-
-import argparse
-import json
-import os
-import sys
-
-
-def load_tasks(app_dir: str, app_name: str) -> list[dict]:
-    tasks_file = os.path.join(app_dir, "real-tasks.json")
-    if not os.path.exists(tasks_file):
-        print(f"Warning: No real-tasks.json found in {app_dir}", file=sys.stderr)
-        return []
-    with open(tasks_file) as f:
-        return json.load(f)
-
-
-def build_task_entry(
-    app_name: str,
-    task: dict,
-    base_port: int,
-) -> dict:
-    task_id = task.get("id", task.get("task_id", "unknown"))
-    difficulty = task.get("difficulty", "unknown")
-    query = task.get("query", task.get("instruction", task.get("task", "")))
-    verifier_path = task.get(
-        "verify",
-        task.get("verifier_path", f"real-tasks/{task_id}.py"),
-    )
-
-    return {
-        "query_id": f"infinity-{app_name}-{task_id}",
-        "dataset": "webarena-infinity",
-        "query": query,
-        "graders": ["infinity_state"],
-        "start_url": f"http://localhost:{base_port}",
-        "setup_script": f"POST http://localhost:{base_port}/api/reset",
-        "metadata": {
-            "original_task_id": f"{app_name}-{task_id}",
-            "website": app_name,
-            "category": "webarena-infinity",
-            "additional": {
-                "app_name": app_name,
-                "difficulty": difficulty,
-                "verifier_path": verifier_path,
-                "app_port": base_port,
-            },
-        },
-    }
-
-
-def main():
-    parser = argparse.ArgumentParser(
-        description="Generate JSONL dataset from WebArena-Infinity apps"
-    )
-    parser.add_argument(
-        "--apps-dir",
-        required=True,
-        help="Path to webarena-infinity/apps/ directory",
-    )
-    parser.add_argument(
-        "--apps",
-        nargs="*",
-        default=None,
-        help="Filter to specific app names (default: all)",
-    )
-    parser.add_argument(
-        "--difficulty",
-        choices=["easy", "medium", "hard"],
-        default=None,
-        help="Filter by difficulty tier",
-    )
-    parser.add_argument(
-        "--base-port",
-        type=int,
-        default=8000,
-        help="Starting port number for apps (default: 8000)",
-    )
-    args = parser.parse_args()
-
-    if not os.path.isdir(args.apps_dir):
-        print(f"Error: {args.apps_dir} is not a directory", file=sys.stderr)
-        sys.exit(1)
-
-    app_dirs = sorted(os.listdir(args.apps_dir))
-    if args.apps:
-        app_dirs = [d for d in app_dirs if d in args.apps]
-
-    port = args.base_port
-    for app_name in app_dirs:
-        app_path = os.path.join(args.apps_dir, app_name)
-        if not os.path.isdir(app_path):
-            continue
-
-        tasks = load_tasks(app_path, app_name)
-        for task in tasks:
-            difficulty = task.get("difficulty", "unknown")
-            if args.difficulty and difficulty != args.difficulty:
-                continue
-
-            entry = build_task_entry(app_name, task, port)
-            print(json.dumps(entry))
-
-        port += 1
-
-
-if __name__ == "__main__":
-    main()
--- a/packages/browseros-agent/apps/eval/scripts/infinity-evaluate.py
+++ b/packages/browseros-agent/apps/eval/scripts/infinity-evaluate.py
@@ -1,82 +0,0 @@
-#!/usr/bin/env python3
-"""
-Evaluation helper for WebArena-Infinity verifier scripts.
-
-Reads JSON from stdin with app_server_url, verifier_path, and task_id.
-Runs the verifier against the app server and outputs a JSON result.
-
-Verifiers have the signature: verify(server_url: str) -> tuple[bool, str]
-They fetch /api/state internally and return (passed, message).
-
-Usage:
-    echo '{"app_server_url": "http://localhost:8000", "verifier_path": "/path/to/verify.py"}' | python infinity-evaluate.py
-"""
-
-import importlib.util
-import json
-import sys
-import traceback
-
-
-def load_verifier(verifier_path: str):
-    spec = importlib.util.spec_from_file_location("verifier", verifier_path)
-    if spec is None or spec.loader is None:
-        raise ImportError(f"Cannot load verifier from {verifier_path}")
-    module = importlib.util.module_from_spec(spec)
-    spec.loader.exec_module(module)
-    return module
-
-
-def main():
-    try:
-        data = json.loads(sys.stdin.read())
-    except json.JSONDecodeError as e:
-        print(json.dumps({"pass": False, "reward": 0.0, "message": f"Invalid JSON input: {e}"}))
-        sys.exit(1)
-
-    server_url = data.get("app_server_url", "")
-    verifier_path = data.get("verifier_path", "")
-
-    if not server_url or not verifier_path:
-        print(json.dumps({
-            "pass": False,
-            "reward": 0.0,
-            "message": "Missing app_server_url or verifier_path",
-        }))
-        sys.exit(1)
-
-    try:
-        verifier = load_verifier(verifier_path)
-        fn = getattr(verifier, "verify", None)
-        if not callable(fn):
-            raise AttributeError(
-                f"Verifier has no verify() function. "
-                f"Available: {[a for a in dir(verifier) if not a.startswith('_')]}"
-            )
-
-        # Verifiers take server_url and fetch state internally
-        result = fn(server_url)
-
-        # Return is tuple[bool, str]
-        if isinstance(result, tuple) and len(result) >= 2:
-            passed, message = result[0], str(result[1])
-        else:
-            passed, message = bool(result), str(result)
-
-    except Exception as e:
-        print(json.dumps({
-            "pass": False,
-            "reward": 0.0,
-            "message": f"Verifier error: {e}\n{traceback.format_exc()}",
-        }))
-        sys.exit(1)
-
-    print(json.dumps({
-        "pass": passed,
-        "reward": 1.0 if passed else 0.0,
-        "message": message,
-    }))
-
-
-if __name__ == "__main__":
-    main()
--- a/packages/browseros-agent/apps/eval/scripts/weekly-report.ts
+++ b/packages/browseros-agent/apps/eval/scripts/weekly-report.ts
@@ -59,8 +59,6 @@ interface RunSummary {
 }

 const PASS_FAIL_GRADER_ORDER = [
-  'agisdk_state_diff',
-  'infinity_state',
  'performance_grader',
  'webvoyager_grader',
  'fara_combined',
--- a/packages/browseros-agent/apps/eval/src/graders/benchmark/agisdk-state-diff.ts
+++ b/packages/browseros-agent/apps/eval/src/graders/benchmark/agisdk-state-diff.ts
@@ -1,202 +0,0 @@
-import { spawn } from 'node:child_process'
-import { join } from 'node:path'
-import type { GraderResult } from '../../types'
-import { callMcpTool } from '../../utils/mcp-client'
-import type { Grader, GraderInput } from '../types'
-
-const EVAL_SCRIPT = join(
-  import.meta.dirname,
-  '..',
-  '..',
-  '..',
-  'scripts',
-  'agisdk-evaluate.py',
-)
-
-export class AgisdkStateDiffGrader implements Grader {
-  name = 'agisdk_state_diff'
-
-  async grade(input: GraderInput): Promise<GraderResult> {
-    const taskId = this.extractTaskId(input.task.query_id)
-    const startUrl = this.extractStartUrl(input)
-    const mcpEndpoint =
-      input.mcpUrl ||
-      `${process.env.BROWSEROS_SERVER_URL || 'http://127.0.0.1:9110'}/mcp`
-
-    if (!startUrl) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning: 'Could not determine clone site URL from task',
-      }
-    }
-
-    const origin = new URL(startUrl).origin
-
-    let envState: Record<string, unknown>
-    try {
-      envState = await this.fetchFinishState(origin, mcpEndpoint)
-    } catch (error) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning: `Failed to fetch /finish endpoint: ${error instanceof Error ? error.message : String(error)}`,
-        details: { origin, error: true },
-      }
-    }
-
-    try {
-      const result = await this.runPythonEvaluator(
-        taskId,
-        envState,
-        input.finalAnswer || '',
-      )
-      return {
-        score: result.reward,
-        pass: result.pass,
-        reasoning:
-          result.message ||
-          (result.pass ? 'All criteria passed' : 'Some criteria failed'),
-        details: {
-          reward: result.reward,
-          per_criterion: result.per_criterion,
-          origin,
-          agisdk_task_id: taskId,
-        },
-      }
-    } catch (error) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning: `Python evaluator error: ${error instanceof Error ? error.message : String(error)}`,
-        details: { error: true },
-      }
-    }
-  }
-
-  private extractTaskId(queryId: string): string {
-    return queryId.replace(/^agisdk-/, '')
-  }
-
-  private extractStartUrl(input: GraderInput): string | null {
-    // Derive from task_id: "dashdish-10" → "https://evals-dashdish.vercel.app"
-    // Task IDs are "{site}-{number}" where site may contain hyphens (e.g. "fly-unified-5")
-    const taskId = this.extractTaskId(input.task.query_id)
-    const siteId = taskId.replace(/-\d+$/, '')
-    if (siteId) return `https://evals-${siteId}.vercel.app`
-
-    // Fallback: search messages for vercel.app URLs
-    for (const msg of input.messages) {
-      const text =
-        msg.type === 'user'
-          ? msg.content
-          : msg.type === 'tool-input-available'
-            ? JSON.stringify(msg.input)
-            : ''
-      const urlMatch = text.match(/https?:\/\/[^\s"']+\.vercel\.app/)
-      if (urlMatch) return urlMatch[0]
-    }
-
-    return null
-  }
-
-  private async fetchFinishState(
-    origin: string,
-    mcpEndpoint: string,
-  ): Promise<Record<string, unknown>> {
-    const finishUrl = `${origin}/finish`
-
-    // Navigate browser to /finish page (state diff is rendered client-side)
-    await callMcpTool(mcpEndpoint, 'navigate_page', {
-      url: finishUrl,
-      page: 1,
-    })
-
-    // Wait for the page to render, then extract JSON from <pre> element
-    const result = await callMcpTool(mcpEndpoint, 'evaluate_script', {
-      page: 1,
-      expression: `
-        new Promise((resolve, reject) => {
-          let attempts = 0;
-          const check = () => {
-            const pre = document.querySelector('pre');
-            if (pre && pre.textContent.trim().startsWith('{')) {
-              resolve(pre.textContent);
-            } else if (++attempts > 20) {
-              reject(new Error('Timed out waiting for <pre> JSON on /finish'));
-            } else {
-              setTimeout(check, 500);
-            }
-          };
-          check();
-        })
-      `,
-    })
-
-    const textContent = result.content?.find(
-      (c: { type: string }) => c.type === 'text',
-    )
-    if (!textContent?.text) {
-      throw new Error('No text content returned from /finish page')
-    }
-
-    return JSON.parse(textContent.text) as Record<string, unknown>
-  }
-
-  private runPythonEvaluator(
-    taskId: string,
-    envState: Record<string, unknown>,
-    modelResponse: string,
-  ): Promise<{
-    reward: number
-    pass: boolean
-    message: string
-    per_criterion: unknown[]
-  }> {
-    return new Promise((resolve, reject) => {
-      const proc = spawn('python3', [EVAL_SCRIPT], {
-        stdio: ['pipe', 'pipe', 'pipe'],
-      })
-
-      const inputData = JSON.stringify({
-        task_id: taskId,
-        env_state: envState,
-        model_response: modelResponse,
-      })
-
-      let stdout = ''
-      let stderr = ''
-
-      proc.stdout.on('data', (data: Buffer) => {
-        stdout += data.toString()
-      })
-
-      proc.stderr.on('data', (data: Buffer) => {
-        stderr += data.toString()
-      })
-
-      proc.on('close', (code) => {
-        if (code !== 0) {
-          reject(
-            new Error(`Python evaluator exited with code ${code}: ${stderr}`),
-          )
-          return
-        }
-
-        try {
-          const result = JSON.parse(stdout.trim())
-          resolve(result)
-        } catch {
-          reject(new Error(`Failed to parse evaluator output: ${stdout}`))
-        }
-      })
-
-      proc.on('error', (err) => {
-        reject(new Error(`Failed to spawn Python evaluator: ${err.message}`))
-      })
-
-      proc.stdin.write(inputData)
-      proc.stdin.end()
-    })
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/graders/benchmark/infinity-state.ts
+++ b/packages/browseros-agent/apps/eval/src/graders/benchmark/infinity-state.ts
@@ -1,134 +0,0 @@
-import { join, resolve } from 'node:path'
-import type { GraderResult } from '../../types'
-import type { Grader, GraderInput } from '../types'
-
-interface InfinityEvalInput {
-  app_server_url: string
-  verifier_path: string
-  task_id: string
-}
-
-interface InfinityEvalOutput {
-  pass: boolean
-  reward: number
-  message: string
-}
-
-const EVAL_SCRIPT = resolve(
-  import.meta.dir,
-  '../../../scripts/infinity-evaluate.py',
-)
-
-export class InfinityStateGrader implements Grader {
-  name = 'infinity_state'
-
-  async grade(input: GraderInput): Promise<GraderResult> {
-    const parsed = this.parseQueryId(input.task.query_id)
-    if (!parsed) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning: `Cannot parse query_id "${input.task.query_id}" — expected format: infinity-{app}-{task_id}`,
-      }
-    }
-
-    const appServerUrl = this.resolveAppServerUrl(input)
-    if (!appServerUrl) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning: 'Cannot determine app server URL',
-      }
-    }
-
-    const infinityDir = process.env.WEBARENA_INFINITY_DIR
-    if (!infinityDir) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning:
-          'WEBARENA_INFINITY_DIR env var not set. Point it to the webarena-infinity repo root.',
-      }
-    }
-
-    const verifierPath = join(
-      infinityDir,
-      'apps',
-      parsed.appName,
-      'real-tasks',
-      `${parsed.taskId}.py`,
-    )
-
-    const evalInput: InfinityEvalInput = {
-      app_server_url: appServerUrl,
-      verifier_path: verifierPath,
-      task_id: input.task.query_id,
-    }
-
-    try {
-      const result = await this.runPythonEvaluator(evalInput)
-      return {
-        score: result.pass ? 1 : 0,
-        pass: result.pass,
-        reasoning: result.message,
-        details: {
-          reward: result.reward,
-          app_name: parsed.appName,
-          app_server_url: appServerUrl,
-        },
-      }
-    } catch (error) {
-      return {
-        score: 0,
-        pass: false,
-        reasoning: `Evaluator process error: ${error instanceof Error ? error.message : String(error)}`,
-      }
-    }
-  }
-
-  private parseQueryId(
-    queryId: string,
-  ): { appName: string; taskId: string } | null {
-    // Task IDs start with "task_", app names may contain hyphens
-    // e.g. "infinity-elation-prescriptions-task_h69"
-    const match = queryId.match(/^infinity-(.+)-(task_.+)$/)
-    if (!match) return null
-    return { appName: match[1], taskId: match[2] }
-  }
-
-  private resolveAppServerUrl(input: GraderInput): string | null {
-    // Passed directly from task executor (started by InfinityAppManager)
-    if (input.infinityAppUrl) return input.infinityAppUrl
-
-    // Fallback: env var for manual testing
-    if (process.env.INFINITY_APP_URL) return process.env.INFINITY_APP_URL
-
-    return null
-  }
-
-  private async runPythonEvaluator(
-    evalInput: InfinityEvalInput,
-  ): Promise<InfinityEvalOutput> {
-    const proc = Bun.spawn(['python3', EVAL_SCRIPT], {
-      stdin: 'pipe',
-      stdout: 'pipe',
-      stderr: 'pipe',
-    })
-
-    const inputJson = JSON.stringify(evalInput)
-    proc.stdin.write(inputJson)
-    proc.stdin.end()
-
-    const stdout = await new Response(proc.stdout).text()
-    const stderr = await new Response(proc.stderr).text()
-    const exitCode = await proc.exited
-
-    if (exitCode !== 0) {
-      throw new Error(
-        `Python evaluator exited with code ${exitCode}: ${stderr || stdout}`,
-      )
-    }
-
-    return JSON.parse(stdout.trim()) as InfinityEvalOutput
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/graders/registry.ts
+++ b/packages/browseros-agent/apps/eval/src/graders/registry.ts
@@ -1,6 +1,4 @@
 import type { GraderResult } from '../types'
-import { AgisdkStateDiffGrader } from './benchmark/agisdk-state-diff'
-import { InfinityStateGrader } from './benchmark/infinity-state'
 import { Mind2WebJudgeGrader } from './benchmark/mind2web'
 import { WebVoyagerGrader } from './benchmark/webvoyager'
 import { FaraAlignmentGrader } from './fara/alignment'
@@ -21,13 +19,7 @@ export function createGrader(
  options: GraderOptions | null,
 ): Grader | null {
  switch (name) {
-    // Deterministic benchmark graders (no LLM judge)
-    case 'agisdk_state_diff':
-      return new AgisdkStateDiffGrader()
-    case 'infinity_state':
-      return new InfinityStateGrader()
-
-    // LLM-based benchmark graders
+    // Benchmark graders
    case 'webvoyager_grader':
      if (!options?.apiKey) return null
      return new WebVoyagerGrader(
@@ -115,12 +107,10 @@ export async function runGraders(

 // Export grader classes for direct use
 export {
-  AgisdkStateDiffGrader,
  FaraAlignmentGrader,
  FaraCombinedGrader,
  FaraMultimodalGrader,
  FaraRubricGrader,
-  InfinityStateGrader,
  Mind2WebJudgeGrader,
  PerformanceGrader,
  WebVoyagerGrader,
--- a/packages/browseros-agent/apps/eval/src/graders/types.ts
+++ b/packages/browseros-agent/apps/eval/src/graders/types.ts
@@ -11,8 +11,6 @@ export interface GraderInput {
  finalAnswer: string | null
  expectedAnswer?: string | null
  outputDir: string
-  mcpUrl?: string
-  infinityAppUrl?: string
 }

 export interface Grader {
--- a/packages/browseros-agent/apps/eval/src/runner/infinity-app-manager.ts
+++ b/packages/browseros-agent/apps/eval/src/runner/infinity-app-manager.ts
@@ -1,89 +0,0 @@
-/**
- * Manages WebArena-Infinity app server lifecycle per task.
- *
- * Each worker gets a unique port: base_port + worker_index.
- * Server is started fresh before each task and killed after,
- * guaranteeing clean state.
- */
-
-import { type ChildProcess, spawn } from 'node:child_process'
-import { join } from 'node:path'
-
-export class InfinityAppManager {
-  private proc: ChildProcess | null = null
-  private port: number
-  private infinityDir: string
-
-  constructor(
-    private workerIndex: number,
-    private basePort: number = 8000,
-  ) {
-    this.port = basePort + workerIndex
-    this.infinityDir = process.env.WEBARENA_INFINITY_DIR || ''
-  }
-
-  async startApp(appName: string): Promise<string> {
-    await this.stop()
-
-    if (!this.infinityDir) {
-      throw new Error('WEBARENA_INFINITY_DIR env var not set')
-    }
-
-    const serverScript = join(this.infinityDir, 'apps', appName, 'server.py')
-    this.proc = spawn('python3', [serverScript, '--port', String(this.port)], {
-      stdio: ['ignore', 'pipe', 'pipe'],
-      cwd: join(this.infinityDir, 'apps', appName),
-    })
-
-    // Wait for server to be ready
-    const url = `http://localhost:${this.port}`
-    await this.waitForReady(url)
-    return url
-  }
-
-  async stop(): Promise<void> {
-    if (this.proc) {
-      this.proc.kill('SIGTERM')
-      await new Promise<void>((resolve) => {
-        const timeout = setTimeout(() => {
-          this.proc?.kill('SIGKILL')
-          resolve()
-        }, 3000)
-        this.proc?.on('exit', () => {
-          clearTimeout(timeout)
-          resolve()
-        })
-      })
-      this.proc = null
-    }
-  }
-
-  getPort(): number {
-    return this.port
-  }
-
-  getUrl(): string {
-    return `http://localhost:${this.port}`
-  }
-
-  private async waitForReady(
-    url: string,
-    maxAttempts = 30,
-    intervalMs = 500,
-  ): Promise<void> {
-    for (let i = 0; i < maxAttempts; i++) {
-      try {
-        const resp = await fetch(url, {
-          signal: AbortSignal.timeout(2000),
-        })
-        if (resp.ok) return
-      } catch {
-        // Server not ready yet
-      }
-      await new Promise((r) => setTimeout(r, intervalMs))
-    }
-    throw new Error(
-      `Infinity app server not ready after ${maxAttempts * intervalMs}ms on port ${this.port}`,
-    )
-  }
-}
--- a/packages/browseros-agent/apps/eval/src/runner/task-executor.ts
+++ b/packages/browseros-agent/apps/eval/src/runner/task-executor.ts
@@ -9,7 +9,6 @@ import {
 import { runGraders } from '../graders/registry'
 import type { ErrorSource, EvalConfig, GraderResult, Task } from '../types'
 import { callMcpTool } from '../utils/mcp-client'
-import { InfinityAppManager } from './infinity-app-manager'
 import type { GraderOptions, TaskResult } from './types'

 // ============================================================================
@@ -102,36 +101,6 @@ export class TaskExecutor {
    // Resolve page ID once — fresh browser has exactly one page
    const pageId = await this.resolveInitialPageId(mcpUrl)

-    // For Infinity tasks, start a fresh app server per task
-    let infinityManager: InfinityAppManager | null = null
-    let actualStartUrl = task.start_url
-
-    if (task.dataset === 'webarena-infinity') {
-      const appName = (task.metadata?.additional as Record<string, unknown>)
-        ?.app_name as string
-      const appBasePort =
-        ((task.metadata?.additional as Record<string, unknown>)
-          ?.app_base_port as number) || 8000
-      const workerIndex = this.config.browseros.base_server_port - 9110 // derive from port offset
-
-      if (appName && process.env.WEBARENA_INFINITY_DIR) {
-        infinityManager = new InfinityAppManager(workerIndex, appBasePort)
-        try {
-          actualStartUrl = await infinityManager.startApp(appName)
-          console.log(
-            `  Infinity app "${appName}" started on port ${infinityManager.getPort()}`,
-          )
-        } catch (error) {
-          throw new TaskExecutionError(
-            `Failed to start Infinity app: ${error instanceof Error ? error.message : String(error)}`,
-            task,
-            'navigation',
-            error instanceof Error ? error : undefined,
-          )
-        }
-      }
-    }
-
    try {
      // Phase 1: Set viewport + navigate to start URL
      try {
@@ -145,10 +114,10 @@ export class TaskExecutor {
        )
      }

-      if (actualStartUrl && actualStartUrl !== 'about:blank') {
+      if (task.start_url && task.start_url !== 'about:blank') {
        try {
          await callMcpTool(mcpUrl, 'navigate_page', {
-            url: actualStartUrl,
+            url: task.start_url,
            page: pageId,
          })
        } catch (error) {
@@ -165,11 +134,7 @@ export class TaskExecutor {
      const agentResult = await this.executeAgent(task, pageId)

      // Phase 3: Run graders
-      const graderResults = await this.runGraders(
-        task,
-        agentResult,
-        infinityManager?.getUrl(),
-      )
+      const graderResults = await this.runGraders(task, agentResult)

      const status =
        agentResult.metadata.termination_reason === 'timeout'
@@ -204,11 +169,6 @@ export class TaskExecutor {
      } catch {
        // Ignore cleanup errors
      }
-
-      // Stop Infinity app server if running
-      if (infinityManager) {
-        await infinityManager.stop().catch(() => {})
-      }
    }
  }

@@ -249,7 +209,6 @@ export class TaskExecutor {
  private async runGraders(
    task: Task,
    agentResult: AgentResult,
-    infinityAppUrl?: string,
  ): Promise<Record<string, GraderResult>> {
    const configGraders = this.config.graders ?? []
    const taskGraders = task.graders ?? []
@@ -275,8 +234,6 @@ export class TaskExecutor {
          expectedAnswer: (task.metadata?.additional as Record<string, unknown>)
            ?.answer as string | undefined,
          outputDir: join(this.outputDir, task.query_id),
-          mcpUrl: `${this.config.browseros.server_url}/mcp`,
-          infinityAppUrl,
        },
        this.deps.graderOptions,
      )
--- a/packages/browseros-agent/apps/eval/src/runner/types.ts
+++ b/packages/browseros-agent/apps/eval/src/runner/types.ts
@@ -100,8 +100,6 @@ export interface TaskResultSummary {
 // ============================================================================

 export const PASS_FAIL_GRADER_ORDER = [
-  'agisdk_state_diff',
-  'infinity_state',
  'performance_grader',
  'webvoyager_grader',
  'fara_combined',
--- a/packages/browseros-agent/apps/server/.env.example
+++ b/packages/browseros-agent/apps/server/.env.example
@@ -13,8 +13,6 @@ BROWSEROS_VERSION=
 BROWSEROS_INSTALL_ID=
 BROWSEROS_CLIENT_ID=

-BROWSEROS_TRUSTED_ORIGINS=
-
 # Graph service
 CODEGEN_SERVICE_URL=

--- a/packages/browseros-agent/apps/server/package.json
+++ b/packages/browseros-agent/apps/server/package.json
@@ -1,6 +1,6 @@
 {
  "name": "@browseros/server",
-  "version": "0.0.82",
+  "version": "0.0.83",
  "description": "BrowserOS server",
  "type": "module",
  "main": "./src/index.ts",
@@ -70,6 +70,7 @@
    "@ai-sdk/openai-compatible": "^2.0.30",
    "@ai-sdk/provider": "^3.0.8",
    "@browseros-ai/agent-sdk": "workspace:*",
+    "@huggingface/transformers": "^3.4.0",
    "@browseros/cdp-protocol": "workspace:*",
    "@browseros/shared": "workspace:*",
    "@google/gemini-cli-core": "^0.16.0",
--- a/packages/browseros-agent/apps/server/resources/openclaw-compose.yml
+++ b/packages/browseros-agent/apps/server/resources/openclaw-compose.yml
@@ -0,0 +1,37 @@
+services:
+  openclaw-gateway:
+    image: ${OPENCLAW_IMAGE:-ghcr.io/openclaw/openclaw:latest}
+    ports:
+      - "127.0.0.1:${OPENCLAW_GATEWAY_PORT:-18789}:18789"
+    environment:
+      - HOME=/home/node
+      - NODE_ENV=production
+      - OPENCLAW_GATEWAY_TOKEN=${OPENCLAW_GATEWAY_TOKEN}
+      - OPENCLAW_GATEWAY_BIND=lan
+      - TZ=${TZ}
+      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
+      - OPENAI_API_KEY=${OPENAI_API_KEY:-}
+      - GEMINI_API_KEY=${GEMINI_API_KEY:-}
+      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY:-}
+      - GROQ_API_KEY=${GROQ_API_KEY:-}
+      - MISTRAL_API_KEY=${MISTRAL_API_KEY:-}
+      - MOONSHOT_API_KEY=${MOONSHOT_API_KEY:-}
+    volumes:
+      - ${OPENCLAW_CONFIG_DIR}:/home/node/.openclaw
+    extra_hosts:
+      - "host.containers.internal:host-gateway"
+    command:
+      - node
+      - dist/index.js
+      - gateway
+      - --bind
+      - lan
+      - --port
+      - "18789"
+      - --allow-unconfigured
+    healthcheck:
+      test: ["CMD", "curl", "-sf", "http://127.0.0.1:18789/healthz"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+    restart: unless-stopped
--- a/packages/browseros-agent/apps/server/src/agent/ai-sdk-agent.ts
+++ b/packages/browseros-agent/apps/server/src/agent/ai-sdk-agent.ts
@@ -6,6 +6,7 @@ import type {
 import { AGENT_LIMITS } from '@browseros/shared/constants/limits'
 import type { BrowserContext } from '@browseros/shared/schemas/browser-context'
 import { LLM_PROVIDERS } from '@browseros/shared/schemas/llm'
+import type { AclRule } from '@browseros/shared/types/acl'
 import {
  type LanguageModel,
  type ModelMessage,
@@ -23,6 +24,7 @@ import { isSoulBootstrap, readSoul } from '../lib/soul'
 import { buildSkillsCatalog } from '../skills/catalog'
 import { loadSkills } from '../skills/loader'
 import { buildFilesystemToolSet } from '../tools/filesystem/build-toolset'
+import type { ToolContext } from '../tools/framework'
 import { buildMemoryToolSet } from '../tools/memory/build-toolset'
 import type { ToolRegistry } from '../tools/tool-registry'
 import { CHAT_MODE_ALLOWED_TOOLS } from './chat-mode'
@@ -46,6 +48,7 @@ export interface AiSdkAgentConfig {
  klavisClient?: KlavisClient
  browserosId?: string
  aiSdkDevtoolsEnabled?: boolean
+  aclRules?: AclRule[]
 }

 export class AiSdkAgent {
@@ -55,6 +58,7 @@ export class AiSdkAgent {
    private _mcpClients: Array<{ close(): Promise<void> }>,
    private conversationId: string,
    private _toolNames: Set<string>,
+    private toolContext: ToolContext,
  ) {}

  /** Tool names registered on this agent — used to sanitize messages during session rebuilds. */
@@ -99,14 +103,19 @@ export class AiSdkAgent {

    // Build browser tools from the unified tool registry
    const originPageId = config.browserContext?.activeTab?.pageId
-    const allBrowserTools = buildBrowserToolSet(
-      config.registry,
-      config.browser,
-      config.resolvedConfig.workingDir,
-      {
+    const toolContext: ToolContext = {
+      browser: config.browser,
+      directories: { workingDir: config.resolvedConfig.workingDir },
+      session: {
        origin: config.resolvedConfig.origin,
        originPageId,
      },
+      aclRules: config.aclRules,
+    }
+    const allBrowserTools = buildBrowserToolSet(
+      config.registry,
+      toolContext,
+      config.resolvedConfig.toolApprovalConfig,
    )
    const browserTools = config.resolvedConfig.chatMode
      ? Object.fromEntries(
@@ -277,6 +286,7 @@ export class AiSdkAgent {
      clients,
      config.resolvedConfig.conversationId,
      new Set(Object.keys(tools)),
+      toolContext,
    )
  }

@@ -300,6 +310,10 @@ export class AiSdkAgent {
    })
  }

+  updateAclRules(rules?: AclRule[]): void {
+    this.toolContext.aclRules = rules
+  }
+
  async dispose(): Promise<void> {
    for (const client of this._mcpClients) {
      await client.close().catch(() => {})
--- a/packages/browseros-agent/apps/server/src/agent/session-store.ts
+++ b/packages/browseros-agent/apps/server/src/agent/session-store.ts
@@ -11,8 +11,8 @@ export interface AgentSession {
  mcpServerKey?: string
  /** Workspace directory when the session was created, for change detection. */
  workingDir?: string
-  /** LLM config used when the session was created, for provider/model changes. */
-  llmConfigKey?: string
+  /** Tool approval category key for change detection. */
+  approvalConfigKey?: string
 }

 export class SessionStore {
--- a/packages/browseros-agent/apps/server/src/agent/tool-adapter.ts
+++ b/packages/browseros-agent/apps/server/src/agent/tool-adapter.ts
@@ -1,6 +1,6 @@
 import type { LanguageModelV2ToolResultOutput } from '@ai-sdk/provider'
+import type { ToolApprovalConfig } from '@browseros/shared/constants/tool-approval'
 import { type ToolSet, tool } from 'ai'
-import type { Browser } from '../browser/browser'
 import { logger } from '../lib/logger'
 import { metrics } from '../lib/metrics'
 import { executeTool, type ToolContext } from '../tools/framework'
@@ -35,23 +35,29 @@ function contentToModelOutput(
  }
 }

+export function getApprovedBrowserToolNames(
+  registry: ToolRegistry,
+  approvalConfig?: ToolApprovalConfig,
+): string[] {
+  if (!approvalConfig) return []
+  return registry
+    .all()
+    .filter((def) => approvalConfig.categories[def.approvalCategory] === true)
+    .map((def) => def.name)
+}
+
 export function buildBrowserToolSet(
  registry: ToolRegistry,
-  browser: Browser,
-  workingDir: string | undefined,
-  session?: { origin?: 'sidepanel' | 'newtab'; originPageId?: number },
+  ctx: ToolContext,
+  approvalConfig?: ToolApprovalConfig,
 ): ToolSet {
  const toolSet: ToolSet = {}
-  const ctx: ToolContext = {
-    browser,
-    directories: { workingDir },
-    session,
-  }

  for (const def of registry.all()) {
    toolSet[def.name] = tool({
      description: def.description,
      inputSchema: def.input,
+      needsApproval: approvalConfig?.categories[def.approvalCategory] === true,
      execute: async (params) => {
        const startTime = performance.now()
        try {
--- a/packages/browseros-agent/apps/server/src/agent/types.ts
+++ b/packages/browseros-agent/apps/server/src/agent/types.ts
@@ -3,6 +3,7 @@
 * Copyright 2025 BrowserOS
 * SPDX-License-Identifier: AGPL-3.0-or-later
 */
+import type { ToolApprovalConfig } from '@browseros/shared/constants/tool-approval'
 import type { LLMProvider } from '@browseros/shared/schemas/llm'

 export interface ProviderConfig {
@@ -50,4 +51,6 @@ export interface ResolvedAgentConfig {
  origin?: 'sidepanel' | 'newtab'
  /** BrowserOS installation ID for credit-based tracking. */
  browserosId?: string
+  /** Tool approval configuration — which categories require human approval. */
+  toolApprovalConfig?: ToolApprovalConfig
 }
--- a/packages/browseros-agent/apps/server/src/api/middleware/require-trusted-origin.ts
+++ b/packages/browseros-agent/apps/server/src/api/middleware/require-trusted-origin.ts
@@ -1,28 +0,0 @@
-/**
- * @license
- * Copyright 2025 BrowserOS
- * SPDX-License-Identifier: AGPL-3.0-or-later
- */
-
-import type { MiddlewareHandler } from 'hono'
-import { isAllowedOrigin } from '../utils/cors'
-
-export function requireTrustedOrigin(): MiddlewareHandler {
-  return async (c, next) => {
-    const origin = c.req.header('Origin')
-    if (origin !== undefined && !isAllowedOrigin(origin)) {
-      return c.json(
-        {
-          error: {
-            name: 'ForbiddenOrigin',
-            message: 'Origin not allowed',
-            code: 'FORBIDDEN_ORIGIN',
-            statusCode: 403,
-          },
-        },
-        403,
-      )
-    }
-    return next()
-  }
-}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Nikhil Sonti	5ec6480d3f	fix: prepare wxt before typecheck in browseros-agent The typecheck and compile scripts failed on fresh checkouts with TS5083 because tsconfig.json extends .wxt/tsconfig.json, which is gitignored and only generated by 'wxt prepare'. Run wxt prepare before tsgo so the extended config and wxt.d.ts are always in place.	2026-04-15 09:20:08 -07:00
Dani Akash	aff8afd9a4	feat: role aware agents (#704 ) * feat: add role aware agent creation * feat: support custom role aware agents * feat: add plain agent creation mode * fix: validate custom role arrays	2026-04-14 19:13:23 +05:30
Dani Akash	0c96002cf5	fix: complete openclaw gateway recovery UX (#703 ) * fix: complete openclaw gateway recovery ui * fix: guard unknown gateway ui state * fix: guard unknown openclaw status badge	2026-04-14 18:22:47 +05:30
Dani Akash	76e5dcb801	fix: harden openclaw gateway recovery (#702 )	2026-04-14 17:53:33 +05:30
shivammittal274	a85f94de40	feat(cli): add strata commands for Klavis MCP integrations (#700 ) Expose the 7 Klavis Strata MCP tools as CLI subcommands under `browseros-cli strata`, so CLI users (claude-code, gemini-cli) can discover and execute actions on 40+ external services. Commands: check, discover, actions, details, exec, search, auth. Includes discovery flow guidance in help text, integration tests, and an "Integrations:" group in the root help output.	2026-04-14 17:32:05 +05:30
Dani Akash	6708ab834b	fix: restore openai compatible openclaw providers (#699 )	2026-04-14 14:15:11 +05:30
shivammittal274	007208d54b	feat: add connector_mcp_servers tool for strata MCP server discovery (#698 ) Agents connecting over MCP URL/CLI (like claude-code) had no way to know which Klavis connectors were available or authenticated, causing them to fall back to browser automation. This adds a connector_mcp_servers tool that checks connection status and returns an auth URL when needed.	2026-04-14 13:09:30 +05:30
shivammittal274	dd85ae503f	fix(openclaw): compose file path and extension auth (#697 ) * fix(openclaw): compose file path after service dir move, loopback auth fallback - Fix COMPOSE_RESOURCE path: services moved to api/services/openclaw/ so the relative path needs one more parent directory traversal - Fix requireTrustedAppOrigin middleware: Chrome extensions cannot set the Origin header (forbidden header name). When Origin is absent, fall back to checking the Host header is a loopback address. The server only binds to loopback so only local processes can reach it. Requests with an explicit non-trusted Origin are still rejected. * fix: request header check * chore: remove setup openclaw button --------- Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>	2026-04-14 12:53:02 +05:30
Dani Akash	452906d3ca	fix: first time run (#696 ) * fix: openclaw creation * fix: request formats * ci: extend code quality to dev	2026-04-14 12:29:53 +05:30
Nikhil	0397d3e393	chore: release alpha: 0.0.83 (#695 )	2026-04-13 18:00:52 -07:00
Nikhil	edd681012c	refactor: consolidate services under api/services/ (#693 ) Move openclaw/ and terminal/ service modules from src/services/ into src/api/services/ so all server-side services live in one directory alongside chat-service, klavis, mcp, and sdk. Update relative imports in moved files and all callers.	2026-04-13 17:21:45 -07:00
Nikhil	ce7c209ba6	feat: add OpenClaw agent command center and terminal (#692 ) * feat: agent command center new tab with OpenClaw conversation history * feat: add web terminal for Podman container shell access * feat: align agent command center with new tab * fix: simplify agent command center styling * style: polish agent terminal layout and theming * style: simplify agent terminal styling * fix: address PR review comments for OpenClaw routes * fix: handle OpenClaw client start and error states * fix: resolve remaining OpenClaw review comments	2026-04-13 17:06:48 -07:00
Nikhil	6548220bcb	chore: merge pull request #690 (feat/acls-approvals) feat: acl approvals	2026-04-13 09:45:46 -07:00
Neel Gupta	14eeba7c20	Feat: Improved ACL robustness with semantic and fuzzy matching (#665 ) * feat: Add enhanced python-based ACL * fix: Port enhanced ACL to TypeScript * fix: greptile suggested bugs	2026-04-13 09:43:33 -07:00
Nikhil Sonti	3c629c5929	feat: tool approvals, governance dashboard, and execution history - Add tool approval system with per-category approval configuration - Build unified Governance dashboard (renamed from Admin) with pending approvals view and execution audit log - Move execution history tracking into the app shell - Extract buildChatRequestBody helper and add newtab system prompt - Add approval config change detection for mid-conversation rebuilds	2026-04-13 09:43:30 -07:00
Nikhil	77dcd37000	feat: ACLs and support enforcing (#583 ) * feat: add ACL rules for per-site element-level agent restrictions Implement Access Control List (ACL) rules that let users block the agent from interacting with specific elements on specific websites. Rules are defined in a new Settings > ACL Rules page and enforced server-side in executeTool() before any input tool handler runs. - Shared ACL types and site pattern matching (packages/shared) - Extension storage, settings UI with rule cards and add dialog - Server-side guard in executeTool() checking tool+page+element - Browser class extensions for element property resolution via CDP - Visual overlay injection (red "BLOCKED" mask) via Runtime.evaluate - Rules transported in chat request body alongside declinedApps * fix: address review comments for ACL rules - Add selector-to-property matching in matchesElement (tag, id, class) - Remove scroll from guarded tools set (read-like action) * fix: ACL site pattern matching fails on multi-segment URL paths The glob-to-regex conversion used [^/]* for wildcard () which only matches a single path segment. ".amazon.com/" failed to match "www.amazon.com/cart/smart-wagon" because the trailing couldn't cross the slash between "cart" and "smart-wagon". Fix: Split URL matching into hostname vs path parts. Path wildcards now use .* to match across slashes. Also add simple domain matching so users can just type "amazon.com" instead of ".amazon.com/". * fix: wire up ACL overlay injection after take_snapshot applyAclOverlays was defined but never called. Now triggers after take_snapshot completes on pages matching ACL rules, so the agent sees red "BLOCKED" overlays on restricted elements. * refactor: rework 0326-acl_rules based on feedback	2026-04-13 09:42:45 -07:00
Nikhil	6d0dff7b1a	feat: claw integration with browseros (#688 ) * feat(openclaw): add foundation — paths constant, browseros-dir helper, static compose file Add OPENCLAW_DIR_NAME to shared paths constant, getOpenClawDir() to browseros-dir.ts, and a static docker-compose.yml resource file that uses native .env variable substitution instead of YAML template strings. * feat(openclaw): add PodmanRuntime container engine abstraction Manages Podman CLI interactions: machine lifecycle (init/start/stop), availability checks, command execution with streaming output, and running container enumeration. Linux skips machine ops since Podman runs natively. * feat(openclaw): add config builder and container runtime openclaw-config.ts: pure functions to build openclaw.json and .env files from BrowserOS settings. Maps provider keys, sets permissive defaults (full exec, cron, web search, MCP bridge to BrowserOS). container-runtime.ts: compose-level abstraction over PodmanRuntime for the browseros-openclaw project. Handles up/down/restart/pull, health checks, .env file writes, and safe machine shutdown. * feat(openclaw): add OpenClawService orchestrator Main service managing the single OpenClaw container. Handles full lifecycle (setup/start/stop/restart/shutdown), agent CRUD with config rewrites and gateway restarts, chat proxy to /v1/chat/completions, provider key updates, auto-start on BrowserOS boot, and status reporting. * feat(openclaw): add API routes and server wiring Add /api/claw/* routes for container lifecycle (setup/start/stop/restart), agent CRUD (list/create/delete), chat proxy with SSE streaming, provider key management, and log retrieval. Register routes in server.ts, add OpenClaw auto-start on BrowserOS boot and graceful shutdown in main.ts. * fix(openclaw): resolve type errors in service and podman runtime Fix TIMEOUTS.TOOL_EXECUTION → TIMEOUTS.TOOL_CALL to match shared constants. Fix ReadableStream undefined/null type mismatch in PodmanRuntime.runCommand stream draining. * feat(openclaw): add agents page UI with chat, create, and lifecycle controls Add /agents route with AgentsPage showing OpenClaw status, agent list, create dialog, and per-agent chat. Includes useOpenClaw hook for server communication, AgentChat component with SSE streaming, and sidebar navigation entry. * feat(openclaw): add provider selector to setup flow Add LLM provider selector using useLlmProviders hook. Filters out OAuth-only providers, pre-selects the user's default, and passes providerType/apiKey/modelId to the setup endpoint so OpenClaw gets a working LLM configuration on first setup. * feat(openclaw): per-agent provider selection Each agent can now have its own LLM provider. The Create Agent dialog includes a provider selector that passes providerType/apiKey/modelId to the backend. The service writes per-agent model config to openclaw.json and merges the API key into the container's .env file. * fix(openclaw): write gateway auth token to openclaw.json The gateway was returning 401 because auth.mode was set to "token" without providing the actual token value. Now the token is written to gateway.auth.token in openclaw.json so the gateway and our chat proxy agree on the same token. * feat(openclaw): add GatewayClient WebSocket RPC client Persistent WS client for the OpenClaw Gateway protocol. Handles the challenge → connect → hello-ok handshake (as openclaw-control-ui with operator.admin scope), JSON-RPC with pending map + timeouts, and auto-reconnect. Exposes typed methods for agents.list, agents.create, agents.delete, and health. * refactor(openclaw): simplify config to bootstrap-only, add /readyz health Config no longer contains agents.list — agent CRUD is handled via WS RPC. buildOpenClawConfig → buildBootstrapConfig, removed makeAgentEntry and AgentEntry (agents managed by OpenClaw runtime). Added isReady() and waitForReady() using /readyz for gateway readiness checks. * refactor(openclaw): agent CRUD via WS RPC, per-agent chat targeting Replace JSON mutation + restart with GatewayClient WS RPC calls for agents.create, agents.delete, agents.list. Chat proxy now uses model: "openclaw/<agentId>" for per-agent targeting. Setup writes bootstrap config once then creates "main" agent via WS after gateway starts. Container restarts only when a new provider env var is added. * fix(openclaw): use agentId field in setup response mapping Fix type error: GatewayAgentEntry uses agentId not id. * fix(openclaw): log service progress through server logger * feat(openclaw): WS streaming, device auth, MCP port fix (#687) * feat(openclaw): WS streaming, device auth, MCP port fix - Fix GatewayClient WS handshake: add Ed25519 device identity signing, Origin header, mode: cli (mode: ui requires device identity always) - Add auto device pairing flow: generate client identity, attempt WS connect (triggers pending), approve via openclaw CLI, reconnect - Replace HTTP /v1/chat/completions proxy with WS-based streaming that surfaces tool calls, thinking blocks, and text deltas - Add chatStream() to GatewayClient returning ReadableStream of typed OpenClawStreamEvent (text-delta, thinking, tool-start/end, lifecycle) - Update chat route to stream WS events as SSE to the extension - Pass actual server port to OpenClaw config (fixes MCP bridge in dev) - Rewrite AgentChat.tsx with turn-based model using Message/MessageContent components matching sidepanel pattern, with tool batching logic that groups consecutive tools and breaks on text/thinking (same as sidepanel) - Add execInContainer() to ContainerRuntime for CLI commands - Fix gateway response field mapping (id→agentId, agents.list/create) - Skip creating main agent if gateway auto-creates it * fix(openclaw): retry WS connect on signature expired (Podman clock skew) Podman VM clock drifts when Mac sleeps, causing Ed25519 signature validation to fail with "device signature expired" on auto-start. Add connectGatewayWithRetry() that restarts the container (resyncs clock) and re-approves the device if needed. * fix(openclaw): address PR review — stream cleanup, error handling - Fix silent catch in setup(): only swallow "pairing required" and "signature expired" errors, re-throw everything else - Guard JSON.parse in approvePendingDevice(): check exit code and wrap parse in try/catch with descriptive error messages - Add try/finally in chat SSE route: reader.cancel() on disconnect - Add cancel callback to chatStream ReadableStream: restores ws.onmessage when stream is cancelled (prevents handler leak) --------- Co-authored-by: shivammittal274 <56757235+shivammittal274@users.noreply.github.com>	2026-04-13 09:13:40 -07:00