mirror of
https://github.com/browseros-ai/BrowserOS.git
synced 2026-05-21 12:55:09 +00:00
* feat: pass per-turn cost and token data through chat history items
- Add costUsd, tokensIn, tokensOut to BrowserOSChatHistoryItem (server)
- Pass through from JSONL agent.message events in jsonlEventsToHistoryItems()
- Add same fields to client-side BrowserOSChatHistoryItem and ClawChatMessage
- Map cost/token data in mapHistoryItemToClawMessage()
Data flows: JSONL message.usage → server history item → API response →
client ClawChatMessage. Available for rendering in ClawChatMessage
component (message toolbar, cost badges).
* feat: add message toolbar with copy button and per-turn cost display
Add MessageToolbar to historical assistant messages in ClawChatMessage:
- Copy button copies message text to clipboard via MessageAction
- Per-turn token count (22.7K → 238) and cost ($0.003) shown as muted
tabular-nums text on the right side of the toolbar
- Toolbar appears on hover (opacity transition via group-hover)
- Only shown when the message has text content
- Cost/token display only shown when data is available from JSONL
* fix: toolbar only on assistant messages, always visible, cost only
- Only render toolbar on assistant messages (not user messages)
- Remove hover-only opacity — toolbar is always visible
- Remove token counts (22.7K → 238 is meaningless to users)
- Show only cost as a budget signal ($0.003)
* feat: group all tool activity into single Task collapsible per turn
Replace flat tool rows with a single ai-elements Task collapsible per
assistant turn that lists every tool/MCP call in sequence.
Live streaming (ConversationMessage):
- Aggregate all tool-batch parts into one Task
- Title: "Working… (N actions)" while running, "Agent activity (N actions)" when done
- Default open while turn is in progress
- Wrench icon in trigger
Historical (ClawChatMessage):
- Group all tool-call parts into one Task
- Title includes failed count if any tools errored
- Default collapsed — expandable on click
- Tool name + status icon + error text per row
Both views show one clean collapsible per turn instead of N individual
tool cards. Collapsed reads "5 actions"; expanded shows the timeline.
* feat: include tool calls in chat history responses
Server: jsonlEventsToHistoryItems() now walks ALL events (not just
messages) and pairs agent.tool_use with agent.tool_result by toolCallId.
The resulting tool call list is attached to the next assistant text
message as toolCalls[]. Each entry includes status, input arguments,
output text, error string, and duration computed from event timestamps.
Client:
- BrowserOSChatHistoryItem gets optional toolCalls field
- Tool-call message part type gets durationMs field
- mapHistoryItemToClawMessage() emits tool-call parts BEFORE the text
part (the order the agent produced them)
- ClawChatMessage Task view now shows tool duration in seconds
Result: historical messages now display the full tool activity
timeline grouped into the single Task collapsible per turn (designed
in step 3), instead of showing only the final text response.
* feat: render activity rows as human verbs sourced from tool registry
Tool calls in the chat activity view now read as sentences:
"Opened tab · news.ycombinator.com" instead of "browseros__new_page".
Server (tool-label-registry.ts):
- Curated verb override map for ~70 BrowserOS first-party tools
- Per-tool subject extractors that pull the meaningful argument from
input (URL → host, query → quoted, element → ID, etc.)
- Generic fallback humanizes snake_case for any unmapped tool
- Strips MCP namespace prefixes (browseros__, mcp_)
Server (openclaw-service.ts):
- jsonlEventsToHistoryItems calls buildToolLabel for each tool_use,
attaches label and subject to the BrowserOSChatHistoryToolCall
Client:
- Mirrored label module at lib/tool-labels.ts
- useAgentConversation tool-start handler computes label/subject
from the SSE tool args
- ClawChatMessage and ConversationMessage render label · subject
with foreground/muted styling, no font-mono
- ToolEntry, BrowserOSChatHistoryToolCall, and tool-call message
part types all carry label and optional subject
* fix: drop meaningless tab N subject from page-read tool rows
Page IDs are internal numbers, not URLs. 'Took screenshot · tab 4'
tells the user nothing. Removed subject extractors for take_snapshot,
take_enhanced_snapshot, get_page_content, get_page_links, get_dom,
and take_screenshot. The verb alone is the right signal.
* fix: gate initial loading on historyQuery.isFetched not isLoading
The session and history queries are sequential: the history query is
disabled until session resolves. After session resolves, there's a render
frame where historyQuery.isLoading is still false (the query hasn't
been kicked off yet). isInitialLoading flipped to false during that
window, exposing an empty chat shell with just Task collapsibles and
copy buttons before the messages filled in.
Switching the guard to isFetched closes that window — the loading state
stays true until the first history fetch actually completes.
* fix: render historical messages immediately instead of through Streamdown's idle-callback debounce
Streamdown defaults to mode="streaming" which uses requestIdleCallback (300ms
debounce, 500ms idle timeout) and lazy/Suspense to optimize for token-by-token
live streams. For finalized historical messages this caused tool collapsibles
and copy buttons to paint while text bodies stayed blank for ~300-500ms after
load. Pass mode="static" + parseIncompleteMarkdown=false on the historical
MessageResponse so completed text paints in the same frame as the surrounding
chrome. Live streaming turns still use the default streaming mode.
Also collapse the redundant /agents/:id/session round-trip into the existing
/history endpoint (server already resolves the most recent user-chat session
when sessionKey is omitted) and tighten the initial-loading gate to stay true
across the render frame where the query is enabled but hasn't started fetching.
* feat: surface thinking duration on historical reasoning collapsibles
Server accumulates agent.thinking events per turn from JSONL and attaches a
single reasoning block (joined text + durationMs from first thinking event
to the closing agent.message) on each assistant history item. Reasoning
buffer resets on user.message alongside the tool-call buffer.
Client mirrors the type, emits the reasoning part before tool calls in
mapHistoryItemToClawMessage (chronological: think → act → answer), and
passes duration in seconds to <Reasoning> so the trigger reads "Thought
for N seconds" instead of just "Thinking" on collapsed historical turns.
* fix: read thinking blocks from the correct JSONL field name
OpenClaw stores reasoning blocks as {type:'thinking', thinking:'...'} but
the JSONL parser was reading block.text, so every thinking event was
silently dropped before it ever reached jsonlEventsToHistoryItems. As a
result the reasoning field on history items was always empty even though
the new accumulator was wired up correctly.
Also guard the client mapping: when durationMs is 0 (think + answer
emitted in the same JSONL line, no real elapsed wall-clock) pass
undefined to <Reasoning> so it renders the static "Thinking" trigger
instead of the streaming shimmer / "Thought for 0 seconds".
* fix: reset reasoning buffer on discarded turns and drop dead session hook
Two cleanups from PR review:
1. jsonlEventsToHistoryItems: when an agent.message is discarded (the
"[Chat messages since your last reply" wrapper without a current-message
marker) the tool buffers were already reset but the reasoning buffer
was not. Accumulated thinking from the discarded turn would bleed onto
the next assistant message. Reset pendingReasoningTexts and
pendingReasoningFirstAt alongside the tool buffers.
2. useClawAgentSession, the AgentSessionResponse type, and the unused
session entry in CLAW_CHAT_QUERY_KEYS became dead code after the
session round-trip was folded into the history endpoint. Removed.
326 lines
11 KiB
TypeScript
326 lines
11 KiB
TypeScript
/**
|
|
* @license
|
|
* Copyright 2025 BrowserOS
|
|
* SPDX-License-Identifier: AGPL-3.0-or-later
|
|
*
|
|
* Maps raw tool names + arguments to human-readable activity labels for
|
|
* the chat UI activity view. The MCP ToolRegistry is the source of truth
|
|
* for tool *existence*; this file is the editorial layer that turns
|
|
* snake_case identifiers into agent-speak verbs.
|
|
*/
|
|
|
|
const VERB_OVERRIDES: Record<string, string> = {
|
|
// Navigation
|
|
navigate_page: 'Navigated to',
|
|
new_page: 'Opened tab',
|
|
new_hidden_page: 'Opened tab',
|
|
show_page: 'Showed tab',
|
|
close_page: 'Closed tab',
|
|
list_pages: 'Listed open tabs',
|
|
get_active_page: 'Got active tab',
|
|
move_page: 'Moved tab',
|
|
group_tabs: 'Grouped tabs',
|
|
|
|
// Page reading
|
|
take_snapshot: 'Captured page snapshot',
|
|
take_enhanced_snapshot: 'Captured detailed snapshot',
|
|
get_page_content: 'Read page content',
|
|
get_page_links: 'Extracted page links',
|
|
get_dom: 'Read page DOM',
|
|
search_dom: 'Searched page DOM',
|
|
take_screenshot: 'Took screenshot',
|
|
|
|
// Input
|
|
click: 'Clicked',
|
|
click_at: 'Clicked at coordinates',
|
|
hover: 'Hovered',
|
|
hover_at: 'Hovered at coordinates',
|
|
type_at: 'Typed at coordinates',
|
|
drag_at: 'Dragged',
|
|
focus: 'Focused element',
|
|
fill: 'Filled field',
|
|
clear: 'Cleared field',
|
|
check: 'Checked box',
|
|
uncheck: 'Unchecked box',
|
|
press_key: 'Pressed key',
|
|
upload_file: 'Uploaded file',
|
|
|
|
// Console / scripts
|
|
evaluate_script: 'Ran script',
|
|
get_console_logs: 'Read console logs',
|
|
|
|
// History / bookmarks
|
|
search_history: 'Searched history',
|
|
get_recent_history: 'Read recent history',
|
|
delete_history_url: 'Deleted history entry',
|
|
delete_history_range: 'Deleted history range',
|
|
get_bookmarks: 'Listed bookmarks',
|
|
create_bookmark: 'Created bookmark',
|
|
remove_bookmark: 'Removed bookmark',
|
|
update_bookmark: 'Updated bookmark',
|
|
move_bookmark: 'Moved bookmark',
|
|
search_bookmarks: 'Searched bookmarks',
|
|
|
|
// Filesystem (sandboxed)
|
|
read_file: 'Read file',
|
|
write_file: 'Wrote file',
|
|
find_files: 'Searched files',
|
|
|
|
// Memory
|
|
read_soul: 'Read soul memory',
|
|
read_core: 'Read core memory',
|
|
write_memory: 'Wrote memory',
|
|
search_memory: 'Searched memory',
|
|
update_soul: 'Updated soul memory',
|
|
update_core: 'Updated core memory',
|
|
|
|
// Web
|
|
web_search: 'Searched the web',
|
|
web_fetch: 'Fetched URL',
|
|
|
|
// Klavis / external apps (Strata)
|
|
connector_mcp_servers: 'Listed connected apps',
|
|
discover_server_categories_or_actions: 'Browsed available actions',
|
|
get_category_actions: 'Listed actions',
|
|
get_action_details: 'Looked up action',
|
|
execute_action: 'Ran external action',
|
|
search_documentation: 'Searched docs',
|
|
handle_auth_failure: 'Handled auth issue',
|
|
|
|
// Suggestions
|
|
suggest_schedule: 'Suggested schedule',
|
|
suggest_app_connection: 'Suggested app connect',
|
|
|
|
// BrowserOS info
|
|
browseros_info: 'Read BrowserOS info',
|
|
|
|
// Windows
|
|
list_windows: 'Listed windows',
|
|
focus_window: 'Focused window',
|
|
close_window: 'Closed window',
|
|
create_window: 'Created window',
|
|
}
|
|
|
|
// ──────────────────────────────────────────────────────────────────────
|
|
// Helpers
|
|
// ──────────────────────────────────────────────────────────────────────
|
|
|
|
function asString(value: unknown): string | undefined {
|
|
return typeof value === 'string' && value.length > 0 ? value : undefined
|
|
}
|
|
|
|
function stringField(
|
|
input: Record<string, unknown>,
|
|
...keys: string[]
|
|
): string | undefined {
|
|
for (const k of keys) {
|
|
const v = asString(input[k])
|
|
if (v) return v
|
|
}
|
|
return undefined
|
|
}
|
|
|
|
function truncate(text: string | undefined, max: number): string | undefined {
|
|
if (!text) return undefined
|
|
return text.length > max ? `${text.slice(0, max - 1)}…` : text
|
|
}
|
|
|
|
function quote(value: string | undefined): string | undefined {
|
|
if (!value) return undefined
|
|
return `"${truncate(value, 60)}"`
|
|
}
|
|
|
|
function basename(path: string | undefined): string | undefined {
|
|
if (!path) return undefined
|
|
const parts = path.split(/[/\\]/).filter(Boolean)
|
|
return parts[parts.length - 1] ?? path
|
|
}
|
|
|
|
function formatUrl(value: unknown): string | undefined {
|
|
const url = asString(value)
|
|
if (!url) return undefined
|
|
try {
|
|
const parsed = new URL(url)
|
|
const host = parsed.host
|
|
const path = parsed.pathname === '/' ? '' : parsed.pathname
|
|
const display = path && path.length > 0 ? `${host}${path}` : host
|
|
return truncate(display, 60)
|
|
} catch {
|
|
return truncate(url, 60)
|
|
}
|
|
}
|
|
|
|
function coords(x: unknown, y: unknown): string | undefined {
|
|
if (typeof x === 'number' && typeof y === 'number') {
|
|
return `${Math.round(x)}, ${Math.round(y)}`
|
|
}
|
|
return undefined
|
|
}
|
|
|
|
// ──────────────────────────────────────────────────────────────────────
|
|
// Subject extractors
|
|
// ──────────────────────────────────────────────────────────────────────
|
|
|
|
type SubjectExtractor = (input: Record<string, unknown>) => string | undefined
|
|
|
|
const SUBJECT_EXTRACTORS: Record<string, SubjectExtractor> = {
|
|
// URL-bearing tools
|
|
new_page: (i) => formatUrl(i.url),
|
|
new_hidden_page: (i) => formatUrl(i.url),
|
|
navigate_page: (i) => {
|
|
const action = asString(i.action)
|
|
if (action === 'back') return 'back'
|
|
if (action === 'forward') return 'forward'
|
|
if (action === 'reload') return 'reload'
|
|
return formatUrl(i.url)
|
|
},
|
|
web_fetch: (i) => formatUrl(i.url),
|
|
|
|
// Search queries
|
|
web_search: (i) => quote(stringField(i, 'query', 'q')),
|
|
search_history: (i) => quote(stringField(i, 'query', 'text')),
|
|
search_bookmarks: (i) => quote(stringField(i, 'query', 'text')),
|
|
search_memory: (i) => quote(stringField(i, 'query', 'q')),
|
|
search_dom: (i) => quote(stringField(i, 'query', 'selector')),
|
|
search_documentation: (i) => quote(stringField(i, 'query', 'q')),
|
|
find_files: (i) => quote(stringField(i, 'pattern', 'query')),
|
|
|
|
// Element interactions
|
|
click: (i) => stringField(i, 'element'),
|
|
hover: (i) => stringField(i, 'element'),
|
|
focus: (i) => stringField(i, 'element'),
|
|
clear: (i) => stringField(i, 'element'),
|
|
check: (i) => stringField(i, 'element'),
|
|
uncheck: (i) => stringField(i, 'element'),
|
|
fill: (i) => {
|
|
const target = stringField(i, 'element')
|
|
const text = stringField(i, 'text')
|
|
if (target && text) return `${target}: ${truncate(text, 40)}`
|
|
return target ?? truncate(text, 40)
|
|
},
|
|
press_key: (i) => stringField(i, 'key'),
|
|
|
|
// Coordinate-based input
|
|
click_at: (i) => coords(i.x, i.y),
|
|
hover_at: (i) => coords(i.x, i.y),
|
|
type_at: (i) => {
|
|
const at = coords(i.x, i.y)
|
|
const text = stringField(i, 'text')
|
|
if (at && text) return `${at}: ${truncate(text, 40)}`
|
|
return at ?? truncate(text, 40)
|
|
},
|
|
drag_at: (i) => {
|
|
const from = coords(i.fromX, i.fromY)
|
|
const to = coords(i.toX, i.toY)
|
|
if (from && to) return `${from} → ${to}`
|
|
return from ?? to
|
|
},
|
|
|
|
// Tab management
|
|
show_page: (i) => {
|
|
const page = i.page
|
|
return typeof page === 'number' ? `tab ${page}` : asString(page)
|
|
},
|
|
close_page: (i) => {
|
|
const page = i.page
|
|
return typeof page === 'number' ? `tab ${page}` : asString(page)
|
|
},
|
|
move_page: (i) => {
|
|
const page = i.page
|
|
return typeof page === 'number' ? `tab ${page}` : asString(page)
|
|
},
|
|
|
|
// Page reads (take_snapshot, take_enhanced_snapshot, get_page_content,
|
|
// get_page_links, get_dom, take_screenshot) intentionally omit a
|
|
// subject — the only argument is a numeric page ID that's internal
|
|
// to the agent and meaningless to the user ("tab 4" tells them nothing).
|
|
// The verb alone communicates what happened.
|
|
|
|
// External actions via Strata
|
|
execute_action: (i) => {
|
|
const server = stringField(i, 'server_name')
|
|
const action = stringField(i, 'action_name')
|
|
if (server && action) return `${server} · ${action}`
|
|
return action ?? server
|
|
},
|
|
get_category_actions: (i) => stringField(i, 'category_name', 'server_name'),
|
|
get_action_details: (i) => stringField(i, 'action_name'),
|
|
discover_server_categories_or_actions: (i) =>
|
|
stringField(i, 'server_name', 'category_name'),
|
|
connector_mcp_servers: (i) => stringField(i, 'server_name'),
|
|
|
|
// Filesystem
|
|
read_file: (i) => basename(stringField(i, 'path')),
|
|
write_file: (i) => basename(stringField(i, 'path')),
|
|
|
|
// Memory writes — show first chars of content
|
|
write_memory: (i) => truncate(stringField(i, 'content', 'text'), 40),
|
|
update_soul: (i) => truncate(stringField(i, 'content'), 40),
|
|
update_core: (i) => truncate(stringField(i, 'content'), 40),
|
|
|
|
// Bookmarks
|
|
create_bookmark: (i) => stringField(i, 'title') ?? formatUrl(i.url),
|
|
remove_bookmark: (i) => stringField(i, 'id', 'title'),
|
|
update_bookmark: (i) => stringField(i, 'id', 'title'),
|
|
move_bookmark: (i) => stringField(i, 'id', 'title'),
|
|
|
|
// History
|
|
delete_history_url: (i) => formatUrl(i.url),
|
|
}
|
|
|
|
// ──────────────────────────────────────────────────────────────────────
|
|
// Public API
|
|
// ──────────────────────────────────────────────────────────────────────
|
|
|
|
export interface ToolLabelResult {
|
|
label: string
|
|
subject?: string
|
|
}
|
|
|
|
/**
|
|
* Strip MCP namespace prefixes (e.g. "browseros__", "mcp_") to find the
|
|
* canonical tool name used in the override maps.
|
|
*/
|
|
function canonicalName(rawName: string): string {
|
|
return rawName.replace(/^browseros__/, '').replace(/^mcp_/, '')
|
|
}
|
|
|
|
/**
|
|
* Convert a snake_case tool name into Sentence-case English as a fallback
|
|
* when no curated override exists.
|
|
*/
|
|
function humanizeToolName(rawName: string): string {
|
|
const stripped = canonicalName(rawName)
|
|
const words = stripped.split(/[_-]/).filter((w) => w.length > 0)
|
|
if (words.length === 0) return rawName
|
|
const first = words[0]!
|
|
return [
|
|
first.charAt(0).toUpperCase() + first.slice(1),
|
|
...words.slice(1),
|
|
].join(' ')
|
|
}
|
|
|
|
/**
|
|
* Build a human-readable label and subject string for a tool call,
|
|
* suitable for rendering in the chat activity view.
|
|
*/
|
|
export function buildToolLabel(
|
|
rawName: string,
|
|
input?: Record<string, unknown>,
|
|
): ToolLabelResult {
|
|
const canonical = canonicalName(rawName)
|
|
const label =
|
|
VERB_OVERRIDES[canonical] ??
|
|
VERB_OVERRIDES[rawName] ??
|
|
humanizeToolName(rawName)
|
|
|
|
const extractor = Object.hasOwn(SUBJECT_EXTRACTORS, canonical)
|
|
? SUBJECT_EXTRACTORS[canonical]
|
|
: Object.hasOwn(SUBJECT_EXTRACTORS, rawName)
|
|
? SUBJECT_EXTRACTORS[rawName]
|
|
: undefined
|
|
const subject = extractor && input ? extractor(input) : undefined
|
|
|
|
return { label, subject }
|
|
}
|