Files
BrowserOS/packages/browseros-agent/apps/agent/entrypoints/app/agent-command/ConversationInput.tsx
Dani Akash 8712f89f18 feat(agents): durable per-agent chat message queue + composer Stop (#880)
* feat(agents): durable per-agent chat message queue + composer Stop button

* fix(agents): tighten queue UI — smaller Stop, drop empty indicator, live drain attach

User feedback round 1 on the message-queue UX:

1) The Stop button matched the send/voice mics at h-10 w-10 with a
   solid destructive fill, which read as alarming. Shrunk to h-8 w-8,
   ghost variant with a soft destructive/10 background, smaller
   filled square glyph. Reads as a calm 'stop' affordance instead of
   a panic button.

2) The QueueItem's leading <QueueItemIndicator> dot was decorative
   only — no state, no interaction. Dropped it from QueuePanel along
   with the import; queue items now render as a clean preview line
   with the trailing X remove action.

3) When the server drained the queue and started the next turn, the
   chat panel didn't pick up the live stream until the user
   navigated away and back. The hook's resume effect previously
   only fired on agent change, not on listing-observed activeTurnId
   change. Surface activeTurnId from useHarnessAgents into
   useAgentConversation; effect now re-runs when the id changes,
   calls /chat/active, and attaches to the new turn — so a queued
   message starts streaming the moment the server drain pops it.

* fix(agents): don't reset streaming state from the resume effect's no-op paths

The Stop button was disappearing while the agent was actively
streaming, even though events were still flowing into the chat. Root
cause: the resume effect's `finally` block reset `streaming`,
`turnIdRef`, and `lastSeqRef` unconditionally — including on the
early-return paths (no active turn, or another mechanism already
owns the stream).

Sequence that triggered it:
  1) User sends a message → send() sets streamAbortRef + streaming=true
     and starts consuming the SSE.
  2) User enqueues another message → enqueue mutation invalidates the
     listing query.
  3) Listing refetches with the live activeTurnId → the resume
     effect re-fires (deps include activeTurnIdDep).
  4) attemptResume hits `if (streamAbortRef.current) return` because
     send() owns it.
  5) The finally clause fires anyway and calls setStreaming(false),
     clobbering the live state set by send(). The SSE consumer keeps
     running (refs are intact) so text keeps streaming, but the React
     flag is wrong, so the Stop button gates off.

Fix: track whether *this* run actually started a stream
(`weStartedStream`). The finally only resets state when it does.
Early-return / no-active-turn paths now leave streaming/turnIdRef/
lastSeqRef alone for whoever does own them.

Also widens the Stop button's visibility (`canStop` prop on
ConversationInput) so it stays steady across the brief gap between
turns when a queue drain is mid-flight; the parent computes
`streaming || activeTurnId !== null || queue.length > 0`. The
visibility widening is independent of the streaming-state fix above
— both are now in place.

* revert: drop canStop widening — Stop only shows while streaming

Reverts the canStop prop on ConversationInput and the OR-with-queue
visibility from AgentCommandConversation. Stop is gated solely on
`streaming` again. Between turns (queue draining) the button stays
hidden — only the actively-streaming turn is interruptible from the
composer, which matches what the user actually expects.

* fix(agents): persist the kicking-off prompt on active turns so the resume placeholder isn't empty

When a queued message drained and started a new turn, the chat
panel's resume effect staged a placeholder turn with userText: ''
because the hook had no way to know what message kicked off the
turn — only the agent-side stream was visible, and the user bubble
above it was blank until the user navigated away and back (at which
point the session record's history loaded normally).

Fix: ActiveTurnRegistry.register now accepts an optional `prompt`
that's stashed on the turn and surfaced via describe() / the
ActiveTurnInfo response. AgentHarnessService.startTurn passes the
incoming message into register. /chat/active returns it. The chat
hook's resume effect uses active.prompt as the placeholder
turn's userText, so the user bubble shows the queued message text
the moment streaming begins. Falls back to '' for older clients
that haven't been refetched yet.

* fix(agents): always release streamAbortRef on resume cleanup, even when cancelled

Greptile P1 follow-up. The previous `weStartedStream` guard correctly
stopped the resume effect's no-op early-returns from clobbering an
in-flight `send()` stream — but it also stopped a *cancelled*
mid-stream resume from clearing its own `streamAbortRef`. When the
cleanup fires (e.g. the 5s listing poll captures a new queue-drain
turn id while the SSE for the prior turn is still finishing), the
next effect run hits the `if (streamAbortRef.current) return` guard
against the now-aborted controller and never reattaches, leaving
`streaming === true` with no live stream until the user navigates
away.

Split the finally block: always release `streamAbortRef` when we
owned the controller (so the next run can take over), but only
reset the streaming flag / turn id / lastSeq on a clean exit (the
new run will set those itself, so resetting on cancel would just
flicker).
2026-04-30 18:26:56 +05:30

694 lines
21 KiB
TypeScript

import {
ArrowRight,
Bot,
ChevronDown,
FileText,
Folder,
Layers,
Loader2,
Mic,
Paperclip,
Square,
X,
} from 'lucide-react'
import {
type DragEvent,
type FC,
type ReactNode,
useEffect,
useLayoutEffect,
useRef,
useState,
} from 'react'
import { AppSelector } from '@/components/elements/AppSelector'
import { TabPickerPopover } from '@/components/elements/tab-picker-popover'
import { WorkspaceSelector } from '@/components/elements/workspace-selector'
import { Button } from '@/components/ui/button'
import { Textarea } from '@/components/ui/textarea'
import type { AgentEntry } from '@/entrypoints/app/agents/useOpenClaw'
import { McpServerIcon } from '@/entrypoints/app/connect-mcp/McpServerIcon'
import { useGetUserMCPIntegrations } from '@/entrypoints/app/connect-mcp/useGetUserMCPIntegrations'
import { type StagedAttachment, stageAttachments } from '@/lib/attachments'
import { Feature } from '@/lib/browseros/capabilities'
import { useCapabilities } from '@/lib/browseros/useCapabilities'
import { useMcpServers } from '@/lib/mcp/mcpServerStorage'
import { cn } from '@/lib/utils'
import { useVoiceInput } from '@/lib/voice/useVoiceInput'
import { useWorkspace } from '@/lib/workspace/use-workspace'
import { AgentSelector } from './AgentSelector'
export interface ConversationInputSendInput {
text: string
attachments: StagedAttachment[]
}
interface ConversationInputProps {
agents: AgentEntry[]
selectedAgentId: string | null
onSelectAgent: (agent: AgentEntry) => void
onSend: (input: ConversationInputSendInput) => void
onCreateAgent?: () => void
streaming: boolean
disabled?: boolean
status?: string
placeholder?: string
attachmentsEnabled?: boolean
variant?: 'home' | 'conversation'
/**
* When set, a Stop button surfaces to the left of the voice mic
* while `streaming === true`. Click cancels the active turn
* server-side via the chat-cancel endpoint. Absent → no Stop
* button (legacy behaviour for the home composer).
*/
onStop?: () => void
}
function InputActionButton({
disabled,
onClick,
streaming,
hasContent,
}: {
disabled: boolean
onClick: () => void
streaming: boolean
hasContent: boolean
}) {
// Show the spinner while streaming only when there's nothing to
// send — once the user types something, the icon flips back to the
// paper-plane so it reads as "queue this message" instead of
// "still working".
const showSpinner = streaming && !hasContent
return (
<Button
onClick={onClick}
size="icon"
disabled={disabled}
title={streaming && hasContent ? 'Queue message' : undefined}
className="h-10 w-10 flex-shrink-0 rounded-xl bg-primary text-primary-foreground hover:bg-primary/90"
>
{showSpinner ? (
<Loader2 className="h-5 w-5 animate-spin" />
) : (
<ArrowRight className="h-5 w-5" />
)}
</Button>
)
}
function StopButton({ onStop }: { onStop: () => void }) {
return (
<Button
type="button"
size="icon"
variant="ghost"
onClick={onStop}
title="Stop current turn — queued messages will start next."
aria-label="Stop current turn"
className="h-8 w-8 flex-shrink-0 rounded-lg bg-destructive/10 text-destructive transition-colors hover:bg-destructive/15 hover:text-destructive"
>
<Square className="h-3.5 w-3.5 fill-current" />
</Button>
)
}
function VoiceButton({
isRecording,
isTranscribing,
onStart,
onStop,
}: {
isRecording: boolean
isTranscribing: boolean
onStart: () => void
onStop: () => void
}) {
if (isRecording) {
return (
<Button
type="button"
size="icon"
onClick={onStop}
className="h-10 w-10 flex-shrink-0 rounded-xl bg-red-600 text-white hover:bg-red-700"
>
<Square className="h-4 w-4" />
</Button>
)
}
if (isTranscribing) {
return (
<Button
type="button"
variant="ghost"
size="icon"
disabled
className="h-10 w-10 flex-shrink-0 rounded-xl"
>
<Loader2 className="h-5 w-5 animate-spin" />
</Button>
)
}
return (
<Button
type="button"
variant="ghost"
size="icon"
onClick={onStart}
className="h-10 w-10 flex-shrink-0 rounded-xl text-muted-foreground transition-colors hover:text-foreground"
title="Voice input"
>
<Mic className="h-5 w-5" />
</Button>
)
}
function ContextControls({
agents,
onCreateAgent,
onSelectAgent,
selectedAgentId,
selectedTabs,
onToggleTab,
showAgentSelector,
status,
onAttachClick,
attachDisabled,
attachmentsEnabled,
}: {
agents: AgentEntry[]
onCreateAgent?: () => void
onSelectAgent: (agent: AgentEntry) => void
selectedAgentId: string | null
selectedTabs: chrome.tabs.Tab[]
onToggleTab: (tab: chrome.tabs.Tab) => void
showAgentSelector: boolean
status?: string
onAttachClick: () => void
attachDisabled: boolean
attachmentsEnabled: boolean
}) {
const { supports } = useCapabilities()
const { selectedFolder } = useWorkspace()
const { servers: mcpServers } = useMcpServers()
const { data: userMCPIntegrations } = useGetUserMCPIntegrations()
const connectedManagedServers = mcpServers.filter((server) => {
if (server.type !== 'managed' || !server.managedServerName) return false
return userMCPIntegrations?.integrations?.find(
(integration) => integration.name === server.managedServerName,
)?.is_authenticated
})
return (
<div className="flex items-center justify-between border-border/40 border-t px-4 py-2.5">
<div className="flex items-center gap-1">
{showAgentSelector ? (
<AgentSelector
agents={agents}
selectedAgentId={selectedAgentId}
onSelectAgent={onSelectAgent}
onCreateAgent={onCreateAgent}
status={status}
/>
) : null}
{supports(Feature.WORKSPACE_FOLDER_SUPPORT) ? (
<WorkspaceSelector>
<Button
variant="ghost"
className={cn(
'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
'data-[state=open]:bg-accent',
)}
>
<Folder className="h-4 w-4" />
<span>{selectedFolder?.name || 'Add workspace'}</span>
<ChevronDown className="h-3 w-3" />
</Button>
</WorkspaceSelector>
) : null}
<TabPickerPopover
variant="selector"
selectedTabs={selectedTabs}
onToggleTab={onToggleTab}
>
<Button
className={cn(
'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
selectedTabs.length > 0
? 'bg-[var(--accent-orange)]! text-white shadow-sm'
: 'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
'data-[state=open]:bg-accent',
)}
>
<Layers className="h-4 w-4" />
<span>Tabs</span>
</Button>
</TabPickerPopover>
<Button
type="button"
variant="ghost"
onClick={onAttachClick}
disabled={attachDisabled || !attachmentsEnabled}
title="Attach files"
className={cn(
'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
)}
>
<Paperclip className="h-4 w-4" />
<span>Attach</span>
</Button>
</div>
{supports(Feature.MANAGED_MCP_SUPPORT) ? (
<div className="ml-auto flex items-center gap-1.5">
<AppSelector side="bottom">
<Button
variant="ghost"
className={cn(
'flex items-center gap-2 rounded-lg px-3 py-1.5 font-medium text-sm transition-all',
'bg-transparent text-muted-foreground hover:bg-accent hover:text-accent-foreground',
'data-[state=open]:bg-accent',
)}
>
<div className="flex items-center -space-x-1.5">
{connectedManagedServers.slice(0, 4).map((server) => (
<div
key={server.id}
className="rounded-full ring-2 ring-card"
>
<McpServerIcon
serverName={server.managedServerName ?? ''}
size={16}
/>
</div>
))}
</div>
{connectedManagedServers.length > 4 ? (
<span className="text-xs">
+{connectedManagedServers.length - 4}
</span>
) : null}
<span>Apps</span>
<ChevronDown className="h-3 w-3" />
</Button>
</AppSelector>
</div>
) : null}
</div>
)
}
function HomeShell({ children }: { children: ReactNode }) {
return (
<div className="overflow-hidden rounded-[1.55rem] border border-border/60 bg-card/95 shadow-sm">
{children}
</div>
)
}
function ConversationShell({ children }: { children: ReactNode }) {
return (
<div className="overflow-hidden rounded-[1.35rem] border border-border/50 bg-background/95 shadow-[0_10px_30px_rgba(15,23,42,0.06)] backdrop-blur-md">
{children}
</div>
)
}
export const ConversationInput: FC<ConversationInputProps> = ({
agents,
selectedAgentId,
onSelectAgent,
onSend,
onCreateAgent,
streaming,
disabled,
status,
placeholder,
attachmentsEnabled = true,
variant = 'conversation',
onStop,
}) => {
const [input, setInput] = useState('')
const [selectedTabs, setSelectedTabs] = useState<chrome.tabs.Tab[]>([])
const [isExpandedDraft, setIsExpandedDraft] = useState(false)
const [attachments, setAttachments] = useState<StagedAttachment[]>([])
const [attachmentError, setAttachmentError] = useState<string | null>(null)
const [isStaging, setIsStaging] = useState(false)
const [isDragOver, setIsDragOver] = useState(false)
const fileInputRef = useRef<HTMLInputElement>(null)
const voice = useVoiceInput()
const textareaRef = useRef<HTMLTextAreaElement>(null)
const selectedAgent = agents.find(
(agent) => agent.agentId === selectedAgentId,
)
const isConversation = variant === 'conversation'
const stageFiles = async (files: File[]) => {
if (files.length === 0) return
if (!attachmentsEnabled) {
setAttachmentError('Attachments are not supported for this agent yet.')
return
}
setIsStaging(true)
setAttachmentError(null)
try {
const result = await stageAttachments(files, attachments.length)
if (result.staged.length > 0) {
setAttachments((prev) => [...prev, ...result.staged])
}
if (result.errors.length > 0) {
setAttachmentError(result.errors.map((e) => e.message).join(' \u2022 '))
}
} finally {
setIsStaging(false)
}
}
const removeAttachment = (id: string) => {
setAttachments((prev) => prev.filter((a) => a.id !== id))
setAttachmentError(null)
}
useLayoutEffect(() => {
const element = textareaRef.current
if (!element) return
const maxHeight = isConversation ? 176 : 100
const collapsedHeight = isConversation ? 56 : 72
element.style.height = '0px'
const nextHeight = Math.min(element.scrollHeight, maxHeight)
element.style.height = `${nextHeight}px`
element.style.overflowY =
element.scrollHeight > maxHeight ? 'auto' : 'hidden'
setIsExpandedDraft(nextHeight > collapsedHeight)
})
useEffect(() => {
if (voice.transcript && !voice.isTranscribing) {
setInput(voice.transcript)
voice.clearTranscript()
}
}, [voice.transcript, voice.isTranscribing, voice])
useEffect(() => {
if (attachmentsEnabled) return
setAttachments([])
setAttachmentError(null)
}, [attachmentsEnabled])
const toggleTab = (tab: chrome.tabs.Tab) => {
setSelectedTabs((prev) => {
const isSelected = prev.some((selected) => selected.id === tab.id)
if (isSelected) {
return prev.filter((selected) => selected.id !== tab.id)
}
return [...prev, tab]
})
}
const hasContent = input.trim().length > 0 || attachments.length > 0
// Queue-aware composers (the conversation panel passes `onStop`)
// accept input while streaming — the parent decides whether the
// submission opens a new turn or enqueues onto the active one.
// Surfaces without a Stop hook (home) keep the legacy behaviour
// and block input until the current turn finishes.
const queueAware = Boolean(onStop)
const handleSend = () => {
const text = input.trim()
if (disabled || isStaging) return
if (streaming && !queueAware) return
if (!text && attachments.length === 0) return
onSend({ text, attachments })
setInput('')
setAttachments([])
setAttachmentError(null)
}
const handlePaste = (event: React.ClipboardEvent<HTMLTextAreaElement>) => {
const items = event.clipboardData?.items
if (!items) return
const files: File[] = []
for (const item of items) {
if (item.kind === 'file') {
const file = item.getAsFile()
if (file) files.push(file)
}
}
if (files.length > 0) {
event.preventDefault()
void stageFiles(files)
}
}
const handleDrop = (event: DragEvent<HTMLDivElement>) => {
event.preventDefault()
setIsDragOver(false)
const files = Array.from(event.dataTransfer?.files ?? [])
if (files.length > 0) {
void stageFiles(files)
}
}
const handleDragOver = (event: DragEvent<HTMLDivElement>) => {
if (!event.dataTransfer?.types.includes('Files')) return
event.preventDefault()
setIsDragOver(true)
}
const handleDragLeave = (event: DragEvent<HTMLDivElement>) => {
if (event.currentTarget.contains(event.relatedTarget as Node | null)) {
return
}
setIsDragOver(false)
}
const openFilePicker = () => {
if (!attachmentsEnabled) {
setAttachmentError('Attachments are not supported for this agent yet.')
return
}
fileInputRef.current?.click()
}
const handleFileInputChange = (
event: React.ChangeEvent<HTMLInputElement>,
) => {
const files = Array.from(event.target.files ?? [])
event.target.value = ''
if (files.length > 0) void stageFiles(files)
}
const shell = variant === 'home' ? HomeShell : ConversationShell
const Shell = shell
return (
<Shell>
<section
// Drag/drop on a region isn't a click affordance — wrap the
// composer in a labeled <section> so the a11y rule is satisfied
// without misrepresenting the surface as interactive.
aria-label="Message composer"
className={cn('relative', isDragOver && 'ring-2 ring-primary/60')}
onDragOver={handleDragOver}
onDragLeave={handleDragLeave}
onDrop={handleDrop}
>
<input
ref={fileInputRef}
type="file"
multiple
accept="image/png,image/jpeg,image/webp,image/gif,text/*,application/json"
className="hidden"
onChange={handleFileInputChange}
/>
{attachments.length > 0 || attachmentError ? (
<AttachmentStrip
attachments={attachments}
onRemove={removeAttachment}
error={attachmentError}
/>
) : null}
<div
className={cn(
'flex gap-3',
variant === 'home' ? 'px-4 py-3' : 'px-4 py-3',
isExpandedDraft ? 'items-end' : 'items-center',
)}
>
<BotInputIcon variant={variant} />
<div className="flex-1">
<Textarea
ref={textareaRef}
value={input}
onChange={(event) => setInput(event.currentTarget.value)}
onKeyDown={(event) => {
if (event.key === 'Enter' && !event.shiftKey) {
event.preventDefault()
handleSend()
}
}}
onPaste={handlePaste}
rows={1}
placeholder={
voice.isTranscribing
? 'Transcribing...'
: (placeholder ??
`Message ${selectedAgent?.name ?? 'agent'}...`)
}
disabled={disabled || voice.isTranscribing}
className={cn(
'resize-none border-none bg-transparent px-0 text-[15px] shadow-none focus-visible:ring-0',
'[field-sizing:fixed]',
variant === 'home'
? 'min-h-[40px] py-2 leading-6'
: 'min-h-[40px] py-2 leading-6',
'placeholder:text-muted-foreground/80',
)}
/>
</div>
{streaming && onStop ? <StopButton onStop={onStop} /> : null}
<VoiceButton
isRecording={voice.isRecording}
isTranscribing={voice.isTranscribing}
onStart={() => {
void voice.startRecording()
}}
onStop={() => {
void voice.stopRecording()
}}
/>
<InputActionButton
disabled={
!hasContent ||
isStaging ||
!!disabled ||
voice.isRecording ||
voice.isTranscribing ||
(streaming && !queueAware)
}
onClick={handleSend}
// Spinner stays the user-facing "agent is busy" hint; with the
// queue active we still spin while a turn is in flight.
streaming={streaming}
hasContent={hasContent}
/>
</div>
{voice.error ? (
<div className="px-5 pb-2 text-destructive text-xs">
{voice.error}
</div>
) : null}
<ContextControls
agents={agents}
onCreateAgent={onCreateAgent}
onSelectAgent={onSelectAgent}
selectedAgentId={selectedAgentId}
selectedTabs={selectedTabs}
onToggleTab={toggleTab}
showAgentSelector={variant === 'home'}
status={status}
onAttachClick={openFilePicker}
attachDisabled={attachments.length >= 10 || isStaging || !!disabled}
attachmentsEnabled={attachmentsEnabled}
/>
{isDragOver ? (
<div className="pointer-events-none absolute inset-0 flex items-center justify-center rounded-[inherit] bg-background/80 font-medium text-foreground text-sm backdrop-blur-sm">
Drop files to attach
</div>
) : null}
</section>
</Shell>
)
}
function AttachmentStrip({
attachments,
onRemove,
error,
}: {
attachments: StagedAttachment[]
onRemove: (id: string) => void
error: string | null
}) {
return (
<div className="border-border/40 border-b px-4 pt-3 pb-2">
{attachments.length > 0 ? (
<div className="flex flex-wrap gap-2">
{attachments.map((attachment) => (
<AttachmentChip
key={attachment.id}
attachment={attachment}
onRemove={() => onRemove(attachment.id)}
/>
))}
</div>
) : null}
{error ? (
<div className="mt-2 text-destructive text-xs">{error}</div>
) : null}
</div>
)
}
function AttachmentChip({
attachment,
onRemove,
}: {
attachment: StagedAttachment
onRemove: () => void
}) {
if (attachment.kind === 'image' && attachment.dataUrl) {
return (
<div className="group relative size-16 overflow-hidden rounded-md border border-border/60">
<img
src={attachment.dataUrl}
alt={attachment.name}
className="size-full object-cover"
/>
<button
type="button"
onClick={onRemove}
className="absolute top-1 right-1 inline-flex size-5 items-center justify-center rounded-full bg-background/80 text-muted-foreground opacity-0 transition-opacity hover:text-foreground group-hover:opacity-100"
aria-label={`Remove ${attachment.name}`}
>
<X className="size-3" />
</button>
</div>
)
}
return (
<div className="group flex max-w-[220px] items-center gap-2 rounded-md border border-border/60 bg-background/60 px-2 py-1.5">
<FileText className="size-4 shrink-0 text-muted-foreground" />
<span className="truncate text-xs">{attachment.name}</span>
<button
type="button"
onClick={onRemove}
className="ml-1 inline-flex size-4 items-center justify-center text-muted-foreground hover:text-foreground"
aria-label={`Remove ${attachment.name}`}
>
<X className="size-3" />
</button>
</div>
)
}
function BotInputIcon({ variant }: { variant: 'home' | 'conversation' }) {
return (
<div
className={cn(
'flex items-center justify-center text-[var(--accent-orange)]',
variant === 'home'
? 'h-8 w-8 rounded-lg bg-[var(--accent-orange)]/10'
: 'h-8 w-8 rounded-lg bg-[var(--accent-orange)]/10',
)}
>
<Bot className="h-4 w-4" />
</div>
)
}