Commit Graph

703 Commits

Author SHA1 Message Date
shivammittal274
d9e249bb23 fix: switch from chrome.storage.session to WXT local storage
chrome.storage.session and its onChanged listener may not work
reliably across all extension page contexts in BrowserOS. Switched
to local storage with WXT's storage.defineItem and .watch() which
are proven to work in this codebase (used by stopAgentStorage,
searchActionsStorage, etc). Simpler: single map, WXT handles
cross-context change notifications.
2026-03-19 00:48:29 +05:30
shivammittal274
9c949a3014 fix: stabilize follower watcher — prevent effect re-runs
The linter had added [setMessages] as a dependency which caused the
effect to tear down and re-setup on re-renders, creating windows
where the storage watcher was disconnected. Fixed with empty deps
array and biome-ignore — this effect must run exactly once on mount.
2026-03-19 00:44:21 +05:30
shivammittal274
ef2be606af fix: follower adopts final messages when stream completes
When the stream finishes (status: 'ready'), the follower now finds
the completed stream by conversationId and calls setMessages() to
adopt the final conversation. Previously it only searched for active
streams, missed the completed one, and reverted to empty state.
2026-03-19 00:41:36 +05:30
shivammittal274
f29d596c6d fix: simplify follower detection — no tab ID resolution needed
The follower no longer tries to resolve its own tab ID via
chrome.tabs.query (which returned the wrong tab when the agent
had already moved on). Instead: any fresh side panel (zero messages)
that opens during an active stream automatically follows it.
This works because the background script only opens side panels
on tabs from followerTabIds — so a fresh panel IS a follower.
2026-03-19 00:22:47 +05:30
shivammittal274
30178d6e07 fix: address review feedback — storage races, stale leader, null guard
- Per-conversation storage keys: each leader writes its own
  `session:stream:<id>` key instead of a shared map, eliminating
  read-modify-write races for parallel agents.
- Stale leader detection: follower schedules a re-check timer
  (STALE_THRESHOLD + 500ms) so it exits following mode even when
  the leader tab closes and stops writing.
- Null guard: findStreamForTab is now async and reads storage
  internally, never receives a raw map that could be null.
- Tab ID race: sequenced initial storage read after chrome.tabs.query
  resolves, so ownTabIdRef is set before the first follower check.
- Added .catch() on all chrome.storage.session writes to surface
  quota errors.
2026-03-19 00:00:29 +05:30
shivammittal274
092003c90c feat: multi-tab stream sync for agent side panel
When the agent opens multiple tabs during a task, all tabs now show the
same streaming conversation in their side panels instead of independent
empty conversations.

Server: framework.ts resolves tabId from structuredContent.pageId in
addition to args.page, so new_page and new_hidden_page tools now get
tabId in their output metadata.

Client: leader panel extracts followerTabIds from tool output metadata
and writes them to shared session storage (keyed by conversationId for
parallel execution support). Background script watches for new
followerTabIds and auto-opens side panels on those tabs. Follower
panels detect their tab in the list and sync messages in real-time.

Handles edge cases: follower opt-out on reset, stale leader detection
(10s timeout), cleanup after completion, and stop signal forwarding.
2026-03-18 23:07:59 +05:30
shivammittal274
59b00a6837 feat: remote skill download and auto-sync (#468)
* feat: add remote skill download and auto-sync

Download default skills from remote catalog on first setup with
bundled fallback when offline. Background sync every 45 minutes
checks for new/updated skills without overwriting user-customized
ones. Tracks installed defaults via content hashes in a local
manifest file.

* feat: make skills catalog URL configurable and add generation script

Add SKILLS_CATALOG_URL env var (following CODEGEN_SERVICE_URL pattern)
with fallback to the default constant. Add script to generate
catalog.json from bundled defaults for static hosting.

* feat: add R2 upload script and use cdn.browseros.com for catalog URL

Add upload-skills-catalog.ts that generates and uploads catalog.json
to Cloudflare R2 (same infra as existing build artifacts). Update
default catalog URL to cdn.browseros.com/skills/v1/catalog.json.

* test: add E2E tests for remote skill sync against live CDN

* fix: address code review findings — security, validation, DRY

- Add path traversal protection via safeSkillDir in writeSkillFile
  and readSkillContent (reuses existing validation from service.ts)
- Add runtime type guards for catalog JSON and manifest JSON parsing
- Fix seedFromRemote to return false on partial failure so bundled
  fallback kicks in
- Add per-skill error handling in syncRemoteSkills so one bad skill
  doesn't crash the entire sync
- Wire stopSkillSync into Application.stop() shutdown path
- Extract version from frontmatter in seedFromBundled instead of
  hardcoding '1.0'
- Consolidate duplicated logic: reuse installSkill/writeSkillFile/
  contentHash/saveManifest from remote-sync.ts in seed.ts
- Extract shared catalog generation into scripts/catalog-utils.ts

* test: add flow tests for all four sync scenarios against live CDN

* refactor: remove redundant scripts and inline catalog generation

Drop generate-skills-catalog.ts, catalog-utils.ts, and
e2e-remote-sync.test.ts (covered by flows.test.ts). Inline
catalog generation into upload-skills-catalog.ts.

* test: add full E2E server flow test against live CDN

Tests all 7 steps of the real server lifecycle: fresh seed from CDN,
no-op sync, user edit preservation, skill reinstall, custom skill
protection, background timer firing, and second startup skip.

* chore: remove e2e-server-flow test

* fix: address Greptile review — entry validation, size limit, DRY, no-op saves

- Validate individual skill entries in catalog (id, version, content
  must all be strings) not just the top-level shape
- Add 1MB response size limit on catalog fetch to prevent resource
  exhaustion from compromised/misconfigured CDN
- Skip manifest save when sync cycle had no changes (avoids
  unnecessary disk I/O every 45 minutes)
- Share extractVersion via remote-sync.ts export, remove duplicate
  from seed.ts

* fix: prevent bundled fallback from overwriting partial remote seeds

When seedFromRemote partially fails, the bundled fallback now skips
skills already in the manifest (installed by the partial remote
seed). Also adds Content-Length early check before downloading the
full catalog response body.

* fix: run sync immediately on startup, not just on interval

Previously the first sync fired 45 minutes after boot. Now
startSkillSync runs one sync immediately so returning users
get skill updates right away.

* refactor: simplify sync — remote always wins, remove manifest

Remote catalog is the source of truth. If a skill exists in the
catalog, its version is compared against local frontmatter and
overwritten when newer. No manifest file, no content hashes.

User-created skills (IDs not in catalog) are never touched.

* fix: skip bundled skills already installed by partial remote seed

* chore: remove unreliable Content-Length check

* chore: remove size limit checks, fetch timeout is sufficient
2026-03-17 21:40:45 +05:30
Nikhil
1779e1e7bd fix: create user-data dir if missing (#473) 2026-03-17 08:30:39 -07:00
shivammittal274
2597cdbc70 feat: add Rewrite with AI for scheduled task prompts (#465)
* feat: add "Rewrite with AI" prompt refinement for scheduled tasks

Add a lightweight /refine-prompt endpoint that uses generateText to
rewrite rough scheduled task prompts into clear, actionable instructions.
The UI adds a sparkle-icon button next to the Prompt label in the
NewScheduledTaskDialog with loading state, undo support, and disabled
state when the textarea is empty.

* fix: clear stale undo ref on dialog re-open and pass providerId to refinePrompt

- Reset originalPromptRef when dialog opens and on form submit to
  prevent stale "Undo rewrite" button on re-open
- Accept optional providerId in refinePrompt() so the form's selected
  provider is used for refinement instead of always the system default

* fix: hide undo rewrite link while refinement is in flight

* fix: reset isRefining state on dialog re-open

* fix: ignore stale refine-prompt responses after dialog re-open

Use a request generation counter so that if the dialog is closed and
re-opened while a rewrite is in flight, the stale response is silently
discarded instead of overwriting the fresh form state.

* fix: invalidate stale refine requests on dialog reopen and rename to kebab-case

- Increment refineRequestIdRef on dialog open so in-flight requests
  from a previous session are discarded when they complete
- Rename refinePrompt.ts to refine-prompt.ts per CLAUDE.md file naming
2026-03-17 19:40:56 +05:30
shivammittal274
515ad44826 fix: resolve biome v2 config and lint errors (#471)
Migrate `files.ignore` to `files.includes` for Biome v2 compatibility,
fix forEach callback return value, unused variable, import ordering,
and formatting violations.
2026-03-17 19:14:01 +05:30
Dani Akash
2a6848bc1d feat: improved system prompt (#466)
* feat: added ai-sdk dev tools

* feat: new system prompt section

* feat: tests to maintain prompt integrity

* feat: update mcp sync to use react query

* fix: refetch logic for sync

* chore: remove limits on fetching integrations

* fix: refetch integrations on delete

* fix: review comment

* chore: update tests

* fix: improved memory classification

* fix: lint issues

* fix: core memory prompts

* fix: handle scenario where soul file is empty
2026-03-17 19:01:10 +05:30
Dani Akash
74f6a2dff1 fix: issue with fill tool (#469) 2026-03-17 18:58:17 +05:30
Dani Akash
58adac17db feat: new workflows (#470) 2026-03-17 18:56:55 +05:30
shivammittal274
e67c17a0f8 feat: add voice input to agent chat sidebar (#467)
* feat: add voice input to agent chat sidebar

Allow users to record voice and transcribe to text in the chat input.
Mic button shows when input is empty, waveform visualizer during recording,
transcription via OpenAI (llm.browseros.com/api/transcribe).

- Extract shared useVoiceInput hook to lib/voice/
- Time-domain waveform bars that bounce per-frequency-band
- Bar height capped to fit input container
- Analytics events for recording lifecycle

* fix: address review — add fetch timeout, await stopRecording, deduplicate VoiceInputState

- Add AbortSignal.timeout(30s) to transcription fetch
- Await stopRecording() and track analytics after completion
- Export VoiceInputState from useVoiceInput, import in consumers

* fix: await startRecording before tracking, narrow SurveyChat effect deps

- Await startRecording() so analytics only fires after mic permission granted
- Narrow SurveyChat useEffect dependency from [voice] to [voice.transcript, voice.isTranscribing]

* fix: analytics only tracks on success, clean up stream on failure, type API response

- startRecording returns boolean; track(RECORDING_STARTED) only fires on success
- Catch block cleans up MediaStream tracks and AudioContext on partial failure
- Type transcription API response with TranscribeResponse interface

* fix: keep mic button always visible alongside send button

Mic and send are now separate buttons, both always visible.
Mic is disabled while AI is streaming. Send is disabled during
recording/transcribing. Buttons are no longer absolutely positioned
inside the textarea — they sit beside it in the flex row.

* fix: keep mic button always visible inside input alongside send

Both mic and send buttons are always visible inside the input field,
positioned on the right side (ChatGPT-style). Mic is disabled while
AI is streaming. Send is disabled during recording/transcribing.

* fix: remove unreachable CSS branch in recording waveform div
2026-03-17 18:28:19 +05:30
shivammittal274
94e3f99adb feat: add test-ui skill for visual testing of agent extension via CDP (#464)
* feat: add CDP UI inspector script for dev self-testing

* fix: address code review feedback for inspect-ui script

- Use Delete key (not Backspace) to match server's keyboard.ts clearField
- Add windowId resolution to open-sidepanel (chrome.sidePanel.open requires it)
- Make target matching case-insensitive
- Replace process.exit(1) in eval with thrown error for proper cleanup
- Add comment referencing DEV_PORTS source of truth

* docs: add self-testing workflow for UI changes via CDP inspector

* fix: runtime fixes for inspect-ui discovered during live testing

- Remove Input.enable (domain has no enable method)
- Add DOM.getDocument before DOM operations (required by protocol)
- Use BrowserOS-specific sidePanel.browserosToggle API instead of
  standard chrome.sidePanel.open (side panel starts disabled)
- Enable side panel with setOptions before toggling

* feat: add test-ui skill for visual testing of agent extension UI

Adds a Claude Code skill that lets the agent visually test both
surfaces of the BrowserOS extension:
- New tab page (app.html) — left sidebar with Home, Scheduled Tasks,
  Settings, Skills, Memory, Soul, Connect Apps
- Right side panel (sidepanel.html) — chat interface

Includes all gotchas discovered through real testing: randomized ports,
fresh profile onboarding redirect, stale element IDs after navigation,
BrowserOS-specific sidePanel APIs, DOM.getDocument requirement.

* feat: add press_key, scroll, hover, select_option, wait_for to inspect-ui

Brings inspect-ui.ts to parity with server's MCP input tools:
- press_key: key combos like Enter, Control+A, Meta+Shift+P
  (ported from keyboard.ts pressCombo)
- scroll: up/down/left/right with configurable amount
- hover: hover over element by ID for tooltip/hover state testing
- select_option: select dropdown option by value or visible text
  (ported from browser.ts selectOption)
- wait_for: poll for text or CSS selector with 10s timeout

Updated skill documentation with new commands and examples.

* docs: prefer snapshot over screenshot, add holistic debugging guidance

- Add snapshot vs screenshot guidance table — prefer snapshot for
  structural checks, screenshot only for visual/layout verification
- Add server log checking instructions ([agent], [server], [build] tags)
- Add JS error checking via eval
- Add API connectivity verification
- Add common issues troubleshooting table
- Update all examples to use snapshot as default verification

* fix: address Greptile review feedback

- Replace process.exit(1) with process.exitCode + return in cmdWaitFor
  to allow async CDP cleanup in finally blocks
- Fix cmdScroll enabling Runtime instead of Page domain
- Add BROWSEROS_EXTENSION_ID env var override for extension ID
- Align CLAUDE.md dev server command with SKILL.md canonical command
2026-03-17 15:18:00 +05:30
Nikhil
e2069bc999 chore: bump server version (#459) 2026-03-16 16:42:54 -07:00
shivammittal274
2d51c82722 fix: detect custom clickable elements in take_snapshot (#452)
take_snapshot only used the AX tree, which misses custom components
(cursor:pointer divs, onclick handlers, etc.) that lack ARIA roles.
These elements appeared as role="generic" and were invisible to the agent.

Changes:
- Merge findCursorInteractiveElements into snapshot() so take_snapshot
  catches cursor:pointer, onclick, and tabindex elements
- Add DisclosureTriangle to INTERACTIVE_ROLES for <summary> elements
- Use aria-label as text fallback in cursor detection for icon-only buttons
- Fix dedup bug in enhancedSnapshot that was silently dropping all
  cursor-detected elements by checking against all AX node IDs instead
  of only already-included output IDs
2026-03-17 02:01:15 +05:30
shivammittal274
29056226bb feat: add eval framework and coordinate-based input tools (#453)
- Add hover_at, type_at, drag_at coordinate tools to server
- Add hoverAt, typeAt, dragAt methods to Browser class
- Export server internals (browser, tool-loop, registry) for eval imports
- Copy eval app from enterprise repo with agents, graders, runner, dashboard
- Nest eval-targets inside apps/eval
- Adapt sessionExecutionDir → workingDir for current server API
- Add biome ignore for dashboard HTML to prevent lint breaking onclick handlers
2026-03-16 23:12:23 +05:30
shivammittal274
d1d2074abc feat: add get_console_logs tool for browser console output (#454)
* feat: add get_console_logs tool to surface browser console output

Captures Runtime.consoleAPICalled, Runtime.exceptionThrown, and
Log.entryAdded CDP events per page with a FIFO ring buffer (500 entries).

- ConsoleCollector: per-page buffers with O(1) session routing via Map lookup
- Session-aware CDP event dispatching (onSessionEvent) in CdpBackend
- Log.enable() added alongside Runtime.enable() in attachToPage
- Single tool with level hierarchy, text search, limit, and clear params
- Buffer clears on main-frame navigation, cleaned up on page close

* fix: address review — handle session re-attach, remove dead code

- ConsoleCollector.attach() now updates session mapping on re-attach
  instead of early-returning, preventing silent event drops after
  target detach/re-attach (e.g. tab crash, cross-process navigation)
- Remove unused clearConsoleLogs() and ConsoleCollector.clear()
2026-03-16 22:20:40 +05:30
shivammittal274
41c9b1547c feat: add per-task LLM provider selection for scheduled tasks (#450)
* feat: add per-task LLM provider selection for scheduled tasks

Allow users to choose which AI provider a scheduled task runs with,
using the same ChatProviderSelector component from the new-tab page.
Falls back to the global default provider when none is selected or
if the selected provider has been deleted.

* fix: lint issues

* chore: updated to latest schema.graphql file

---------

Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>
2026-03-16 18:03:21 +05:30
shivammittal274
46031ed573 fix: filter empty messages from conversation history to prevent validation errors
The AI SDK can produce assistant messages with empty parts (parts:[]) when
a stream is aborted, and providers reject assistant messages with empty text
content. This adds a validation utility that filters both cases before
sending messages to createAgentUIStreamResponse and when persisting them.
2026-03-15 17:42:34 +05:30
Felarof
4bee76253d fix: prevent undefined provider in chat requests on fresh install (#442)
* fix: fallback to default BrowserOS provider when provider is null

When the extension first loads, provider config is loaded async from
storage. If a chat request fires before loading completes (race
condition), provider is null and the server receives provider: undefined,
causing a Zod validation error. This adds a fallback to
createDefaultBrowserOSProvider() in both chat paths (sidepanel and
scheduled tasks) so provider.type is always defined.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: fallback to first provider when default provider ID is stale

When defaultProviderId in storage doesn't match any loaded provider
(e.g. after Kimi/Moonshot rollout), selectedProvider was null causing
provider: undefined in chat requests. Now falls back to providers[0].

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: repair stale defaultProviderId in storage on load

When the stored default provider ID doesn't match any loaded provider,
write back the corrected ID (providers[0].id) to storage so it doesn't
silently persist across sessions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 09:05:27 -07:00
Felarof
5b1b4e22cb chore: disable Canva and Exa from Klavis MCP server list
Comment out non-working Canva and Exa integrations from the OAuth MCP
servers list and remove their imports/icon mappings from the UI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 15:30:50 -07:00
Felarof
95c855a091 feat: replace rate limit CTAs with Kimi/Moonshot partnership links (#437)
* feat: replace rate limit CTAs with Kimi/Moonshot partnership links

Comment out old "Learn more" and "take a quick survey" links on the
daily limit error banner. Replace with Kimi API key docs link and
direct Moonshot AI platform link for conversion tracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove partnership tagline from rate limit banner

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 12:45:41 -07:00
Felarof
2c04d79830 fix: use BookOpen icon for Docs button in settings sidebar
The Docs link in the settings sidebar was using the Info icon (circle
with "i"). Changed it to BookOpen which is the standard icon for
documentation links.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:36:19 -07:00
Dani Akash
290ee91a8b Add 'packages/browseros-agent/' from commit '90bd4be3008285bf3825aad3702aff98f872671a'
git-subtree-dir: packages/browseros-agent
git-subtree-mainline: 8f148d0918
git-subtree-split: 90bd4be300
2026-03-13 21:22:09 +05:30
Dani Akash
8f148d0918 chore(repo): remove BrowserOS-agent submodule 2026-03-13 21:21:51 +05:30
Nikhil
8a38e90e24 fix: move session dirs to ~/.browseros/sessions and update skill paths (#494)
* chore: bump server version

* fix: move session dirs to ~/.browseros/sessions and update skill paths

Session directories now live under ~/.browseros/sessions/{conversationId}/
instead of executionDir/sessions/. Adds 30-day cleanup for stale sessions
at server startup. Updates 6 default skills to reference the working
directory instead of hardcoding ~/Downloads/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: rename sessionExecutionDir to workingDir across server

Consistent naming for the per-conversation working directory:
- ResolvedAgentConfig.sessionExecutionDir → workingDir
- ToolDirectories.executionDir → workingDir
- resolveExecutionPath() → resolveWorkingPath()
- buildBrowserToolSet param: executionDir → workingDir

Server-level executionDir (DB, logs) unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — restore emoji folder name, refresh session mtime

- Revert "Read Later" back to "📚 Read Later" to avoid creating
  duplicate bookmark folders for existing users
- Touch session dir mtime on each message via utimes() so cleanup
  correctly reflects last activity, not just directory creation time

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review round 2 — remove dead executionDir, fix emoji

- Remove executionDir from ChatServiceDeps and ChatRouteDeps since
  resolveSessionDir now uses getSessionsDir() directly
- Fix missed emoji in notification format template

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 16:41:47 -07:00
Nikhil Sonti
2c8cbbb77f chore: update patch 2026-03-12 14:24:07 -07:00
Nikhil Sonti
a12b3b4ffc chore: bump PATCH and OFFSET 2026-03-11 17:22:42 -07:00
Nikhil Sonti
038ae259f0 feat: show updates immediaately on macos 2026-03-11 17:01:14 -07:00
Nikhil Sonti
90400e3fcf fix: sparkle notification for update fix duplication 2026-03-11 16:38:48 -07:00
Nikhil
58a216fde3 fix: sparkle crash + notification fix for macos (#425)
* fix: sparkle crash

* feat: sparkle notification fix

* feat: new tab focus fix
2026-03-11 14:50:54 -07:00
Nikhil
38cc388894 feat: add missing patches and split sparkle in features.yaml (#424)
* feat: add missing patches to features.yaml

Add 37 patch files from chromium_patches/ that were not tracked in
features.yaml. Creates 3 new features (cdp-api, vertical-tabs,
crash-reporter) and adds missing files to 3 existing features
(chromium-ui-fixes, side-panel-fixes, first-run).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: split sparkle third-party from mac-sparkle-updater

Move third_party/sparkle/ into its own feature since the Sparkle
framework is downloaded on-the-fly during build, not a permanent
patch in the tree.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: minor

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 14:32:53 -07:00
Nikhil Sonti
32ce02b59f fix: hidden windows fix 2026-03-10 18:40:10 -07:00
Nikhil Sonti
7566f0ee82 fix: sidepanel request focus fix 2026-03-10 18:39:19 -07:00
Nikhil Sonti
ffe1f8a469 chore: server ota 2026-03-10 18:31:37 -07:00
Nikhil Sonti
a5e7c359e3 chore: Merge branch 'main' 2026-03-10 18:22:19 -07:00
Nikhil Sonti
3f4cccdf12 chore: bump PATCH and OFFSET 2026-03-10 18:22:15 -07:00
Nikhil
866fe88acd feat: fix hidden window and tab tools (#417) 2026-03-10 18:21:10 -07:00
Nikhil
a824078f6d fix: compaction config for small context windows (≤32K) (#466)
* fix: compaction config for small context windows (≤32K)

Raise COMPACTION_SMALL_CONTEXT_WINDOW from 16K to 32K so models like
Haiku 4.5 (30K context) use proportional 50% reserve instead of the
fixed 20K reserve. Also scale fixedOverhead for small contexts (capped
at 40% of context window) to prevent the doom loop where overhead alone
triggers compaction on every step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add compaction tuning guidance to limits constants

Explain the relationship between SMALL_CONTEXT_WINDOW and
FIXED_OVERHEAD so devs know the 24K minimum constraint when
tweaking these values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 18:12:20 -07:00
Nikhil Sonti
ae49da6e09 fix: sidepanel request focus fix 2026-03-10 17:27:54 -07:00
Nikhil Sonti
4472c2b890 chore: bump PATCH and OFFSET 2026-03-10 15:12:18 -07:00
Nikhil
de70525889 fix: grab handle size (#414) 2026-03-10 12:26:08 -07:00
Nikhil
5b27933c63 feat: add 2-stage pruning to compaction pipeline (#455)
* feat: add 2-stage pruning to compaction pipeline before LLM summarization

Add two new lightweight stages to the compaction prepareStep pipeline that
recover context tokens cheaply before falling back to expensive LLM
summarization:

- Stage 2: Use AI SDK's pruneMessages to remove old tool call/result
  pairs beyond the last 6 messages entirely
- Stage 3: Replace remaining tool output values with short placeholders
  ("[Cleared — N chars]") while preserving tool call structure and IDs

Both stages re-estimate tokens from message content (not stale step
usage) after modifying messages. The existing LLM summarization and
sliding window fallback remain as Stage 4.

Also adds estimateTokensForThreshold() helper, clearToolOutputs()
function, and COMPACTION_PRUNE_KEEP_RECENT_MESSAGES /
COMPACTION_CLEAR_OUTPUT_MIN_CHARS constants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: reorder compaction pipeline — truncate before clear, protect recent tools

- Stage 0: Check threshold, return untouched when under (no data loss)
- Stage 1: Prune old tool call/result pairs beyond last 6 messages
- Stage 2: Truncate large tool outputs to 15K chars (keeps partial content)
- Stage 3: Clear old tool outputs with placeholders, protect last 2
- Stage 4: LLM-based compaction with sliding window fallback

clearToolOutputs now accepts keepRecentCount parameter (default 2) to
skip the N most recent tool messages from clearing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: limits fixes

* fix: address review — preserve toKeep context, derive test values from constants

- When Stage 3 (clearToolOutputs) doesn't resolve overflow, pass
  truncated (not cleared) messages to Stage 4 so toKeep retains
  meaningful tool outputs for the agent's immediate context
- Add comment explaining intentional conservatism in post-prune
  token estimation (step usage is stale, must re-estimate safely)
- Refactor computeConfig tests to derive expected values from
  AGENT_LIMITS constants instead of hardcoding magic numbers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 10:41:34 -07:00
Nikhil
7d20768d8e feat: persist large tool outputs to disk (#452)
* feat: persist large tool outputs to disk

* fix: address PR review comments for tool output limits

* chore: raise filesystem read line limit to 500
2026-03-10 09:25:19 -07:00
Felarof
1e6b5ac7a8 chore: sync packages/browseros-agent submodule (to f35ac0d) 2026-03-10 12:20:28 +00:00
Dani Akash
f35ac0ddd3 feat: new onboarding tools (#385)
* feat: new tools for breadcrumbs

* feat: setup scheduled task card

* feat: added dismiss cooldown

* chore: update prompt

* fix: support api key tool

* fix: prompt text to limit nudges

* fix: scheduled tasks card

* fix: update nudges prompt

* feat: skip nudges when user dismisses nudge

* fix: ensure nudges only show if they are not dismissed

* Revert "fix: ensure nudges only show if they are not dismissed"

This reverts commit d825254698829b8e9941aae7873bd440027d0c74.

* Revert "feat: skip nudges when user dismisses nudge"

This reverts commit 12b552b454d10ec4209b88668fc48681423ff6fc.

* Revert "fix: update nudges prompt"

This reverts commit 80b7520b953b4d3cbed2ed477b9e508e39938dca.

* feat: update agent with mcp when new mcp connection is added

* feat: created connect apps option as a blocking card system

* feat: schedule tasks passive without dismiss

* fix: nudges and prompt texts

* fix: biome lint errors

* fix: review comments

* fix: resolve comments

* fix: review comments

* fix: review comments

* fix: auto resolve state

* fix: eliminate the race where the async delete could resolve after the
new session

* feat: track ignored apps list

* fix: empty response text object on message reply

* feat: sync previously connected mcps

* feat: sync integrations with klavis

* feat: account for unauthenticated connections

* fix: analytics events

* fix: typescript issues

* fix: klavis client issue

* fix: invalid mcps causing entire responses from failing

* fix: prompt with card for integrations when the integration fails

* fix: prompt structure to support declined apps

* fix: refresh session on mcp changes
2026-03-10 17:44:10 +05:30
shivammittal274
b6b45404ee feat: add agent skills system with catalog, loader, and UI (#450)
* feat: add agent skills system with catalog, loader, and UI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: return 500 for server errors in PUT/DELETE skill routes

Previously both handlers returned 404 for all errors, masking filesystem
failures (disk full, permission denied) as "not found". Now only
"not found" errors return 404; everything else returns 500.

* fix: align SKILL.md format with agentskills.io spec

- Move `enabled` and `version` into `metadata` field (spec only allows
  name, description, license, compatibility, metadata, allowed-tools)
- Frontmatter `name` now matches directory name (lowercase kebab-case)
- Human-readable name stored in `metadata.display-name`
- Add index signature to SkillMetadata for arbitrary string keys
- Validate frontmatter with type guard in getSkill (remove unsafe cast)
- updateSkill now preserves existing frontmatter fields (license, etc.)
- Tighten buildSkillMd param from Record<string, unknown> to SkillFrontmatter

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 17:24:05 +05:30
Felarof
797c75baee chore: sync packages/browseros-agent submodule (to 44071cb) 2026-03-09 21:13:22 +00:00