BrowserOS

mirror of https://github.com/browseros-ai/BrowserOS.git synced 2026-05-20 20:39:10 +00:00

Author	SHA1	Message	Date
Nikhil Sonti	9a36c09577	fix: address review feedback for PR #602 - Make loadProdEnv return empty map when .env.production is absent so pickEnv falls through to process.env in CI (Greptile P1) - Add semver format validation for version string in install.sh and install.ps1 to guard against malformed CDN responses - Pass inputs.version via env var instead of inline ${{ }} interpolation to prevent command injection in workflow shell	2026-03-27 11:36:49 -07:00
Nikhil Sonti	2ce6379f8c	feat: upload CLI binaries to CDN during release and gate workflow to core team - Extend scripts/build/cli/upload.ts with uploadCliRelease() that pushes archives + checksums to R2 under versioned (cli/v{VERSION}/) and latest (cli/latest/) paths, plus a version.txt for lightweight latest resolution - Update scripts/build/cli.ts entry point with --release/--version/--binaries-dir flags (existing no-args behavior preserved for upload:cli-installers) - Rewrite install.sh and install.ps1 to fetch from cdn.browseros.com instead of GitHub releases API — eliminates rate limits and API dependency - Add environment: release-core to release-cli.yml for core-team gating via GitHub environment protection rules - Add Bun setup + CDN upload step to the workflow between build and GitHub release	2026-03-27 11:36:49 -07:00
Nikhil	1c5ffdf878	fix: harden cli installer bootstrap (#601 ) * fix: harden cli installer bootstrap * refactor: rework 0327-harden_cli_installers based on feedback	2026-03-27 11:24:16 -07:00
shivammittal274	ed948f4b59	Feat/cli launch ready v2 (#600 ) * test: temporarily allow release workflow on any branch * fix(cli): restore main-only guard, remove goreleaser dependency Replaces GoReleaser (Pro-only monorepo feature) with plain go build. Tested: RC release created successfully on branch with all 6 binaries. * fix(cli): fix hdiutil mount detection, update README with install/launch/init flow	2026-03-27 20:20:17 +05:30
shivammittal274	aad5bc16fd	Feat/cli launch ready v2 (#599 ) * test: temporarily allow release workflow on any branch * fix(cli): restore main-only guard, remove goreleaser dependency Replaces GoReleaser (Pro-only monorepo feature) with plain go build. Tested: RC release created successfully on branch with all 6 binaries. * fix(cli): remove -quiet from hdiutil so mount point is detected	2026-03-27 20:17:13 +05:30
Dani Akash	cee318a40b	fix: improve chat history freshness and reduce query payload (#598 ) * fix: add refresh indicator to chat history when fetching latest conversations Show a non-blocking "Fetching latest conversations" indicator at the top of the history list while the cached data is being refreshed. Users can still interact with the cached conversation list during the refresh. * perf: reduce chat history query payload — fetch last 2 messages instead of 5 The conversation list only displays the last user message as a preview. Fetching 5 messages per conversation was wasteful — each message contains the full UIMessage object (tool calls, reasoning, etc.) multiplied by 50 conversations per page. Reduced to last 2 which is sufficient to find the last user message in a user→assistant exchange. * perf: use first+DESC instead of last+ASC to push LIMIT down to SQL PostGraphile's `last: N` doesn't map to SQL LIMIT — it uses a padded LIMIT 10 and slices in application code. Changing to `first: 2` with ORDER_INDEX_DESC generates a true SQL LIMIT 2, reducing rows scanned from 500 to 100 per page (50 conversations × 2 vs 10 messages each). No UX impact — extractLastUserMessage() filters by role regardless of message order. * chore: update react query packages * feat: replace localforage with idb-keyval	2026-03-27 19:49:47 +05:30
Dani Akash	febaf58f91	fix: guard filesystem tools behind workspace selection and handle mid-conversation changes (#595 ) * fix: remove filesystem tools when no workspace is selected - Make workingDir optional on ResolvedAgentConfig - Remove resolveSessionDir() fallback that always created a session dir, masking the no-workspace state and keeping filesystem tools available - Gate buildFilesystemToolSet() on workingDir being defined - Add workspace change detection mid-conversation — rebuilds the agent session when workspace is added, removed, or switched (same pattern as existing MCP server change detection) - download_file falls back to tmpdir() when no workspace is set - Memory/soul tools are unaffected — they use ~/BrowserOS/ paths * fix: sanitize message history when session rebuilds with different tools When a session is rebuilt due to workspace or MCP changes, the carried-over message history may contain tool parts for tools that no longer exist in the new session. The AI SDK validates messages against the current toolset and rejects parts with no matching schema. - Add toolNames getter to AiSdkAgent exposing registered tool names - Add sanitizeMessagesForToolset() to strip tool parts referencing removed tools from carried-over messages - Apply sanitization in both MCP and workspace session rebuilds * fix: prepend tool-change context to user message on session rebuild When workspace or MCP integrations change mid-conversation, prepend a [Context: ...] block to the user's message explaining what changed. This prevents the LLM from hallucinating tool usage based on patterns in the carried-over conversation history. Context messages vary by change type: - Workspace removed: lists unavailable filesystem tools, suggests selecting a working directory - Workspace added: confirms filesystem tools are available with path - Workspace switched: notes the new working directory - MCP changed: notes that some integration tools may have changed Only fires on the first message after a rebuild. Invisible in the UI. * fix: make MCP change context specific about which apps were added/removed Diff the old and new MCP server keys to produce specific context like: - "The following app integrations were disconnected: Gmail, Slack." - "The following app integrations were connected: Linear." instead of a generic "some tools may no longer be available" message. * refactor: extract shared rebuildSession helper in ChatService Eliminates the duplicated 20-line dispose→create→sanitize→store flow that existed separately in both the MCP and workspace change-detection blocks. Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com> * test: add sanitizeMessagesForToolset test suite Tests for the message sanitization that runs when a session rebuilds with a different toolset (workspace or MCP change mid-conversation): - Preserves messages with no tool parts - Preserves tool parts when tool is in the toolset - Strips tool parts when tool is NOT in the toolset - Strips multiple removed tool parts from same message - Keeps browser tools while removing filesystem tools - Removes messages that become empty after stripping - Preserves non-tool parts (reasoning, step-start, file) - Returns same references when no filtering needed - Handles empty message array and empty toolset * style: fix biome formatting in chat-service.ts --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>	2026-03-27 18:30:25 +05:30
Dani Akash	aacb47f7ee	feat: isolate new-tab agent navigation from origin tab (#593 ) * feat: isolate new-tab agent navigation from origin tab Add origin-aware navigation isolation so the agent never navigates away from the new-tab chat UI. This is a two-layer defense: 1. Prompt adaptation: When origin is 'newtab', the system prompt's execution and tool-selection sections are rewritten to prohibit navigating the active tab and default all lookups to new_page. 2. Tool-level guards: navigate_page and close_page reject attempts to act on the origin tab when in newtab mode, returning an error that teaches the agent to self-correct. The client now sends an `origin` field ('sidepanel' \| 'newtab') instead of injecting a soft NEWTAB_SYSTEM_PROMPT that LLMs could ignore. Backwards compatible — defaults to 'sidepanel'. Closes TKT-592, addresses TKT-564 * test: add newtab origin navigation guard tests - 14 new prompt tests verifying the system prompt adapts correctly for newtab vs sidepanel origin (execution rules, tool selection table, absence of conflicting single-tab guidance) - 6 new integration tests for navigate_page and close_page guards: rejects origin tab in newtab mode, allows non-origin tabs, allows all tabs in sidepanel mode, backwards compatible with no session	2026-03-27 12:06:32 +05:30
Dani Akash	b3003542d8	docs: overhaul READMEs across all major packages (#594 ) * docs: overhaul READMEs across all major packages - Root README: restructure with feature table, LLM provider table, comparison matrix, architecture map, and docs link - New: packages/browseros/README.md (Chromium fork build system) - New: apps/server/README.md (MCP server + agent loop) - New: packages/cdp-protocol/README.md (CDP type bindings) - Polish: agent-sdk (badges, prerequisites, multi-step example, links) - Polish: cli (badges, install section, MCP server section, links) - Polish: agent extension (badges, WXT mention, architecture context) - Polish: eval (badges, paper links) * fix: address review — consistent tool count and correct default port - CLI README: "54 MCP tools" → "53+ MCP tools" to match root and server docs - Agent SDK README: localhost:3000 → localhost:9100 to match documented default * docs: add detailed comparison links to How We Compare section * docs: update comparison table with verified competitor data Research all 5 competitors via official websites and docs: - Chrome: no AI agent, Gemini Nano only, MV3 weakening ad blocking - Brave: BYOM feature, local models via BYOM, Shields ad blocking, MV2+MV3 - Dia: Skills-based AI, no BYOK, cloud AI, acquired by Atlassian - Comet: full cloud-based agent, built-in ad blocking, extensions on desktop - Atlas: standalone Chromium browser with Agent Mode, 30-day cloud memory Renamed Arc/Dia column to just Dia (Arc is sunset). * docs: simplify comparison table with clean checkmarks and key differentiators * docs: update browseros-agent README — remove submodule note, add missing packages	2026-03-27 11:59:04 +05:30
Nikhil	aba7a10430	chore: server release (#592 )	2026-03-26 19:13:56 -07:00
Nikhil	220577b41c	feat: add CDN-hosted CLI installer flow (#588 ) * feat: add CDN upload flow for cli installers * fix: move cli install docs to top-level readme * fix: bun.lock update	2026-03-26 17:41:03 -07:00
Nikhil	03b45013a6	feat(cli): add install scripts for macOS, Linux, and Windows (#587 ) * feat(cli): add install scripts for macOS, Linux, and Windows Bash script (install.sh) for macOS/Linux and PowerShell script (install.ps1) for Windows. Both download the correct platform binary from GitHub Releases with checksum verification, version resolution, and PATH setup. * fix(cli): address PR review comments for install scripts - Add checksum verification to install.ps1 using Get-FileHash - Add warnings on all checksum skip paths in install.sh - Use grep -F (fixed-string) instead of regex for filename matching - Add ?per_page=100 to GitHub API call in install.ps1 - Use random temp directory name in install.ps1 to avoid collisions * fix(cli): address installer review feedback	2026-03-26 17:05:21 -07:00
Nikhil	085352a6f0	fix(ui): resolve MCP promo banner dismiss button overlapping with text (#581 ) Move dismiss button from absolute positioning to inline flex child, preventing it from overlapping with the "Set up" button.	2026-03-26 12:54:00 -07:00
shivammittal274	663c18ee97	fix(cli): update goreleaser tag_prefix to match browseros-cli-v* format (#579 )	2026-03-27 01:07:36 +05:30
Dani Akash	48727750b4	fix: change CLI tag format from cli/v* to browseros-cli-v* (#578 ) GoReleaser free cannot parse slash-prefixed tags (cli/v0.0.1) as semver. Switch to browseros-cli-v0.0.1 format which is valid semver after stripping the prefix. Remove the monorepo config (GoReleaser Pro only).	2026-03-27 00:58:13 +05:30
Dani Akash	30a3a96a57	fix: add monorepo tag prefix for goreleaser to parse cli/ tags (#576 )	2026-03-27 00:50:38 +05:30
shivammittal274	6773ce39da	ci(cli): manual dispatch release workflow (#574 ) * ci(cli): change release workflow to manual dispatch from main - Trigger via Actions UI with a version input (e.g. "0.1.0") - Only runs on main branch - Creates git tag cli/v<version> automatically - Then GoReleaser builds all 6 binaries and creates the GitHub Release * feat: add scoped release notes, changelog PR, and idempotent tags to CLI workflow - Add concurrency group to prevent parallel releases - Add scoped release notes from commits touching the CLI directory - Pass release notes to goreleaser via --release-notes flag - Make tag creation idempotent for safe re-runs - Tag the saved release SHA, not HEAD after branching - Add CHANGELOG.md and auto-update via PR with auto-merge - Add pull-requests: write permission --------- Co-authored-by: Dani Akash <DaniAkash@users.noreply.github.com>	2026-03-27 00:41:08 +05:30
github-actions[bot]	342a3e4a07	docs: update agent extension changelog for v0.0.52 (#573 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-03-26 19:01:46 +00:00
Dani Akash	1f00cbc9cc	feat: add release workflow for agent extension (#566 ) * feat: add release workflow for agent extension Adds a workflow_dispatch workflow that builds the WXT extension, creates a .zip for sideloading, generates scoped release notes with contributors and PR links, creates a GitHub release with the zip attached, and opens an auto-merge PR to update CHANGELOG.md. * fix: correct API URL to api.browseros.com * fix: remove duplicate PR numbers and contributors from extension release notes Apply the same fixes from the agent-sdk workflow: - Skip PR number if already in commit subject (squash merges) - Remove custom Contributors section (GitHub auto-generates one) - Clean up unused variables * fix: use absolute path for extension zip in release upload * fix: wxt zip already builds, use correct output path - Remove separate build step since wxt zip runs the build internally - Fix zip path from .output/.zip to dist/-chrome.zip	2026-03-27 00:23:04 +05:30
shivammittal274	e3d57e5347	feat(cli): production-ready CLI with auto-launch, install, and cross-platform builds (#555 ) * feat(cli): production-ready CLI with auto-launch, install, and cross-platform builds - init: accept URL argument and --auto flag for non-interactive setup - install: new command to download BrowserOS app for current platform - launch: auto-detect and launch BrowserOS when server is not running - discovery: prefer server.json (live) over config.yaml (may be stale) - errors: actionable messages guiding users to init/install - goreleaser: cross-platform builds for 6 targets (darwin/linux/windows × amd64/arm64) - ci: GitHub Actions workflow to release CLI binaries on cli/v* tag push * fix(cli): check health status code and add progress dots during launch - Health check in newClient() now verifies HTTP 200, not just no error - waitForServer prints dots during the 30s poll so users know it's working * refactor(cli): make launch an explicit command, remove auto-launch from newClient - launch: new explicit command to find and open BrowserOS app - launch: probes server.json, config, and common ports before launching - launch: if already running, reports URL instead of launching again - init --auto: uses port probing to find running servers - install --deb: errors on non-Linux instead of silently downloading DMG - error messages: guide users to launch/install/init explicitly - removed: auto-launch from newClient() — CLI never does something surprising * fix(cli): platform-native detection, launch, and install for all OSes Detection (isBrowserOSInstalled): - macOS: uses `open -Ra` to query Launch Services (no hardcoded paths) - Linux: checks /usr/bin/browseros (.deb), browseros.desktop, AppImage search - Windows: checks %LOCALAPPDATA%\BrowserOS\Application\BrowserOS.exe and HKCU/HKLM uninstall registry keys Launch (startBrowserOS): - macOS: `open -b com.browseros.BrowserOS` (bundle ID, not path) - Linux: `browseros` binary, AppImage, or `gtk-launch browseros` (fixed: was using xdg-open which opens by MIME type, not desktop files) - Windows: runs BrowserOS.exe from known Chromium per-user install path (fixed: was using `cmd /c start BrowserOS` which doesn't resolve) Install (runPostInstall): - macOS: hdiutil attach → cp -R to /Applications → hdiutil detach - Linux: chmod +x for AppImage, dpkg -i instruction for .deb - Windows: launches installer exe - --deb flag now errors on non-Linux platforms Removed auto-launch from newClient() — CLI never does surprising things. Sources verified from: - packages/browseros/build/common/context.py (binary names per platform) - packages/browseros/build/modules/package/linux.py (.deb structure, .desktop file) - packages/browseros/chromium_patches/chrome/install_static/chromium_install_modes.h (Windows base_app_name="BrowserOS", registry GUID, install paths) - /Applications/BrowserOS.app/Contents/Info.plist (bundle ID)	2026-03-26 23:12:55 +05:30
Dani Akash	0f193055c7	fix: broaden connection error detection for main page and sidepanel (#563 ) * fix: broaden connection error detection for main page and sidepanel The connection error check required both "Failed to fetch" AND "127.0.0.1" in the error message. On the main page, the browser only produces "Failed to fetch" without the IP, so users saw a generic "Something went wrong" instead of the troubleshooting link. Broaden detection to also match "localhost" and bare "Failed to fetch" errors that don't contain an external URL. Also pass providerType in NewTabChat so provider-specific errors render correctly. Closes #526 * fix: simplify connection error detection All chat requests go through the local BrowserOS agent server, so any "Failed to fetch" error is always a local connection issue. Remove the unnecessary 127.0.0.1/localhost/URL checks. * fix: pass providerType to agentUrlError ChatError instances	2026-03-26 20:55:40 +05:30
Dani Akash	f45cb58889	fix: stop sending port-in-use errors to Sentry (#558 ) Port conflicts are expected — Chromium retries with a different port. These errors were flooding Sentry (14k+ events) without user impact. - handleStartupError: move Sentry.captureException below the port-in-use check so it only fires for unexpected startup errors - handleControllerStartupError: skip Sentry capture for port errors - index.ts: exit early for port errors before Sentry capture	2026-03-26 09:32:18 +05:30
shivammittal274	37ead6d129	fix: add cursor-pointer to credit badge in sidepanel (#554 )	2026-03-26 00:09:58 +05:30
Nikhil	5ea9463030	fix: widen scheduled task results dialog and add horizontal scroll for tables (#549 ) - Change dialog width from sm:max-w-2xl (672px) to sm:w-[70vw] sm:max-w-4xl so it takes 70% of viewport width, capped at 896px - Add overflow-x-auto on table wrappers so wide tables scroll horizontally instead of being clipped Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 16:27:46 -07:00
shivammittal274	dde35ccbd5	feat: integrate models.dev for dynamic LLM provider/model data (#547 ) * feat: integrate models.dev for dynamic LLM provider/model data (#TKT-657) Replace hardcoded model lists with data sourced from models.dev so new providers and models appear automatically when the community adds them. - Add build script (scripts/generate-models.ts) that fetches models.dev/api.json and outputs a compact JSON with 10 providers and 520 models - Replace hardcoded MODELS_DATA (50 models) with dynamic models.dev lookups - Add searchable model combobox (Popover + Command) replacing plain Select dropdown - Enrich provider templates with models.dev metadata (context window, image support) - Keep chatgpt-pro, qwen-code, browseros, openai-compatible as hardcoded providers * fix: address review — remove ollama-cloud mapping, fix default models, remove dead code - Remove ollama from PROVIDER_MAP (ollama-cloud has cloud models, not local) - Add ollama to CUSTOM_PROVIDER_MODELS with empty list (users type custom IDs) - Update defaultModelIds to ones that exist in models.dev data: openrouter → anthropic/claude-sonnet-4.5 lmstudio → openai/gpt-oss-20b bedrock → anthropic.claude-sonnet-4-6 - Remove dead isCustomModel export - Regenerate models-dev-data.json (9 providers, 486 models) * fix: model suggestion list focus/dismiss behavior - List only opens when input is focused or user types - Clicking a model selects it and closes the list - Clicking outside (blur) dismisses the list - onMouseDown preventDefault on list items prevents blur race condition * refactor: extract ModelPickerList component with proper open/close UX - Collapsed state: Select-like trigger showing selected model + chevron - Expanded state: search input + scrollable filtered list, inline - Click outside or Escape to close, Enter to submit custom model - Extracted as separate component (reduces dialog nesting, testable) - No more setTimeout hacks for blur handling * chore: remove plan doc from repo	2026-03-25 02:41:07 +05:30
shivammittal274	c8204efab6	feat: improve rate limit UX, usage page, and provider selector (#544 ) * feat: improve rate limit UX, usage page, and provider selector - Show "Add your own provider for unlimited usage" CTA when BrowserOS credits are exhausted or daily limit is reached - Fix credit exhaustion detection to match actual error message - Improve Usage page: remove disabled Add Credits button, add "Coming soon" badge, add "Want unlimited usage?" section linking to providers - Add "+ Add Provider" button at bottom of chat provider selector dropdown * fix: use asChild pattern for Button+anchor in usage page Replace nested <a><Button> (invalid HTML) with Button asChild pattern per shadcn/ui convention.	2026-03-24 18:01:42 +05:30
shivammittal274	fb5143b563	feat: UI improvements for OAuth dialog, provider badges, and events docs (#543 ) * feat: UI improvements for OAuth dialog, provider badges, and events docs - Replace OAuth device code toast with a proper Dialog showing the code prominently with a copy button (GitHub Copilot, Qwen Code, ChatGPT Pro) - Add "New" badge on provider template cards for ChatGPT Plus/Pro, GitHub Copilot, and Qwen Code with orange border highlight - Add events.md documenting all analytics events across the platform * fix: add verificationUri to DeviceCodeDialog for popup-blocked fallback Add verificationUri to PendingDeviceCode interface and pass it from both handleClientAuth and handleServerAuth. Render a fallback "Open verification page" link in DeviceCodeDialog so users can navigate to the auth page if the popup was blocked.	2026-03-24 17:27:27 +05:30
Dani Akash	fe257cd8d1	feat: only parse browseros provider errors (#542 )	2026-03-24 14:43:05 +05:30
shivammittal274	890d3406dd	feat: promote BrowserOS as MCP with UI improvements (#541 ) - Add MCP promo banner on AI providers page with "New" badge and "66+ tools" highlight, linking to /settings/mcp - Add Quick Setup section on MCP settings page with copy-paste commands for Claude Code, Gemini CLI, Codex, Claude Desktop, OpenClaw - Consolidate MCP settings: move restart button inline with server URL, remove separate MCP Server Settings card - Add analytics event for promo banner clicks	2026-03-24 03:08:08 +05:30
shivammittal274	c316e09c11	feat: add source tag to tool_executed PostHog events (#538 ) Add `source: 'mcp' \| 'chat'` property to all `tool_executed` metrics events so we can distinguish tool calls from external MCP clients (Claude Code, Cursor) vs the built-in BrowserOS agent in PostHog. - register-mcp.ts: source='mcp' (browser tools via MCP endpoint) - register-klavis-mcp.ts: source='mcp' (Klavis tools via MCP endpoint) - tool-adapter.ts: source='chat' (browser tools via chat agent) - ai-sdk-agent.ts: source='chat' (Klavis/external MCP tools via chat agent, previously untracked) - filesystem/utils.ts: source='chat' (filesystem tools via chat agent)	2026-03-24 02:03:18 +05:30
shivammittal274	65547c60c0	fix(eval): clean up eval configs and add test-clado-api script (#540 ) Consolidate 13 configs down to 7 with uniform settings: - 3 weekly (CI): browseros-agent, browseros-oe-agent, browseros-oe-clado - 4 test (local): test_gemini-computer-use, test_yutori-navigator, test_webvoyager, test_mind2web - All configs: headless=false, captcha block, full browseros ports, restart_server_per_task Deleted: debug-test, mind2web-test, tool-loop-test, orchestrator-executor-test, orchestrator-executor-clado-test, fireworks-minimax-m2, webvoyager-test Added: test-clado-api.ts script, browseros-oe-agent-weekly.json (OE with AI SDK executor)	2026-03-24 01:28:05 +05:30
shivammittal274	0babc05077	feat(eval): NopeCHA CAPTCHA solver integration (#537 ) * feat(eval): show mean score instead of pass/fail in report and viewer * feat(eval): integrate NopeCHA CAPTCHA solver into eval pipeline Add CAPTCHA detection and waiting so screenshots capture post-solve state. Run headed with xvfb on CI since headless breaks extension content scripts. - Add CaptchaWaiter module (detect reCAPTCHA/hCaptcha/Turnstile, poll until solved) - Add optional `captcha` config block to EvalConfigSchema - Wait for CAPTCHA solve before screenshot in single-agent and orchestrator-executor - Patch NopeCHA manifest with API key before launching workers - Fix CAPTCHA_EXT_DIR path (was pointing one level too high) - Remove --incognito (extensions don't run in incognito; fresh user-data-dir isolates) - CI: install xvfb, run headed via xvfb-run, pass NOPECHA_API_KEY secret	2026-03-24 00:14:16 +05:30
Nikhil	1270b5b55c	feat: new manifest perms (#536 ) * feat: new manifest perms * fix: minor * fix: minor	2026-03-23 09:31:07 -07:00
Nikhil	e97d8bc1cb	fix: remove daily rate-limit middleware (#535 ) * fix: remove daily rate-limit middleware The daily conversation rate limit is no longer needed. Remove the middleware, RateLimiter class, fetch-config, error type, shared constants, DB schema table, and integration tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove unused getDb() method No longer needed after rate-limiter removal. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 08:31:20 -07:00
Dani Akash	5109ca4347	feat: added scope in server error logs (#533 ) * feat: added scope in server error logs * fix: prevent double capture on chat request	2026-03-23 20:47:28 +05:30
shivammittal274	f14942c6f9	feat(eval): show mean score instead of pass/fail in report and viewer (#534 )	2026-03-23 20:28:34 +05:30
Dani Akash	86ec88ed80	feat: sentry improvements (#532 ) * feat: process request record from sentry locally * feat: added analytics for logged in users	2026-03-23 19:45:28 +05:30
Dani Akash	4928b7e84b	fix: no current window and sentry context (#531 ) * fix: error reporting and better breadcrumbs * fix: lint issues	2026-03-23 18:46:39 +05:30
shivammittal274	94a1a701f6	fix(eval): include browser context in agent prompt (#530 ) The eval's single-agent was passing raw task.query as the prompt, without browser context (active tab URL, title). The agent didn't know which page it was on, causing it to ask "which website?" instead of browsing. Use formatUserMessage() (same as chat-service.ts) to include browser context in the prompt. Re-export formatUserMessage from agent/tool-loop.	2026-03-23 17:42:03 +05:30
Dani Akash	ecf2efa857	fix: add unlimited storage permission to agent (#529 )	2026-03-23 17:36:26 +05:30
Nikhil	2b53daf641	fix: prevent deleted scheduled tasks from reappearing after sync (#518 ) * fix: prevent deleted scheduled tasks from reappearing after sync When a scheduled task was deleted, the sync function would see the remote job missing locally and re-add it, undoing the delete. Fix by tracking pending deletions in storage so the sync function deletes them from the backend instead of re-adding them locally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use read-modify-write for pending deletions to prevent concurrent clobber Re-read pendingDeletionStorage before write-back and only remove resolved IDs, preserving any new entries added by concurrent removeJob calls during the sync's network I/O. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 11:31:57 -07:00
shivammittal274	70be5c5c21	fix(eval): log agent errors in task progress for CI visibility (#523 )	2026-03-21 23:33:19 +05:30
shivammittal274	f157436e7d	feat(eval): switch to Linux GitHub-hosted runner (#519 ) * feat(eval): switch to ubuntu-latest runner, add OE-Clado config - Switch workflow from self-hosted Mac Studio to ubuntu-latest - Install BrowserOS Linux .deb in CI (no self-hosted runner needed) - Add browseros-oe-clado-weekly.json config for orchestrator-executor - Fix report chart to show date+time (not just date) - Make BROWSEROS_BINARY configurable via env var * feat(eval): add NopeCHA captcha solver extension to eval runs - Auto-load NopeCHA extension in eval Chrome instances - Works in incognito + headless mode - CI workflow downloads NopeCHA before eval - extensions/ directory gitignored (downloaded at runtime) * feat(eval): per-config concurrency — different configs run in parallel * feat(eval): remove concurrency limit — all runs execute in parallel	2026-03-21 23:04:45 +05:30
Nikhil	ba7892322b	ci: run BrowserOS test suites on PRs (#514 ) * ci: run browseros tests on pull requests * refactor: rework 0320-github_action_for_tests based on feedback * refactor: rework 0320-github_action_for_tests based on feedback * chore: add CI artifacts to .gitignore Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove mikepenz/action-junit-report to fix check suite misattribution The JUnit report action creates check runs that GitHub associates with the CLA check suite instead of the Tests check suite, causing test reports to appear under "CLA Assistant" in the PR checks UI. Remove the action and rely on job status + step summary + artifact upload for test result visibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 09:46:36 -07:00
shivammittal274	4e90b4561a	feat(eval): weekly eval pipeline with R2 uploads and trend dashboard (#516 ) * feat(eval): weekly eval pipeline with R2 uploads and trend dashboard Add infrastructure for running weekly evaluations and tracking score trends over time: - Auto-generated output dirs: results/{config-name}/{timestamp}/ Each eval run gets its own timestamped folder, nothing is overwritten. - upload-run.ts: uploads eval results to Cloudflare R2. Supports uploading a specific run or all un-uploaded runs for a config. - weekly-report.ts: generates an interactive HTML dashboard from R2 data. Config dropdown, trend chart with hover tooltips, searchable runs table. Groups runs by config name. - viewer.html: client-facing 3-column run viewer (task list, screenshots with autoplay, agent stream with messages.jsonl). Shows performance grader axis breakdown with per-axis scores. - browseros-agent-weekly.json: weekly benchmark config (kimi-k2p5, webbench-2of4-50, 10 workers, performance grader, headless). - eval-weekly.yml: GitHub Actions workflow with cron (Saturday 6am) and manual trigger. Runs on self-hosted Mac Studio runner. Concurrency group ensures only one eval runs at a time. - Dashboard updates: load previous runs, messages.jsonl viewer, grade badges show percentages, async stream loading. - Grader updates: timeout 30min, max turns 100, DOM content verification guidance for performance grader. * fix(eval): address Greptile review — injection, nested dirs, escaping - Fix script injection in eval-weekly.yml: pass github.event.inputs through env var instead of interpolating into shell - Fix /api/runs to enumerate nested results/{config}/{timestamp}/ dirs - Fix /api/load-run to allow single-slash run names (config/timestamp) - Add HTML escaping for R2-sourced values in weekly-report.ts - Escape axis names in viewer.html renderAxesBreakdown * fix(eval): fix biome lint — non-null assertion, template literals * fix(eval): fix biome errors — replace var with let, fix inner function declaration * fix(eval): address Greptile P2 issues - isRunDir: check all subdirs for metadata.json, not just first 3 - eval-runner: guard configPath for dashboard-driven runs (fallback to 'eval') - load-run: default unknown termination_reason to 'failed' not 'completed' * feat(eval): make BROWSEROS_BINARY configurable via env var	2026-03-21 22:12:52 +05:30
shivammittal274	86eed82350	fix: lazy OAuth callback server with cancel+retry (Codex CLI pattern) (#515 ) The OAuth callback server on port 1455 was bound eagerly at startup, crashing the server if another BrowserOS instance was already running. Rewrite as a lazy class (OAuthCallbackServer) that: - Only binds port 1455 when the user initiates a ChatGPT Pro login - Sends GET /cancel to any existing server on the port first, then retries up to 5 times (follows Codex CLI's cancel+retry pattern) - Exposes /cancel endpoint so other instances/tools can cancel us - Releases the port after the OAuth callback arrives - Device-code providers (GitHub Copilot, Qwen) never touch port 1455 This allows running eval, dev instances, and multiple BrowserOS instances without port conflicts. OAuth login works on whichever instance initiates it — the others continue without OAuth.	2026-03-21 16:44:03 +05:30
Nikhil	be6ed22af4	test: fix BrowserOS tool test harness regressions (#513 ) * test: fix browseros tool test harness regressions * test: align working directory naming in page action tests	2026-03-20 12:05:39 -07:00
Nikhil	149cde118d	chore: bump server version, offset and patch for release (#512 )	2026-03-20 11:45:12 -07:00
Nikhil	9bc5e666c4	feat: auto-discover server port via ~/.browseros/server.json (#504 ) * feat: auto-discover server port via ~/.browseros/server.json Server writes its port to ~/.browseros/server.json on startup so the CLI can auto-discover the server URL without requiring `browseros-cli init`. Discovery chain: BROWSEROS_URL env > config.yaml > server.json > error Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review feedback for PR #504 - Use synchronous unlinkSync in stop() since process.exit() fires immediately after, abandoning any pending async operations - Wrap writeServerConfig in try/catch so a write failure doesn't crash a healthy server for a convenience feature Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: type server discovery config and add version metadata Add ServerDiscoveryConfig interface to @browseros/shared and enrich server.json with server_version, browseros_version, and chromium_version. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: normalize URL from server.json for consistency All other URL sources (env var, config.yaml) pass through normalizeServerURL; apply the same to the server.json path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 11:37:00 -07:00
Nikhil	2271277b4d	feat: add voice input to new tab search bar (#509 ) * feat: add voice recording UI with waveform overlay to new tab search bar Add a microphone button to the NewTab search bar that opens a fullscreen recording overlay powered by react-voice-visualizer. The overlay shows a real-time waveform visualization during recording, recording time, and a stop button. On completion, the audio is transcribed via the existing gateway endpoint and the transcript auto-navigates to inline chat. Changes: - Extract transcribeAudio() to shared lib/voice/transcribe-audio.ts - Add VoiceRecordingOverlay component with react-voice-visualizer - Add Mic button to NewTab search bar - Track analytics via existing NEWTAB_VOICE_* events - Handle cancel (backdrop click) vs submit (stop button) correctly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review comments for voice recording overlay - Reset processingRef on transcription error to prevent stuck state - Use stable callback refs to prevent useEffect re-runs from inline arrow function props (fixes timer reset and unnecessary re-processing) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: replace voice overlay with inline sidepanel-style voice UI Remove react-voice-visualizer dependency and VoiceRecordingOverlay. Instead use the same inline voice pattern as the sidepanel ChatInput: - Waveform bars replace the search input during recording - Mic/stop/loading button states in the search bar - Transcript populates the search input on completion - Voice error shown inline below the search bar Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 11:33:01 -07:00

1 2

92 Commits