diff --git a/migration/LOGIC_INCONSISTENCIES_BY_PACKAGE.md b/migration/LOGIC_INCONSISTENCIES_BY_PACKAGE.md new file mode 100644 index 0000000..eba66f2 --- /dev/null +++ b/migration/LOGIC_INCONSISTENCIES_BY_PACKAGE.md @@ -0,0 +1,358 @@ +# AIPex vs new-aipex: Logic Inconsistencies by Package + +> **Purpose**: This document enumerates every confirmed logic/functionality gap between the legacy `aipex/` codebase and the new `new-aipex/packages/*` architecture. Each entry includes evidence paths, impact assessment, suggested migration target, and priority. + +--- + +## Baseline + + +| Codebase | Root Path | +| ---------------------- | ---------------------- | +| Legacy (full-featured) | `aipex/` | +| New (restructured) | `new-aipex/packages/*` | + + +**Focus areas**: Tools, Context/Summarization, Skill system, UI components, Use-cases, Hosted services (auth, uploads, version-check). + +--- + +## 1. `packages/core` (`@aipexstudio/aipex-core`) + +### 1.1 Conversation Compression/Summarization Strategy Differs Significantly + +**Status**: ⚠️ Acceptable Difference | **No action planned** + + +| Aspect | Legacy | New | +| --------------------------- | -------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | +| Summary prompt | High-density structured Markdown prompt (`aipex/src/lib/context/context-optimizer.ts`) | Simple character-length-capped summarizer instruction (`new-aipex/packages/core/src/conversation/compressor.ts`) | +| Trigger condition | Real `totalTokens` from `BackgroundContextManager.getTokenUsage()` hitting watermark | Item count **or** optional token watermark | +| Tool-pair boundary handling | Explicit `adjustProtectedBoundary()` to avoid splitting assistant↔tool pairs | `expandForToolCallClosure()` exists but logic is simpler | + + +**Impact**: Long sessions may lose critical context more aggressively in new architecture. + +**Priority**: N/A — Closed + +**Migration target**: N/A + +**Resolution**: The new architecture's compression approach is intentionally simpler and acceptable for the current use case. The `expandForToolCallClosure()` provides adequate tool-pair protection. No migration required. + +--- + +### 1.2 Token Usage Tracking & UI Hook Missing + +**Status**: ✅ Resolved + + +| Aspect | Legacy | New | +| --------------- | ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- | +| Usage recording | `BackgroundContextManager.recordUsage()` aggregates real API usage | `AIPex.runExecution()` emits `metrics_update` event with `AgentMetrics`; `Session.addMetrics()` aggregates into `SessionMetrics` | +| UI integration | `TokenUsageIndicator.tsx` consumes usage | `TokenUsageIndicator` component in `@aipexstudio/aipex-react` consumes `useChatContext().metrics` | + + +**Impact**: ~~Cannot display real-time token consumption in UI.~~ Resolved. + +**Priority**: P1 (Completed) + +**Migration target**: `packages/core` + `packages/aipex-react` + `packages/browser-ext` + +**Evidence**: +- Core: `types.ts` — `AgentEvent.metrics_update` now includes optional `sessionId` +- Core: `aipex.ts` — yields `{ type: "metrics_update", metrics, sessionId }` on success and error paths +- React: `use-chat.ts` — exposes `metrics: AgentMetrics | null` in return value and processes `metrics_update` events +- React: `context.ts` — `ChatContextValue` includes `metrics` field +- React: `components/chatbot/components/token-usage-indicator.tsx` — new component with compact/full modes +- Browser-ext: `browser-chat-header.tsx` — integrates `` in header + +--- + +### 1.3 MCP System Absent + +**Status**: ⚠️ Superseded | **No action needed** + + +| Legacy | New | +| ------------------------------------------------------------------- | ---------------------------------------------- | +| `aipex/src/mcp/*` (UnifiedToolManager, tool converters, MCP server) | No `mcp/` directory; 0 matches for `**/mcp/**` | + + +**Impact**: Dynamic tool registration / MCP-to-OpenAI conversion path does not exist. + +**Priority**: N/A — Closed + +**Migration target**: N/A + +**Resolution**: The new architecture provides direct tool registration via `@aipexstudio/aipex-core` tool definitions passed to `AIPex.create()`. The MCP abstraction layer is superseded by this simpler pattern; no migration required. + +--- + +## 2. `packages/browser-runtime` (`@aipexstudio/browser-runtime`) + +### 2.1 Default `allBrowserTools` Surface Area Reduced + README Drift + + +| Aspect | Legacy | New | +| ------------------------------ | ------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | +| Approx. tool count | 70+ MCP tools | 32 tools in `allBrowserTools` | +| Disabled tools (code comments) | N/A | `switch_to_tab`, `duplicate_tab`, `wait`, `capture_screenshot_to_clipboard`, `download_text_as_markdown`, `download_current_chat_images` | +| README claim | N/A | "31 tools" – contradicts code | + + +**Evidence**: + +- New: `new-aipex/packages/browser-runtime/src/tools/index.ts` (lines 31-88) +- README: `new-aipex/packages/browser-runtime/README.md` (line 26, 40-45) + +**Impact**: Many commonly-used tools unavailable; documentation misleading. + +**Priority**: P0 + +**Migration target**: `packages/browser-runtime` + +--- + +### 2.2 Tool Implementations Exist but Not Exported by Default + + +| Category | Implemented Path | Exported in `allBrowserTools`? | +| -------------------------- | -------------------------------------------- | ------------------------------------------------------------- | +| Clipboard | `src/tools/tools/clipboard/index.ts` | No | +| Context Menus | `src/tools/tools/context-menus/index.ts` | No | +| Downloads (extended) | `src/tools/tools/downloads/index.ts` | Partial (`downloadImageTool`, `downloadChatImagesTool` only) | +| Extensions | `src/tools/tools/extensions/index.ts` | No | +| Sessions | `src/tools/tools/sessions/index.ts` | No | +| Tab Groups | `src/tools/tools/tab-groups/index.ts` | No (only `organizeTabsTool`, `ungroupTabsTool` from `tab.ts`) | +| Window Management | `src/tools/tools/window-management/index.ts` | No | +| Bookmarks | `src/tools/bookmark.ts` | No | +| History | `src/tools/history.ts` | No | +| Snapshot (`take_snapshot`) | `src/tools/snapshot.ts` | No (intentional, internal use) | + + +**Impact**: Features exist in code but are invisible to the extension/agent. + +**Priority**: P0 + +**Migration target**: `packages/browser-runtime/src/tools/index.ts` + +--- + +### 2.3 `organize_tabs` Is a Stub (AI Grouping Disabled) + +**Status**: ⚠️ Mitigated | **Tool removed from default bundle** + + +| Legacy | New | +| ------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------- | +| `aipex/src/mcp-servers/tab-groups.ts` → `groupTabsByAI()` with full LLM prompt | `new-aipex/packages/browser-runtime/src/tools/tab.ts` → returns `{ success: false, message: "...requires additional implementation..." }` | + + +**Impact**: ~~Core feature (smart tab grouping) non-functional.~~ Tool is no longer exposed to users. + +**Priority**: N/A — Mitigated + +**Migration target**: N/A (tool removed from `allBrowserTools`) + +**Resolution**: The `organize_tabs` tool has been removed from `allBrowserTools` in `packages/browser-runtime/src/tools/index.ts`. The implementation code is retained for future completion of AI-powered tab grouping. The tool is listed in the "Disabled tools" comment block. + +--- + +### 2.4 Tab-Group Tool Naming Inconsistency + +**Status**: ✅ Resolved + + +| Tool | Legacy name | New (tab.ts) | New (tab-groups/index.ts) | +| ----------- | -------------- | -------------- | ------------------------- | +| Ungroup all | `ungroup_tabs` | `ungroup_tabs` | `ungroup_tabs` | + + +**Impact**: ~~Skill scripts / prompts referencing old names may break.~~ Resolved. + +**Priority**: N/A — Resolved + +**Migration target**: N/A + +**Resolution**: The `ungroupAllTabsTool` in `packages/browser-runtime/src/tools/tools/tab-groups/index.ts` has been renamed from `ungroup_all_tabs` to `ungroup_tabs` for consistency with the legacy naming convention. A comment has been added warning against registering both tools simultaneously to avoid duplicate name conflicts. + +--- + +### 2.5 Bookmark/History/Window Tool Naming Changed + Default Off + + +| Category | Legacy names (sample) | New names (sample) | Default exported? | +| --------- | ------------------------------------------------------------- | ----------------------------------------------------------------------------------- | ----------------- | +| Bookmarks | `get_all_bookmarks`, `get_bookmark_folders` | `list_bookmarks`, `search_bookmarks`, `create_bookmark_folder` | No | +| History | `get_recent_history`, `search_history` | Same | No | +| Windows | `get_all_windows`, `minimize_window`, `maximize_window`, etc. | `get_all_windows`, `switch_to_window`, `create_new_window`, `close_window` (subset) | No | + + +**Impact**: Prompts/scripts using legacy names fail; features hidden. + +**Priority**: P1 + +**Migration target**: `packages/browser-runtime` + +--- + +### 2.6 Page Content Tools Missing + + +| Legacy | New | +| ------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------- | +| `get_page_images`, `get_page_performance`, `get_page_accessibility` (`aipex/src/mcp-servers/page-content.ts`) | Not found in `new-aipex/packages/browser-runtime/src/tools/page.ts` | + + +**Impact**: Accessibility audits, performance checks unavailable. + +**Priority**: P2 + +**Migration target**: `packages/browser-runtime` + +--- + +### 2.7 Voice Input Degraded to Web Speech API Only + + +| Legacy | New | +| --------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- | +| Three-tier: Server STT → ElevenLabs → Web Speech (`aipex/src/lib/voice/voice-input-manager.ts`, `aipex/src/interventions/implementations/voice-input.ts`) | Web Speech API only (`new-aipex/packages/browser-runtime/src/intervention/implementations/voice-input.ts`) | + + +**Impact**: Non-BYOK users lose server-side STT; BYOK users lose ElevenLabs path. + +**Priority**: P1 + +**Migration target**: `packages/browser-runtime` + +--- + +### 2.8 Skill System: `refreshSkillMetadata()` Missing + +**Status**: ✅ Resolved + + +| Legacy | New | +| ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------- | +| `aipex/src/skill/lib/services/skill-manager.ts` has `refreshSkillMetadata()` | `new-aipex/packages/browser-runtime/src/skill/lib/services/skill-manager.ts` now has `refreshSkillMetadata()` | + + +**Impact**: ~~Skill metadata may become stale after updates.~~ Resolved. + +**Priority**: N/A — Resolved + +**Migration target**: N/A + +**Resolution**: The `refreshSkillMetadata(skillId: string)` method has been ported to the new `SkillManager`. It reads `SKILL.md` from ZenFS, parses frontmatter, updates IndexedDB metadata via `skillStorage.updateSkill()`, refreshes the registry cache via `skillRegistry.updateSkill()`, and emits a `skill_loaded` event with type `skill_metadata_refreshed`. Path traversal is guarded by rejecting skill IDs containing `/`, `\\`, or `..`. + +--- + +## 3. `packages/aipex-react` (`@aipexstudio/aipex-react`) + +### 3.1 UI Components Missing + +**Status**: ✅ Resolved + + +| Component | Legacy path | New status | +| --------------------- | ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | +| `TokenUsageIndicator` | `aipex/src/lib/components/chatbot/TokenUsageIndicator.tsx` | Found (`packages/aipex-react/src/components/chatbot/components/token-usage-indicator.tsx`) | +| `AuthProvider` | `aipex/src/lib/components/auth/AuthProvider.tsx` | Moved to browser-ext (`packages/browser-ext/src/auth/AuthProvider.tsx`) | +| `VoiceInput` (UI) | `aipex/src/lib/components/voice-mode/voice-input.tsx` | Found (`packages/aipex-react/src/components/voice/VoiceInput.tsx`) | + + +**Impact**: ~~Token monitor, login/user-state, voice-mode UI unavailable.~~ Resolved. + +**Priority**: N/A — Resolved + +**Migration target**: N/A + +**Resolution**: +- `TokenUsageIndicator` was already migrated and is exported from `@aipexstudio/aipex-react/components/chatbot`. +- `AuthProvider` and `useAuth` now live in `packages/browser-ext/src/auth/` since authentication logic requires browser-specific Chrome APIs (cookies, tabs, scripting). +- `VoiceInput` (3D particle UI + VAD + STT) migrated to `packages/aipex-react/src/components/voice/` with supporting voice engine code in `packages/aipex-react/src/lib/voice/`. + +--- + +## 4. `packages/browser-ext` (Extension Assembly) + +### 4.1 Tool Surface Defined Entirely by `allBrowserTools` + +- Extension agent config (`new-aipex/packages/browser-ext/src/lib/browser-agent-config.ts`) uses `allBrowserTools` directly. +- Any tool not in that bundle is invisible. + +**Impact**: See section 2.1 / 2.2. + +**Priority**: Addressed via P0 items above. + +--- + +### 4.2 Legacy Services Not Migrated + + +| Service | Legacy path | New status | +| ---------------------- | --------------------------------------------- | ---------- | +| `version-checker.ts` | `aipex/src/lib/services/version-checker.ts` | Not found | +| `web-auth.ts` | `aipex/src/lib/services/web-auth.ts` | Not found | +| `recording-upload.ts` | `aipex/src/lib/services/recording-upload.ts` | Not found | +| `screenshot-upload.ts` | `aipex/src/lib/services/screenshot-upload.ts` | Not found | +| `user-manuals-api.ts` | `aipex/src/lib/services/user-manuals-api.ts` | Not found | +| `replay-controller.ts` | `aipex/src/lib/services/replay-controller.ts` | Not found | + + +**Impact**: Hosted login, version check, upload, manual retrieval, replay all missing. + +**Priority**: P1 (auth) / P2 (others) + +**Migration target**: `packages/browser-ext` or new `packages/services` + +--- + +## 5. `packages/dom-snapshot` (`@aipexstudio/dom-snapshot`) + + +| Status | Notes | +| ------------ | ------------------------------------------------------------------------------------------------------------------------------ | +| ✅ Consistent | Legacy `aipex/src/experimental/dom-automation/snapshot/*` is a compatibility wrapper around the new package. No action needed. | + + +--- + +## 6. `packages/use-cases` (Planned but Not Created) + + +| Legacy | New | +| --------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | +| `aipex/src/use-cases/*` (User Guide Generator, batch jobs, e2e testing templates) | `new-aipex/packages/use-cases/` does not exist; only mentioned in `MIGRATION_STRATEGY.md` | + + +**Impact**: High-level workflow templates (screen recording → GIF/PDF export) unavailable. + +**Priority**: P0 + +**Migration target**: Create `packages/use-cases` + +--- + +## Migration Priority Summary + + +| Priority | Items | +| -------- | ---------------------------------------- | +| P0 | 2.1, 2.2, 6 | +| P1 | ~~1.1~~, ~~1.2~~ (completed), ~~2.4~~ (resolved), 2.5, 2.7, ~~3.1~~ (resolved), 4.2 (auth) | +| P2 | ~~1.3~~ (superseded), 2.6, ~~2.8~~ (resolved), 4.2 (non-auth) | +| Closed | 1.1 (acceptable difference), 1.2 (resolved), 1.3 (superseded), 2.3 (mitigated), 2.4 (resolved), 2.8 (resolved), 3.1 (resolved) | + + +--- + +## Security Review Card (Before Re-enabling High-Risk Tools) + +- Threat snapshot updated (entry points, trust boundaries, sensitive data) +- Tool input validated against schema with allowlists/bounds +- No tokens/PII written to logs +- BYOK token stored securely (chrome.storage.local, masked in UI) +- High-risk tools (storage/clipboard/extensions/downloads) default-off with explicit opt-in +- Minimal regression tests added (tool invocation, permission-denied scenarios, background vs focus mode) diff --git a/migration/TOOL_SURFACE_AUDIT.md b/migration/TOOL_SURFACE_AUDIT.md new file mode 100644 index 0000000..fdef624 --- /dev/null +++ b/migration/TOOL_SURFACE_AUDIT.md @@ -0,0 +1,297 @@ +# Tool Surface Audit: aipex vs new-aipex + +> **Purpose**: Compare legacy MCP tool set (`aipex/src/mcp/index.ts`) with new browser-runtime default tool bundle (`new-aipex/packages/browser-runtime/src/tools/index.ts`). + +--- + +## Summary + +| Metric | Count | +|--------|-------| +| Legacy MCP tools | ~82 | +| New `allBrowserTools` (default) | 32 | +| New tools implemented but NOT registered | ~45 | +| Tools completely missing in new | ~15 | + +--- + +## Legend + +| Status | Meaning | +|--------|---------| +| ✅ Registered | Included in `allBrowserTools` | +| 🔧 Implemented (not registered) | Code exists in `browser-runtime` but not in default bundle | +| ❌ Missing | No implementation found in new codebase | +| 🔄 Renamed | Same functionality with different name | +| ⚠️ Stub | Implementation exists but returns failure/placeholder | + +--- + +## Tool Comparison by Category + +### 1. Tab Management + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_tabs` | ✅ Registered | `get_all_tabs` | | +| `get_current_tab` | ✅ Registered | `get_current_tab` | | +| `switch_to_tab` | ❌ Disabled | `switch_to_tab` (exists in tab.ts) | Commented out: "causes context switching issues" | +| `organize_tabs` | ⚠️ Stub | `organize_tabs` | Returns `success: false` - needs `groupTabsByAI()` migration | +| `ungroup_tabs` | ✅ Registered | `ungroup_tabs` | | +| `create_new_tab` | ✅ Registered | `create_new_tab` | | +| `get_tab_info` | ✅ Registered | `get_tab_info` | | +| `close_tab` | ✅ Registered | `close_tab` | | + +### 2. Tab Groups + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_tab_groups` | 🔧 Implemented | `tools/tab-groups/index.ts` | Not in default bundle | +| `create_tab_group` | 🔧 Implemented | `tools/tab-groups/index.ts` | Not in default bundle | +| `update_tab_group` | 🔧 Implemented | `tools/tab-groups/index.ts` | Not in default bundle | +| `ungroup_all_tabs` | 🔧 Implemented | `tools/tab-groups/index.ts` | Naming conflict with `ungroup_tabs` | + +### 3. Bookmarks + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_bookmarks` | 🔄 Renamed | `list_bookmarks` | `tools/bookmark.ts` - not registered | +| `get_bookmark_folders` | ❌ Missing | - | Replaced by `create_bookmark_folder` | +| `create_bookmark` | 🔧 Implemented | `create_bookmark` | `tools/bookmark.ts` - not registered | +| `delete_bookmark` | 🔧 Implemented | `delete_bookmark` | `tools/bookmark.ts` - not registered | +| `search_bookmarks` | 🔧 Implemented | `search_bookmarks` | `tools/bookmark.ts` - not registered | +| - | 🔧 New | `get_bookmark` | New tool, not in legacy | +| - | 🔧 New | `update_bookmark` | New tool, not in legacy | +| - | 🔧 New | `create_bookmark_folder` | New tool, not in legacy | +| - | 🔧 New | `delete_bookmark_folder` | New tool, not in legacy | + +### 4. History + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_recent_history` | 🔧 Implemented | `get_recent_history` | `tools/history.ts` - not registered | +| `search_history` | 🔧 Implemented | `search_history` | `tools/history.ts` - not registered | +| `delete_history_item` | 🔧 Implemented | `delete_history_item` | `tools/history.ts` - not registered | +| `clear_history` | 🔧 Implemented | `clear_history` | `tools/history.ts` - not registered | +| - | 🔧 New | `get_most_visited_sites` | New tool, not in legacy | +| - | 🔧 New | `get_history_stats` | New tool, not in legacy | + +### 5. Window Management + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_windows` | 🔧 Implemented | `get_all_windows` | `tools/window-management/index.ts` - not registered | +| `get_current_window` | 🔧 Implemented | `get_current_window` | `tools/window-management/index.ts` - not registered | +| `close_window` | 🔧 Implemented | `close_window` | `tools/window-management/index.ts` - not registered | +| `minimize_window` | ❌ Missing | - | TODO in window-management/index.ts | +| `maximize_window` | ❌ Missing | - | TODO in window-management/index.ts | +| `restore_window` | ❌ Missing | - | TODO in window-management/index.ts | +| `update_window` | ❌ Missing | - | TODO in window-management/index.ts | +| `arrange_windows_in_grid` | ❌ Missing | - | TODO in window-management/index.ts | +| `cascade_windows` | ❌ Missing | - | TODO in window-management/index.ts | +| - | 🔧 New | `switch_to_window` | New tool with automationMode gating | +| - | 🔧 New | `create_new_window` | New tool with automationMode gating | + +### 6. Clipboard + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `copy_to_clipboard` | 🔧 Implemented | `copy_to_clipboard` | `tools/clipboard/index.ts` - not registered | +| `read_from_clipboard` | 🔧 Implemented | `read_from_clipboard` | `tools/clipboard/index.ts` - not registered | +| `copy_current_page_url` | 🔧 Implemented | `copy_current_page_url` | `tools/clipboard/index.ts` - not registered | +| `copy_current_page_title` | 🔧 Implemented | `copy_current_page_title` | `tools/clipboard/index.ts` - not registered | +| `copy_selected_text` | 🔧 Implemented | `copy_selected_text` | `tools/clipboard/index.ts` - not registered | +| `copy_page_as_markdown` | 🔧 Implemented | `copy_page_as_markdown` | `tools/clipboard/index.ts` - not registered | +| `copy_page_as_text` | 🔧 Implemented | `copy_page_as_text` | `tools/clipboard/index.ts` - not registered | +| `copy_page_links` | 🔧 Implemented | `copy_page_links` | `tools/clipboard/index.ts` - not registered | +| `copy_page_metadata` | 🔧 Implemented | `copy_page_metadata` | `tools/clipboard/index.ts` - not registered | + +### 7. Storage + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_storage_value` | ❌ Missing | - | | +| `set_storage_value` | ❌ Missing | - | | +| `remove_storage_value` | ❌ Missing | - | | +| `get_all_storage_keys` | ❌ Missing | - | | +| `clear_all_storage` | ❌ Missing | - | | +| `get_extension_settings` | ❌ Missing | - | | +| `update_extension_settings` | ❌ Missing | - | | +| `get_ai_config` | ❌ Missing | - | | +| `set_ai_config` | ❌ Missing | - | | +| `export_storage_data` | ❌ Missing | - | | +| `import_storage_data` | ❌ Missing | - | | +| `get_storage_stats` | ❌ Missing | - | | + +### 8. Utilities + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_browser_info` | ❌ Missing | - | | +| `get_system_info` | ❌ Missing | - | | +| `get_current_datetime` | ❌ Missing | - | | +| `format_timestamp` | ❌ Missing | - | | +| `generate_random_string` | ❌ Missing | - | | +| `validate_url` | ❌ Missing | - | | +| `extract_domain` | ❌ Missing | - | | +| `get_url_parameters` | ❌ Missing | - | | +| `build_url` | ❌ Missing | - | | +| `get_text_stats` | ❌ Missing | - | | +| `convert_text_case` | ❌ Missing | - | | +| `check_permissions` | ❌ Missing | - | | +| `wait` | ❌ Deprecated | - | Replaced by `computer` tool's wait action | + +### 9. Extensions + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_extensions` | 🔧 Implemented | `get_all_extensions` | `tools/extensions/index.ts` - not registered | +| `get_extension` | 🔧 Implemented | `get_extension` | `tools/extensions/index.ts` - not registered | +| `set_extension_enabled` | 🔧 Implemented | `set_extension_enabled` | `tools/extensions/index.ts` - not registered | +| `uninstall_extension` | 🔧 Implemented | `uninstall_extension` | `tools/extensions/index.ts` - not registered | +| `get_extension_permissions` | ❌ Missing | - | | + +### 10. Downloads + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_downloads` | ❌ Missing | - | | +| `get_download` | ❌ Missing | - | | +| `pause_download` | ❌ Missing | - | | +| `resume_download` | ❌ Missing | - | | +| `cancel_download` | ❌ Missing | - | | +| `remove_download` | ❌ Missing | - | | +| `open_download` | ❌ Missing | - | | +| `show_download_in_folder` | ❌ Missing | - | | +| `get_download_stats` | ❌ Missing | - | | +| `download_text_as_markdown` | ❌ Disabled | - | Disabled in index.ts | +| `download_image` | ✅ Registered | `download_image` | `tools/downloads/index.ts` | +| `download_chat_images` | ✅ Registered | `download_chat_images` | `tools/downloads/index.ts` | +| `download_current_chat_images` | ❌ Disabled | - | Disabled in index.ts | + +### 11. Sessions + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_all_sessions` | 🔧 Implemented | `get_all_sessions` | `tools/sessions/index.ts` - not registered | +| `get_session` | 🔧 Implemented | `get_session` | `tools/sessions/index.ts` - not registered | +| `restore_session` | 🔧 Implemented | `restore_session` | `tools/sessions/index.ts` - not registered | +| `get_current_device` | 🔧 Implemented | `get_current_device` | `tools/sessions/index.ts` - not registered | +| `get_all_devices` | 🔧 Implemented | `get_all_devices` | `tools/sessions/index.ts` - not registered | + +### 12. Context Menus + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `create_context_menu_item` | 🔧 Implemented | - | `tools/context-menus/index.ts` - not registered | +| `update_context_menu_item` | 🔧 Implemented | - | `tools/context-menus/index.ts` - not registered | +| `remove_context_menu_item` | 🔧 Implemented | - | `tools/context-menus/index.ts` - not registered | +| `remove_all_context_menu_items` | 🔧 Implemented | - | `tools/context-menus/index.ts` - not registered | +| `get_context_menu_items` | 🔧 Implemented | - | `tools/context-menus/index.ts` - not registered | + +### 13. Screenshots + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `capture_screenshot` | ✅ Registered | `capture_screenshot` | | +| `capture_screenshot_with_highlight` | ❌ Missing | - | | +| `capture_tab_screenshot` | ✅ Registered | `capture_tab_screenshot` | | +| `capture_screenshot_to_clipboard` | ❌ Disabled | - | Disabled in index.ts | +| `read_clipboard_image` | ❌ Missing | - | | +| `get_clipboard_image_info` | ❌ Missing | - | | + +### 14. Page Content + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `get_page_metadata` | ✅ Registered | `get_page_metadata` | | +| `get_page_images` | ❌ Missing | - | | +| `get_page_performance` | ❌ Missing | - | | +| `get_page_accessibility` | ❌ Missing | - | | +| `scroll_to_element` | ✅ Registered | `scroll_to_element` | | +| `highlight_element` | ✅ Registered | `highlight_element` | | +| `highlight_text_inline` | ✅ Registered | `highlight_text_inline` | | + +### 15. UI Operations + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `take_snapshot` | ❌ Internal | - | Not in allBrowserTools (internal use) | +| `search_elements` | ✅ Registered | `search_elements` | | +| `click` | ✅ Registered | `click` | | +| `fill_element_by_uid` | ✅ Registered | `fill_element_by_uid` | | +| `get_editor_value` | ✅ Registered | `get_editor_value` | | +| `fill_form` | ✅ Registered | `fill_form` | | +| `hover_element_by_uid` | ✅ Registered | `hover_element_by_uid` | | +| `click_by_xy` | ❌ Deprecated | - | Replaced by `computer` tool | +| `hover_by_xy` | ❌ Deprecated | - | Replaced by `computer` tool | +| `fill_by_xy` | ❌ Deprecated | - | Replaced by `computer` tool | +| `computer` | ✅ Registered | `computer` | Unified tool | + +### 16. Interventions + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `list_interventions` | ✅ Registered | `list_interventions` | | +| `get_intervention_info` | ✅ Registered | `get_intervention_info` | | +| `request_intervention` | ✅ Registered | `request_intervention` | | +| `cancel_intervention` | ✅ Registered | `cancel_intervention` | | + +### 17. Skills + +| Legacy Tool | New Status | New Name/Path | Notes | +|-------------|------------|---------------|-------| +| `load_skill` | ✅ Registered | `load_skill` | | +| `execute_skill_script` | ✅ Registered | `execute_skill_script` | | +| (other skill tools) | ✅ Registered | - | 6 skill tools total | + +--- + +## Recommendations + +### P0: High Priority (Blocking Core Functionality) + +1. **Register existing tool implementations** in `allBrowserTools`: + - Bookmarks (8 tools) + - History (6 tools) + - Window management (5 tools) + - Clipboard (9 tools) + - Sessions (5 tools) + - Extensions (4 tools) + - Tab groups (4 tools) + +2. **Fix `organize_tabs` stub** by migrating `groupTabsByAI()` from legacy + +3. **Enable disabled tools** with proper security controls: + - `switch_to_tab` (with automationMode gating) + - `download_text_as_markdown` + - `capture_screenshot_to_clipboard` + +### P1: Medium Priority + +1. **Implement missing storage tools** (12 tools) +2. **Implement missing utility tools** (12 tools) +3. **Implement missing download management tools** (9 tools) + +### P2: Low Priority + +1. **Implement missing window management tools** (minimize, maximize, arrange) +2. **Implement missing page content tools** (images, performance, accessibility) +3. **Implement missing screenshot tools** (clipboard image, highlight capture) + +--- + +## Security Considerations + +Before registering high-risk tools, implement: + +1. **Tool bundles by risk level**: + - `coreTools` (safe, always enabled) + - `browserTools` (moderate, enabled by default) + - `systemTools` (high-risk, requires opt-in) + +2. **Per-tool permission checks**: + - `automationMode` gating for focus-changing operations + - User consent for destructive operations (clear history, uninstall extension) + +3. **Rate limiting** for sensitive tools diff --git a/packages/aipex-react/package.json b/packages/aipex-react/package.json index 022c55f..0813bbe 100644 --- a/packages/aipex-react/package.json +++ b/packages/aipex-react/package.json @@ -127,6 +127,7 @@ "type": "module", "dependencies": { "@aipexstudio/aipex-core": "workspace:*", + "@ricky0123/vad-web": "^0.0.27", "@radix-ui/react-avatar": "^1.1.11", "@radix-ui/react-collapsible": "^1.1.12", "@radix-ui/react-dialog": "^1.1.15", @@ -153,6 +154,7 @@ "remark-gfm": "^4.0.1", "streamdown": "^2.1.0", "tailwind-merge": "^3.4.0", + "three": "^0.177.0", "tokenlens": "^1.3.1", "use-stick-to-bottom": "^1.1.3" }, @@ -172,9 +174,11 @@ "@testing-library/dom": "^10.4.0", "@testing-library/jest-dom": "^6.9.1", "@testing-library/react": "^16.3.0", + "@types/chrome": "0.1.32", "@types/react": "19.2.8", "@types/react-dom": "19.2.3", "@types/react-syntax-highlighter": "^15.5.13", + "@types/three": "^0.177.0", "jsdom": "^27.4.0", "react": "19.2.3", "react-dom": "19.2.3", diff --git a/packages/aipex-react/src/adapters/chat-adapter.test.ts b/packages/aipex-react/src/adapters/chat-adapter.test.ts index 01cb4f1..03a2448 100644 --- a/packages/aipex-react/src/adapters/chat-adapter.test.ts +++ b/packages/aipex-react/src/adapters/chat-adapter.test.ts @@ -388,7 +388,8 @@ describe("ChatAdapter", () => { state: "error", errorText: "failed", }); - expect(adapter.getStatus()).toBe("error"); + // Status should be streaming, not error - agent may continue after tool error + expect(adapter.getStatus()).toBe("streaming"); }); it("should handle multiple calls for same tool sequentially", () => { @@ -442,6 +443,119 @@ describe("ChatAdapter", () => { adapter.getMessages()[0]?.parts.filter((p) => p.type === "tool") ?? []; expect(toolParts).toHaveLength(0); }); + + it("should mark tool as error when result has success: false", () => { + adapter.processEvent({ + type: "tool_call_start", + toolName: "organize_tabs", + params: {}, + }); + adapter.processEvent({ + type: "tool_call_complete", + toolName: "organize_tabs", + result: { + success: false, + error: "Cannot organize tabs in incognito window", + }, + }); + + const toolPart = adapter + .getMessages()[0] + ?.parts.find((p) => p.type === "tool"); + expect(toolPart).toMatchObject({ + toolName: "organize_tabs", + state: "error", + errorText: "Cannot organize tabs in incognito window", + }); + // Status should remain streaming (not error) since this is a business failure + expect(adapter.getStatus()).toBe("streaming"); + }); + + it("should use message field when error field is missing in success: false result", () => { + adapter.processEvent({ + type: "tool_call_start", + toolName: "screenshot", + params: {}, + }); + adapter.processEvent({ + type: "tool_call_complete", + toolName: "screenshot", + result: { success: false, message: "No active tab found" }, + }); + + const toolPart = adapter + .getMessages()[0] + ?.parts.find((p) => p.type === "tool"); + expect(toolPart).toMatchObject({ + state: "error", + errorText: "No active tab found", + }); + }); + + it("should show generic error message when success: false has no error/message", () => { + adapter.processEvent({ + type: "tool_call_start", + toolName: "failing_tool", + params: {}, + }); + adapter.processEvent({ + type: "tool_call_complete", + toolName: "failing_tool", + result: { success: false }, + }); + + const toolPart = adapter + .getMessages()[0] + ?.parts.find((p) => p.type === "tool"); + expect(toolPart).toMatchObject({ + state: "error", + errorText: "Operation failed", + }); + }); + + it("should keep output in tool part when marking as error for debugging", () => { + adapter.processEvent({ + type: "tool_call_start", + toolName: "api_call", + params: {}, + }); + adapter.processEvent({ + type: "tool_call_complete", + toolName: "api_call", + result: { + success: false, + error: "API rate limit exceeded", + details: { remaining: 0 }, + }, + }); + + const toolPart = adapter + .getMessages()[0] + ?.parts.find((p) => p.type === "tool") as UIToolPart | undefined; + expect(toolPart?.state).toBe("error"); + expect(toolPart?.errorText).toBe("API rate limit exceeded"); + expect(toolPart?.output).toEqual({ + success: false, + error: "API rate limit exceeded", + details: { remaining: 0 }, + }); + }); + + it("should not set overall status to error on tool_call_error", () => { + adapter.processEvent({ + type: "tool_call_start", + toolName: "search", + params: {}, + }); + adapter.processEvent({ + type: "tool_call_error", + toolName: "search", + error: new Error("Tool execution failed"), + }); + + // Status should be streaming, not error - agent may continue + expect(adapter.getStatus()).toBe("streaming"); + }); }); describe("reset", () => { diff --git a/packages/aipex-react/src/adapters/chat-adapter.ts b/packages/aipex-react/src/adapters/chat-adapter.ts index e670238..48888e5 100644 --- a/packages/aipex-react/src/adapters/chat-adapter.ts +++ b/packages/aipex-react/src/adapters/chat-adapter.ts @@ -165,7 +165,9 @@ export class ChatAdapter { case "tool_call_error": this.updateToolError(event.toolName, event.error); - this.updateStatus("error"); + // Don't set overall status to "error" for tool errors - the agent may continue + // Only set to "error" for actual execution errors (event.type === "error") + this.updateStatus("streaming"); break; case "execution_complete": @@ -397,6 +399,19 @@ export class ChatAdapter { if (!callId) { return; } + + // Check if result indicates a business-level failure (success: false pattern) + const failureInfo = this.extractBusinessFailure(result); + if (failureInfo) { + this.updateToolPart(callId, (toolPart) => ({ + ...toolPart, + state: "error", + output: result, // Keep full output for debugging + errorText: failureInfo.errorMessage, + })); + return; + } + this.updateToolPart(callId, (toolPart) => ({ ...toolPart, state: "completed", @@ -404,6 +419,39 @@ export class ChatAdapter { })); } + /** + * Check if a tool result indicates a business-level failure. + * Many tools return { success: false, error: "..." } instead of throwing. + */ + private extractBusinessFailure( + result: unknown, + ): { errorMessage: string } | null { + if (result === null || result === undefined) { + return null; + } + + if (typeof result !== "object") { + return null; + } + + const obj = result as Record; + + // Check for common failure patterns: { success: false, error: ... } + if (obj.success === false) { + // Extract error message + if (typeof obj.error === "string" && obj.error.length > 0) { + return { errorMessage: obj.error }; + } + if (typeof obj.message === "string" && obj.message.length > 0) { + return { errorMessage: obj.message }; + } + // Generic failure message + return { errorMessage: "Operation failed" }; + } + + return null; + } + private updateToolError(toolName: string, error: Error): void { const callId = this.dequeueToolCall(toolName); if (!callId) { diff --git a/packages/aipex-react/src/components/chatbot/components/chatbot.tsx b/packages/aipex-react/src/components/chatbot/components/chatbot.tsx index 3fbc51e..5339690 100644 --- a/packages/aipex-react/src/components/chatbot/components/chatbot.tsx +++ b/packages/aipex-react/src/components/chatbot/components/chatbot.tsx @@ -82,6 +82,7 @@ export function ChatbotProvider({ messages: chatState.messages, status: chatState.status, sessionId: chatState.sessionId, + metrics: chatState.metrics, sendMessage: chatState.sendMessage, continueConversation: chatState.continueConversation, interrupt: chatState.interrupt, diff --git a/packages/aipex-react/src/components/chatbot/components/index.ts b/packages/aipex-react/src/components/chatbot/components/index.ts index 6fd178e..31785fd 100644 --- a/packages/aipex-react/src/components/chatbot/components/index.ts +++ b/packages/aipex-react/src/components/chatbot/components/index.ts @@ -14,4 +14,8 @@ export { } from "./input-area"; export { DefaultMessageItem, MessageItem } from "./message-item"; export { DefaultMessageList, MessageList } from "./message-list"; +export { + TokenUsageIndicator, + type TokenUsageIndicatorProps, +} from "./token-usage-indicator"; export { DefaultWelcomeScreen, WelcomeScreen } from "./welcome-screen"; diff --git a/packages/aipex-react/src/components/chatbot/components/slots/tool-display.tsx b/packages/aipex-react/src/components/chatbot/components/slots/tool-display.tsx index 6d5da25..5b48225 100644 --- a/packages/aipex-react/src/components/chatbot/components/slots/tool-display.tsx +++ b/packages/aipex-react/src/components/chatbot/components/slots/tool-display.tsx @@ -23,10 +23,14 @@ import { formatToolOutput, mapToolState } from "../../tools"; /** * Default tool display slot component + * Opens by default when there's an error so users can see the failure reason */ export function DefaultToolDisplay({ tool }: ToolDisplaySlotProps) { + // Expand by default when in error state to make failure reasons visible + const shouldExpandByDefault = tool.state === "error"; + return ( - + { @@ -63,8 +68,11 @@ export function CompactToolDisplay({ tool }: ToolDisplaySlotProps) { } }; + // Expand by default when in error state to make failure reasons visible + const shouldExpandByDefault = tool.state === "error"; + return ( - + {getStatusIcon()} {tool.toolName} diff --git a/packages/aipex-react/src/components/chatbot/components/token-usage-indicator.tsx b/packages/aipex-react/src/components/chatbot/components/token-usage-indicator.tsx new file mode 100644 index 0000000..96c82a8 --- /dev/null +++ b/packages/aipex-react/src/components/chatbot/components/token-usage-indicator.tsx @@ -0,0 +1,258 @@ +import type { AgentMetrics } from "@aipexstudio/aipex-core"; +import { useMemo } from "react"; +import { cn } from "../../../lib/utils"; +import { Tooltip, TooltipContent, TooltipTrigger } from "../../ui/tooltip"; +import { useChatContext } from "../context"; + +// Default thresholds (matching legacy aipex behavior) +const DEFAULT_WATERMARK_TOKENS = 150_000; +const DEFAULT_UI_MAX_TOKENS = 180_000; + +export interface TokenUsageIndicatorProps { + /** Custom className */ + className?: string; + /** Compact mode for header/toolbar usage */ + compact?: boolean; + /** Whether the conversation is currently being summarized */ + isSummarizing?: boolean; + /** Token watermark threshold (when to show warning) */ + watermarkTokens?: number; + /** Maximum tokens for UI display (100% mark) */ + maxTokens?: number; + /** Override metrics (if not using context) */ + metrics?: AgentMetrics | null; +} + +/** + * Format token numbers for display (e.g., 150000 -> "150K") + */ +function formatTokens(tokens: number): string { + if (tokens >= 1_000_000) return `${(tokens / 1_000_000).toFixed(1)}M`; + if (tokens >= 1_000) return `${(tokens / 1_000).toFixed(1)}K`; + return tokens.toString(); +} + +/** + * TokenUsageIndicator - Displays current token usage with visual progress + * + * Shows token consumption as a circular progress indicator with color-coded + * thresholds. Supports compact mode for use in headers/toolbars. + * + * @example + * ```tsx + * // In a header (compact mode) + * + * + * // Full display with custom thresholds + * + * ``` + */ +export function TokenUsageIndicator({ + className, + compact = false, + isSummarizing = false, + watermarkTokens = DEFAULT_WATERMARK_TOKENS, + maxTokens = DEFAULT_UI_MAX_TOKENS, + metrics: metricsProp, +}: TokenUsageIndicatorProps) { + // Get metrics from context if not provided via props + const chatContext = useChatContext(); + const metrics = metricsProp ?? chatContext.metrics; + + const usage = useMemo(() => { + // Use tokensUsed (total from latest response) instead of just promptTokens + const tokens = metrics?.tokensUsed ?? 0; + const percentage = Math.min((tokens / maxTokens) * 100, 100); + return { tokens, percentage }; + }, [metrics, maxTokens]); + + // Hide the indicator when there's no usage data and not summarizing + if (!isSummarizing && usage.tokens === 0) { + return null; + } + + // Determine color based on usage percentage + const getColorClass = (percentage: number): string => { + if (percentage >= 90) return "text-red-500"; + if (percentage >= (watermarkTokens / maxTokens) * 100) + return "text-orange-500"; + if (percentage >= 60) return "text-yellow-500"; + return "text-gray-500"; + }; + + const getProgressColor = (percentage: number): string => { + if (percentage >= 90) return "stroke-red-500"; + if (percentage >= (watermarkTokens / maxTokens) * 100) + return "stroke-orange-500"; + if (percentage >= 60) return "stroke-yellow-500"; + return "stroke-gray-400"; + }; + + // Compact mode: only show percentage and circular progress, hover for details + if (compact) { + return ( + + +
+ {/* Circular Progress Indicator */} +
+ + {/* Background circle */} + + {/* Progress circle */} + + +
+ + {/* Percentage only */} + + {usage.percentage.toFixed(0)}% + +
+
+ +
+
Context Usage
+
+ {formatTokens(usage.tokens)} / {formatTokens(maxTokens)} tokens +
+ {metrics && ( +
+ Prompt: {formatTokens(metrics.promptTokens)} | Completion:{" "} + {formatTokens(metrics.completionTokens)} +
+ )} + {isSummarizing && ( +
Summarizing...
+ )} +
+
+
+ ); + } + + // Full mode: show all details inline + return ( +
+ {/* Circular Progress Indicator */} +
+ + {/* Background circle */} + + {/* Progress circle */} + + +
+ + {/* Token count and percentage */} +
+ + {usage.percentage.toFixed(1)}% + + + {formatTokens(usage.tokens)} / {formatTokens(maxTokens)} + +
+ + {/* Summary indicator */} + {isSummarizing ? ( +
+
+ + Summarizing... + +
+ ) : usage.percentage >= (watermarkTokens / maxTokens) * 100 ? ( +
+
+
+ ) : null} +
+ ); +} diff --git a/packages/aipex-react/src/components/chatbot/context.ts b/packages/aipex-react/src/components/chatbot/context.ts index 5ac7884..566f73d 100644 --- a/packages/aipex-react/src/components/chatbot/context.ts +++ b/packages/aipex-react/src/components/chatbot/context.ts @@ -1,4 +1,5 @@ import type { + AgentMetrics, AIPex, AppSettings, KeyValueStorage, @@ -24,6 +25,8 @@ export interface ChatContextValue { status: ChatStatus; /** Current session ID */ sessionId: string | null; + /** Latest token metrics from most recent execution */ + metrics: AgentMetrics | null; /** Send a message */ sendMessage: ( text: string, diff --git a/packages/aipex-react/src/components/chatbot/index.ts b/packages/aipex-react/src/components/chatbot/index.ts index ab797ec..ee0850d 100644 --- a/packages/aipex-react/src/components/chatbot/index.ts +++ b/packages/aipex-react/src/components/chatbot/index.ts @@ -51,6 +51,8 @@ export { InputArea, MessageItem, MessageList, + TokenUsageIndicator, + type TokenUsageIndicatorProps, WelcomeScreen, } from "./components"; // Default export for backward compatibility diff --git a/packages/aipex-react/src/components/voice/VoiceInput.tsx b/packages/aipex-react/src/components/voice/VoiceInput.tsx new file mode 100644 index 0000000..02b6969 --- /dev/null +++ b/packages/aipex-react/src/components/voice/VoiceInput.tsx @@ -0,0 +1,475 @@ +/** + * Voice Input Component + * Integrates VAD, audio recording and STT for voice input + */ + +import type React from "react"; +import { useCallback, useEffect, useRef, useState } from "react"; +import { useTranslation } from "../../i18n/hooks"; +import { cn } from "../../lib/utils"; +import { isByokUserSimple } from "../../lib/voice/ai-config"; +import { AudioRecorder } from "../../lib/voice/audio-recorder"; +import { useChromeStorage } from "../../lib/voice/chrome-storage"; +import { transcribeAudioWithRetry } from "../../lib/voice/elevenlabs-stt"; +import { transcribeAudioWithServerRetry } from "../../lib/voice/server-stt"; +import { VADDetector } from "../../lib/voice/vad-detector"; +import { Button } from "../ui/button"; +import { ParticleSystem } from "./particle-system"; + +export interface VoiceInputProps { + onTranscript: (text: string) => void; + onError?: (error: string) => void; + className?: string; + isPaused?: boolean; // External control for pause + onSwitchToText?: () => void; +} + +type VoiceStatus = "idle" | "listening" | "speaking" | "processing" | "error"; + +export const VoiceInput: React.FC = ({ + onTranscript, + onError, + className, + isPaused = false, +}) => { + const { t } = useTranslation(); + const [status, setStatus] = useState("idle"); + const [statusText, setStatusText] = useState(""); + const [isPermissionError, setIsPermissionError] = useState(false); + + const vadRef = useRef(null); + const isInitializingRef = useRef(false); + const prevPausedRef = useRef(isPaused); + const hasInitializedRef = useRef(false); + const onTranscriptRef = useRef(onTranscript); + const onErrorRef = useRef(onError); + + // Particle system refs + const canvasRef = useRef(null); + const particleSystemRef = useRef(null); + + // Get ElevenLabs config from storage + const [elevenlabsApiKey, , isLoadingApiKey] = useChromeStorage( + "elevenlabsApiKey", + "", + ); + const [elevenlabsModelId] = useChromeStorage("elevenlabsModelId", ""); + + // BYOK state + const [isByokUser, setIsByokUser] = useState(null); + + // Initialize Particle System + useEffect(() => { + if (canvasRef.current && !particleSystemRef.current) { + particleSystemRef.current = new ParticleSystem(canvasRef.current); + } + + return () => { + if (particleSystemRef.current) { + particleSystemRef.current.destroy(); + particleSystemRef.current = null; + } + }; + }, []); + + // Update Particle System State + useEffect(() => { + if (!particleSystemRef.current) return; + + if (isPaused) { + particleSystemRef.current.setState("idle"); + return; + } + + switch (status) { + case "idle": + case "error": + particleSystemRef.current.setState("idle"); + break; + case "listening": + particleSystemRef.current.setState("listening"); + break; + case "speaking": + particleSystemRef.current.setState("speaking"); + break; + case "processing": + particleSystemRef.current.setState("processing"); + break; + } + }, [status, isPaused]); + + // Sync refs + useEffect(() => { + onTranscriptRef.current = onTranscript; + }, [onTranscript]); + + useEffect(() => { + onErrorRef.current = onError; + }, [onError]); + + // Check BYOK status + useEffect(() => { + isByokUserSimple().then(setIsByokUser); + }, []); + + const isPausedRef = useRef(isPaused); + + // Update isPausedRef whenever isPaused changes + useEffect(() => { + isPausedRef.current = isPaused; + }, [isPaused]); + + // Initialize VAD + const initializeVAD = useCallback(async () => { + if (isInitializingRef.current || vadRef.current?.isActive()) { + console.log( + "[VoiceInput] Skipping initialization - already initializing or active", + ); + return; + } + + isInitializingRef.current = true; + + try { + setStatus("idle"); + setStatusText( + t("interventions.voice.initializingMic") || "Initializing...", + ); + + const vad = new VADDetector({ + onSpeechStart: () => { + // Check if paused using ref to get latest value + if (isPausedRef.current) { + console.log("[VoiceInput] Speech started but ignored (paused)"); + return; + } + + console.log("[VoiceInput] Speech started"); + setStatus("speaking"); + setStatusText( + t("interventions.voice.recognizing") || "Recognizing...", + ); + }, + onSpeechEnd: async (audio: Float32Array) => { + // Check if paused using ref to get latest value + if (isPausedRef.current) { + console.log("[VoiceInput] Speech ended but ignored (paused)"); + return; + } + + console.log("[VoiceInput] Speech ended"); + setStatus("processing"); + setStatusText(t("common.processing") || "Processing..."); + + try { + // Convert audio format + const audioBlob = AudioRecorder.float32ArrayToWav(audio, 16000); + + let result: { text?: string; error?: string }; + + // Determine which STT service to use + if (isByokUser === false) { + // Non-BYOK user: use server API + console.log("[VoiceInput] Using server STT (non-BYOK user)"); + result = await transcribeAudioWithServerRetry(audioBlob); + } else if (isByokUser === true && elevenlabsApiKey) { + // BYOK user with ElevenLabs API: use ElevenLabs + console.log( + "[VoiceInput] Using ElevenLabs STT (BYOK user with API key)", + ); + result = await transcribeAudioWithRetry(audioBlob, { + apiKey: elevenlabsApiKey, + modelId: elevenlabsModelId, + }); + } else { + // BYOK user without ElevenLabs API: should use Web Speech API + // But current VoiceInput uses VAD, does not support Web Speech + // This case should be handled by intervention + throw new Error( + "Please configure ElevenLabs API Key in settings or use browser speech recognition", + ); + } + + if (result.error) { + throw new Error(result.error); + } + + if (result.text) { + // Log success but not the actual transcript (privacy) + console.log("[VoiceInput] Transcription successful"); + + // Pause VAD while waiting for AI processing + if (vadRef.current) { + console.log( + "[VoiceInput] Pausing VAD after sending transcript", + ); + vadRef.current.pause(); + } + + setStatus("listening"); + setStatusText("Waiting for AI response..."); + + // Send transcript using ref + onTranscriptRef.current(result.text); + } else { + console.warn("[VoiceInput] Empty transcription"); + setStatus("listening"); + setStatusText("No speech detected, please try again"); + } + } catch (error) { + console.error("[VoiceInput] Transcription error"); + const errorMsg = + error instanceof Error ? error.message : String(error); + setStatus("error"); + setStatusText(errorMsg); + onErrorRef.current?.(errorMsg); + + // Resume listening after 3 seconds + setTimeout(() => { + setStatus("listening"); + setStatusText( + t("interventions.voice.speakPrompt") || "Start speaking...", + ); + }, 3000); + } + }, + onVADMisfire: () => { + console.log("[VoiceInput] VAD misfire"); + setStatus("listening"); + setStatusText( + t("interventions.voice.speakPrompt") || "Continue speaking...", + ); + }, + onVolumeChange: (vol: number) => { + if (particleSystemRef.current && !isPaused) { + particleSystemRef.current.updateFrequency(vol); + } + }, + }); + + await vad.start(); + vadRef.current = vad; + + setStatus("listening"); + setStatusText( + t("interventions.voice.speakPrompt") || "Start speaking...", + ); + } catch (error) { + console.error("[VoiceInput] Failed to initialize VAD"); + + // Check if microphone permission was denied + const isPermissionDenied = + error instanceof Error && + (error.name === "NotAllowedError" || + error.name === "PermissionDeniedError" || + error.message.includes("Permission denied") || + error.message.includes("permission")); + + if (isPermissionDenied) { + console.log( + "[VoiceInput] Microphone permission denied, redirecting to guide...", + ); + setIsPermissionError(true); + // Open voice guide page + window.open("https://www.claudechrome.com/voice/guide", "_blank"); + } else { + setIsPermissionError(false); + } + + const errorMsg = + error instanceof Error ? error.message : "Unable to access microphone"; + setStatus("error"); + setStatusText(errorMsg); + onErrorRef.current?.(errorMsg); + } finally { + isInitializingRef.current = false; + } + }, [isByokUser, elevenlabsApiKey, elevenlabsModelId, t, isPaused]); + + // Initialize on component mount + useEffect(() => { + // Ensure only initialized once + if (hasInitializedRef.current) { + console.log("[VoiceInput] Already initialized, skipping"); + return; + } + + // Wait for BYOK status to load + if (isByokUser === null) { + setStatus("idle"); + setStatusText("Loading..."); + return; + } + + // Wait for storage to load + if (isLoadingApiKey) { + setStatus("idle"); + setStatusText("Loading..."); + return; + } + + // BYOK user needs to check for ElevenLabs API key + if (isByokUser && !elevenlabsApiKey) { + setStatus("error"); + setStatusText("Please configure ElevenLabs API Key in settings"); + return; + } + + console.log("[VoiceInput] Initializing VAD on mount"); + hasInitializedRef.current = true; + initializeVAD(); + }, [isByokUser, elevenlabsApiKey, isLoadingApiKey, initializeVAD]); + + // Cleanup on component unmount - separate effect to ensure it always runs + useEffect(() => { + return () => { + // Immediately sync cleanup on unmount + console.log( + "[VoiceInput] Component unmounting - stopping VAD immediately", + ); + + if (vadRef.current) { + // Stop immediately, don't wait for Promise + vadRef.current.stop().catch((_err) => { + console.error("[VoiceInput] Failed to stop VAD during unmount"); + }); + vadRef.current = null; + } + + // Reset all flags + isInitializingRef.current = false; + hasInitializedRef.current = false; + + console.log("[VoiceInput] Cleanup complete"); + }; + }, []); // Empty dependency ensures only runs on unmount + + // Monitor external pause state changes + useEffect(() => { + if (prevPausedRef.current !== isPaused) { + console.log("[VoiceInput] Pause state changed:", isPaused); + + if (isPaused) { + // Pause VAD but don't release resources + if (vadRef.current) { + console.log("[VoiceInput] Pausing VAD due to external pause"); + vadRef.current.pause(); + setStatusText("AI is processing..."); + } + } else { + // Resume VAD + if (vadRef.current) { + console.log("[VoiceInput] Resuming VAD after external pause"); + vadRef.current.resume(); + setStatus("listening"); + setStatusText( + t("interventions.voice.speakPrompt") || "Start speaking...", + ); + } + } + + prevPausedRef.current = isPaused; + } + }, [isPaused, t]); + + // Handle resize with ResizeObserver + useEffect(() => { + if (!canvasRef.current || !particleSystemRef.current) return; + + const resizeObserver = new ResizeObserver((entries) => { + for (const entry of entries) { + if (entry.target === canvasRef.current) { + particleSystemRef.current?.handleResize(); + } + } + }); + + resizeObserver.observe(canvasRef.current); + + return () => { + resizeObserver.disconnect(); + }; + }, []); + + return ( +
+ {/* Particle Ball - Full Screen Canvas */} + + + {/* Content layer - Overlay Content - Positioned at bottom */} +
+ {/* Status text - Gray and at bottom */} +
+

+ {statusText} +

+ + {/* Hint text */} + {!isPaused && status === "listening" && ( +

+ Start speaking, VAD will auto-detect your voice +

+ )} + + {isPaused && ( +

+ AI is processing, voice detection paused +

+ )} + + {/* Permission error prompt */} + {status === "error" && isPermissionError && ( +
+ +
+ )} + + {/* API Key error prompt */} + {status === "error" && !isPermissionError && !elevenlabsApiKey && ( +
+ +
+ )} +
+
+
+ ); +}; diff --git a/packages/aipex-react/src/components/voice/config.ts b/packages/aipex-react/src/components/voice/config.ts new file mode 100644 index 0000000..7861300 --- /dev/null +++ b/packages/aipex-react/src/components/voice/config.ts @@ -0,0 +1,125 @@ +/** + * Voice mode configuration options + */ + +export const VOICE_MODE_CONFIG = { + // Particle system settings + particles: { + count: 1000, // More particles for dense cloud effect + pointSize: 1.0, // Slightly larger for clear visibility of each particle + minRadius: 0.8, // Minimum sphere radius + maxRadius: 1.8, // Maximum sphere radius + }, + + // Animation settings + animation: { + rotationSpeed: 0.1, // Rotation speed multiplier + transitionSpeed: 0.05, // State transition smoothing (0-1) + pulseSpeed: 3.0, // Pulsing animation speed + }, + + // Audio settings + audio: { + fftSize: 256, // FFT size for frequency analysis + smoothing: 0.8, // Audio smoothing (0-1) + threshold: 0.01, // Audio activity threshold + }, + + // Speech recognition settings + speech: { + language: "en-US", // Default language + continuous: false, // Continuous recognition + interimResults: true, // Show interim results + }, + + // Speech synthesis settings + synthesis: { + rate: 1.0, // Speech rate (0.1-10) + pitch: 1.0, // Speech pitch (0-2) + volume: 1.0, // Speech volume (0-1) + }, + + // Visual settings + visual: { + backgroundColor: 0x0a0a0a, // Background color + blending: "additive", // Particle blending mode + antialias: true, // Enable antialiasing + maxPixelRatio: 2, // Maximum pixel ratio + }, + + // Performance presets + presets: { + low: { + particleCount: 3000, + fftSize: 128, + maxPixelRatio: 1, + }, + medium: { + particleCount: 8000, + fftSize: 256, + maxPixelRatio: 2, + }, + high: { + particleCount: 15000, + fftSize: 512, + maxPixelRatio: 2, + }, + }, +}; + +/** + * Color schemes for different states + */ +export const COLOR_SCHEMES = { + idle: { + primary: [0.6, 0.9, 1.0], // Bright cyan + secondary: [0.3, 0.8, 1.0], // Vivid cyan + }, + listening: { + primary: [1.0, 0.75, 0.3], // Bright orange (detecting input) + secondary: [1.0, 0.5, 0.0], // Vivid orange + }, + speaking: { + primary: [0.5, 1.0, 0.5], // Bright lime green (audio output) + secondary: [0.2, 1.0, 0.3], // Vivid green + }, + processing: { + primary: [0.6, 0.9, 1.0], // Bright cyan (same as idle) + secondary: [0.3, 0.8, 1.0], // Vivid cyan + }, +}; + +/** + * Helper function to get performance preset + */ +export function getPerformancePreset( + level: "low" | "medium" | "high" = "medium", +) { + return VOICE_MODE_CONFIG.presets[level]; +} + +/** + * Helper function to detect device performance + */ +export function detectPerformanceLevel(): "low" | "medium" | "high" { + // Check for mobile devices + const isMobile = + /Android|webOS|iPhone|iPad|iPod|BlackBerry|IEMobile|Opera Mini/i.test( + navigator.userAgent, + ); + + if (isMobile) { + return "low"; + } + + // Check for hardware concurrency (CPU cores) + const cores = navigator.hardwareConcurrency || 4; + + if (cores >= 8) { + return "high"; + } else if (cores >= 4) { + return "medium"; + } else { + return "low"; + } +} diff --git a/packages/aipex-react/src/components/voice/index.ts b/packages/aipex-react/src/components/voice/index.ts new file mode 100644 index 0000000..07384b9 --- /dev/null +++ b/packages/aipex-react/src/components/voice/index.ts @@ -0,0 +1,21 @@ +/** + * Voice mode component exports + */ + +export { + COLOR_SCHEMES, + detectPerformanceLevel, + getPerformancePreset, + VOICE_MODE_CONFIG, +} from "./config"; +export { ParticleSystem } from "./particle-system"; +export type { + AudioData, + ParticleUniforms, + SpeechRecognitionResult, + VoiceModeConfig, + VoiceModeProps, + VoiceState, +} from "./types"; +export type { VoiceInputProps } from "./VoiceInput"; +export { VoiceInput } from "./VoiceInput"; diff --git a/packages/aipex-react/src/components/voice/particle-system.ts b/packages/aipex-react/src/components/voice/particle-system.ts new file mode 100644 index 0000000..79caa6a --- /dev/null +++ b/packages/aipex-react/src/components/voice/particle-system.ts @@ -0,0 +1,218 @@ +import * as THREE from "three"; +import { VOICE_MODE_CONFIG } from "./config"; +import { fragmentShader, vertexShader } from "./shaders"; +import type { VoiceState } from "./types"; + +/** + * WebGL-based particle system for voice mode visualization + */ +export class ParticleSystem { + private scene: THREE.Scene; + private camera: THREE.PerspectiveCamera; + private renderer: THREE.WebGLRenderer; + private particles: THREE.Points; + private uniforms!: { + uTime: THREE.IUniform; + uState: THREE.IUniform; + uRadius: THREE.IUniform; + uFrequency: THREE.IUniform; + uPointSize: THREE.IUniform; + }; + private animationId: number | null = null; + private startTime: number; + private targetRadius: number = 2.2; + private currentRadius: number = 2.2; + + constructor(canvas: HTMLCanvasElement) { + this.startTime = Date.now(); + + // Setup scene + this.scene = new THREE.Scene(); + this.scene.background = new THREE.Color( + VOICE_MODE_CONFIG.visual.backgroundColor, + ); + + // Get canvas dimensions (fallback to window size if canvas size is 0) + const width = canvas.clientWidth || window.innerWidth; + const height = canvas.clientHeight || window.innerHeight; + + // Setup camera - closer for better cloud view + this.camera = new THREE.PerspectiveCamera(75, width / height, 0.1, 1000); + this.camera.position.z = 7; // Distance for optimal viewing + + // Setup renderer + this.renderer = new THREE.WebGLRenderer({ + canvas, + antialias: VOICE_MODE_CONFIG.visual.antialias, + alpha: true, + }); + + // Set renderer size and pixel ratio + this.renderer.setSize(width, height, false); // false = don't update canvas style + this.renderer.setPixelRatio( + Math.min(window.devicePixelRatio, VOICE_MODE_CONFIG.visual.maxPixelRatio), + ); + + // Create particles + this.uniforms = { + uTime: { value: 0 }, + uState: { value: 0 }, // 0=idle, 1=listening, 2=speaking + uRadius: { value: 3.0 }, + uFrequency: { value: 0 }, + uPointSize: { value: VOICE_MODE_CONFIG.particles.pointSize }, + }; + + this.particles = this.createParticles(); + this.scene.add(this.particles); + + // Start animation + this.animate(); + + // Handle window resize + window.addEventListener("resize", this.handleResize); + } + + private createParticles(): THREE.Points { + const particleCount = VOICE_MODE_CONFIG.particles.count; + const geometry = new THREE.BufferGeometry(); + + // Create initial positions on a sphere + const positions = new Float32Array(particleCount * 3); + const initialPositions = new Float32Array(particleCount * 3); + const randomOffsets = new Float32Array(particleCount); + + for (let i = 0; i < particleCount; i++) { + // Fibonacci sphere distribution for even distribution + const phi = Math.acos(1 - (2 * (i + 0.5)) / particleCount); + const theta = Math.PI * (1 + Math.sqrt(5)) * i; + + const x = Math.cos(theta) * Math.sin(phi); + const y = Math.sin(theta) * Math.sin(phi); + const z = Math.cos(phi); + + const i3 = i * 3; + positions[i3] = x; + positions[i3 + 1] = y; + positions[i3 + 2] = z; + + initialPositions[i3] = x; + initialPositions[i3 + 1] = y; + initialPositions[i3 + 2] = z; + + randomOffsets[i] = Math.random(); + } + + geometry.setAttribute("position", new THREE.BufferAttribute(positions, 3)); + geometry.setAttribute( + "initialPosition", + new THREE.BufferAttribute(initialPositions, 3), + ); + geometry.setAttribute( + "randomOffset", + new THREE.BufferAttribute(randomOffsets, 1), + ); + + // Create material with shaders + const material = new THREE.ShaderMaterial({ + uniforms: this.uniforms, + vertexShader, + fragmentShader, + transparent: true, + blending: THREE.AdditiveBlending, + depthWrite: false, + depthTest: true, + }); + + return new THREE.Points(geometry, material); + } + + private animate = () => { + this.animationId = requestAnimationFrame(this.animate); + + // Update time + const elapsed = (Date.now() - this.startTime) / 1000; + this.uniforms.uTime.value = elapsed; + + // Smooth radius transition + this.currentRadius += (this.targetRadius - this.currentRadius) * 0.08; + this.uniforms.uRadius.value = this.currentRadius; + + // Very slow rotation for subtle movement + this.particles.rotation.y = elapsed * 0.05; + + // Render the scene + this.renderer.render(this.scene, this.camera); + }; + + public handleResize = () => { + const canvas = this.renderer.domElement; + const width = canvas.clientWidth; + const height = canvas.clientHeight; + + this.camera.aspect = width / height; + this.camera.updateProjectionMatrix(); + this.renderer.setSize(width, height); + }; + + public setState(state: VoiceState) { + // Update uniform with smaller radius values to fit in canvas + switch (state) { + case "idle": + this.uniforms.uState.value = 0; + this.targetRadius = 1.5; + break; + case "listening": + this.uniforms.uState.value = 1; + this.targetRadius = 1.8; + break; + case "speaking": + this.uniforms.uState.value = 2; + this.targetRadius = 1.6; + break; + case "processing": + this.uniforms.uState.value = 0; + this.targetRadius = 1.4; + break; + } + } + + public updateFrequency(frequency: number) { + // Normalize frequency to 0-1 range + this.uniforms.uFrequency.value = Math.min(Math.max(frequency, 0), 1); + } + + public getDebugInfo() { + return { + isAnimating: this.animationId !== null, + particleCount: this.particles.geometry.attributes.position?.count ?? 0, + currentRadius: this.currentRadius, + targetRadius: this.targetRadius, + rendererSize: { + width: this.renderer.domElement.width, + height: this.renderer.domElement.height, + }, + cameraPosition: { + x: this.camera.position.x, + y: this.camera.position.y, + z: this.camera.position.z, + }, + }; + } + + public destroy() { + if (this.animationId !== null) { + cancelAnimationFrame(this.animationId); + } + + window.removeEventListener("resize", this.handleResize); + + if (this.particles) { + this.particles.geometry.dispose(); + if (this.particles.material instanceof THREE.Material) { + this.particles.material.dispose(); + } + } + + this.renderer.dispose(); + } +} diff --git a/packages/aipex-react/src/components/voice/shaders.ts b/packages/aipex-react/src/components/voice/shaders.ts new file mode 100644 index 0000000..5be4181 --- /dev/null +++ b/packages/aipex-react/src/components/voice/shaders.ts @@ -0,0 +1,220 @@ +/** + * Vertex shader for particle system + */ +export const vertexShader = ` + uniform float uTime; + uniform float uState; // 0=idle, 1=listening, 2=speaking + uniform float uRadius; + uniform float uFrequency; + uniform float uPointSize; + + attribute vec3 initialPosition; + attribute float randomOffset; + + varying float vDistance; + varying float vState; + varying float vAlpha; + + // Simplex noise for organic movement + vec3 mod289(vec3 x) { return x - floor(x * (1.0 / 289.0)) * 289.0; } + vec4 mod289(vec4 x) { return x - floor(x * (1.0 / 289.0)) * 289.0; } + vec4 permute(vec4 x) { return mod289(((x*34.0)+1.0)*x); } + vec4 taylorInvSqrt(vec4 r) { return 1.79284291400159 - 0.85373472095314 * r; } + + float snoise(vec3 v) { + const vec2 C = vec2(1.0/6.0, 1.0/3.0); + const vec4 D = vec4(0.0, 0.5, 1.0, 2.0); + + vec3 i = floor(v + dot(v, C.yyy)); + vec3 x0 = v - i + dot(i, C.xxx); + + vec3 g = step(x0.yzx, x0.xyz); + vec3 l = 1.0 - g; + vec3 i1 = min(g.xyz, l.zxy); + vec3 i2 = max(g.xyz, l.zxy); + + vec3 x1 = x0 - i1 + C.xxx; + vec3 x2 = x0 - i2 + C.yyy; + vec3 x3 = x0 - D.yyy; + + i = mod289(i); + vec4 p = permute(permute(permute( + i.z + vec4(0.0, i1.z, i2.z, 1.0)) + + i.y + vec4(0.0, i1.y, i2.y, 1.0)) + + i.x + vec4(0.0, i1.x, i2.x, 1.0)); + + float n_ = 0.142857142857; + vec3 ns = n_ * D.wyz - D.xzx; + + vec4 j = p - 49.0 * floor(p * ns.z * ns.z); + + vec4 x_ = floor(j * ns.z); + vec4 y_ = floor(j - 7.0 * x_); + + vec4 x = x_ *ns.x + ns.yyyy; + vec4 y = y_ *ns.x + ns.yyyy; + vec4 h = 1.0 - abs(x) - abs(y); + + vec4 b0 = vec4(x.xy, y.xy); + vec4 b1 = vec4(x.zw, y.zw); + + vec4 s0 = floor(b0)*2.0 + 1.0; + vec4 s1 = floor(b1)*2.0 + 1.0; + vec4 sh = -step(h, vec4(0.0)); + + vec4 a0 = b0.xzyw + s0.xzyw*sh.xxyy; + vec4 a1 = b1.xzyw + s1.xzyw*sh.zzww; + + vec3 p0 = vec3(a0.xy, h.x); + vec3 p1 = vec3(a0.zw, h.y); + vec3 p2 = vec3(a1.xy, h.z); + vec3 p3 = vec3(a1.zw, h.w); + + vec4 norm = taylorInvSqrt(vec4(dot(p0,p0), dot(p1,p1), dot(p2,p2), dot(p3,p3))); + p0 *= norm.x; + p1 *= norm.y; + p2 *= norm.z; + p3 *= norm.w; + + vec4 m = max(0.6 - vec4(dot(x0,x0), dot(x1,x1), dot(x2,x2), dot(x3,x3)), 0.0); + m = m * m; + return 42.0 * dot(m*m, vec4(dot(p0,x0), dot(p1,x1), dot(p2,x2), dot(p3,x3))); + } + + void main() { + vState = uState; + + // Normalized position on sphere + vec3 pos = normalize(initialPosition); + + // Base radius with breathing effect + float breathe = sin(uTime * 0.8) * 0.1; + float baseRadius = uRadius + breathe; + + // Audio reactivity + float audioInfluence = uFrequency * 0.3; + + // Organic noise displacement + float noiseScale = 0.5; + float noiseTime = uTime * 0.2; + float noise = snoise(pos * noiseScale + vec3(noiseTime)); + + // State-based effects with controlled dispersion + float displacement = 0.0; + float radiusMultiplier = 1.0; + + if (uState < 0.5) { + // Idle - subtle cloud movement with audio reactivity + displacement = noise * 0.2 + sin(uTime * 0.5 + randomOffset * 6.28) * 0.08 + audioInfluence * 0.15; + radiusMultiplier = 0.9 + randomOffset * 0.15 + audioInfluence * 0.2; + } else if (uState < 1.5) { + // Listening - moderate expansion with strong audio reactivity + displacement = noise * 0.3 + audioInfluence * 0.3; + radiusMultiplier = 1.1 + audioInfluence * 0.5 + randomOffset * 0.1; + } else { + // Speaking - controlled pulsing with audio reactivity + float pulse = sin(uTime * 4.0 + randomOffset * 6.28) * 0.5 + 0.5; + displacement = noise * 0.25 + pulse * audioInfluence * 0.2; + radiusMultiplier = 0.95 + pulse * 0.2 + audioInfluence * 0.3 + randomOffset * 0.08; + } + + // Apply displacement + pos += pos * displacement; + pos = normalize(pos); + + // Final position + float finalRadius = baseRadius * radiusMultiplier; + pos *= finalRadius; + + // Calculate distance from center for effects + vDistance = length(pos) / finalRadius; + + // Alpha based on distance from center (cloud-like gradient) + // More visible at edges for cloud effect + vAlpha = 0.3 + 0.7 * (1.0 - smoothstep(0.0, 1.2, vDistance)); + + // Transform position + vec4 mvPosition = modelViewMatrix * vec4(pos, 1.0); + gl_Position = projectionMatrix * mvPosition; + + // Point size - consistent size for distinct particles + float depth = -mvPosition.z; + float sizeMultiplier = 1.0 + audioInfluence * 0.5; + // Scale for distinct, visible particles + gl_PointSize = (uPointSize * 90.0 / max(depth, 1.0)) * sizeMultiplier; + + // Clamp size for consistent appearance + gl_PointSize = clamp(gl_PointSize, 2.0, 8.0); + } +`; + +/** + * Fragment shader for particle system + */ +export const fragmentShader = ` + uniform float uState; + uniform float uFrequency; + + varying float vDistance; + varying float vState; + varying float vAlpha; + + void main() { + // Create sharp circular points - no soft glow + vec2 center = gl_PointCoord - vec2(0.5); + float dist = length(center) * 2.0; + + // Sharp edge for distinct particles + if (dist > 1.0) { + discard; // Cut off particles at edge for sharp circles + } + + // Strong intensity for bright particles + float intensity = 1.0 - smoothstep(0.5, 1.0, dist); + intensity = pow(intensity, 1.2); // Brighter particles + + // State-based colors - brighter and more vibrant + vec3 color; + vec3 coreColor; + vec3 edgeColor; + float audioBoost = uFrequency * 0.5; + + if (vState < 0.5) { + // Idle - bright cyan/blue + coreColor = vec3(0.6, 0.9, 1.0); // Bright cyan + edgeColor = vec3(0.3, 0.8, 1.0); // Vivid cyan + } else if (vState < 1.5) { + // Listening (detecting input) - bright orange + coreColor = vec3(1.0, 0.75, 0.3); // Bright orange + edgeColor = vec3(1.0, 0.5, 0.0); // Vivid orange + intensity *= (1.0 + audioBoost * 0.5); + } else { + // Speaking (audio output) - bright green + coreColor = vec3(0.5, 1.0, 0.5); // Bright lime green + edgeColor = vec3(0.2, 1.0, 0.3); // Vivid green + intensity *= (1.0 + audioBoost * 0.6); + } + + // Mix colors based on distance from particle center + color = mix(coreColor, edgeColor, dist); + + // Very high brightness for super bright particles + float brightnessMultiplier = 2.5 + audioBoost * 0.8; + color *= brightnessMultiplier; + + // Less fade for brighter overall appearance + float centerFade = mix(0.85, 0.5, vDistance); + + // Very strong alpha for extremely bright, distinct particles + float alpha = intensity * centerFade * vAlpha; + + // Much higher alpha for super bright particles + alpha = clamp(alpha * 2.0, 0.0, 1.0); + + // High minimum alpha - every particle should be clearly visible + alpha = max(alpha, 0.35); + + // Output with additive blending for bright particles + gl_FragColor = vec4(color, alpha); + } +`; diff --git a/packages/aipex-react/src/components/voice/types.ts b/packages/aipex-react/src/components/voice/types.ts new file mode 100644 index 0000000..a09a6a7 --- /dev/null +++ b/packages/aipex-react/src/components/voice/types.ts @@ -0,0 +1,52 @@ +/** + * Voice mode state types + */ +export type VoiceState = "idle" | "listening" | "speaking" | "processing"; + +/** + * Voice mode configuration + */ +export interface VoiceModeConfig { + onTextRecognized?: (text: string) => void; + onSpeechComplete?: () => void; + onError?: (error: Error) => void; + language?: string; + continuous?: boolean; +} + +/** + * Audio analysis data + */ +export interface AudioData { + frequencyData: Uint8Array; + averageFrequency: number; + isActive: boolean; +} + +/** + * Speech recognition result + */ +export interface SpeechRecognitionResult { + transcript: string; + isFinal: boolean; + confidence: number; +} + +/** + * Particle system uniforms + */ +export interface ParticleUniforms { + uTime: { value: number }; + uState: { value: number }; // 0=idle, 1=listening, 2=speaking + uRadius: { value: number }; + uFrequency: { value: number }; + uPointSize: { value: number }; +} + +/** + * Voice mode props + */ +export interface VoiceModeProps { + onClose: () => void; + onSubmit: (text: string) => void; +} diff --git a/packages/aipex-react/src/hooks/use-agent.ts b/packages/aipex-react/src/hooks/use-agent.ts index ecb7d70..fc5f03c 100644 --- a/packages/aipex-react/src/hooks/use-agent.ts +++ b/packages/aipex-react/src/hooks/use-agent.ts @@ -88,7 +88,7 @@ export function useAgent({ tools = [], instructions, name = "AIPex Assistant", - maxTurns = 10, + maxTurns = 2000, agentOptions = {}, }: UseAgentOptions): UseAgentReturn { const [agent, setAgent] = useState(undefined); diff --git a/packages/aipex-react/src/hooks/use-chat.test.ts b/packages/aipex-react/src/hooks/use-chat.test.ts index 20aef8f..54bd2d9 100644 --- a/packages/aipex-react/src/hooks/use-chat.test.ts +++ b/packages/aipex-react/src/hooks/use-chat.test.ts @@ -405,4 +405,119 @@ describe("useChat", () => { state: "completed", }); }); + + it("should update metrics state when metrics_update event is received", async () => { + const { agent } = setupMockAgent(); + const metricsEvent = { + type: "metrics_update" as const, + metrics: { + tokensUsed: 500, + promptTokens: 300, + completionTokens: 200, + itemCount: 2, + maxTurns: 10, + duration: 1500, + startTime: Date.now(), + }, + sessionId: "session-1", + }; + + (agent.chat as ReturnType).mockReturnValue( + createEventGenerator([ + { type: "session_created", sessionId: "session-1" }, + metricsEvent, + createExecutionCompleteEvent(), + ]), + ); + + const { result } = await renderUseChat(agent); + + // Initially null + expect(result.current.metrics).toBeNull(); + + await act(async () => { + await result.current.sendMessage("Hello"); + }); + + // After processing events, metrics should be updated + expect(result.current.metrics).toEqual(metricsEvent.metrics); + }); + + it("should call onMetricsUpdate handler when metrics_update event is received", async () => { + const { agent } = setupMockAgent(); + const onMetricsUpdate = vi.fn(); + const metricsEvent = { + type: "metrics_update" as const, + metrics: { + tokensUsed: 1000, + promptTokens: 600, + completionTokens: 400, + itemCount: 3, + maxTurns: 10, + duration: 2000, + startTime: Date.now(), + }, + sessionId: "session-123", + }; + + (agent.chat as ReturnType).mockReturnValue( + createEventGenerator([ + { type: "session_created", sessionId: "session-123" }, + metricsEvent, + createExecutionCompleteEvent(), + ]), + ); + + const { result } = await renderUseChat(agent, { + handlers: { onMetricsUpdate }, + }); + + await act(async () => { + await result.current.sendMessage("Test"); + }); + + expect(onMetricsUpdate).toHaveBeenCalledWith( + metricsEvent.metrics, + "session-123", + ); + }); + + it("should reset metrics to null on chat reset", async () => { + const { agent } = setupMockAgent(); + const metricsEvent = { + type: "metrics_update" as const, + metrics: { + tokensUsed: 100, + promptTokens: 60, + completionTokens: 40, + itemCount: 1, + maxTurns: 10, + duration: 500, + startTime: Date.now(), + }, + sessionId: "session-1", + }; + + (agent.chat as ReturnType).mockReturnValue( + createEventGenerator([ + { type: "session_created", sessionId: "session-1" }, + metricsEvent, + createExecutionCompleteEvent(), + ]), + ); + + const { result } = await renderUseChat(agent); + + await act(async () => { + await result.current.sendMessage("Hello"); + }); + + expect(result.current.metrics).not.toBeNull(); + + act(() => { + result.current.reset(); + }); + + expect(result.current.metrics).toBeNull(); + }); }); diff --git a/packages/aipex-react/src/hooks/use-chat.ts b/packages/aipex-react/src/hooks/use-chat.ts index 1ba57a4..4fb6a89 100644 --- a/packages/aipex-react/src/hooks/use-chat.ts +++ b/packages/aipex-react/src/hooks/use-chat.ts @@ -1,4 +1,9 @@ -import type { AgentEvent, AIPex, Context } from "@aipexstudio/aipex-core"; +import type { + AgentEvent, + AgentMetrics, + AIPex, + Context, +} from "@aipexstudio/aipex-core"; import { useCallback, useEffect, useMemo, useRef, useState } from "react"; import { ChatAdapter } from "../adapters/chat-adapter"; import type { @@ -23,6 +28,8 @@ export interface UseChatReturn { status: ChatStatus; /** Current session ID */ sessionId: string | null; + /** Latest token metrics from the most recent execution */ + metrics: AgentMetrics | null; /** Send a new message */ sendMessage: ( text: string, @@ -82,6 +89,7 @@ export function useChat( ); const [status, setStatus] = useState("idle"); const [sessionId, setSessionId] = useState(null); + const [metrics, setMetrics] = useState(null); // Refs for stable callbacks const handlersRef = useRef(handlers); @@ -145,6 +153,15 @@ export function useChat( handlersRef.current?.onError?.(event.error); } + // Handle metrics update + if (event.type === "metrics_update") { + setMetrics(event.metrics); + handlersRef.current?.onMetricsUpdate?.( + event.metrics, + event.sessionId, + ); + } + // Process the event through adapter adapter.processEvent(event); } @@ -245,6 +262,7 @@ export function useChat( } activeGeneratorRef.current = null; setSessionId(null); + setMetrics(null); adapter.reset(configRef.current?.initialMessages ?? []); }, [adapter, agent, sessionId]); @@ -290,6 +308,7 @@ export function useChat( messages, status, sessionId, + metrics, sendMessage, continueConversation, interrupt, diff --git a/packages/aipex-react/src/index.ts b/packages/aipex-react/src/index.ts index 4179e8c..44d4598 100644 --- a/packages/aipex-react/src/index.ts +++ b/packages/aipex-react/src/index.ts @@ -6,6 +6,7 @@ export * from "./components/file-manager/index.js"; export * from "./components/intervention/index.js"; export * from "./components/omni/index.js"; export * from "./components/settings/index.js"; +export * from "./components/voice/index.js"; // Skill UI components moved to browser-ext - no longer exported from aipex-react // export * from "./components/skill/index.js"; export * from "./hooks/index.js"; diff --git a/packages/aipex-react/src/lib/index.ts b/packages/aipex-react/src/lib/index.ts index 67e3a32..a846273 100644 --- a/packages/aipex-react/src/lib/index.ts +++ b/packages/aipex-react/src/lib/index.ts @@ -10,3 +10,6 @@ export { LocalStorageKeyValueAdapter, localStorageKeyValueAdapter, } from "./storage.js"; + +// Voice module exports +export * from "./voice/index.js"; diff --git a/packages/aipex-react/src/lib/voice/ai-config.ts b/packages/aipex-react/src/lib/voice/ai-config.ts new file mode 100644 index 0000000..4ce2c93 --- /dev/null +++ b/packages/aipex-react/src/lib/voice/ai-config.ts @@ -0,0 +1,22 @@ +/** + * AI configuration utilities for voice mode + */ + +import { ChromeStorage } from "./chrome-storage"; + +/** + * Check if the user is a BYOK (Bring Your Own Key) user. + * Only checks the byokEnabled flag in Chrome storage. + */ +export async function isByokUserSimple(): Promise { + try { + const storage = new ChromeStorage("local"); + const byokValue = await storage.get("byokEnabled"); + const isByokEnabled = byokValue === "true" || Boolean(byokValue); + return isByokEnabled; + } catch (_error) { + // Avoid logging detailed error info for security + console.error("[AIConfig] Failed to check BYOK flag"); + return false; + } +} diff --git a/packages/aipex-react/src/lib/voice/audio-recorder.ts b/packages/aipex-react/src/lib/voice/audio-recorder.ts new file mode 100644 index 0000000..22471aa --- /dev/null +++ b/packages/aipex-react/src/lib/voice/audio-recorder.ts @@ -0,0 +1,197 @@ +/** + * Audio Recorder + * Manages audio recording, supports converting Float32Array to uploadable audio format + */ + +export interface AudioRecorderConfig { + sampleRate?: number; + mimeType?: string; +} + +export class AudioRecorder { + private mediaRecorder: MediaRecorder | null = null; + private audioChunks: Blob[] = []; + private stream: MediaStream | null = null; + private isRecording: boolean = false; + private config: AudioRecorderConfig; + + constructor(config: AudioRecorderConfig = {}) { + this.config = { + sampleRate: 16000, + mimeType: "audio/webm;codecs=opus", + ...config, + }; + } + + /** + * Start recording + */ + async startRecording(): Promise { + if (this.isRecording) { + console.warn("[AudioRecorder] Already recording"); + return; + } + + try { + // Request microphone permission + this.stream = await navigator.mediaDevices.getUserMedia({ + audio: { + sampleRate: this.config.sampleRate, + echoCancellation: true, + noiseSuppression: true, + autoGainControl: true, + }, + }); + + // Create MediaRecorder + const options = this.getSupportedMimeType(); + this.mediaRecorder = new MediaRecorder(this.stream, options); + + // Listen for data available event + this.mediaRecorder.ondataavailable = (event) => { + if (event.data.size > 0) { + this.audioChunks.push(event.data); + } + }; + + // Start recording + this.audioChunks = []; + this.mediaRecorder.start(); + this.isRecording = true; + + console.log("[AudioRecorder] Recording started"); + } catch (error) { + console.error("[AudioRecorder] Failed to start recording"); + this.cleanup(); + throw error; + } + } + + /** + * Stop recording and return audio Blob + */ + async stopRecording(): Promise { + return new Promise((resolve, reject) => { + if (!this.isRecording || !this.mediaRecorder) { + reject(new Error("Not recording")); + return; + } + + this.mediaRecorder.onstop = () => { + const mimeType = this.mediaRecorder?.mimeType || this.config.mimeType!; + const audioBlob = new Blob(this.audioChunks, { type: mimeType }); + this.cleanup(); + console.log( + "[AudioRecorder] Recording stopped, blob size:", + audioBlob.size, + ); + resolve(audioBlob); + }; + + this.mediaRecorder.stop(); + this.isRecording = false; + }); + } + + /** + * Convert Float32Array audio data to WAV Blob + * Used for processing audio data returned by VAD + */ + static float32ArrayToWav( + audioData: Float32Array, + sampleRate: number = 16000, + ): Blob { + const buffer = AudioRecorder.encodeWAV(audioData, sampleRate); + return new Blob([buffer], { type: "audio/wav" }); + } + + /** + * Encode WAV file + */ + private static encodeWAV( + samples: Float32Array, + sampleRate: number, + ): ArrayBuffer { + const buffer = new ArrayBuffer(44 + samples.length * 2); + const view = new DataView(buffer); + + // WAV file header + const writeString = (offset: number, str: string) => { + for (let i = 0; i < str.length; i++) { + view.setUint8(offset + i, str.charCodeAt(i)); + } + }; + + const floatTo16BitPCM = (offset: number, input: Float32Array) => { + for (let i = 0; i < input.length; i++, offset += 2) { + const s = Math.max(-1, Math.min(1, input[i] ?? 0)); + view.setInt16(offset, s < 0 ? s * 0x8000 : s * 0x7fff, true); + } + }; + + writeString(0, "RIFF"); + view.setUint32(4, 36 + samples.length * 2, true); + writeString(8, "WAVE"); + writeString(12, "fmt "); + view.setUint32(16, 16, true); // fmt chunk size + view.setUint16(20, 1, true); // audio format (PCM) + view.setUint16(22, 1, true); // number of channels + view.setUint32(24, sampleRate, true); + view.setUint32(28, sampleRate * 2, true); // byte rate + view.setUint16(32, 2, true); // block align + view.setUint16(34, 16, true); // bits per sample + writeString(36, "data"); + view.setUint32(40, samples.length * 2, true); + floatTo16BitPCM(44, samples); + + return buffer; + } + + /** + * Get supported MIME type + */ + private getSupportedMimeType(): MediaRecorderOptions { + const types = [ + "audio/webm;codecs=opus", + "audio/webm", + "audio/ogg;codecs=opus", + "audio/ogg", + "audio/mp4", + ]; + + for (const type of types) { + if (MediaRecorder.isTypeSupported(type)) { + console.log("[AudioRecorder] Using MIME type:", type); + return { mimeType: type }; + } + } + + console.warn( + "[AudioRecorder] No preferred MIME type supported, using default", + ); + return {}; + } + + /** + * Cleanup resources + */ + private cleanup(): void { + if (this.stream) { + for (const track of this.stream.getTracks()) { + track.stop(); + } + this.stream = null; + } + + this.mediaRecorder = null; + this.audioChunks = []; + this.isRecording = false; + } + + /** + * Check if currently recording + */ + isActive(): boolean { + return this.isRecording; + } +} diff --git a/packages/aipex-react/src/lib/voice/chrome-storage.ts b/packages/aipex-react/src/lib/voice/chrome-storage.ts new file mode 100644 index 0000000..43130bc --- /dev/null +++ b/packages/aipex-react/src/lib/voice/chrome-storage.ts @@ -0,0 +1,122 @@ +/** + * Chrome Storage adapter for voice mode + * Uses native Chrome Storage API for extension context + */ + +import { useEffect, useRef, useState } from "react"; + +/** + * Chrome Storage class for direct Chrome extension storage access + */ +export class ChromeStorage { + private area: chrome.storage.StorageArea; + + constructor(area: "local" | "sync" = "local") { + this.area = chrome.storage[area]; + } + + /** + * Get a value from storage + */ + async get(key: string): Promise { + const result = await this.area.get(key); + return result[key] as T | undefined; + } + + /** + * Set a value in storage + */ + async set(key: string, value: unknown): Promise { + await this.area.set({ [key]: value }); + } + + /** + * Remove a value from storage + */ + async remove(key: string): Promise { + await this.area.remove(key); + } + + /** + * Clear all storage + */ + async clear(): Promise { + await this.area.clear(); + } + + /** + * Get all keys from storage + */ + async getAll(): Promise> { + return new Promise((resolve) => { + this.area.get(null, (items) => { + resolve(items || {}); + }); + }); + } + + /** + * Watch for changes to a specific key + */ + watch( + key: string, + callback: (change: { newValue?: T; oldValue?: T }) => void, + ): () => void { + const listener = ( + changes: Record, + areaName: string, + ) => { + if (areaName === "local" && changes[key]) { + callback({ + newValue: changes[key].newValue as T | undefined, + oldValue: changes[key].oldValue as T | undefined, + }); + } + }; + + chrome.storage.onChanged.addListener(listener); + + // Return unsubscribe function + return () => { + chrome.storage.onChanged.removeListener(listener); + }; + } +} + +/** + * React hook for Chrome extension storage + * Returns [value, setValue, isLoading] + */ +export function useChromeStorage( + key: string, + defaultValue?: T, +): [T | undefined, (value: T) => Promise, boolean] { + const [value, setValue] = useState(defaultValue); + const [isLoading, setIsLoading] = useState(true); + const defaultValueRef = useRef(defaultValue); + + useEffect(() => { + const storage = new ChromeStorage(); + + // Load initial value + storage.get(key).then((storedValue) => { + setValue(storedValue ?? defaultValueRef.current); + setIsLoading(false); + }); + + // Watch for changes + const unwatch = storage.watch(key, ({ newValue }) => { + setValue(newValue ?? defaultValueRef.current); + }); + + return unwatch; + }, [key]); + + const setStoredValue = async (newValue: T) => { + const storage = new ChromeStorage(); + await storage.set(key, newValue); + setValue(newValue); + }; + + return [value, setStoredValue, isLoading]; +} diff --git a/packages/aipex-react/src/lib/voice/elevenlabs-stt.ts b/packages/aipex-react/src/lib/voice/elevenlabs-stt.ts new file mode 100644 index 0000000..c62f4eb --- /dev/null +++ b/packages/aipex-react/src/lib/voice/elevenlabs-stt.ts @@ -0,0 +1,188 @@ +/** + * ElevenLabs Speech-to-Text Integration + * Uses ElevenLabs API for speech-to-text transcription + */ + +export interface ElevenLabsSTTConfig { + apiKey: string; + modelId?: string; + language?: string; +} + +export interface TranscriptionResult { + text: string; + confidence?: number; + error?: string; +} + +/** + * Transcribe audio using ElevenLabs API + */ +export async function transcribeAudio( + audioBlob: Blob, + config: ElevenLabsSTTConfig, +): Promise { + const { apiKey, modelId } = config; + + if (!apiKey) { + throw new Error("ElevenLabs API key is required"); + } + + try { + console.log( + "[ElevenLabs STT] Starting transcription, audio size:", + audioBlob.size, + ); + + // Prepare FormData + const formData = new FormData(); + // Use 'file' field name, filename based on actual format + formData.append("file", audioBlob, "audio.wav"); + + // Only add modelId if provided + if (modelId) { + formData.append("model_id", modelId); + } + + // Call ElevenLabs API + const response = await fetch( + "https://api.elevenlabs.io/v1/speech-to-text", + { + method: "POST", + headers: { + "xi-api-key": apiKey, + }, + body: formData, + }, + ); + + if (!response.ok) { + const errorText = await response.text(); + // Do not log full error response for security + console.error("[ElevenLabs STT] API error:", response.status); + + let errorMessage = `ElevenLabs API error: ${response.status}`; + try { + const errorJson = JSON.parse(errorText); + errorMessage = errorJson.detail || errorJson.message || errorMessage; + } catch { + // Use generic error message + } + + return { + text: "", + error: errorMessage, + }; + } + + const result = await response.json(); + // Log only success status, not the actual transcript (PII) + console.log("[ElevenLabs STT] Transcription completed successfully"); + + // ElevenLabs STT API response format: + // { text: string, language: string, confidence: number, ... } + const text = result.text || ""; + const confidence = result.confidence || result.language_probability || 1.0; + + return { + text: text.trim(), + confidence, + }; + } catch (error) { + console.error("[ElevenLabs STT] Transcription failed"); + return { + text: "", + error: error instanceof Error ? error.message : String(error), + }; + } +} + +/** + * Transcribe with retry mechanism + */ +export async function transcribeAudioWithRetry( + audioBlob: Blob, + config: ElevenLabsSTTConfig, + maxRetries: number = 2, +): Promise { + let lastError: string | undefined; + + for (let i = 0; i <= maxRetries; i++) { + if (i > 0) { + console.log(`[ElevenLabs STT] Retry attempt ${i}/${maxRetries}`); + // Wait before retry + await new Promise((resolve) => setTimeout(resolve, 1000 * i)); + } + + const result = await transcribeAudio(audioBlob, config); + + if (!result.error && result.text) { + return result; + } + + lastError = result.error; + } + + return { + text: "", + error: lastError || "Transcription failed after retries", + }; +} + +/** + * Validate API key + */ +export async function validateApiKey(apiKey: string): Promise { + if (!apiKey) { + return false; + } + + try { + // Try to call API to get model list or user info + const response = await fetch("https://api.elevenlabs.io/v1/models", { + method: "GET", + headers: { + "xi-api-key": apiKey, + }, + }); + + return response.ok; + } catch (_error) { + console.error("[ElevenLabs STT] API key validation failed"); + return false; + } +} + +/** + * Supported languages list + * ElevenLabs STT supports multiple languages using standard ISO 639-1 language codes + */ +export const SUPPORTED_LANGUAGES = [ + { code: "en", name: "English" }, + { code: "zh", name: "Chinese (中文)" }, + { code: "es", name: "Spanish" }, + { code: "fr", name: "French" }, + { code: "de", name: "German" }, + { code: "it", name: "Italian" }, + { code: "pt", name: "Portuguese" }, + { code: "pl", name: "Polish" }, + { code: "tr", name: "Turkish" }, + { code: "ru", name: "Russian" }, + { code: "nl", name: "Dutch" }, + { code: "cs", name: "Czech" }, + { code: "ar", name: "Arabic" }, + { code: "ja", name: "Japanese" }, + { code: "ko", name: "Korean" }, + { code: "hi", name: "Hindi" }, +] as const; + +/** + * Available ElevenLabs STT models + */ +export const AVAILABLE_MODELS = [ + { + id: "scribe_v1", + name: "Scribe v1 (Default)", + description: "High quality general transcription model", + }, +] as const; diff --git a/packages/aipex-react/src/lib/voice/index.ts b/packages/aipex-react/src/lib/voice/index.ts new file mode 100644 index 0000000..9cb7271 --- /dev/null +++ b/packages/aipex-react/src/lib/voice/index.ts @@ -0,0 +1,10 @@ +/** + * Voice module exports + */ + +export * from "./ai-config"; +export * from "./audio-recorder"; +export * from "./chrome-storage"; +export * from "./elevenlabs-stt"; +export * from "./server-stt"; +export * from "./vad-detector"; diff --git a/packages/aipex-react/src/lib/voice/server-stt.ts b/packages/aipex-react/src/lib/voice/server-stt.ts new file mode 100644 index 0000000..717f754 --- /dev/null +++ b/packages/aipex-react/src/lib/voice/server-stt.ts @@ -0,0 +1,157 @@ +/** + * Server-side Speech-to-Text Integration + * Uses claudechrome.com server API for speech-to-text transcription + */ + +import type { TranscriptionResult } from "./elevenlabs-stt"; + +export type ServerSTTConfig = Record; + +interface ServerSTTResponse { + success: boolean; + transcript: string; + duration: number; + cost: number; + language: string; + speakers: unknown[]; + timestamp: string; +} + +/** + * Transcribe audio using server API + */ +export async function transcribeAudioWithServer( + audioBlob: Blob, +): Promise { + try { + console.log( + "[Server STT] Starting transcription, audio size:", + audioBlob.size, + ); + + // Get authentication cookies + let cookieHeader = ""; + try { + const cookies = await chrome.cookies.getAll({ + url: "https://www.claudechrome.com", + }); + + const relevantCookies = cookies.filter( + (cookie) => + cookie.name.includes("better-auth") || + cookie.name.includes("session"), + ); + + // Only store cookie names for logging, not values (security) + const cookieNames = relevantCookies.map((c) => c.name); + console.log( + "[Server STT] Found cookies:", + cookieNames.length > 0 ? "yes" : "no", + ); + + cookieHeader = relevantCookies + .map((cookie) => `${cookie.name}=${cookie.value}`) + .join("; "); + } catch (_error) { + console.warn("[Server STT] Failed to get cookies"); + } + + // Prepare FormData + const formData = new FormData(); + formData.append("audio", audioBlob, "audio.wav"); + + // Call server API + const headers: Record = {}; + if (cookieHeader) { + headers["Cookie"] = cookieHeader; + } + + const response = await fetch( + "https://www.claudechrome.com/api/speech-to-text", + { + method: "POST", + headers, + body: formData, + }, + ); + + if (!response.ok) { + // Do not log detailed error response for security + console.error("[Server STT] API error:", response.status); + + let errorMessage = `Server STT error: ${response.status}`; + try { + const errorText = await response.text(); + const errorJson = JSON.parse(errorText); + errorMessage = errorJson.message || errorJson.error || errorMessage; + } catch { + // Use generic error message + } + + return { + text: "", + error: errorMessage, + }; + } + + const result: ServerSTTResponse = await response.json(); + // Log only success status, not the actual transcript (PII) + console.log( + "[Server STT] Transcription completed, success:", + result.success, + ); + + if (!result.success) { + return { + text: "", + error: "Server STT failed", + }; + } + + const text = result.transcript || ""; + // Server may not return confidence, use default + const confidence = 1.0; + + return { + text: text.trim(), + confidence, + }; + } catch (error) { + console.error("[Server STT] Transcription failed"); + return { + text: "", + error: error instanceof Error ? error.message : String(error), + }; + } +} + +/** + * Transcribe with retry mechanism + */ +export async function transcribeAudioWithServerRetry( + audioBlob: Blob, + maxRetries: number = 2, +): Promise { + let lastError: string | undefined; + + for (let i = 0; i <= maxRetries; i++) { + if (i > 0) { + console.log(`[Server STT] Retry attempt ${i}/${maxRetries}`); + // Wait before retry + await new Promise((resolve) => setTimeout(resolve, 1000 * i)); + } + + const result = await transcribeAudioWithServer(audioBlob); + + if (!result.error && result.text) { + return result; + } + + lastError = result.error; + } + + return { + text: "", + error: lastError || "Transcription failed after retries", + }; +} diff --git a/packages/aipex-react/src/lib/voice/vad-detector.ts b/packages/aipex-react/src/lib/voice/vad-detector.ts new file mode 100644 index 0000000..827aa8d --- /dev/null +++ b/packages/aipex-react/src/lib/voice/vad-detector.ts @@ -0,0 +1,268 @@ +/** + * VAD (Voice Activity Detection) Detector + * Uses @ricky0123/vad-web for voice activity detection + */ + +import { MicVAD } from "@ricky0123/vad-web"; + +export interface VADConfig { + positiveSpeechThreshold?: number; + negativeSpeechThreshold?: number; + minSpeechMs?: number; + preSpeechPadMs?: number; + redemptionMs?: number; + onSpeechStart?: () => void; + onSpeechEnd?: (audio: Float32Array) => void; + onVADMisfire?: () => void; + onVolumeChange?: (volume: number) => void; +} + +export class VADDetector { + private vad: MicVAD | null = null; + private audioContext: AudioContext | null = null; + private analyser: AnalyserNode | null = null; + private microphone: MediaStreamAudioSourceNode | null = null; + private volumeCheckInterval: number | null = null; + private isRunning: boolean = false; + private config: VADConfig; + + constructor(config: VADConfig = {}) { + this.config = { + positiveSpeechThreshold: 0.8, + negativeSpeechThreshold: 0.5, + minSpeechMs: 150, + preSpeechPadMs: 300, + redemptionMs: 600, + ...config, + }; + } + + /** + * Initialize and start VAD + */ + async start(): Promise { + if (this.isRunning) { + console.warn("[VAD] Already running"); + return; + } + + try { + console.log("[VAD] Requesting microphone access..."); + + // Request microphone permission + const stream = await navigator.mediaDevices.getUserMedia({ + audio: { + echoCancellation: true, + noiseSuppression: true, + autoGainControl: true, + }, + }); + + // Create AudioContext for volume detection + this.audioContext = new AudioContext(); + this.analyser = this.audioContext.createAnalyser(); + this.analyser.fftSize = 256; + this.microphone = this.audioContext.createMediaStreamSource(stream); + this.microphone.connect(this.analyser); + + // Start volume monitoring + this.startVolumeMonitoring(); + + console.log("[VAD] Initializing VAD..."); + + // Get asset paths (Chrome extension context) + const vadBasePath = chrome.runtime.getURL("assets/vad/"); + const onnxBasePath = chrome.runtime.getURL("assets/onnx/"); + + console.log("[VAD] Asset paths configured"); + + // Verify resources are accessible + try { + const modelUrl = chrome.runtime.getURL( + "assets/vad/silero_vad_legacy.onnx", + ); + const wasmUrl = chrome.runtime.getURL("assets/onnx/ort-wasm-simd.wasm"); + + console.log("[VAD] Checking resources accessibility..."); + + const [modelResp, wasmResp] = await Promise.all([ + fetch(modelUrl, { method: "HEAD" }), + fetch(wasmUrl, { method: "HEAD" }), + ]); + + console.log("[VAD] Resources check:", { + model: modelResp.ok, + wasm: wasmResp.ok, + }); + } catch (_e) { + console.warn("[VAD] Resource check failed"); + } + + // Configure onnxruntime-web paths + // @ts-expect-error - MicVAD uses ort internally + if (window.ort) { + // @ts-expect-error + window.ort.env.wasm.wasmPaths = onnxBasePath; + // Force single thread to avoid threaded WASM loading issues and SharedArrayBuffer compatibility + // @ts-expect-error + window.ort.env.wasm.numThreads = 1; + // Disable eval usage (onnxruntime-web may try to use new Function) + // @ts-expect-error + window.ort.env.wasm.proxy = false; + } + + this.vad = await MicVAD.new({ + baseAssetPath: vadBasePath, + onnxWASMBasePath: onnxBasePath, + positiveSpeechThreshold: this.config.positiveSpeechThreshold!, + negativeSpeechThreshold: this.config.negativeSpeechThreshold!, + minSpeechMs: this.config.minSpeechMs!, + preSpeechPadMs: this.config.preSpeechPadMs!, + redemptionMs: this.config.redemptionMs!, + onSpeechStart: () => { + console.log("[VAD] Speech started"); + this.config.onSpeechStart?.(); + }, + onSpeechEnd: (audio) => { + // Log only audio length, not content (privacy) + console.log("[VAD] Speech ended, audio samples:", audio.length); + this.config.onSpeechEnd?.(audio); + }, + onVADMisfire: () => { + console.log("[VAD] Misfire detected"); + this.config.onVADMisfire?.(); + }, + }); + + this.vad.start(); + this.isRunning = true; + console.log("[VAD] Started successfully"); + } catch (error) { + console.error("[VAD] Failed to start"); + this.cleanup(); + throw error; + } + } + + /** + * Stop VAD + */ + async stop(): Promise { + if (!this.isRunning) { + console.log("[VAD] Already stopped, skipping"); + return; + } + + console.log("[VAD] Stopping VAD..."); + + // Immediately mark as not running + this.isRunning = false; + + // Stop volume monitoring + this.stopVolumeMonitoring(); + + // Stop VAD + if (this.vad) { + console.log("[VAD] Pausing MicVAD..."); + this.vad.pause(); + this.vad = null; + } + + // Cleanup audio resources + this.cleanup(); + + console.log("[VAD] VAD stopped completely"); + } + + /** + * Pause VAD (without releasing resources) + */ + pause(): void { + if (this.vad && this.isRunning) { + this.vad.pause(); + this.stopVolumeMonitoring(); + console.log("[VAD] Paused"); + } + } + + /** + * Resume VAD + */ + resume(): void { + if (this.vad && this.isRunning) { + this.vad.start(); + this.startVolumeMonitoring(); + console.log("[VAD] Resumed"); + } + } + + /** + * Start volume monitoring + */ + private startVolumeMonitoring(): void { + if (!this.analyser || this.volumeCheckInterval !== null) { + return; + } + + const bufferLength = this.analyser.frequencyBinCount; + const dataArray = new Uint8Array(bufferLength); + + const checkVolume = () => { + if (!this.analyser) return; + + this.analyser.getByteFrequencyData(dataArray); + + // Calculate average volume + let sum = 0; + for (let i = 0; i < bufferLength; i++) { + sum += dataArray[i] ?? 0; + } + const average = sum / bufferLength; + + // Normalize to 0-1 + const volume = average / 255; + + this.config.onVolumeChange?.(volume); + }; + + // Check volume every 50ms + this.volumeCheckInterval = window.setInterval(checkVolume, 50); + } + + /** + * Stop volume monitoring + */ + private stopVolumeMonitoring(): void { + if (this.volumeCheckInterval !== null) { + clearInterval(this.volumeCheckInterval); + this.volumeCheckInterval = null; + } + } + + /** + * Cleanup resources + */ + private cleanup(): void { + if (this.microphone) { + this.microphone.disconnect(); + this.microphone = null; + } + + if (this.analyser) { + this.analyser.disconnect(); + this.analyser = null; + } + + if (this.audioContext) { + this.audioContext.close(); + this.audioContext = null; + } + } + + /** + * Check if VAD is running + */ + isActive(): boolean { + return this.isRunning; + } +} diff --git a/packages/aipex-react/src/types/chat.ts b/packages/aipex-react/src/types/chat.ts index 500453d..a0d47ba 100644 --- a/packages/aipex-react/src/types/chat.ts +++ b/packages/aipex-react/src/types/chat.ts @@ -1,3 +1,4 @@ +import type { AgentMetrics } from "@aipexstudio/aipex-core"; import type { ComponentType, HTMLAttributes, ReactNode } from "react"; import type { ChatStatus, @@ -197,4 +198,9 @@ export interface ChatbotEventHandlers { onToolExecute?: (toolName: string, input: unknown) => void; /** Called when a tool completes */ onToolComplete?: (toolName: string, result: unknown) => void; + /** Called when metrics are updated */ + onMetricsUpdate?: (metrics: AgentMetrics, sessionId?: string) => void; } + +// Re-export AgentMetrics for convenience +export type { AgentMetrics } from "@aipexstudio/aipex-core"; diff --git a/packages/aipex-react/src/types/index.ts b/packages/aipex-react/src/types/index.ts index 0561c7c..7e3f6e7 100644 --- a/packages/aipex-react/src/types/index.ts +++ b/packages/aipex-react/src/types/index.ts @@ -1,6 +1,6 @@ // UI Types -export type { AppSettings } from "@aipexstudio/aipex-core"; +export type { AgentMetrics, AppSettings } from "@aipexstudio/aipex-core"; // Adapter Types export type { ChatAdapterOptions, ChatAdapterState } from "./adapter"; diff --git a/packages/browser-ext/manifest.json b/packages/browser-ext/manifest.json index 46df41f..50cab94 100644 --- a/packages/browser-ext/manifest.json +++ b/packages/browser-ext/manifest.json @@ -40,7 +40,7 @@ }, "web_accessible_resources": [ { - "resources": ["assets/*"], + "resources": ["assets/*", "assets/vad/*", "assets/onnx/*"], "matches": [""] } ], diff --git a/packages/browser-ext/package.json b/packages/browser-ext/package.json index 680a185..ca88c47 100644 --- a/packages/browser-ext/package.json +++ b/packages/browser-ext/package.json @@ -37,6 +37,8 @@ "@aipexstudio/browser-runtime": "workspace:*", "@aipexstudio/dom-snapshot": "workspace:*", "@modelcontextprotocol/sdk": "^1.26.0", + "@ricky0123/vad-web": "^0.0.27", + "onnxruntime-web": "^1.22.0", "@radix-ui/react-avatar": "^1.1.11", "@radix-ui/react-collapsible": "^1.1.12", "@radix-ui/react-dialog": "^1.1.15", diff --git a/packages/browser-ext/src/auth/AuthProvider.tsx b/packages/browser-ext/src/auth/AuthProvider.tsx new file mode 100644 index 0000000..650eef2 --- /dev/null +++ b/packages/browser-ext/src/auth/AuthProvider.tsx @@ -0,0 +1,498 @@ +/** + * Authentication Provider for browser extension + * Manages user authentication state, cookie sync, and login/logout flows + */ + +import React, { + createContext, + type ReactNode, + useCallback, + useContext, + useEffect, + useState, +} from "react"; +import { AUTH_COOKIE_NAMES, WEBSITE_URL } from "../services/web-auth"; + +/** + * User data structure + */ +export interface User { + id: string; + name: string; + email: string; + image: string; + provider: string; +} + +/** + * Auth context type + */ +export interface AuthContextType { + user: User | null; + isLoading: boolean; + authChecked: boolean; + login: () => Promise; + logout: () => Promise; +} + +const AuthContext = createContext(null); + +interface AuthProviderProps { + children: ReactNode; +} + +/** + * Chrome storage wrapper for user data + */ +class AuthStorage { + private area = chrome.storage.local; + + async getUser(): Promise { + try { + const result = await this.area.get("user"); + const user = result.user as User | undefined; + return user ?? null; + } catch (_error) { + console.error("[AuthProvider] Failed to get user from storage"); + return null; + } + } + + async setUser(user: User): Promise { + try { + await this.area.set({ user }); + } catch (_error) { + console.error("[AuthProvider] Failed to save user to storage"); + } + } + + async removeUser(): Promise { + try { + await this.area.remove("user"); + } catch (_error) { + console.error("[AuthProvider] Failed to remove user from storage"); + } + } +} + +const storage = new AuthStorage(); + +export const AuthProvider: React.FC = ({ children }) => { + const [user, setUser] = useState(null); + const [isLoading, setIsLoading] = useState(true); + const [authChecked, setAuthChecked] = useState(false); + + /** + * Save user data to storage and state + */ + const saveUserData = useCallback(async (newUser: User) => { + await storage.setUser(newUser); + setUser(newUser); + }, []); + + /** + * Clear authentication data + */ + const clearAuthData = useCallback(async () => { + await storage.removeUser(); + setUser(null); + }, []); + + /** + * Check authentication via API + */ + const checkCookieAuth = useCallback(async (): Promise => { + try { + console.log("[AuthProvider] Checking authentication via API..."); + + // Get all claudechrome.com cookies + const cookies = await chrome.cookies.getAll({ + url: WEBSITE_URL, + }); + + // Log only cookie count, not values (security) + console.log("[AuthProvider] Found cookies:", cookies.length); + + // Check if there are auth-related cookies + const hasAuthCookie = cookies.some( + (c) => c.name.includes("better-auth") || c.name.includes("session"), + ); + + if (!hasAuthCookie) { + console.log("[AuthProvider] No authentication cookies found"); + return false; + } + + // Call website's auth verify API + try { + const response = await fetch(`${WEBSITE_URL}/api/auth/verify`, { + method: "GET", + credentials: "include", + }); + + if (response.ok) { + const sessionData = await response.json(); + // Log only success status, not user data (PII) + console.log( + "[AuthProvider] API check successful:", + sessionData?.authenticated, + ); + + if (sessionData?.authenticated && sessionData?.user) { + const userData: User = { + id: sessionData.user.id || sessionData.user.email, + name: sessionData.user.name || sessionData.user.email, + email: sessionData.user.email, + image: sessionData.user.image || "", + provider: sessionData.user.provider || "email", + }; + + await saveUserData(userData); + console.log("[AuthProvider] User loaded from API"); + return true; + } + } else { + console.log("[AuthProvider] API returned:", response.status); + } + } catch (_apiError) { + console.log("[AuthProvider] Direct API call failed"); + } + + // If direct API call fails, try tab injection method + const tabs = await chrome.tabs.query({ url: `${WEBSITE_URL}/*` }); + const targetTab = tabs[0]; + + if (targetTab?.id) { + // Inject script to get session data + const results = await chrome.scripting.executeScript({ + target: { tabId: targetTab.id }, + func: async () => { + try { + const response = await fetch("/api/auth/verify", { + method: "GET", + credentials: "include", + }); + if (response.ok) { + return await response.json(); + } + return null; + } catch { + return null; + } + }, + }); + + const sessionData = results?.[0]?.result; + if (sessionData?.authenticated && sessionData?.user) { + const userData: User = { + id: sessionData.user.id || sessionData.user.email, + name: sessionData.user.name || sessionData.user.email, + email: sessionData.user.email, + image: sessionData.user.image || "", + provider: sessionData.user.provider || "email", + }; + + await saveUserData(userData); + console.log("[AuthProvider] User loaded from tab injection"); + return true; + } + } + + return false; + } catch (_error) { + console.error("[AuthProvider] Failed to check cookie auth"); + return false; + } + }, [saveUserData]); + + // Listen for message from auth success page + useEffect(() => { + const handleMessage = (event: MessageEvent) => { + if (event.origin !== WEBSITE_URL) return; + + if (event.data.type === "AUTH_SUCCESS") { + const { user: newUser } = event.data; + // Validate user structure before saving + if ( + newUser && + typeof newUser.email === "string" && + newUser.email.length > 0 && + newUser.email.length < 256 + ) { + saveUserData(newUser); + } + } + }; + + window.addEventListener("message", handleMessage); + return () => window.removeEventListener("message", handleMessage); + }, [saveUserData]); + + // Listen for tab updates to detect auth success page + useEffect(() => { + const handleTabUpdate = ( + tabId: number, + changeInfo: chrome.tabs.OnUpdatedInfo, + tab: chrome.tabs.Tab, + ) => { + if ( + changeInfo.status === "complete" && + tab.url && + tab.url.includes("/auth/extension-success") + ) { + // Delay check to ensure localStorage is set + setTimeout(async () => { + try { + const results = await chrome.scripting.executeScript({ + target: { tabId }, + func: () => { + const token = localStorage.getItem("extension_auth_token"); + const userStr = localStorage.getItem("extension_user"); + if (token && userStr) { + try { + const user = JSON.parse(userStr); + // Clear localStorage + localStorage.removeItem("extension_auth_token"); + localStorage.removeItem("extension_user"); + return { token, user }; + } catch { + return null; + } + } + return null; + }, + }); + + const result = results?.[0]?.result; + if ( + result?.user && + typeof result.user.email === "string" && + result.user.email.length > 0 + ) { + console.log("[AuthProvider] Got auth data from tab"); + await saveUserData(result.user); + } + } catch (_error) { + console.error("[AuthProvider] Error checking auth on tab"); + } + }, 1000); + } + }; + + if (typeof chrome !== "undefined" && chrome.tabs) { + chrome.tabs.onUpdated.addListener(handleTabUpdate); + return () => { + chrome.tabs.onUpdated.removeListener(handleTabUpdate); + }; + } + }, [saveUserData]); + + // Listen for cookie changes to sync website login state + useEffect(() => { + if (typeof chrome === "undefined" || !chrome.cookies) return; + + const handleCookieChange = async ( + changeInfo: chrome.cookies.CookieChangeInfo, + ) => { + // Only care about claudechrome.com domain auth cookies + if (!changeInfo.cookie.domain.includes("claudechrome.com")) return; + if (!AUTH_COOKIE_NAMES.includes(changeInfo.cookie.name)) return; + + console.log("[AuthProvider] Auth cookie changed:", { + name: changeInfo.cookie.name, + removed: changeInfo.removed, + }); + + if (changeInfo.removed) { + // Cookie was removed, user may have logged out on website + console.log("[AuthProvider] Auth cookie removed, checking..."); + setTimeout(async () => { + const hasAuthCookie = await chrome.cookies + .getAll({ url: WEBSITE_URL }) + .then((cookies) => + cookies.some((c) => AUTH_COOKIE_NAMES.includes(c.name)), + ); + + if (!hasAuthCookie && user) { + console.log("[AuthProvider] No auth cookies found, logging out"); + await clearAuthData(); + } + }, 500); + } else { + // Cookie was set or updated, user may have logged in + console.log("[AuthProvider] Auth cookie set, checking auth..."); + setTimeout(async () => { + const success = await checkCookieAuth(); + if (success) { + console.log("[AuthProvider] Successfully synced auth"); + } + }, 500); + } + }; + + chrome.cookies.onChanged.addListener(handleCookieChange); + + return () => { + chrome.cookies.onChanged.removeListener(handleCookieChange); + }; + }, [user, checkCookieAuth, clearAuthData]); + + // Initialize: load auth data from storage, check cookie if needed + useEffect(() => { + const loadAuthData = async () => { + try { + const savedUser = await storage.getUser(); + + if (savedUser) { + setUser(savedUser); + // Async validate cookie, don't block UI + checkCookieAuth() + .then((isValid) => { + if (!isValid) { + setUser(null); + storage.removeUser(); + } + }) + .catch(() => { + console.error("[AuthProvider] Failed to validate cookie"); + }); + } else { + // Async check cookie + checkCookieAuth().catch(() => { + console.error("[AuthProvider] Failed to check cookie auth"); + }); + } + } catch (_error) { + console.error("[AuthProvider] Failed to load auth data"); + } finally { + setIsLoading(false); + setAuthChecked(true); + } + }; + + loadAuthData(); + }, [checkCookieAuth]); + + const login = useCallback(async () => { + console.log("[AuthProvider] Login function called"); + try { + const authUrl = `${WEBSITE_URL}/auth/login?source=extension`; + console.log("[AuthProvider] Opening auth URL"); + + let tabCreated = false; + + if (typeof chrome !== "undefined" && chrome.tabs) { + try { + await chrome.tabs.create({ url: authUrl }); + console.log("[AuthProvider] Tab created successfully"); + tabCreated = true; + } catch (_chromeError) { + console.error("[AuthProvider] chrome.tabs.create failed"); + } + } + + // Fallback if Chrome API fails + if (!tabCreated) { + console.log("[AuthProvider] Using fallback method"); + try { + window.open(authUrl, "_blank"); + } catch { + window.location.href = authUrl; + } + } + } catch (_error) { + console.error("[AuthProvider] Login failed"); + } + }, []); + + const logout = useCallback(async () => { + try { + // 1. Clear local extension data + await clearAuthData(); + + // 2. Clear website cookies + const cookies = await chrome.cookies.getAll({ + url: WEBSITE_URL, + }); + + for (const cookie of cookies) { + if ( + cookie.name.includes("better-auth") || + cookie.name.includes("session") + ) { + await chrome.cookies.remove({ + url: WEBSITE_URL, + name: cookie.name, + }); + } + } + + // 3. Notify website to sign out + try { + await fetch(`${WEBSITE_URL}/api/auth/signout`, { + method: "POST", + headers: { + "Content-Type": "application/json", + }, + }); + } catch { + console.warn("[AuthProvider] Failed to sign out from website"); + } + + // 4. Clear all related localStorage data + if (typeof chrome !== "undefined" && chrome.tabs) { + try { + const tabs = await chrome.tabs.query({}); + for (const tab of tabs) { + if (tab.id && tab.url && tab.url.includes(WEBSITE_URL)) { + try { + await chrome.scripting.executeScript({ + target: { tabId: tab.id }, + func: () => { + localStorage.removeItem("extension_user"); + Object.keys(localStorage).forEach((key) => { + if (key.startsWith("better-auth")) { + localStorage.removeItem(key); + } + }); + }, + }); + } catch { + // Ignore inaccessible tabs + } + } + } + } catch { + console.warn("[AuthProvider] Failed to clear localStorage"); + } + } + + console.log("[AuthProvider] Logout completed successfully"); + } catch (_error) { + console.error("[AuthProvider] Logout failed"); + } + }, [clearAuthData]); + + const contextValue: AuthContextType = { + user, + isLoading: isLoading && !authChecked, // Only show loading when not checked + authChecked, + login, + logout, + }; + + return ( + {children} + ); +}; + +/** + * Hook to access auth context + */ +export const useAuth = (): AuthContextType => { + const context = useContext(AuthContext); + if (!context) { + throw new Error("useAuth must be used within AuthProvider"); + } + return context; +}; diff --git a/packages/browser-ext/src/auth/UserProfile.tsx b/packages/browser-ext/src/auth/UserProfile.tsx new file mode 100644 index 0000000..9cd15fc --- /dev/null +++ b/packages/browser-ext/src/auth/UserProfile.tsx @@ -0,0 +1,131 @@ +/** + * User Profile dropdown component + * Displays user avatar and provides account/logout options + */ + +import type React from "react"; +import { useState } from "react"; +import { WEBSITE_URL } from "../services/web-auth"; +import { useAuth } from "./AuthProvider"; + +export const UserProfile: React.FC = () => { + const { user, logout } = useAuth(); + const [showDropdown, setShowDropdown] = useState(false); + + if (!user) return null; + + const handleLogout = async () => { + setShowDropdown(false); + await logout(); + }; + + const handleAccountClick = () => { + setShowDropdown(false); + chrome.tabs.create({ url: `${WEBSITE_URL}/settings/credits` }); + }; + + return ( +
+ {/* User Avatar Button */} + + + {/* Dropdown Menu */} + {showDropdown && ( +
+ {/* User Info */} +
+
+ {user.name || user.email.split("@")[0]} +
+
+ {user.email} +
+
+ + {/* Account Button */} + + + {/* Logout Button */} + +
+ )} + + {/* Click outside to close dropdown */} + {showDropdown && ( +
setShowDropdown(false)} + onKeyDown={(e) => { + if (e.key === "Escape") { + setShowDropdown(false); + } + }} + role="button" + tabIndex={0} + aria-label="Close dropdown" + /> + )} +
+ ); +}; + +export default UserProfile; diff --git a/packages/browser-ext/src/auth/index.ts b/packages/browser-ext/src/auth/index.ts new file mode 100644 index 0000000..fd70852 --- /dev/null +++ b/packages/browser-ext/src/auth/index.ts @@ -0,0 +1,7 @@ +/** + * Auth module exports + */ + +export type { AuthContextType, User } from "./AuthProvider"; +export { AuthProvider, useAuth } from "./AuthProvider"; +export { UserProfile } from "./UserProfile"; diff --git a/packages/browser-ext/src/lib/automation-mode-toolbar.tsx b/packages/browser-ext/src/lib/automation-mode-toolbar.tsx index 26aad09..0b96c37 100644 --- a/packages/browser-ext/src/lib/automation-mode-toolbar.tsx +++ b/packages/browser-ext/src/lib/automation-mode-toolbar.tsx @@ -11,6 +11,7 @@ import { validateAutomationMode, } from "@aipexstudio/aipex-core"; import type { InputToolbarSlotProps } from "@aipexstudio/aipex-react"; +import { TokenUsageIndicator } from "@aipexstudio/aipex-react/components/chatbot"; import { Button } from "@aipexstudio/aipex-react/components/ui/button"; import { DropdownMenu, @@ -78,6 +79,9 @@ export function AutomationModeInputToolbar({ return (
+ {/* Token Usage Indicator - compact mode next to automation mode */} + + {/* Automation Mode Selector */} diff --git a/packages/browser-ext/src/lib/browser-agent-config.ts b/packages/browser-ext/src/lib/browser-agent-config.ts index 36f652d..14f31c4 100644 --- a/packages/browser-ext/src/lib/browser-agent-config.ts +++ b/packages/browser-ext/src/lib/browser-agent-config.ts @@ -110,5 +110,5 @@ export function useBrowserTools(): FunctionTool[] { export const BROWSER_AGENT_CONFIG = { instructions: SYSTEM_PROMPT, name: "AIPex Browser Assistant", - maxTurns: 10, + maxTurns: 2000, } as const; diff --git a/packages/browser-ext/src/lib/browser-chat-header.tsx b/packages/browser-ext/src/lib/browser-chat-header.tsx index 46ff66e..5deb85c 100644 --- a/packages/browser-ext/src/lib/browser-chat-header.tsx +++ b/packages/browser-ext/src/lib/browser-chat-header.tsx @@ -1,6 +1,6 @@ /** * BrowserChatHeader - * Custom header with conversation persistence, history dropdown, intervention toggle + * Custom header with conversation persistence and history dropdown */ import { useChatContext } from "@aipexstudio/aipex-react/components/chatbot"; @@ -12,9 +12,8 @@ import type { HeaderProps } from "@aipexstudio/aipex-react/types"; import { conversationStorage } from "@aipexstudio/browser-runtime"; import { PlusIcon, SettingsIcon } from "lucide-react"; import { useCallback, useEffect, useRef, useState } from "react"; +import { UserProfile, useAuth } from "../auth"; import { ConversationHistory } from "./conversation-history"; -import { useInterventionMode } from "./intervention-mode-context"; -import { InterventionModeToggleHeader } from "./intervention-ui"; import { fromStorageFormat, toStorageFormat } from "./message-adapter"; export function BrowserChatHeader({ @@ -28,7 +27,7 @@ export function BrowserChatHeader({ const { t } = useTranslation(); const runtime = getRuntime(); const { messages, setMessages, interrupt } = useChatContext(); - const { mode, setMode } = useInterventionMode(); + const { user, login, isLoading: isAuthLoading } = useAuth(); const [currentConversationId, setCurrentConversationId] = useState< string | undefined @@ -146,26 +145,35 @@ export function BrowserChatHeader({ {t("common.settings")} - {/* Center - Intervention toggle and History */} -
- - -
+ {/* Center - History */} + - {/* Right side - New Chat */} - + {/* Right side - New Chat and User Profile */} +
+ + + {/* User Profile or Login Button */} + {!isAuthLoading && + (user ? ( + + ) : ( + + ))} +
{children}
diff --git a/packages/browser-ext/src/lib/intervention-ui.tsx b/packages/browser-ext/src/lib/intervention-ui.tsx index 8ff7edf..0ec280d 100644 --- a/packages/browser-ext/src/lib/intervention-ui.tsx +++ b/packages/browser-ext/src/lib/intervention-ui.tsx @@ -65,8 +65,27 @@ export function InterventionUI({ mode }: InterventionUIProps) { setTimeout(() => setCurrentIntervention(null), 3000); }; - const handleInterventionCancel = () => { - setCurrentIntervention(null); + const handleInterventionCancel = (event: InterventionEvent) => { + // Update state to show the cancellation reason instead of immediately hiding + const current = interventionManager.getCurrentIntervention(); + if (current) { + setCurrentIntervention(current); + } else if ( + event.data && + typeof event.data === "object" && + "result" in event.data + ) { + // If we don't have current intervention but have result data, + // log the cancel reason for debugging + const result = (event.data as { result?: { error?: string } }).result; + if (result?.error) { + console.log( + `[InterventionUI] Intervention cancelled: ${result.error}`, + ); + } + } + // Keep visible briefly so user can see the cancellation, then hide + setTimeout(() => setCurrentIntervention(null), 2000); }; const handleInterventionTimeout = () => { diff --git a/packages/browser-ext/src/lib/message-adapter.ts b/packages/browser-ext/src/lib/message-adapter.ts index f479f65..558266c 100644 --- a/packages/browser-ext/src/lib/message-adapter.ts +++ b/packages/browser-ext/src/lib/message-adapter.ts @@ -29,10 +29,15 @@ export function toStorageFormat( case "tool": // Map tool to tool_use or tool_result based on state if (part.output !== undefined) { + // Avoid double-stringifying if output is already a string + const content = + typeof part.output === "string" + ? part.output + : JSON.stringify(part.output); return { type: "tool_result", tool_use_id: part.toolCallId, - content: JSON.stringify(part.output), + content, is_error: part.state === "error", }; } @@ -56,15 +61,79 @@ export function toStorageFormat( } /** - * Convert runtime UIMessage back to aipex-react UIMessage for display + * Safely parse a JSON string, returning undefined on failure + */ +function safeJsonParse(value: unknown): T | undefined { + if (typeof value !== "string") { + return undefined; + } + try { + return JSON.parse(value) as T; + } catch { + return undefined; + } +} + +/** + * Check if a tool result indicates a business-level failure. + * Many tools return { success: false, error: "..." } instead of throwing. + */ +function extractBusinessFailure( + result: unknown, +): { errorMessage: string } | null { + if (result === null || result === undefined) { + return null; + } + + if (typeof result !== "object") { + return null; + } + + const obj = result as Record; + + // Check for common failure patterns: { success: false, error: ... } + if (obj.success === false) { + // Extract error message + if (typeof obj.error === "string" && obj.error.length > 0) { + return { errorMessage: obj.error }; + } + if (typeof obj.message === "string" && obj.message.length > 0) { + return { errorMessage: obj.message }; + } + // Generic failure message + return { errorMessage: "Operation failed" }; + } + + return null; +} + +/** + * Convert runtime UIMessage back to aipex-react UIMessage for display. + * This function: + * - Correlates tool_use and tool_result parts by id to restore proper toolName and input + * - Parses JSON-stringified tool content + * - Detects {success: false, error} patterns and sets state/errorText accordingly */ export function fromStorageFormat( messages: RuntimeUIMessage[], ): ReactUIMessage[] { - return messages.map((msg) => ({ - id: msg.id, - role: msg.role as ReactUIMessage["role"], - parts: msg.parts.map((part) => { + return messages.map((msg) => { + // First pass: build a map of tool_use parts by their ID + const toolUseMap = new Map< + string, + { name: string; input: Record } + >(); + for (const part of msg.parts) { + if (part.type === "tool_use") { + toolUseMap.set(part.id, { + name: part.name, + input: part.input, + }); + } + } + + // Second pass: convert parts with proper correlation + const convertedParts = msg.parts.map((part) => { switch (part.type) { case "text": return { type: "text", text: part.text }; @@ -77,26 +146,126 @@ export function fromStorageFormat( url: part.imageData, }; case "tool_use": + // We'll merge this with tool_result if both exist, + // but if no result, show as executing/pending return { type: "tool", toolName: part.name, toolCallId: part.id, input: part.input, - state: "completed" as const, + state: "pending" as const, }; - case "tool_result": + case "tool_result": { + // Correlate with tool_use to get proper toolName and input + const toolUseInfo = toolUseMap.get(part.tool_use_id); + const toolName = toolUseInfo?.name ?? "unknown"; + const input = toolUseInfo?.input ?? {}; + + // Parse the content - it may be JSON-stringified + let parsedOutput: unknown = part.content; + const parsed = safeJsonParse(part.content); + if (parsed !== undefined) { + parsedOutput = parsed; + } + + // Check for is_error flag first + if (part.is_error) { + // Extract error message from the parsed output if possible + let errorText = "Tool execution failed"; + if (typeof parsedOutput === "string" && parsedOutput.length > 0) { + errorText = parsedOutput; + } else if ( + typeof parsedOutput === "object" && + parsedOutput !== null + ) { + const obj = parsedOutput as Record; + if (typeof obj.error === "string") { + errorText = obj.error; + } else if (typeof obj.message === "string") { + errorText = obj.message; + } + } + return { + type: "tool", + toolName, + toolCallId: part.tool_use_id, + input, + output: parsedOutput, + state: "error" as const, + errorText, + }; + } + + // Check for business-level failure ({success: false, error: ...}) + const failureInfo = extractBusinessFailure(parsedOutput); + if (failureInfo) { + return { + type: "tool", + toolName, + toolCallId: part.tool_use_id, + input, + output: parsedOutput, + state: "error" as const, + errorText: failureInfo.errorMessage, + }; + } + + // Normal successful completion return { type: "tool", - toolName: "unknown", + toolName, toolCallId: part.tool_use_id, - input: {}, - output: part.content, - state: part.is_error ? ("error" as const) : ("completed" as const), + input, + output: parsedOutput, + state: "completed" as const, }; + } default: return { type: "text", text: "[unknown]" }; } - }), - timestamp: msg.timestamp, - })) as ReactUIMessage[]; + }); + + // Third pass: merge tool_use with tool_result if both exist for the same call + // This avoids showing duplicate tool parts + const mergedParts: (typeof convertedParts)[number][] = []; + const processedToolCallIds = new Set(); + + for (const part of convertedParts) { + if (part.type === "tool") { + const toolCallId = part.toolCallId; + // Skip if toolCallId is missing or we've already processed this tool call + if (!toolCallId || processedToolCallIds.has(toolCallId)) { + continue; + } + + // Find if there's a corresponding result for this tool call + const resultPart = convertedParts.find( + (p) => + p.type === "tool" && + p.toolCallId === toolCallId && + p.state !== "pending" && + p !== part, + ); + + if (resultPart && resultPart.type === "tool") { + // Use the result part (which has the full info) + mergedParts.push(resultPart); + } else { + // No result, use the original part + mergedParts.push(part); + } + + processedToolCallIds.add(toolCallId); + } else { + mergedParts.push(part); + } + } + + return { + id: msg.id, + role: msg.role as ReactUIMessage["role"], + parts: mergedParts, + timestamp: msg.timestamp, + }; + }) as ReactUIMessage[]; } diff --git a/packages/browser-ext/src/pages/common/app-root.tsx b/packages/browser-ext/src/pages/common/app-root.tsx index 1b6d784..5a0bc0c 100644 --- a/packages/browser-ext/src/pages/common/app-root.tsx +++ b/packages/browser-ext/src/pages/common/app-root.tsx @@ -13,6 +13,7 @@ import type { Theme } from "@aipexstudio/aipex-react/theme/types"; import { ChromeStorageAdapter } from "@aipexstudio/browser-runtime"; import React, { useState } from "react"; import ReactDOM from "react-dom/client"; +import { AuthProvider } from "../../auth"; import { chromeStorageAdapter } from "../../hooks"; import { AutomationModeInputToolbar } from "../../lib/automation-mode-toolbar"; import { @@ -97,7 +98,9 @@ export function renderChatApp() { const App = () => ( - + + + ); diff --git a/packages/browser-ext/src/services/web-auth.ts b/packages/browser-ext/src/services/web-auth.ts new file mode 100644 index 0000000..387f277 --- /dev/null +++ b/packages/browser-ext/src/services/web-auth.ts @@ -0,0 +1,63 @@ +/** + * Public website configuration and authentication cookie utilities + */ + +export const WEBSITE_URL = "https://www.claudechrome.com"; + +/** + * Aggregate claudechrome website authentication cookies and generate Cookie header content. + * Note: Returns only presence indication, not actual cookie values in logs (security). + */ +export async function getAuthCookieHeader(): Promise { + try { + const cookies = await chrome.cookies.getAll({ url: WEBSITE_URL }); + + const relevantCookies = cookies.filter( + (cookie) => + cookie.name.includes("better-auth") || cookie.name.includes("session"), + ); + + if (!relevantCookies.length) { + console.log("[web-auth] No auth cookies found"); + return undefined; + } + + // Log only cookie presence, not values + console.log("[web-auth] Found auth cookies:", relevantCookies.length); + + return relevantCookies + .map((cookie) => `${cookie.name}=${cookie.value}`) + .join("; "); + } catch (_error) { + console.warn("[web-auth] Failed to get cookies"); + return undefined; + } +} + +/** + * Check if user has authentication cookies (without reading values) + */ +export async function hasAuthCookies(): Promise { + try { + const cookies = await chrome.cookies.getAll({ url: WEBSITE_URL }); + return cookies.some( + (cookie) => + cookie.name.includes("better-auth") || cookie.name.includes("session"), + ); + } catch (_error) { + console.warn("[web-auth] Failed to check cookies"); + return false; + } +} + +/** + * List of known auth cookie names + */ +export const AUTH_COOKIE_NAMES = [ + "__Secure-next-auth.session-token", + "next-auth.session-token", + "__Secure-better-auth.session_token", + "better-auth.session_token", + "__Secure-better-auth.session_data", + "better-auth.session_data", +]; diff --git a/packages/browser-ext/vite.config.ts b/packages/browser-ext/vite.config.ts index 61ba59c..d125d8e 100644 --- a/packages/browser-ext/vite.config.ts +++ b/packages/browser-ext/vite.config.ts @@ -20,6 +20,28 @@ export default defineConfig({ src: "host-access-config.json", dest: ".", }, + // VAD assets for voice mode + { + src: "node_modules/@ricky0123/vad-web/dist/vad.worklet.bundle.min.js", + dest: "assets/vad", + }, + { + src: "node_modules/@ricky0123/vad-web/dist/silero_vad_legacy.onnx", + dest: "assets/vad", + }, + { + src: "node_modules/@ricky0123/vad-web/dist/silero_vad_v5.onnx", + dest: "assets/vad", + }, + // ONNX runtime WASM files for VAD + { + src: "node_modules/onnxruntime-web/dist/*.wasm", + dest: "assets/onnx", + }, + { + src: "node_modules/onnxruntime-web/dist/*.mjs", + dest: "assets/onnx", + }, ], }), ], diff --git a/packages/browser-runtime/README.md b/packages/browser-runtime/README.md index 0934092..ed58dc7 100644 --- a/packages/browser-runtime/README.md +++ b/packages/browser-runtime/README.md @@ -23,26 +23,48 @@ AIPex is split into layers so each stays focused: ## Features -### 1) `allBrowserTools` (31 tools) +### 1) `allBrowserTools` (32 tools) `allBrowserTools` is a curated bundle of `FunctionTool`s that an agent can call. It includes: -- **Tab management**: list/switch/open/duplicate/close, basic grouping helpers -- **UI operations**: locate elements, click, hover, fill inputs/forms, wait +- **Tab management**: list/open/close, basic grouping helpers +- **UI operations**: locate elements, click, hover, fill inputs/forms, computer tool - **Page content**: metadata, scrolling, highlighting -- **Screenshots**: capture to data URL / clipboard -- **Downloads**: save text/images from the agent workflow +- **Screenshots**: capture to data URL +- **Downloads**: save images from the agent workflow - **Human-in-the-loop interventions**: request/cancel interventions +- **Skills**: load/execute skill scripts Tool names included (strings used for tool-calling): -- Tabs: `get_all_tabs`, `get_current_tab`, `switch_to_tab`, `create_new_tab`, `get_tab_info`, `duplicate_tab`, `close_tab`, `organize_tabs`, `ungroup_tabs` -- UI ops: `search_elements`, `click`, `fill_element_by_uid`, `get_editor_value`, `fill_form`, `hover_element_by_uid`, `wait` -- Page: `get_page_metadata`, `scroll_to_element`, `highlight_element`, `highlight_text_inline` -- Screenshot: `capture_screenshot`, `capture_tab_screenshot`, `capture_screenshot_to_clipboard` -- Download: `download_text_as_markdown`, `download_image`, `download_chat_images`, `download_current_chat_images` -- Interventions: `list_interventions`, `get_intervention_info`, `request_intervention`, `cancel_intervention` +- Tabs (7): `get_all_tabs`, `get_current_tab`, `create_new_tab`, `get_tab_info`, `close_tab`, `organize_tabs`, `ungroup_tabs` +- UI ops (7): `search_elements`, `click`, `fill_element_by_uid`, `get_editor_value`, `fill_form`, `hover_element_by_uid`, `computer` +- Page (4): `get_page_metadata`, `scroll_to_element`, `highlight_element`, `highlight_text_inline` +- Screenshot (2): `capture_screenshot`, `capture_tab_screenshot` +- Download (2): `download_image`, `download_chat_images` +- Interventions (4): `list_interventions`, `get_intervention_info`, `request_intervention`, `cancel_intervention` +- Skills (6): `load_skill`, `execute_skill_script`, and 4 other skill tools + +**Disabled tools** (exist in code but not in default bundle): + +- `switch_to_tab`: causes context switching issues +- `duplicate_tab`: not enabled +- `wait`: deprecated, replaced by `computer` tool's wait action +- `capture_screenshot_to_clipboard`: not enabled +- `download_text_as_markdown`: not enabled +- `download_current_chat_images`: architecture issue + +**Available but not registered by default** (can be imported separately): + +- Bookmarks: `list_bookmarks`, `search_bookmarks`, `create_bookmark`, `delete_bookmark`, etc. (`tools/bookmark.ts`) +- History: `get_recent_history`, `search_history`, `delete_history_item`, `clear_history`, etc. (`tools/history.ts`) +- Clipboard: `copy_to_clipboard`, `read_from_clipboard`, `copy_page_as_markdown`, etc. (`tools/tools/clipboard/`) +- Window management: `get_all_windows`, `switch_to_window`, `create_new_window`, etc. (`tools/tools/window-management/`) +- Sessions: `get_all_sessions`, `restore_session`, etc. (`tools/tools/sessions/`) +- Extensions: `get_all_extensions`, `set_extension_enabled`, etc. (`tools/tools/extensions/`) +- Context menus: `create_context_menu_item`, etc. (`tools/tools/context-menus/`) +- Tab groups: `create_tab_group`, `get_all_tab_groups`, etc. (`tools/tools/tab-groups/`) > Note: `take_snapshot` exists but is intentionally not included in `allBrowserTools` because it is used internally. diff --git a/packages/browser-runtime/src/intervention/index.ts b/packages/browser-runtime/src/intervention/index.ts index 36963d8..2b7d1a2 100644 --- a/packages/browser-runtime/src/intervention/index.ts +++ b/packages/browser-runtime/src/intervention/index.ts @@ -13,7 +13,10 @@ export { userSelectionIntervention, } from "./implementations/user-selection.js"; export { voiceInputIntervention } from "./implementations/voice-input.js"; -export { interventionManager } from "./intervention-manager.js"; +export { + type CancelReason, + interventionManager, +} from "./intervention-manager.js"; export { interventionRegistry } from "./intervention-registry.js"; // Types export type { diff --git a/packages/browser-runtime/src/intervention/intervention-manager.ts b/packages/browser-runtime/src/intervention/intervention-manager.ts index aa32bf1..4b9b4b7 100644 --- a/packages/browser-runtime/src/intervention/intervention-manager.ts +++ b/packages/browser-runtime/src/intervention/intervention-manager.ts @@ -23,6 +23,34 @@ import type { type EventListener = (event: InterventionEvent) => void; +/** + * Reasons why an intervention can be cancelled. + * Used to provide more informative error messages to users. + */ +export type CancelReason = + | "user" // User clicked cancel + | "tab_switched" // User switched to another tab + | "page_navigated" // Page URL changed + | "mode_disabled"; // Intervention mode was disabled + +/** + * Map cancel reasons to human-readable messages + */ +function getCancelMessage(reason: CancelReason): string { + switch (reason) { + case "user": + return "Cancelled by user"; + case "tab_switched": + return "Cancelled: browser tab switched"; + case "page_navigated": + return "Cancelled: page navigated to a different URL"; + case "mode_disabled": + return "Cancelled: intervention mode was disabled"; + default: + return "Intervention was cancelled"; + } +} + export class InterventionManager { private static instance: InterventionManager; private currentIntervention: InterventionState | null = null; @@ -69,7 +97,10 @@ export class InterventionManager { // If switching to disabled, cancel all ongoing interventions if (mode === "disabled" && this.currentIntervention) { - this.cancelIntervention(this.currentIntervention.request.id); + this.cancelIntervention( + this.currentIntervention.request.id, + "mode_disabled", + ); } } @@ -278,8 +309,10 @@ export class InterventionManager { /** * Cancel an intervention + * @param id - The intervention ID to cancel + * @param reason - Why the intervention is being cancelled (defaults to "user") */ - cancelIntervention(id: string): boolean { + cancelIntervention(id: string, reason: CancelReason = "user"): boolean { if ( !this.currentIntervention || this.currentIntervention.request.id !== id @@ -290,7 +323,10 @@ export class InterventionManager { return false; } - console.log(`[InterventionManager] Cancelling intervention: ${id}`); + const cancelMessage = getCancelMessage(reason); + console.log( + `[InterventionManager] Cancelling intervention: ${id} (reason: ${reason})`, + ); // Cancel operation if (this.abortController) { @@ -304,7 +340,7 @@ export class InterventionManager { const result: InterventionResult = { success: false, - error: "Cancelled by user", + error: cancelMessage, status: "cancelled", timestamp: Date.now(), duration: Date.now() - this.currentIntervention.startTime, @@ -314,7 +350,7 @@ export class InterventionManager { this.currentIntervention.result = result; this.currentIntervention.endTime = Date.now(); - this.emitEvent("cancel", id, { result }); + this.emitEvent("cancel", id, { result, reason }); this.processNextRequest(); return true; @@ -389,7 +425,10 @@ export class InterventionManager { console.log( "[InterventionManager] Tab switched, cancelling intervention", ); - this.cancelIntervention(this.currentIntervention.request.id); + this.cancelIntervention( + this.currentIntervention.request.id, + "tab_switched", + ); } } }); @@ -405,7 +444,10 @@ export class InterventionManager { console.log( "[InterventionManager] Page navigated, cancelling intervention", ); - this.cancelIntervention(this.currentIntervention.request.id); + this.cancelIntervention( + this.currentIntervention.request.id, + "page_navigated", + ); } } }); diff --git a/packages/browser-runtime/src/lib/vm/quickjs-manager.ts b/packages/browser-runtime/src/lib/vm/quickjs-manager.ts index 1b6a2d8..7d7e85e 100644 --- a/packages/browser-runtime/src/lib/vm/quickjs-manager.ts +++ b/packages/browser-runtime/src/lib/vm/quickjs-manager.ts @@ -7,6 +7,9 @@ */ import { default as RELEASE_SYNC } from "@jitl/quickjs-ng-wasmfile-release-sync"; +// Import the WASM file as a URL so Vite/bundler handles it correctly +// This ensures the wasm is properly bundled and the URL is correct at runtime +import quickjsWasmUrl from "@jitl/quickjs-ng-wasmfile-release-sync/wasm?url"; import fs from "@zenfs/core"; import type { QuickJSContext, @@ -17,6 +20,18 @@ import type { import { newQuickJSWASMModuleFromVariant, Scope } from "quickjs-emscripten"; import type { SkillAPIBridge } from "./skill-api"; +/** + * QuickJS sync variant interface - matches the structure expected by newQuickJSWASMModuleFromVariant + */ +interface QuickJSVariantLike { + importModuleLoader: () => Promise< + (options?: Record) => unknown + >; +} + +// Type assertion for the variant - the default export type is not fully recognized +const variant = RELEASE_SYNC as unknown as QuickJSVariantLike; + interface ExecutionContext { skillId: string; workingDir: string; @@ -54,9 +69,45 @@ class QuickJSManager { "[QuickJS] Initializing runtime with RELEASE_SYNC variant...", ); + // Sanity check: ensure the WASM URL was properly resolved by Vite + if (!quickjsWasmUrl) { + throw new Error( + "[QuickJS] WASM URL is not defined. Vite may not have bundled the wasm file correctly.", + ); + } + console.log(`[QuickJS] WASM URL resolved to: ${quickjsWasmUrl}`); + // Use RELEASE_SYNC variant (required for Chrome extensions due to CSP restrictions) // Chrome extensions don't allow 'wasm-eval' which asyncify variants need - this.quickjs = await newQuickJSWASMModuleFromVariant(RELEASE_SYNC); + // Wrap the variant to override locateFile so the Emscripten loader can find the wasm + const variantWithLocateFile = { + ...variant, + importModuleLoader: async () => { + // Get the original module loader + const originalLoader = await variant.importModuleLoader(); + // Return a wrapped version that injects locateFile + return (moduleOptions?: Record) => { + return originalLoader({ + ...moduleOptions, + // Override locateFile to return the correct URL for the wasm file + locateFile: (path: string, prefix: string) => { + if (path.endsWith(".wasm")) { + console.log( + `[QuickJS] locateFile intercepted for ${path}, returning: ${quickjsWasmUrl}`, + ); + return quickjsWasmUrl; + } + // For non-wasm files, use the default behavior + return prefix + path; + }, + }); + }; + }, + }; + + this.quickjs = await newQuickJSWASMModuleFromVariant( + variantWithLocateFile, + ); this.runtime = this.quickjs.newRuntime(); // Set memory and stack limits diff --git a/packages/browser-runtime/src/skill/lib/services/skill-manager.ts b/packages/browser-runtime/src/skill/lib/services/skill-manager.ts index acfe1fa..888db48 100644 --- a/packages/browser-runtime/src/skill/lib/services/skill-manager.ts +++ b/packages/browser-runtime/src/skill/lib/services/skill-manager.ts @@ -450,6 +450,106 @@ export class SkillManager { } } + /** + * Refresh skill metadata from SKILL.md file. + * This is called when SKILL.md is edited and saved via the file manager. + * It re-parses the frontmatter and updates both IndexedDB and the registry. + */ + async refreshSkillMetadata(skillId: string): Promise { + if (!this.initialized) { + throw new Error("SkillManager not initialized"); + } + + // Validate skillId to prevent path traversal + if ( + !skillId || + skillId.includes("/") || + skillId.includes("\\") || + skillId.includes("..") + ) { + throw new Error(`Invalid skill ID: ${skillId}`); + } + + try { + // Get current metadata + const currentMetadata = await skillStorage.getSkillMetadata(skillId); + if (!currentMetadata) { + throw new Error(`Skill not found: ${skillId}`); + } + + // Read the SKILL.md content from ZenFS + const skillPath = zenfs.getSkillPath(skillId); + const skillMdPath = `${skillPath}/SKILL.md`; + + const skillMdExists = await zenfs.exists(skillMdPath); + if (!skillMdExists) { + throw new Error(`SKILL.md not found for skill: ${skillId}`); + } + + const skillMdContent = (await zenfs.readFile( + skillMdPath, + "utf8", + )) as string; + + // Parse the frontmatter to extract description and version + const parsedMetadata = skillRegistry.parseSkillMetadata(skillMdContent); + + // Check that name hasn't changed (we don't support rename) + if (parsedMetadata.name && parsedMetadata.name !== skillId) { + throw new Error( + `Skill name mismatch: expected "${skillId}" but found "${parsedMetadata.name}" in SKILL.md. Skill renaming is not supported.`, + ); + } + + // Build updates object (only update fields that are present in frontmatter) + const updates: Partial = {}; + if (parsedMetadata.description !== undefined) { + updates.description = parsedMetadata.description; + } + if (parsedMetadata.version !== undefined) { + updates.version = parsedMetadata.version; + } + + // Update in IndexedDB if there are changes + if (Object.keys(updates).length > 0) { + await skillStorage.updateSkill(skillId, updates); + } + + // Get the updated metadata + const updatedMetadata = await skillStorage.getSkillMetadata(skillId); + if (!updatedMetadata) { + throw new Error( + `Failed to retrieve updated metadata for skill: ${skillId}`, + ); + } + + // Update the registry with updated metadata and refreshed content + const existingSkill = skillRegistry.getSkill(currentMetadata.name); + if (existingSkill) { + skillRegistry.updateSkill(currentMetadata.name, { + metadata: updatedMetadata, + skillMdContent: skillMdContent, + }); + } + + console.log(`✅ Skill metadata refreshed: ${skillId}`); + + // Emit an event so UI components can react + this._emit("skill_loaded", { + type: "skill_metadata_refreshed", + skillId, + skillName: currentMetadata.name, + skillMetadata: updatedMetadata, + }); + } catch (error) { + console.error( + `❌ Failed to refresh skill metadata for ${skillId}:`, + error, + ); + throw error; + } + } + getRegisteredTools(): any[] { return skillExecutor.getRegisteredTools(); } diff --git a/packages/browser-runtime/src/tools/index.ts b/packages/browser-runtime/src/tools/index.ts index ba00e87..056a29e 100644 --- a/packages/browser-runtime/src/tools/index.ts +++ b/packages/browser-runtime/src/tools/index.ts @@ -24,14 +24,13 @@ import { getAllTabsTool, getCurrentTabTool, getTabInfoTool, - organizeTabsTool, ungroupTabsTool, } from "./tab"; import { downloadChatImagesTool, downloadImageTool } from "./tools/downloads"; /** * All browser tools registered for AI use - * Total: 32 tools (28 core + 4 intervention tools) + * Total: 31 tools (27 core + 4 intervention tools) * * Disabled tools (per aipex): * - switch_to_tab (causes context switching issues) @@ -40,6 +39,7 @@ import { downloadChatImagesTool, downloadImageTool } from "./tools/downloads"; * - capture_screenshot_to_clipboard (not enabled in aipex) * - download_text_as_markdown (not enabled in aipex) * - download_current_chat_images (architecture issue, not enabled in aipex) + * - organize_tabs (stub implementation, temporarily disabled until AI grouping is complete) */ type BrowserFunctionTool = FunctionTool< unknown, @@ -48,13 +48,13 @@ type BrowserFunctionTool = FunctionTool< >; const browserFunctionTools: BrowserFunctionTool[] = [ - // Browser/Tab Management (7 tools) + // Browser/Tab Management (6 tools) + // Note: organize_tabs temporarily disabled (stub/not shipped) getAllTabsTool, getCurrentTabTool, createNewTabTool, getTabInfoTool, closeTabTool, - organizeTabsTool, ungroupTabsTool, // UI Operations (7 tools) - computer tool replaces visual XY tools diff --git a/packages/browser-runtime/src/tools/interventions/index.ts b/packages/browser-runtime/src/tools/interventions/index.ts index 4cefe77..72636b7 100644 --- a/packages/browser-runtime/src/tools/interventions/index.ts +++ b/packages/browser-runtime/src/tools/interventions/index.ts @@ -256,8 +256,10 @@ export const cancelInterventionTool = tool({ }; } + // Pass "user" as the reason since this is an AI/user-initiated cancellation const cancelled = interventionManager.cancelIntervention( currentIntervention.request.id, + "user", ); if (cancelled) { diff --git a/packages/browser-runtime/src/tools/organize-tabs.ts b/packages/browser-runtime/src/tools/organize-tabs.ts new file mode 100644 index 0000000..ca47610 --- /dev/null +++ b/packages/browser-runtime/src/tools/organize-tabs.ts @@ -0,0 +1,515 @@ +/** + * AI-powered tab organization module + * + * This module provides the logic to group tabs using AI or fallback to domain-based grouping. + * The AI integration uses a configurable callback pattern, allowing the extension to inject + * the actual LLM call implementation. + */ + +import { z } from "zod"; + +// ============================================================================ +// Types +// ============================================================================ + +/** + * Valid tab group color values (matching chrome.tabGroups.Color enum values) + */ +export type TabGroupColor = + | "blue" + | "red" + | "yellow" + | "green" + | "orange" + | "purple" + | "pink" + | "cyan" + | "grey"; + +export interface TabData { + id: number; + title: string; + url: string; + hostname: string; +} + +export interface TabGroupResult { + emoji: string; + category: string; + color: TabGroupColor; + tabIds: number[]; +} + +export interface OrganizeTabsResult { + success: boolean; + groupedTabs?: number; + groups?: number; + error?: string; +} + +/** + * LLM response schema for tab grouping + */ +export const TabGroupingResponseSchema = z.object({ + groups: z.array( + z.object({ + emoji: z.string(), + category: z.string(), + color: z.enum([ + "blue", + "red", + "yellow", + "green", + "orange", + "purple", + "pink", + "cyan", + "grey", + ]), + tabIds: z.array(z.number()), + }), + ), +}); + +export type TabGroupingResponse = z.infer; + +/** + * Callback type for AI-powered tab classification. + * The extension should provide this when AI is available. + */ +export type TabClassificationCallback = ( + tabData: TabData[], + language: "en" | "zh", +) => Promise; + +// ============================================================================ +// Module State +// ============================================================================ + +let aiClassificationCallback: TabClassificationCallback | null = null; + +/** + * Set the AI classification callback for tab organization. + * This should be called by the extension when the agent is ready. + */ +export function setTabClassificationCallback( + callback: TabClassificationCallback | null, +): void { + aiClassificationCallback = callback; +} + +// ============================================================================ +// Helper Functions +// ============================================================================ + +const VALID_COLORS: TabGroupColor[] = [ + "blue", + "red", + "yellow", + "green", + "orange", + "purple", + "pink", + "cyan", + "grey", +]; + +function getRandomColor(): TabGroupColor { + return VALID_COLORS[Math.floor(Math.random() * VALID_COLORS.length)]!; +} + +// Regex patterns for character sanitization - using RegExp constructor to satisfy linter +// biome-ignore lint/suspicious/noControlCharactersInRegex: intentionally matching control characters for sanitization +const CONTROL_CHARS_REGEX = /[\u0000-\u001F\u007F-\u009F]/g; + +/** + * Sanitize string for AI request - remove problematic characters + */ +function sanitizeForAI(str: string): string { + return str + .replace(/[\uD800-\uDFFF]/g, "") // Remove surrogate pairs + .replace(CONTROL_CHARS_REGEX, "") // Remove control characters + .replace(/[\u{1F600}-\u{1F64F}]/gu, "") // Remove emoji ranges + .replace(/[\u{1F300}-\u{1F5FF}]/gu, "") + .replace(/[\u{1F680}-\u{1F6FF}]/gu, "") + .replace(/[\u{1F1E0}-\u{1F1FF}]/gu, "") + .replace(/[\u{2600}-\u{26FF}]/gu, "") + .replace(/[\u{2700}-\u{27BF}]/gu, "") + .replace(/[^\x20-\x7E\u4e00-\u9fff]/g, "") // Keep ASCII and Chinese + .trim(); +} + +/** + * Sanitize category string for display + */ +function sanitizeString(str: string): string { + return str + .replace(/[\uD800-\uDFFF]/g, "") + .replace(CONTROL_CHARS_REGEX, "") + .replace(/[^\x20-\x7E\u4e00-\u9fff]/g, "") + .trim(); +} + +/** + * Validate emoji - more permissive but safe + */ +function validateEmoji(emoji: string | undefined): string { + if (!emoji || typeof emoji !== "string") { + return "📁"; + } + const trimmed = emoji.trim(); + if (trimmed.length === 0 || trimmed.includes("\u0000")) { + return "📁"; + } + return trimmed; +} + +/** + * Extract hostname from URL safely + */ +function getHostname(url: string): string { + try { + return new URL(url).hostname; + } catch { + const match = url.match(/:\/\/([^/]+)/); + return match?.[1] || url.split("://")[0] || ""; + } +} + +// ============================================================================ +// Fallback: Domain-Based Grouping +// ============================================================================ + +interface DomainGroup { + domain: string; + category: string; + emoji: string; + color: TabGroupColor; +} + +const DOMAIN_CATEGORIES: DomainGroup[] = [ + // Development + { domain: "github.com", category: "Dev", emoji: "💻", color: "grey" }, + { domain: "gitlab.com", category: "Dev", emoji: "💻", color: "grey" }, + { domain: "stackoverflow.com", category: "Dev", emoji: "💻", color: "grey" }, + { domain: "npmjs.com", category: "Dev", emoji: "💻", color: "grey" }, + { domain: "vercel.com", category: "Dev", emoji: "💻", color: "grey" }, + + // Google + { domain: "google.com", category: "Google", emoji: "🔍", color: "blue" }, + { domain: "youtube.com", category: "Video", emoji: "🎬", color: "red" }, + { domain: "gmail.com", category: "Mail", emoji: "📧", color: "red" }, + { domain: "docs.google.com", category: "Docs", emoji: "📄", color: "blue" }, + + // Social + { domain: "twitter.com", category: "Social", emoji: "🐦", color: "cyan" }, + { domain: "x.com", category: "Social", emoji: "🐦", color: "cyan" }, + { domain: "linkedin.com", category: "Social", emoji: "💼", color: "blue" }, + { domain: "facebook.com", category: "Social", emoji: "👥", color: "blue" }, + { domain: "reddit.com", category: "Social", emoji: "🗨️", color: "orange" }, + + // Shopping + { domain: "amazon.com", category: "Shop", emoji: "🛒", color: "yellow" }, + { domain: "ebay.com", category: "Shop", emoji: "🛒", color: "yellow" }, + { domain: "taobao.com", category: "Shop", emoji: "🛒", color: "orange" }, + { domain: "jd.com", category: "Shop", emoji: "🛒", color: "red" }, + + // News + { domain: "cnn.com", category: "News", emoji: "📰", color: "red" }, + { domain: "bbc.com", category: "News", emoji: "📰", color: "red" }, + { domain: "reuters.com", category: "News", emoji: "📰", color: "blue" }, + + // AI + { domain: "openai.com", category: "AI", emoji: "🤖", color: "green" }, + { domain: "anthropic.com", category: "AI", emoji: "🤖", color: "orange" }, + { domain: "claude.ai", category: "AI", emoji: "🤖", color: "orange" }, + { domain: "chatgpt.com", category: "AI", emoji: "🤖", color: "green" }, +]; + +function getDomainCategory(hostname: string): DomainGroup | null { + const lowerHost = hostname.toLowerCase(); + for (const category of DOMAIN_CATEGORIES) { + if (lowerHost.includes(category.domain)) { + return category; + } + } + return null; +} + +/** + * Fallback grouping by domain when AI is not available + */ +function groupTabsByDomain(tabs: TabData[]): TabGroupResult[] { + const groups = new Map< + string, + { + category: string; + emoji: string; + color: TabGroupColor; + tabIds: number[]; + } + >(); + + const otherTabs: number[] = []; + + for (const tab of tabs) { + const domainInfo = getDomainCategory(tab.hostname); + if (domainInfo) { + const key = domainInfo.category; + const existing = groups.get(key); + if (existing) { + existing.tabIds.push(tab.id); + } else { + groups.set(key, { + category: domainInfo.category, + emoji: domainInfo.emoji, + color: domainInfo.color, + tabIds: [tab.id], + }); + } + } else { + // Group remaining tabs by root domain + const rootDomain = + tab.hostname.split(".").slice(-2).join(".") || tab.hostname; + if (rootDomain) { + const key = `domain:${rootDomain}`; + const existing = groups.get(key); + if (existing) { + existing.tabIds.push(tab.id); + } else { + groups.set(key, { + category: rootDomain.split(".")[0] || "Other", + emoji: "🌐", + color: getRandomColor(), + tabIds: [tab.id], + }); + } + } else { + otherTabs.push(tab.id); + } + } + } + + // Convert to results, excluding single-tab groups + const results: TabGroupResult[] = []; + for (const [, group] of groups) { + if (group.tabIds.length >= 2) { + results.push(group); + } else { + otherTabs.push(...group.tabIds); + } + } + + // Add "Other" group if there are uncategorized tabs + if (otherTabs.length >= 2) { + results.push({ + category: "Other", + emoji: "📁", + color: "grey", + tabIds: otherTabs, + }); + } + + return results; +} + +// ============================================================================ +// Main Implementation +// ============================================================================ + +/** + * Use AI to automatically group tabs by topic/purpose. + * Falls back to domain-based grouping if AI is not available. + */ +export async function groupTabsByAI(): Promise { + try { + // Get all tabs in current window + const tabs = await chrome.tabs.query({ currentWindow: true }); + const validTabs = tabs.filter((tab) => tab.url && tab.id); + + if (validTabs.length === 0) { + return { success: true, groupedTabs: 0, groups: 0 }; + } + + // Get active tab for collapse logic + const [activeTab] = await chrome.tabs.query({ + active: true, + currentWindow: true, + }); + + // Prepare tab data + const tabData: TabData[] = validTabs.map((tab) => ({ + id: tab.id!, + title: sanitizeForAI(tab.title || ""), + url: tab.url!, + hostname: sanitizeForAI(getHostname(tab.url!)), + })); + + let groupingResult: TabGroupResult[]; + + // Try AI classification if callback is available + if (aiClassificationCallback) { + try { + // Detect language - simple heuristic + const hasChineseChars = tabData.some( + (t) => /[\u4e00-\u9fff]/.test(t.title) || t.hostname.endsWith(".cn"), + ); + const language = hasChineseChars ? "zh" : "en"; + + const aiResponse = await aiClassificationCallback(tabData, language); + const parsed = TabGroupingResponseSchema.safeParse(aiResponse); + + if (parsed.success) { + groupingResult = parsed.data.groups.map((g) => ({ + emoji: validateEmoji(g.emoji), + category: sanitizeString(g.category), + color: VALID_COLORS.includes(g.color) ? g.color : getRandomColor(), + tabIds: g.tabIds.filter((id) => validTabs.some((t) => t.id === id)), + })); + } else { + console.warn( + "[organize_tabs] AI response parsing failed, using fallback:", + parsed.error, + ); + groupingResult = groupTabsByDomain(tabData); + } + } catch (aiError) { + console.warn( + "[organize_tabs] AI classification failed, using fallback:", + aiError, + ); + groupingResult = groupTabsByDomain(tabData); + } + } else { + // No AI available, use domain-based fallback + console.log( + "[organize_tabs] No AI callback set, using domain-based grouping", + ); + groupingResult = groupTabsByDomain(tabData); + } + + // Apply groups to tabs + const windowId = validTabs[0]!.windowId; + let groupCount = 0; + + for (const group of groupingResult) { + if (group.tabIds.length === 0) continue; + + const displayName = `${group.emoji} ${group.category}`; + + try { + // Check for existing group with same name + const existingGroups = await chrome.tabGroups.query({ windowId }); + const existingGroup = existingGroups.find( + (g) => g.title === displayName, + ); + + if (existingGroup) { + // Add tabs to existing group + await chrome.tabs.group({ + tabIds: group.tabIds as [number, ...number[]], + groupId: existingGroup.id, + }); + // Collapse unless it contains active tab + const containsActiveTab = group.tabIds.includes(activeTab?.id ?? -1); + await chrome.tabGroups.update(existingGroup.id, { + collapsed: !containsActiveTab, + }); + } else { + // Create new group + const groupId = await chrome.tabs.group({ + createProperties: { windowId }, + tabIds: group.tabIds as [number, ...number[]], + }); + await chrome.tabGroups.update(groupId, { + title: displayName, + color: group.color, + }); + // Collapse unless it contains active tab + const containsActiveTab = group.tabIds.includes(activeTab?.id ?? -1); + await chrome.tabGroups.update(groupId, { + collapsed: !containsActiveTab, + }); + } + groupCount++; + } catch (groupError) { + console.warn(`[organize_tabs] Failed to create group:`, groupError); + } + } + + return { + success: true, + groupedTabs: validTabs.length, + groups: groupCount, + }; + } catch (error) { + console.error("[organize_tabs] Error:", error); + return { + success: false, + error: error instanceof Error ? error.message : String(error), + }; + } +} + +// ============================================================================ +// Prompt Template for AI Classification +// ============================================================================ + +/** + * Generate the prompt for AI tab classification. + * This can be used by the extension to create the appropriate AI request. + */ +export function generateTabClassificationPrompt( + tabData: TabData[], + language: "en" | "zh", +): string { + const languageInstructions = + language === "zh" + ? { + categoryInstruction: "简单的分类名称(1-2个中文字)", + exampleCategories: ["新闻", "购物", "工作"], + languageNote: "使用简单的中文词汇作为分类名称。", + } + : { + categoryInstruction: "A simple category name (1-2 words in English)", + exampleCategories: ["News", "Shopping", "Work"], + languageNote: "Use simple English words for categories.", + }; + + return `Classify these browser tabs into 3-7 meaningful groups based on their content, purpose, or topic. For each group, provide an appropriate emoji, color, and a simple category name. + +Tab data: +${JSON.stringify(tabData, null, 2)} + +You must return a JSON object with a "groups" key containing an array where each item has: +1. "emoji": A single emoji that represents the group content +2. "category": ${languageInstructions.categoryInstruction} +3. "color": A color from this list: blue, red, yellow, green, orange, purple, pink, cyan, grey +4. "tabIds": Array of tab IDs that belong to this group + +Example response format: +{ + "groups": [ + { + "emoji": "[emoji]", + "category": "${languageInstructions.exampleCategories[0]}", + "color": "blue", + "tabIds": [123, 124, 125] + }, + { + "emoji": "[emoji]", + "category": "${languageInstructions.exampleCategories[1]}", + "color": "green", + "tabIds": [126, 127] + }, + { + "emoji": "[emoji]", + "category": "${languageInstructions.exampleCategories[2]}", + "color": "purple", + "tabIds": [128, 129] + } + ] +} + +Important: Use only common, standard emojis and ${languageInstructions.languageNote} Choose colors that match the content theme.`; +} diff --git a/packages/browser-runtime/src/tools/tools/tab-groups/index.ts b/packages/browser-runtime/src/tools/tools/tab-groups/index.ts index 027f7c2..2a0a03a 100644 --- a/packages/browser-runtime/src/tools/tools/tab-groups/index.ts +++ b/packages/browser-runtime/src/tools/tools/tab-groups/index.ts @@ -135,8 +135,14 @@ export async function deleteTabGroup(groupId: number): Promise<{ } } +/** + * Tool to remove all tab groups in the current window. + * Note: This tool uses the name "ungroup_tabs" for consistency with legacy naming. + * Do not register this alongside the default ungroupTabsTool from ./tab.ts to avoid + * duplicate tool name registration. + */ export const ungroupAllTabsTool = tool({ - name: "ungroup_all_tabs", + name: "ungroup_tabs", description: "Remove all tab groups in the current window", parameters: z.object({}), execute: async () => { diff --git a/packages/browser-runtime/src/types/external-modules.d.ts b/packages/browser-runtime/src/types/external-modules.d.ts index 7cb4982..2111ac4 100644 --- a/packages/browser-runtime/src/types/external-modules.d.ts +++ b/packages/browser-runtime/src/types/external-modules.d.ts @@ -3,6 +3,12 @@ declare module "@jitl/quickjs-ng-wasmfile-release-sync" { export default releaseSyncVariant; } +// Declaration for the wasm subpath export with Vite's ?url suffix +declare module "@jitl/quickjs-ng-wasmfile-release-sync/wasm?url" { + const wasmUrl: string; + export default wasmUrl; +} + declare module "@zenfs/core" { export const configure: (...args: any[]) => Promise; export const fs: { diff --git a/packages/browser-runtime/src/types/url.d.ts b/packages/browser-runtime/src/types/url.d.ts new file mode 100644 index 0000000..6088414 --- /dev/null +++ b/packages/browser-runtime/src/types/url.d.ts @@ -0,0 +1,16 @@ +/** + * TypeScript declaration for Vite's ?url import suffix. + * When importing a file with ?url, Vite returns the public URL of the asset + * after bundling, rather than the file contents. + * + * This is used for importing WASM files and other assets that need to be + * fetched at runtime with their correct bundled URL. + * + * @example + * import wasmUrl from "@some-package/file.wasm?url"; + * // wasmUrl is a string containing the URL to the wasm file + */ +declare module "*?url" { + const url: string; + export default url; +} diff --git a/packages/browser-runtime/vitest.config.ts b/packages/browser-runtime/vitest.config.ts index c05c84f..fa3656d 100644 --- a/packages/browser-runtime/vitest.config.ts +++ b/packages/browser-runtime/vitest.config.ts @@ -9,6 +9,11 @@ export default defineConfig({ concurrent: false, }, silent: true, - exclude: ["**/node_modules/**", "**/dist/**"], + exclude: [ + "**/node_modules/**", + "**/dist/**", + // Puppeteer tests require Chrome browser installation - run separately with: vitest run --config vitest.puppeteer.config.ts + "**/*.puppeteer.test.ts", + ], }, }); diff --git a/packages/core/package.json b/packages/core/package.json index 9c9229d..9f16882 100644 --- a/packages/core/package.json +++ b/packages/core/package.json @@ -56,7 +56,7 @@ "@ai-sdk/anthropic": "^3.0.0", "@ai-sdk/google": "^3.0.0", "@ai-sdk/openai": "^3.0.0", - "@openrouter/ai-sdk-provider": "^0.4.0" + "@openrouter/ai-sdk-provider": "^2.0.0" }, "peerDependenciesMeta": { "@ai-sdk/openai": { diff --git a/packages/core/src/agent/aipex.test.ts b/packages/core/src/agent/aipex.test.ts index 3377165..27f2901 100644 --- a/packages/core/src/agent/aipex.test.ts +++ b/packages/core/src/agent/aipex.test.ts @@ -19,22 +19,35 @@ function createMockRunResult( overrides: { finalOutput?: string; usage?: { promptTokens?: number; completionTokens?: number }; + /** Multiple raw responses (for testing multi-turn within single execution) */ + rawResponses?: Array<{ + usage?: { inputTokens?: number; outputTokens?: number }; + }>; streamEvents?: any[]; } = {}, ): StreamedRunResult { const events = overrides.streamEvents ?? []; + + // Build rawResponses: if explicit rawResponses provided, use it; otherwise use usage shorthand + let rawResponses: Array<{ + usage?: { inputTokens?: number; outputTokens?: number }; + }> = []; + if (overrides.rawResponses) { + rawResponses = overrides.rawResponses; + } else if (overrides.usage) { + rawResponses = [ + { + usage: { + inputTokens: overrides.usage.promptTokens ?? 0, + outputTokens: overrides.usage.completionTokens ?? 0, + }, + }, + ]; + } + return { finalOutput: overrides.finalOutput ?? "", - rawResponses: overrides.usage - ? [ - { - usage: { - inputTokens: overrides.usage.promptTokens ?? 0, - outputTokens: overrides.usage.completionTokens ?? 0, - }, - }, - ] - : [], + rawResponses, async *[Symbol.asyncIterator]() { for (const event of events) { yield event; @@ -388,6 +401,94 @@ describe("AIPex", () => { } }); + it("should use last rawResponse usage when multiple responses exist", async () => { + // Simulate a multi-turn execution where multiple model responses occur + // (e.g., tool calls triggering additional model calls) + vi.mocked(run).mockResolvedValue( + createMockRunResult({ + finalOutput: "Final response", + rawResponses: [ + // First response (e.g., tool call) + { usage: { inputTokens: 100, outputTokens: 50 } }, + // Second response (e.g., another tool call) + { usage: { inputTokens: 200, outputTokens: 100 } }, + // Final response - this should be used + { usage: { inputTokens: 500, outputTokens: 250 } }, + ], + streamEvents: [ + { + type: "raw_model_stream_event", + data: { type: "output_text_delta", delta: "Final response" }, + }, + ], + }), + ); + + const agent = AIPex.create({ + instructions: "Test", + model: mockModel, + }); + + const events: AgentEvent[] = []; + for await (const event of agent.chat("Input")) { + events.push(event); + } + + const metricsEvent = events.find((e) => e.type === "metrics_update"); + expect(metricsEvent).toBeDefined(); + if (metricsEvent && metricsEvent.type === "metrics_update") { + // Should use the LAST response's usage, not the sum + expect(metricsEvent.metrics.promptTokens).toBe(500); + expect(metricsEvent.metrics.completionTokens).toBe(250); + expect(metricsEvent.metrics.tokensUsed).toBe(750); + } + + const completeEvent = events.find((e) => e.type === "execution_complete"); + expect(completeEvent).toBeDefined(); + if (completeEvent && completeEvent.type === "execution_complete") { + expect(completeEvent.metrics.tokensUsed).toBe(750); + } + }); + + it("should handle rawResponses with some entries missing usage", async () => { + vi.mocked(run).mockResolvedValue( + createMockRunResult({ + finalOutput: "Response", + rawResponses: [ + { usage: { inputTokens: 100, outputTokens: 50 } }, + {}, // No usage + { usage: undefined }, + { usage: { inputTokens: 300, outputTokens: 150 } }, // Last with usage + ], + streamEvents: [ + { + type: "raw_model_stream_event", + data: { type: "output_text_delta", delta: "Response" }, + }, + ], + }), + ); + + const agent = AIPex.create({ + instructions: "Test", + model: mockModel, + }); + + const events: AgentEvent[] = []; + for await (const event of agent.chat("Input")) { + events.push(event); + } + + const metricsEvent = events.find((e) => e.type === "metrics_update"); + expect(metricsEvent).toBeDefined(); + if (metricsEvent && metricsEvent.type === "metrics_update") { + // Should find the last response WITH usage data + expect(metricsEvent.metrics.promptTokens).toBe(300); + expect(metricsEvent.metrics.completionTokens).toBe(150); + expect(metricsEvent.metrics.tokensUsed).toBe(450); + } + }); + it("should accumulate session metrics across multiple conversations", async () => { vi.mocked(run).mockResolvedValue( createMockRunResult({ @@ -724,5 +825,182 @@ describe("AIPex", () => { await expect(runPromise).resolves.toBeUndefined(); expect(events.some((event) => event.type === "error")).toBe(true); }); + + it("should extract real error message from tool failure", async () => { + vi.mocked(run).mockResolvedValue( + createMockRunResult({ + finalOutput: "", + streamEvents: [ + { + type: "run_item_stream_event", + name: "tool_called", + item: { rawItem: { name: "screenshot", arguments: "{}" } }, + }, + { + type: "run_item_stream_event", + name: "tool_output", + item: { + rawItem: { + name: "screenshot", + status: "failed", + error: { message: "No active tab found" }, + }, + output: undefined, + }, + }, + ], + }), + ); + + const agent = AIPex.create({ + instructions: "Tools", + model: mockModel, + }); + + const events: AgentEvent[] = []; + for await (const event of agent.chat("take screenshot")) { + events.push(event); + } + + const errorEvent = events.find( + (event) => event.type === "tool_call_error", + ); + expect(errorEvent).toBeDefined(); + if (errorEvent?.type === "tool_call_error") { + expect(errorEvent.error.message).toBe("No active tab found"); + } + }); + + it("should extract error message from JSON output on failure", async () => { + vi.mocked(run).mockResolvedValue( + createMockRunResult({ + finalOutput: "", + streamEvents: [ + { + type: "run_item_stream_event", + name: "tool_called", + item: { rawItem: { name: "organize_tabs", arguments: "{}" } }, + }, + { + type: "run_item_stream_event", + name: "tool_output", + item: { + rawItem: { name: "organize_tabs", status: "failed" }, + output: JSON.stringify({ + success: false, + error: "Cannot organize tabs in incognito window", + }), + }, + }, + ], + }), + ); + + const agent = AIPex.create({ + instructions: "Tools", + model: mockModel, + }); + + const events: AgentEvent[] = []; + for await (const event of agent.chat("organize tabs")) { + events.push(event); + } + + const errorEvent = events.find( + (event) => event.type === "tool_call_error", + ); + expect(errorEvent).toBeDefined(); + if (errorEvent?.type === "tool_call_error") { + expect(errorEvent.error.message).toBe( + "Cannot organize tabs in incognito window", + ); + } + }); + + it("should sanitize sensitive data from error messages", async () => { + vi.mocked(run).mockResolvedValue( + createMockRunResult({ + finalOutput: "", + streamEvents: [ + { + type: "run_item_stream_event", + name: "tool_called", + item: { rawItem: { name: "api_call", arguments: "{}" } }, + }, + { + type: "run_item_stream_event", + name: "tool_output", + item: { + rawItem: { name: "api_call", status: "failed" }, + output: + "Error: Request failed with Authorization: Bearer sk-1234567890abcdef", + }, + }, + ], + }), + ); + + const agent = AIPex.create({ + instructions: "Tools", + model: mockModel, + }); + + const events: AgentEvent[] = []; + for await (const event of agent.chat("make api call")) { + events.push(event); + } + + const errorEvent = events.find( + (event) => event.type === "tool_call_error", + ); + expect(errorEvent).toBeDefined(); + if (errorEvent?.type === "tool_call_error") { + expect(errorEvent.error.message).toContain("[REDACTED]"); + expect(errorEvent.error.message).not.toContain("sk-1234567890abcdef"); + } + }); + + it("should truncate long error messages", async () => { + const longMessage = "x".repeat(1000); + vi.mocked(run).mockResolvedValue( + createMockRunResult({ + finalOutput: "", + streamEvents: [ + { + type: "run_item_stream_event", + name: "tool_called", + item: { rawItem: { name: "failing_tool", arguments: "{}" } }, + }, + { + type: "run_item_stream_event", + name: "tool_output", + item: { + rawItem: { name: "failing_tool", status: "failed" }, + output: longMessage, + }, + }, + ], + }), + ); + + const agent = AIPex.create({ + instructions: "Tools", + model: mockModel, + }); + + const events: AgentEvent[] = []; + for await (const event of agent.chat("run failing tool")) { + events.push(event); + } + + const errorEvent = events.find( + (event) => event.type === "tool_call_error", + ); + expect(errorEvent).toBeDefined(); + if (errorEvent?.type === "tool_call_error") { + expect(errorEvent.error.message.length).toBeLessThanOrEqual(500); + expect(errorEvent.error.message.endsWith("...")).toBe(true); + } + }); }); }); diff --git a/packages/core/src/agent/aipex.ts b/packages/core/src/agent/aipex.ts index 0ba0ede..bd5f44c 100644 --- a/packages/core/src/agent/aipex.ts +++ b/packages/core/src/agent/aipex.ts @@ -45,7 +45,7 @@ export class AIPex { this.agent = agent; this.conversationManager = conversationManager; this.contextManager = contextManager; - this.maxTurns = maxTurns ?? 10; + this.maxTurns = maxTurns ?? 2000; this.plugins = plugins; this.pluginContext = { agent: this }; this.initializePlugins(); @@ -237,7 +237,11 @@ export class AIPex { metrics: metricsSnapshot, sessionId: session?.id ?? undefined, }); - yield { type: "metrics_update", metrics: metricsSnapshot }; + yield { + type: "metrics_update", + metrics: metricsSnapshot, + sessionId: session?.id, + }; if (session) { session.addMetrics(metrics); @@ -267,7 +271,11 @@ export class AIPex { metrics: metricsSnapshot, sessionId: session?.id ?? undefined, }); - yield { type: "metrics_update", metrics: { ...metrics } }; + yield { + type: "metrics_update", + metrics: { ...metrics }, + sessionId: session?.id, + }; yield { type: "error", error: agentError }; if (session) { session.addMetrics(metrics); @@ -396,10 +404,12 @@ export class AIPex { const status = this.getToolStatus(event.item); if (status !== "completed") { + const toolName = this.extractToolName(event.item); + const failureMessage = this.extractToolFailureMessage(event.item, status); return { type: "tool_call_error", - toolName: this.extractToolName(event.item), - error: new Error(`Tool call ${status}`), + toolName, + error: new Error(failureMessage), }; } @@ -459,20 +469,168 @@ export class AIPex { return rawOutput; } + /** + * Extract a human-readable failure message from a tool execution. + * Attempts to find the real error message from various locations in the item, + * with basic truncation and sanitization. + */ + private extractToolFailureMessage( + item: RunItemStreamEvent["item"], + status: string, + ): string { + const MAX_MESSAGE_LENGTH = 500; + + // Try to extract error message from various sources + let message: string | undefined; + + // Check item.output for error info + const outputCarrier = item as unknown as { output?: unknown }; + if (outputCarrier.output !== undefined) { + message = this.extractErrorFromValue(outputCarrier.output); + } + + // Check rawItem.output + if (!message) { + const rawOutput = (item as unknown as { rawItem?: { output?: unknown } }) + .rawItem?.output; + if (rawOutput !== undefined) { + message = this.extractErrorFromValue(rawOutput); + } + } + + // Check rawItem.error directly + if (!message) { + const rawError = (item as unknown as { rawItem?: { error?: unknown } }) + .rawItem?.error; + if (rawError !== undefined) { + message = this.extractErrorFromValue(rawError); + } + } + + // Fallback to status-based message + if (!message) { + message = `Tool call ${status}`; + } + + // Truncate and sanitize + return this.sanitizeErrorMessage(message, MAX_MESSAGE_LENGTH); + } + + /** + * Extract error message from a value that could be: + * - A string (possibly JSON) + * - An Error object + * - An object with error/message properties + */ + private extractErrorFromValue(value: unknown): string | undefined { + if (value === undefined || value === null) { + return undefined; + } + + // Handle Error objects + if (value instanceof Error) { + return value.message; + } + + // Handle string values + if (typeof value === "string") { + // Try to parse as JSON + const parsed = safeJsonParse(value); + if (parsed !== undefined) { + return this.extractErrorFromValue(parsed); + } + return value; + } + + // Handle objects with error-related properties + if (typeof value === "object") { + const obj = value as Record; + + // Check for common error patterns + if (typeof obj.error === "string" && obj.error.length > 0) { + return obj.error; + } + if (typeof obj.message === "string" && obj.message.length > 0) { + return obj.message; + } + if ( + obj.error && + typeof obj.error === "object" && + typeof (obj.error as Record).message === "string" + ) { + return (obj.error as Record).message as string; + } + + // If it's a failure result object, try to extract useful info + if (obj.success === false) { + if (typeof obj.error === "string") { + return obj.error; + } + // Return a stringified version as last resort + try { + return JSON.stringify(obj); + } catch { + return undefined; + } + } + } + + return undefined; + } + + /** + * Sanitize and truncate error message for safe display. + * - Truncates to maxLength + * - Masks potential sensitive patterns (tokens, auth headers) + */ + private sanitizeErrorMessage(message: string, maxLength: number): string { + let sanitized = message; + + // Mask potential sensitive patterns + // Authorization headers + sanitized = sanitized.replace( + /Authorization:\s*(Bearer\s+)?[^\s,}"\]]+/gi, + "Authorization: [REDACTED]", + ); + // API keys patterns + sanitized = sanitized.replace( + /(['"](api[_-]?key|apikey|token|secret|password)['"]\s*[=:]\s*['"])[^'"]+(['"])/gi, + "$1[REDACTED]$3", + ); + // Bearer tokens in JSON + sanitized = sanitized.replace( + /(bearer\s+)[a-zA-Z0-9._-]{20,}/gi, + "$1[REDACTED]", + ); + + // Truncate if needed + if (sanitized.length > maxLength) { + sanitized = `${sanitized.substring(0, maxLength - 3)}...`; + } + + return sanitized; + } + private applyUsageMetrics( metrics: AgentMetrics, result: { rawResponses?: Array<{ usage?: UsageShape }> }, ): void { const responses = result.rawResponses ?? []; - let promptTokens = 0; - let completionTokens = 0; - for (const response of responses) { - if (!response.usage) continue; - promptTokens += response.usage.inputTokens ?? 0; - completionTokens += response.usage.outputTokens ?? 0; + // Use the LAST response with usage data (typically the final model response) + // This represents the total tokens for this execution, not a running sum + let lastUsage: UsageShape | undefined; + for (let i = responses.length - 1; i >= 0; i--) { + const response = responses[i]; + if (response?.usage) { + lastUsage = response.usage; + break; + } } + const promptTokens = lastUsage?.inputTokens ?? 0; + const completionTokens = lastUsage?.outputTokens ?? 0; + metrics.promptTokens = promptTokens; metrics.completionTokens = completionTokens; metrics.tokensUsed = promptTokens + completionTokens; diff --git a/packages/core/src/types.ts b/packages/core/src/types.ts index e4c3dfb..8bde618 100644 --- a/packages/core/src/types.ts +++ b/packages/core/src/types.ts @@ -126,7 +126,7 @@ export type AgentEvent = | { type: "contexts_attached"; contexts: Context[] } | { type: "contexts_loaded"; providerId: string; count: number } | { type: "context_error"; providerId: string; error: Error } - | { type: "metrics_update"; metrics: AgentMetrics } + | { type: "metrics_update"; metrics: AgentMetrics; sessionId?: string } | { type: "error"; error: AgentError } | { type: "execution_complete"; diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index 2e945d1..05bf863 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -80,6 +80,9 @@ importers: '@radix-ui/react-use-controllable-state': specifier: ^1.2.2 version: 1.2.2(@types/react@19.2.8)(react@19.2.3) + '@ricky0123/vad-web': + specifier: ^0.0.27 + version: 0.0.27 ai: specifier: ^6.0.28 version: 6.0.28(zod@4.3.6) @@ -119,6 +122,9 @@ importers: tailwind-merge: specifier: ^3.4.0 version: 3.4.0 + three: + specifier: ^0.177.0 + version: 0.177.0 tokenlens: specifier: ^1.3.1 version: 1.3.1 @@ -135,6 +141,9 @@ importers: '@testing-library/react': specifier: ^16.3.0 version: 16.3.0(@testing-library/dom@10.4.1)(@types/react-dom@19.2.3(@types/react@19.2.8))(@types/react@19.2.8)(react-dom@19.2.3(react@19.2.3))(react@19.2.3) + '@types/chrome': + specifier: 0.1.32 + version: 0.1.32 '@types/react': specifier: 19.2.8 version: 19.2.8 @@ -144,6 +153,9 @@ importers: '@types/react-syntax-highlighter': specifier: ^15.5.13 version: 15.5.13 + '@types/three': + specifier: ^0.177.0 + version: 0.177.0 jsdom: specifier: ^27.4.0 version: 27.4.0(@noble/hashes@1.8.0) @@ -222,6 +234,9 @@ importers: '@radix-ui/react-use-controllable-state': specifier: ^1.2.2 version: 1.2.2(@types/react@19.2.8)(react@19.2.3) + '@ricky0123/vad-web': + specifier: ^0.0.27 + version: 0.0.27 ahooks: specifier: ^3.9.6 version: 3.9.6(react-dom@19.2.3(react@19.2.3))(react@19.2.3) @@ -255,6 +270,9 @@ importers: nanoid: specifier: ^5.1.6 version: 5.1.6 + onnxruntime-web: + specifier: ^1.22.0 + version: 1.24.1 react: specifier: 19.2.3 version: 19.2.3 @@ -396,8 +414,8 @@ importers: specifier: ^0.4.6 version: 0.4.6(@ai-sdk/provider@3.0.8)(@openai/agents@0.4.3(ws@8.19.0)(zod@4.3.6))(ai@6.0.28(zod@4.3.6))(ws@8.19.0)(zod@4.3.6) '@openrouter/ai-sdk-provider': - specifier: ^0.4.0 - version: 0.4.6(zod@4.3.6) + specifier: ^2.0.0 + version: 2.1.1(ai@6.0.28(zod@4.3.6))(zod@4.3.6) lru-cache: specifier: ^11.2.4 version: 11.2.4 @@ -471,15 +489,6 @@ packages: peerDependencies: zod: ^3.25.76 || ^4.1.8 - '@ai-sdk/provider-utils@2.1.10': - resolution: {integrity: sha512-4GZ8GHjOFxePFzkl3q42AU0DQOtTQ5w09vmaWUf/pKFXJPizlnzKSUkF0f+VkapIUfDugyMqPMT1ge8XQzVI7Q==} - engines: {node: '>=18'} - peerDependencies: - zod: ^3.0.0 - peerDependenciesMeta: - zod: - optional: true - '@ai-sdk/provider-utils@4.0.10': resolution: {integrity: sha512-VeDAiCH+ZK8Xs4hb9Cw7pHlujWNL52RKe8TExOkrw6Ir1AmfajBZTb9XUdKOZO08RwQElIKA8+Ltm+Gqfo8djQ==} engines: {node: '>=18'} @@ -504,10 +513,6 @@ packages: peerDependencies: zod: ^3.25.76 || ^4.1.8 - '@ai-sdk/provider@1.0.9': - resolution: {integrity: sha512-jie6ZJT2ZR0uVOVCDc9R2xCX5I/Dum/wEK28lx21PJx6ZnFAN9EzD2WsPhcDWfCgGx3OAZZ0GyM3CEobXpa9LA==} - engines: {node: '>=18'} - '@ai-sdk/provider@3.0.2': resolution: {integrity: sha512-HrEmNt/BH/hkQ7zpi2o6N3k1ZR1QTb7z85WYhYygiTxOQuaml4CMtHCWRbric5WPU+RNsYI7r1EpyVQMKO1pYw==} engines: {node: '>=18'} @@ -879,6 +884,9 @@ packages: resolution: {integrity: sha512-Vd/9EVDiu6PPJt9yAh6roZP6El1xHrdvIVGjyBsHR0RYwNHgL7FJPyIIW4fANJNG6FtyZfvlRPpFI4ZM/lubvw==} engines: {node: '>=18'} + '@dimforge/rapier3d-compat@0.12.0': + resolution: {integrity: sha512-uekIGetywIgopfD97oDL5PfeezkFpNhwlzlaEYNOA0N6ghdsOvh/HYjSMek5Q2O1PYvRSDFcqFVJl4r4ZBwOow==} + '@emnapi/core@1.7.1': resolution: {integrity: sha512-o1uhUASyo921r2XtHYOHy7gdkGLge8ghBEQHMWmyJFoXlpU58kIrhhN3w26lpQb6dspetweapMn2CSNwQ8I4wg==} @@ -1198,11 +1206,12 @@ packages: peerDependencies: zod: ^4.0.0 - '@openrouter/ai-sdk-provider@0.4.6': - resolution: {integrity: sha512-oUa8xtssyUhiKEU/aW662lsZ0HUvIUTRk8vVIF3Ha3KI/DnqX54zmVIuzYnaDpermqhy18CHqblAY4dDt1JW3g==} + '@openrouter/ai-sdk-provider@2.1.1': + resolution: {integrity: sha512-UypPbVnSExxmG/4Zg0usRiit3auvQVrjUXSyEhm0sZ9GQnW/d8p/bKgCk2neh1W5YyRSo7PNQvCrAEBHZnqQkQ==} engines: {node: '>=18'} peerDependencies: - zod: ^3.0.0 + ai: ^6.0.0 + zod: ^3.25.0 || ^4.0.0 '@opentelemetry/api@1.9.0': resolution: {integrity: sha512-3giAOQvZiH5F9bMlMiv8+GSPMeqg0dbaeo58/0SlA9sxSqZhnUtxzX9/2FzyhS9sWQf5S0GJE0AKBrFqjpeYcg==} @@ -1308,6 +1317,36 @@ packages: cpu: [x64] os: [win32] + '@protobufjs/aspromise@1.1.2': + resolution: {integrity: sha512-j+gKExEuLmKwvz3OgROXtrJ2UG2x8Ch2YZUxahh+s1F2HZ+wAceUNLkvy6zKCPVRkU++ZWQrdxsUeQXmcg4uoQ==} + + '@protobufjs/base64@1.1.2': + resolution: {integrity: sha512-AZkcAA5vnN/v4PDqKyMR5lx7hZttPDgClv83E//FMNhR2TMcLUhfRUBHCmSl0oi9zMgDDqRUJkSxO3wm85+XLg==} + + '@protobufjs/codegen@2.0.4': + resolution: {integrity: sha512-YyFaikqM5sH0ziFZCN3xDC7zeGaB/d0IUb9CATugHWbd1FRFwWwt4ld4OYMPWu5a3Xe01mGAULCdqhMlPl29Jg==} + + '@protobufjs/eventemitter@1.1.0': + resolution: {integrity: sha512-j9ednRT81vYJ9OfVuXG6ERSTdEL1xVsNgqpkxMsbIabzSo3goCjDIveeGv5d03om39ML71RdmrGNjG5SReBP/Q==} + + '@protobufjs/fetch@1.1.0': + resolution: {integrity: sha512-lljVXpqXebpsijW71PZaCYeIcE5on1w5DlQy5WH6GLbFryLUrBD4932W/E2BSpfRJWseIL4v/KPgBFxDOIdKpQ==} + + '@protobufjs/float@1.0.2': + resolution: {integrity: sha512-Ddb+kVXlXst9d+R9PfTIxh1EdNkgoRe5tOX6t01f1lYWOvJnSPDBlG241QLzcyPdoNTsblLUdujGSE4RzrTZGQ==} + + '@protobufjs/inquire@1.1.0': + resolution: {integrity: sha512-kdSefcPdruJiFMVSbn801t4vFK7KB/5gd2fYvrxhuJYg8ILrmn9SKSX2tZdV6V+ksulWqS7aXjBcRXl3wHoD9Q==} + + '@protobufjs/path@1.1.2': + resolution: {integrity: sha512-6JOcJ5Tm08dOHAbdR3GrvP+yUUfkjG5ePsHYczMFLq3ZmMkAD98cDgcT2iA1lJ9NVwFd4tH/iSSoe44YWkltEA==} + + '@protobufjs/pool@1.1.0': + resolution: {integrity: sha512-0kELaGSIDBKvcgS4zkjz1PeddatrjYcmMWOlAuAPwAeccUrPHdUqo/J6LiymHHEiJT5NrF1UVwxY14f+fy4WQw==} + + '@protobufjs/utf8@1.1.0': + resolution: {integrity: sha512-Vvn3zZrhQZkkBE8LSuW3em98c0FwgO4nxzv6OdSxPKJIEKY2bGbHn+mhGIPerzI4twdxaP8/0+06HBpwf345Lw==} + '@puppeteer/browsers@2.11.1': resolution: {integrity: sha512-YmhAxs7XPuxN0j7LJloHpfD1ylhDuFmmwMvfy/+6nBSrETT2ycL53LrhgPtR+f+GcPSybQVuQ5inWWu5MrWCpA==} engines: {node: '>=18'} @@ -1774,6 +1813,9 @@ packages: '@radix-ui/rect@1.1.1': resolution: {integrity: sha512-HPwpGIzkl28mWyZqG52jiqDJ12waP11Pa1lGoiyUkIEuMLBP0oeK/C89esbXrxsky5we7dfd8U58nm0SgAWpVw==} + '@ricky0123/vad-web@0.0.27': + resolution: {integrity: sha512-4XFng44oj7qFQUrVYFpMnwRYJDFYrGUL0FmPWcrkF0gPneubJbu8KJvp+WaKn+70GNw2gwGZUMvd9hHiCJkUNg==} + '@rolldown/pluginutils@1.0.0-beta.47': resolution: {integrity: sha512-8QagwMH3kNCuzD8EWL8R2YPW5e4OrHNSAHRFDdmFqEwEaD/KcNKjVoumo+gP2vW5eKB2UPbM6vTYiGZX0ixLnw==} @@ -2035,6 +2077,9 @@ packages: '@tootallnate/quickjs-emscripten@0.23.0': resolution: {integrity: sha512-C5Mc6rdnsaJDjO3UpGW/CQTHtCKaYlScZTly4JIu97Jxo/odCiH0ITnDXSJPTOrEKk/ycSZ0AOgTmkDtkOsvIA==} + '@tweenjs/tween.js@23.1.3': + resolution: {integrity: sha512-vJmvvwFxYuGnF2axRtPYocag6Clbb5YS7kLL+SO/TeVFzHqDIWrNKYtcsPMibjDx9O+bu+psAy9NKfWklassUA==} + '@tybys/wasm-util@0.10.1': resolution: {integrity: sha512-9tTaPJLSiejZKx+Bmog4uSubteqTvFrVrURwkmHixBo0G4seD0zUxp98E1DzUBJxLQ3NPwXrGKDiVjwx/DpPsg==} @@ -2115,12 +2160,21 @@ packages: '@types/react@19.2.8': resolution: {integrity: sha512-3MbSL37jEchWZz2p2mjntRZtPt837ij10ApxKfgmXCTuHWagYg7iA5bqPw6C8BMPfwidlvfPI/fxOc42HLhcyg==} + '@types/stats.js@0.17.4': + resolution: {integrity: sha512-jIBvWWShCvlBqBNIZt0KAshWpvSjhkwkEu4ZUcASoAvhmrgAUI2t1dXrjSL4xXVLB4FznPrIsX3nKXFl/Dt4vA==} + + '@types/three@0.177.0': + resolution: {integrity: sha512-/ZAkn4OLUijKQySNci47lFO+4JLE1TihEjsGWPUT+4jWqxtwOPPEwJV1C3k5MEx0mcBPCdkFjzRzDOnHEI1R+A==} + '@types/unist@2.0.11': resolution: {integrity: sha512-CmBKiL6NNo/OqgmMn95Fk9Whlp2mtvIv+KNpQKN2F4SjvrEesubTRWGYSg+BnWZOnlCaSTU1sMpsBOzgbYhnsA==} '@types/unist@3.0.3': resolution: {integrity: sha512-ko/gIFJRv177XgZsZcBwnqJN5x/Gien8qNOn0D5bQU/zAzVf9Zt3BlcUiLqhV9y4ARk0GbT3tnUiPNgnTXzc/Q==} + '@types/webxr@0.5.24': + resolution: {integrity: sha512-h8fgEd/DpoS9CBrjEQXR+dIDraopAEfu4wYVNY2tEPwk60stPWhvZMf4Foo5FakuQ7HFZoa8WceaWFervK2Ovg==} + '@types/ws@8.18.1': resolution: {integrity: sha512-ThVF6DCVhA8kUGy+aazFQ4kXQ7E1Ty7A3ypFOe0IcJV8O/M511G99AW24irKrW56Wt44yG9+ij8FaqoBGkuBXg==} @@ -2172,6 +2226,9 @@ packages: '@webcomponents/custom-elements@1.6.0': resolution: {integrity: sha512-CqTpxOlUCPWRNUPZDxT5v2NnHXA4oox612iUGnmTUGQFhZ1Gkj8kirtl/2wcF6MqX7+PqqicZzOCBKKfIn0dww==} + '@webgpu/types@0.1.69': + resolution: {integrity: sha512-RPmm6kgRbI8e98zSD3RVACvnuktIja5+yLgDAkTmxLr90BEwdTXRQWNLF3ETTTyH/8mKhznZuN5AveXYFEsMGQ==} + '@xterm/xterm@5.5.0': resolution: {integrity: sha512-hqJHYaQb5OptNunnyAnkHyM8aCjZ1MEIDTQu1iIbbTD/xops91NB5yq1ZK/dC2JDbVWtF23zUtl9JE2NqwT87A==} @@ -3034,6 +3091,9 @@ packages: resolution: {integrity: sha512-1yD6RmLI1XBfxugvORwlck6f75tYL+iR0jqwsOrOxMZyGYqUuDhJ0l4AXdO1iX/FTs9cBAMEk1gWSEx1kSbylg==} engines: {node: '>=6'} + flatbuffers@25.9.23: + resolution: {integrity: sha512-MI1qs7Lo4Syw0EOzUl0xjs2lsoeqFku44KpngfIduHBYvzm8h2+7K8YMQh1JtVVVrUvhLpNwqVi4DERegUJhPQ==} + flow-parser@0.299.0: resolution: {integrity: sha512-phGMRoNt6SNglPHGRbCyWm9/pxfe6t/t4++EIYPaBGWT6e0lphLBgUMrvpL62NbRo9R549o3oqrbKHq82kANCw==} engines: {node: '>=0.4.0'} @@ -3169,6 +3229,9 @@ packages: graceful-fs@4.2.11: resolution: {integrity: sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ==} + guid-typescript@1.0.9: + resolution: {integrity: sha512-Y8T4vYhEfwJOTbouREvG+3XDsjr8E3kIr7uf+JZ0BYloFsttiHU0WfvANVsR7TxNUJa/WpCnw/Ino/p+DeBhBQ==} + has-flag@4.0.0: resolution: {integrity: sha512-EykJT/Q1KjTWctppgIAgfSO0tKVuZUjhgMr17kqTumMl6Afv3EISleU7qZUzoXDFTAHTDC4NOoG/ZxU3EvlMPQ==} engines: {node: '>=8'} @@ -3564,6 +3627,9 @@ packages: lodash@4.17.21: resolution: {integrity: sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==} + long@5.3.2: + resolution: {integrity: sha512-mNAgZ1GmyNhD7AuqnTG3/VQ26o760+ZYBPKjPvugO8+nLbYfX6TVpJPseBvopbdY+qpZ/lKUnmEc1LeZYS3QAA==} + longest-streak@3.1.0: resolution: {integrity: sha512-9Ri+o0JYgehTaVBBDoMqIl8GXtbWg711O3srftcHhZ0dqnETqLaoIK0x17fUw9rFSlK/0NlsKe0Ahhyl5pXE2g==} @@ -3697,6 +3763,9 @@ packages: resolution: {integrity: sha512-8q7VEgMJW4J8tcfVPy8g09NcQwZdbwFEqhe/WZkoIzjn/3TGDwtOCYtXGxA3O8tPzpczCCDgv+P2P5y00ZJOOg==} engines: {node: '>= 8'} + meshoptimizer@0.18.1: + resolution: {integrity: sha512-ZhoIoL7TNV4s5B6+rx5mC//fw8/POGyNxS/DZyCJeiZ12ScLfVwRE/GfsxwiTkMYYD5DmK2/JXnEVXqL4rF+Sw==} + micromark-core-commonmark@2.0.3: resolution: {integrity: sha512-RDBrHEMSxVFLg6xvnXmb1Ayr2WzLAWjeSATAoxwKYJV94TeNavgoIdA0a9ytzDSVzBy2YKFK+emCPOEibLeCrg==} @@ -3922,6 +3991,12 @@ packages: resolution: {integrity: sha512-kbpaSSGJTWdAY5KPVeMOKXSrPtr8C8C7wodJbcsd51jRnmD+GZu8Y0VoU6Dm5Z4vWr0Ig/1NKuWRKf7j5aaYSg==} engines: {node: '>=6'} + onnxruntime-common@1.24.1: + resolution: {integrity: sha512-UnV15u4p4XxoIV+jFP4hXPsW93s3QrwLSpi20HUDYHoTfI4z4sjzex3L4XDOxGGZJ/M/catrwAG2go958UQq0w==} + + onnxruntime-web@1.24.1: + resolution: {integrity: sha512-i2u395dv+ZEQBdH+aORvlu19Bzvlg5AXJ7wjxnL350hknOP9z0UeP3pVfjkpMEWMPy2T6nCQxetKTmNia6wSzg==} + openai@6.15.0: resolution: {integrity: sha512-F1Lvs5BoVvmZtzkUEVyh8mDQPPFolq4F+xdsx/DO8Hee8YF3IGAlZqUIsF+DVGhqf4aU0a3bTghsxB6OIsRy1g==} hasBin: true @@ -3934,18 +4009,6 @@ packages: zod: optional: true - openai@6.18.0: - resolution: {integrity: sha512-odLRYyz9rlzz6g8gKn61RM2oP5UUm428sE2zOxZqS9MzVfD5/XW8UoEjpnRkzTuScXP7ZbP/m7fC+bl8jCOZZw==} - hasBin: true - peerDependencies: - ws: ^8.18.0 - zod: ^3.25 || ^4.0 - peerDependenciesMeta: - ws: - optional: true - zod: - optional: true - oxc-resolver@11.16.4: resolution: {integrity: sha512-nvJr3orFz1wNaBA4neRw7CAn0SsjgVaEw1UHpgO/lzVW12w+nsFnvU/S6vVX3kYyFaZdxZheTExi/fa8R8PrZA==} @@ -4075,6 +4138,9 @@ packages: pkg-types@2.3.0: resolution: {integrity: sha512-SIqCzDRg0s9npO5XQ3tNZioRY1uK06lA41ynBC1YmFTmnY6FjUjVt6s4LoADmwoig1qqD0oK8h1p/8mlMx8Oig==} + platform@1.3.6: + resolution: {integrity: sha512-fnWVljUchTro6RiCFvCXBbNhJc2NijN7oIQxbwsyL0buWJPG85v81ehlHI9fXrJsMNgTofEoWIQeClKpgxFLrg==} + plimit-lit@1.6.1: resolution: {integrity: sha512-B7+VDyb8Tl6oMJT9oSO2CW8XC/T4UcJGrwOVoNGwOQsQYhlpfajmrMj5xeejqaASq3V/EqThyOeATEOMuSEXiA==} engines: {node: '>=12'} @@ -4171,6 +4237,10 @@ packages: property-information@7.1.0: resolution: {integrity: sha512-TwEZ+X+yCJmYfL7TPUOcvBZ4QfoT5YenQiJuX//0th53DE6w0xxLEtfK3iyryQFddXuvkIk51EEgrJQ0WJkOmQ==} + protobufjs@7.5.4: + resolution: {integrity: sha512-CvexbZtbov6jW2eXAvLukXjXUW1TzFaivC46BpWc/3BpcCysb5Vffu+B3XHMm8lVEuy2Mm4XGex8hBSg1yapPg==} + engines: {node: '>=12.0.0'} + proxy-addr@2.0.7: resolution: {integrity: sha512-llQsMLSUDUPT44jdrU/O37qlnifitDP+ZwrmmZcoSKyLKvtZxpyV0n2/bD/N4tBAAZ/gJEdZU7KMraoK1+XYAg==} engines: {node: '>= 0.10'} @@ -4423,9 +4493,6 @@ packages: resolution: {integrity: sha512-9BakfsO2aUQN2K9Fdbj87RJIEZ82Q9IGim7FqM5OsebfoFC6ZHXgDq/KvniuLTPdeM8wY2o6Dj3WQ7KeQCj3cA==} engines: {node: '>=0.10.0'} - secure-json-parse@2.7.0: - resolution: {integrity: sha512-6aU+Rwsezw7VR8/nyvKTx8QpWH9FrcYiXXlqC4z5d5XQBDRqtbfsRjnwGyqbi3gddNtWHuEk9OANUotL26qKUw==} - semver@5.7.2: resolution: {integrity: sha512-cBznnQ9KjJqU67B52RMC65CMarK2600WFnbkcaiwWq3xy/5haFJlshgnpjovMVJ+Hff49d8GEn0b87C5pDQ10g==} hasBin: true @@ -4628,6 +4695,9 @@ packages: text-decoder@1.2.3: resolution: {integrity: sha512-3/o9z3X0X0fTupwsYvR03pJ/DjWuqqrfwBgTQzdWDiQSm9KitAyz/9WqsT2JQW7KV2m+bC2ol/zqpW37NHxLaA==} + three@0.177.0: + resolution: {integrity: sha512-EiXv5/qWAaGI+Vz2A+JfavwYCMdGjxVsrn3oBwllUoqYeaBO75J63ZfyaQKoiLrqNHoTlUc6PFgMXnS0kI45zg==} + tiny-invariant@1.3.3: resolution: {integrity: sha512-+FbBPE1o9QAYvviau/qC5SE3caw21q3xkvWKBtja5vgqOWIHHJ3ioaq1VPfn/Szqctz2bU/oYeKd9/z5BL+PVg==} @@ -5059,15 +5129,6 @@ snapshots: '@ai-sdk/provider-utils': 4.0.13(zod@4.3.6) zod: 4.3.6 - '@ai-sdk/provider-utils@2.1.10(zod@4.3.6)': - dependencies: - '@ai-sdk/provider': 1.0.9 - eventsource-parser: 3.0.6 - nanoid: 3.3.11 - secure-json-parse: 2.7.0 - optionalDependencies: - zod: 4.3.6 - '@ai-sdk/provider-utils@4.0.10(zod@4.3.6)': dependencies: '@ai-sdk/provider': 3.0.5 @@ -5096,10 +5157,6 @@ snapshots: eventsource-parser: 3.0.6 zod: 4.3.6 - '@ai-sdk/provider@1.0.9': - dependencies: - json-schema: 0.4.0 - '@ai-sdk/provider@3.0.2': dependencies: json-schema: 0.4.0 @@ -5573,6 +5630,8 @@ snapshots: '@csstools/css-tokenizer@3.0.4': {} + '@dimforge/rapier3d-compat@0.12.0': {} + '@emnapi/core@1.7.1': dependencies: '@emnapi/wasi-threads': 1.1.0 @@ -5807,7 +5866,7 @@ snapshots: '@openai/agents-core@0.4.6(ws@8.19.0)(zod@4.3.6)': dependencies: debug: 4.4.3 - openai: 6.18.0(ws@8.19.0)(zod@4.3.6) + openai: 6.15.0(ws@8.19.0)(zod@4.3.6) optionalDependencies: '@modelcontextprotocol/sdk': 1.26.0(zod@4.3.6) zod: 4.3.6 @@ -5870,10 +5929,9 @@ snapshots: - utf-8-validate - ws - '@openrouter/ai-sdk-provider@0.4.6(zod@4.3.6)': + '@openrouter/ai-sdk-provider@2.1.1(ai@6.0.28(zod@4.3.6))(zod@4.3.6)': dependencies: - '@ai-sdk/provider': 1.0.9 - '@ai-sdk/provider-utils': 2.1.10(zod@4.3.6) + ai: 6.0.28(zod@4.3.6) zod: 4.3.6 '@opentelemetry/api@1.9.0': {} @@ -5940,6 +5998,29 @@ snapshots: '@oxc-resolver/binding-win32-x64-msvc@11.16.4': optional: true + '@protobufjs/aspromise@1.1.2': {} + + '@protobufjs/base64@1.1.2': {} + + '@protobufjs/codegen@2.0.4': {} + + '@protobufjs/eventemitter@1.1.0': {} + + '@protobufjs/fetch@1.1.0': + dependencies: + '@protobufjs/aspromise': 1.1.2 + '@protobufjs/inquire': 1.1.0 + + '@protobufjs/float@1.0.2': {} + + '@protobufjs/inquire@1.1.0': {} + + '@protobufjs/path@1.1.2': {} + + '@protobufjs/pool@1.1.0': {} + + '@protobufjs/utf8@1.1.0': {} + '@puppeteer/browsers@2.11.1': dependencies: debug: 4.4.3 @@ -6410,6 +6491,10 @@ snapshots: '@radix-ui/rect@1.1.1': {} + '@ricky0123/vad-web@0.0.27': + dependencies: + onnxruntime-web: 1.24.1 + '@rolldown/pluginutils@1.0.0-beta.47': {} '@rollup/pluginutils@4.2.1': @@ -6610,6 +6695,8 @@ snapshots: '@tootallnate/quickjs-emscripten@0.23.0': {} + '@tweenjs/tween.js@23.1.3': {} + '@tybys/wasm-util@0.10.1': dependencies: tslib: 2.8.1 @@ -6704,10 +6791,24 @@ snapshots: dependencies: csstype: 3.2.3 + '@types/stats.js@0.17.4': {} + + '@types/three@0.177.0': + dependencies: + '@dimforge/rapier3d-compat': 0.12.0 + '@tweenjs/tween.js': 23.1.3 + '@types/stats.js': 0.17.4 + '@types/webxr': 0.5.24 + '@webgpu/types': 0.1.69 + fflate: 0.8.2 + meshoptimizer: 0.18.1 + '@types/unist@2.0.11': {} '@types/unist@3.0.3': {} + '@types/webxr@0.5.24': {} + '@types/ws@8.18.1': dependencies: '@types/node': 25.2.2 @@ -6774,6 +6875,8 @@ snapshots: '@webcomponents/custom-elements@1.6.0': {} + '@webgpu/types@0.1.69': {} + '@xterm/xterm@5.5.0': optional: true @@ -7700,6 +7803,8 @@ snapshots: dependencies: locate-path: 3.0.0 + flatbuffers@25.9.23: {} + flow-parser@0.299.0: {} follow-redirects@1.15.11: {} @@ -7837,6 +7942,8 @@ snapshots: graceful-fs@4.2.11: {} + guid-typescript@1.0.9: {} + has-flag@4.0.0: {} has-symbols@1.1.0: {} @@ -8255,6 +8362,8 @@ snapshots: lodash@4.17.21: {} + long@5.3.2: {} + longest-streak@3.1.0: {} lowlight@1.20.0: @@ -8465,6 +8574,8 @@ snapshots: merge2@1.4.1: {} + meshoptimizer@0.18.1: {} + micromark-core-commonmark@2.0.3: dependencies: decode-named-character-reference: 1.3.0 @@ -8761,12 +8872,18 @@ snapshots: dependencies: mimic-fn: 2.1.0 - openai@6.15.0(ws@8.19.0)(zod@4.3.6): - optionalDependencies: - ws: 8.19.0 - zod: 4.3.6 + onnxruntime-common@1.24.1: {} - openai@6.18.0(ws@8.19.0)(zod@4.3.6): + onnxruntime-web@1.24.1: + dependencies: + flatbuffers: 25.9.23 + guid-typescript: 1.0.9 + long: 5.3.2 + onnxruntime-common: 1.24.1 + platform: 1.3.6 + protobufjs: 7.5.4 + + openai@6.15.0(ws@8.19.0)(zod@4.3.6): optionalDependencies: ws: 8.19.0 zod: 4.3.6 @@ -8917,6 +9034,8 @@ snapshots: exsolve: 1.0.8 pathe: 2.0.3 + platform@1.3.6: {} + plimit-lit@1.6.1: dependencies: queue-lit: 1.5.2 @@ -9007,6 +9126,21 @@ snapshots: property-information@7.1.0: {} + protobufjs@7.5.4: + dependencies: + '@protobufjs/aspromise': 1.1.2 + '@protobufjs/base64': 1.1.2 + '@protobufjs/codegen': 2.0.4 + '@protobufjs/eventemitter': 1.1.0 + '@protobufjs/fetch': 1.1.0 + '@protobufjs/float': 1.0.2 + '@protobufjs/inquire': 1.1.0 + '@protobufjs/path': 1.1.2 + '@protobufjs/pool': 1.1.0 + '@protobufjs/utf8': 1.1.0 + '@types/node': 25.2.2 + long: 5.3.2 + proxy-addr@2.0.7: dependencies: forwarded: 0.2.0 @@ -9347,8 +9481,6 @@ snapshots: screenfull@5.2.0: {} - secure-json-parse@2.7.0: {} - semver@5.7.2: {} semver@6.3.1: {} @@ -9624,6 +9756,8 @@ snapshots: transitivePeerDependencies: - react-native-b4a + three@0.177.0: {} + tiny-invariant@1.3.3: {} tinybench@2.9.0: {}