28 Commits

Author SHA1 Message Date
Rohit Kushwaha
9902cbfe13 Merge branch 'dev' into feat/deep-agents-backend 2026-03-21 01:24:12 +05:30
Rohit Kushwaha
d2dc43aeeb docs(discord): add Discord Docker deployment guide and update Discord docs
- Add new deployment/discord-docker.mdx with full Docker setup guide
  (LLM provider tabs, Claude Code OAuth, Coolify, volume config)
- Update channels/discord.mdx with missing config vars (conversation
  all channels, exclude channels, status, activity), Message Content
  Intent, server-wide conversation mode, bot presence section
- Update guides/discord-ai-bot.mdx with Docker deployment and
  server-wide conversation mode
- Update deployment/docker.mdx with Claude Code OAuth section and
  Discord-only deployment reference
- Add Discord Docker to sidebar navigation in docs-config.json
2026-03-21 00:30:29 +05:30
Rohit Kushwaha
976f9803a9 docs: add Deep Agents backend documentation
- New docs/backends/deep-agents.mdx with full configuration guide
  covering all providers (Anthropic, OpenAI, Ollama, OpenRouter,
  LiteLLM, OpenAI-compatible, Google)
- Updated backends index: comparison table, card grid, tool bridge note
- Added to sidebar navigation in docs-config.json
2026-03-19 18:07:48 +05:30
Rohit Kushwaha
aee5a88fb0 feat(a2a): add settings UI and documentation
- Add A2A Protocol section to System settings with toggle, agent name,
  description, task timeout, and trusted agents list
- Wire frontend state, WebSocket save/load, and backend handler
- Add docs/integrations/a2a-protocol.mdx with full protocol docs
- Add A2A to sidebar nav and integrations overview card
- Add live smoke test script (scripts/test_a2a_live.py)
2026-03-18 23:07:26 +05:30
Rohit Kushwaha
c44f3a1786 feat(cli): add 10 new subcommands for local agent management (#675)
* feat(cli): add 10 new subcommands for local agent management

New CLI commands:
- doctor: full diagnostics with connectivity checks
- health: quick startup-only health check (no network)
- channels: list channel status, start/stop adapters via REST
- skills: list and search available skills
- sessions: list, delete, and search chat sessions
- memory: show stats and search long-term memories
- config: show (masked), set, validate, path
- errors: show recent errors with --limit and --search
- logs: show/tail audit log with --follow

All commands support --json for scripting. Refactored __main__.py
to subcommand-based parser while keeping full backwards compat
with existing flags (--doctor, --telegram, --discord, etc).

* docs: add new CLI subcommands to CLAUDE.md

* docs: add CLI reference page with all subcommands

New docs page at getting-started/cli covering all 15+ CLI commands
with usage examples, flags, and descriptions. Added to sidebar nav
after Quick Start.
2026-03-18 17:32:26 +05:30
Rohit Kushwaha
5adeae1f9c feat: agent status API, SSE stream, and CLI command (#586)
* docs: add agent status API & CLI design

Design for GET /api/v1/agent/status endpoint, SSE stream, and
pocketpaw status CLI command. Exposes real-time agent state
(idle/thinking/tool_running/streaming/error) for external integrations.

* docs: add agent status API implementation plan

9-task plan covering StatusTracker, REST endpoint, SSE stream,
CLI command, auth, tests, and lifecycle wiring.

* feat: add agent status API, SSE stream, and CLI command

Adds a public status endpoint for external integrations (stream decks,
LED indicators, desktop widgets) to monitor PocketPaw agent state.

- GET /api/v1/agent/status returns global state (idle/active/degraded)
  and per-session breakdown (thinking/tool_running/streaming/error)
- GET /api/v1/agent/status/stream pushes SSE events on state changes
- pocketpaw status CLI with --json and --watch flags
- Optional API key auth via POCKETPAW_STATUS_API_KEY / X-Status-Key
- StatusTracker subscribes to bus events, no internal coupling
- 22 tests covering state transitions, auth, and CLI formatting

* fix(tests): update router count to 24 and fix lint in test files

Update test_v1_routers_count assertion from 23 to 24 for the new
agent_status router. Reformat test files and fix UP038 lint error.

* fix: address review issues in agent status API

- Cache status API key to avoid Settings.load() on every request
- Add client disconnect detection in SSE stream loop
- Fix wait_for_change race condition with version-based tracking
- Wire response_model=AgentStatusResponse and fix schema alias mismatch
- Move lifecycle registration from module scope to startup_event
- Extract session title enrichment into dedicated method
- Update tests for new caching and version tracking

* fix: emit agent_start/end events, fix SSE stream spamming, add docs

- AgentLoop now emits agent_start before processing and agent_end on
  completion/error so StatusTracker actually tracks sessions
- Skip redundant thinking notifications when state is already thinking
- Deduplicate SSE stream using state fingerprints (ignores timing fields)
- Increase SSE debounce from 200ms to 1s to coalesce rapid tool events
- Add API docs for GET /agent/status and GET /agent/status/stream
2026-03-14 10:14:12 +05:30
Rohit Kushwaha
d8bdc9ffb3 docs: add documentation for PII masking, streaming redaction, AGENTS.md, Discord conversation mode, identity drift prevention, and kill command
- New pages: PII detection/masking, streaming redaction, AGENTS.md support
- Updated: Discord (conversation mode, admin commands, kill command), agent loop (identity drift, kill, AGENTS.md), backends (full tool access, OpenRouter)
- Updated security overview and sidebar navigation
2026-03-10 21:58:07 +05:30
Rohit Kushwaha
423b66827a docs: add desktop client section and v1 API server documentation
New pages:
- desktop-client/index.mdx — overview, architecture, features
- desktop-client/installation.mdx — download, setup, requirements
- desktop-client/development.mdx — dev environment, project structure, conventions
- desktop-client/api-server.mdx — pocketpaw serve command, full endpoint reference, auth
2026-03-10 18:38:55 +05:30
Rohit Kushwaha
f20d320af6 Merge branch 'main' into dev 2026-03-10 18:16:41 +05:30
Prakash Dalai
42c9d15bbd Merge pull request #502 from pocketpaw/feat/client-v1-api
feat: Tauri v2 desktop client with v1 API
2026-03-09 09:27:05 +05:30
Prakash
3a147b64e2 fix(docs): split navbar into Guides and Reference dropdowns
Restore the Channels/Integrations/Tools/API Reference links that were
removed when the Guides dropdown was added. Now two separate dropdowns
so both tutorials and reference docs are one click away.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 17:31:21 +05:30
Prakash
0deaddec7c docs(seo): add guides section, FAQ schema, and internal linking
5 SEO-targeted tutorial guides for long-tail keyword traffic:
- Self-host an AI agent on your laptop
- Build a Telegram AI bot in 5 minutes
- Add AI to your Discord server
- Run AI with Ollama (no API key)
- AI agents vs chatbots comparison

Landing page: FAQ section with FAQPage JSON-LD schema markup
(targets Google "People Also Ask" snippets).

Internal linking: cross-links from channels, backends, and
introduction pages back to relevant guides. Updated navbar
dropdown and footer with guide links.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 17:14:47 +05:30
Prakash
73e110aabe fix(seo): fix stale URLs, outdated schema, and missing internal links
- og:url and og:image pointed to pocketpaw.netlify.app, now pocketpaw.xyz
- JSON-LD softwareVersion was 0.1.0, now 0.4.7
- JSON-LD applicationCategory broadened to include Productivity and Utilities
- docs-config.json metadata URL was pocketpaw.com, now pocketpaw.xyz
- pyproject.toml Homepage now points to pocketpaw.xyz (not GitHub)
- pyproject.toml Documentation now points to pocketpaw.xyz/introduction
- Added footer links to /channels, /tools, /backends, /getting-started
- Updated all remaining dsc.gg/pocketpaw Discord links

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 12:33:13 +05:30
Prakash
a37268af8b fix(seo): fix stale URLs, outdated schema, and missing internal links
- og:url and og:image pointed to pocketpaw.netlify.app, now pocketpaw.xyz
- JSON-LD softwareVersion was 0.1.0, now 0.4.7
- JSON-LD applicationCategory broadened to include Productivity and Utilities
- docs-config.json metadata URL was pocketpaw.com, now pocketpaw.xyz
- pyproject.toml Homepage now points to pocketpaw.xyz (not GitHub)
- pyproject.toml Documentation now points to pocketpaw.xyz/introduction
- Added footer links to /channels, /tools, /backends, /getting-started
- Updated all remaining dsc.gg/pocketpaw Discord links

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 12:23:44 +05:30
Prakash Dalai
fdf128120d feat(deep-work): v2 engine — retry, timeout, cancel, output chaining, PawKit schema (#375)
* feat: merge desktop client into monorepo under client/

Move the Tauri 2.0 + SvelteKit desktop client from the separate
pocketpaw-client repo into client/ to create a monorepo. Update
.gitignore with client build artifact ignores and add Desktop Client
section to CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(client): unignore client/src/lib and broaden node/sveltekit ignores

The root .gitignore `lib/` pattern was blocking client/src/lib/ from
being tracked. Anchor it to `/lib/` so only the root Python lib dir is
ignored. Also generalize node_modules/ and .svelte-kit/ patterns to
work at any depth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(client): fix white OAuth screen on Windows

Load a local loading page first in the OAuth popup window, then
navigate to the external OAuth URL via eval(). WebView2 on Windows
may not be fully initialized when WebviewUrl::External is used
directly, resulting in a permanently white window.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(client): replace WebView OAuth popup with system browser + localhost server

WebView2 secondary windows fail to render on Windows (white/transparent screen).
Replace the OAuth popup with a temporary localhost HTTP server that captures the
callback, and open the authorize URL in the system browser instead.

Also pre-create sidepanel and quickask windows in tauri.conf.json (with
visible: false) so they initialize alongside the main window, avoiding the
same WebView2 rendering issue with dynamically-created transparent windows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add multi-mode workspace client design

Design for transforming the desktop client into a full workspace IDE
with tabbed workspaces, tiling layout engine, pluggable widgets,
real-time agent collaboration, unified file management (local + remote
+ cloud), and responsive mobile support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add multi-mode workspace implementation plan

Detailed 6-phase implementation plan with ~49 new files across Rust
backend (fs commands, PTY, git, widget windows), SvelteKit frontend
(tiling layout, Monaco editor, file previews, widget system, terminal),
filesystem providers (local, remote, cloud), and co-work features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: rewrite design as AI file explorer (Poly.app-inspired)

Complete redesign from VS Code-like IDE to file-explorer-first UI.
Primary interface is a visual file grid with rich thumbnails and
5 view modes (Icon, Grid, List, Column, Gallery). AI chat lives
in a collapsible right sidebar with folder/file context awareness.
Agent actions (create, edit, delete files) reflect in the grid
in real-time. Terminal output is inline in chat, not a separate panel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): add file explorer with list view, search/filter, and context menu

Implements a full file explorer with Rust-backed filesystem operations
(read, write, delete, rename, watch, thumbnails), multiple view modes
(icon grid + list table), inline search/filter, and an enhanced context
menu with rename, new folder, and keyboard shortcuts (Ctrl+F, F2, Del).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(deep-work): add v2 engine — retry, timeout, cancel, output chaining, PawKit schema

Deep Work v2 hardens the execution engine and lays the foundation for
PawKit-based Command Centers. This is the first WIP PR toward the
blueprints-strategy vision.

Engine improvements:
- Task retry: auto-retry failed tasks up to configurable max_retries
- Task timeout: per-task asyncio.wait_for with timeout_minutes
- Output storage: task.output field for cross-task chaining
- Error tracking: task.error_message for diagnostics
- Project cancellation: CANCELLED status, stop_all_project_tasks,
  skip pending tasks with error_message
- Manual retry API: POST /projects/{id}/tasks/{tid}/retry

New PawKit config schema v0.1:
- Pydantic v2 models for Command Center templates
- Layout (panels, sections), workflows, user_config, integrations
- YAML load/save utilities with PyYAML

Files changed:
- src/pocketpaw/mission_control/models.py — 5 new Task fields
- src/pocketpaw/deep_work/models.py — TaskSpec v2 + CANCELLED status
- src/pocketpaw/mission_control/executor.py — retry, timeout, output, stop_all
- src/pocketpaw/deep_work/session.py — cancel(), materialize v2 fields
- src/pocketpaw/deep_work/api.py — cancel + retry endpoints
- src/pocketpaw/deep_work/__init__.py — cancel_project export
- src/pocketpaw/deep_work/pawkit.py — NEW: PawKit config schema
- tests/test_deep_work_v2.py — 71 new tests
- tests/test_mission_control_executor.py — adapted for retry defaults

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(deep-work): add 21 E2E tests for v2 engine features

API-level E2E tests using httpx.AsyncClient with ASGI transport.
Real file-based store, session, scheduler, and executor — only the
agent backend (LLM calls) is mocked.

Coverage: full lifecycle, output chaining, auto-retry, timeout,
cancellation, manual retry API, get plan API, skip task API,
and PawKit YAML round-trip.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(deep-work): add v2 feature docs — retry, timeout, cancel, output chaining, PawKit

Updated the Deep Work guide with new sections covering auto-retry,
task timeout, output chaining, project cancellation, and PawKit
template format. Added API endpoint pages for Cancel Project and
Retry Task. Updated sidebar navigation in docs-config.json.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add Command Centers concept page and PawKit technical reference

New standalone pages for the two biggest ideas in the v2 engine:

- docs/concepts/command-centers.mdx — the "why": agent-operated
  dashboards, PawKits as templates, conversational building,
  Kit Store marketplace, how it all connects
- docs/advanced/pawkit.mdx — the "how": full YAML schema reference
  with panel types, workflow triggers, user config fields,
  integrations, and Python API examples

Trimmed the PawKit section in deep-work.mdx to a cross-reference
since the content now lives in dedicated pages. Updated sidebar
navigation with both new entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): enhance file explorer with viewers, clipboard, and editor save

Add file preview support (audio, video, PDF, markdown with syntax highlighting),
copy/cut/paste with keyboard shortcuts, file properties dialog, open-in-terminal,
editable code editor with Ctrl+S save and dirty tracking, and improved AI model
settings panel with Ollama model fetching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): migrate stores to REST-first architecture, WS push-only

Move all request-response flows (sessions, chat, settings) to REST HTTP.
WebSocket is now exclusively for server-initiated push events.

- Add createSession() REST method to client
- Remove WS convenience methods (chat, newSession, switchSession, etc.)
- Remove bindEvents/disposeEvents from all stores
- Rewrite sessionStore to use POST /sessions and REST history
- Rewrite settingsStore.saveApiKey to use PUT /settings
- Simplify connectionStore (remove sessionId, connection_info listener)
- Simplify initializeStores (no more bindEvents calls)
- Clean up WS types (remove chat/session event types)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(api): REST-first architecture, fix WS dual-delivery and persistent client bugs (#384)

Move all request-response flows to REST HTTP. WebSocket is now push-only.
Fixes dual-delivery bug where SSE chat responses were broadcast to all WS
clients, and persistent Claude SDK client issues with stale data across sessions.

Backend changes:
- Add POST /sessions endpoint with SessionCreateResponse schema
- Fix WS adapter send() — drop messages for unknown chat_ids instead of broadcasting
- Fix WS adapter on_system_event() — route by session_key, not broadcast
- Enhance PUT /settings with runtime side-effects (reset_router + memory reload)
- Add cancel_task() to AgentLoop — lightweight cancellation without stopping router
- Fix _on_done callback race — don't remove newer task's entry from _active_tasks
- Add session lock contention logging for diagnostics

Chat streaming fixes:
- Cancel in-flight SSE streams when new request arrives for same session
- Only cancel agent tasks when there IS a competing stream (not on every request)
- Add diagnostic logging to SSE bridge event delivery

Claude SDK persistent client fixes:
- Include session_key in persistent client key — different sessions get fresh subprocesses
- Drain trailing ResultMessage from subprocess pipe after each query to prevent
  stale data on subsequent queries (root cause of "second message shows first response")
- Add dispatch/event logging for persistent vs stateless path diagnostics

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(api): add PawKits system + MC/DW route mounting for serve command

- Add kits module (models, store, catalog, builtin YAML kits)
- Add kits API router with CRUD, catalog, and data resolution endpoints
- Add api:agents and api:standup source resolvers in kits data resolution
- Mount MC and DW routers in serve.py (was only in dashboard.py)
- Fix standup resolver to use manager.generate_standup()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: merge desktop client into monorepo (#290)

* feat: merge desktop client into monorepo under client/

Move the Tauri 2.0 + SvelteKit desktop client from the separate
pocketpaw-client repo into client/ to create a monorepo. Update
.gitignore with client build artifact ignores and add Desktop Client
section to CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(client): unignore client/src/lib and broaden node/sveltekit ignores

The root .gitignore `lib/` pattern was blocking client/src/lib/ from
being tracked. Anchor it to `/lib/` so only the root Python lib dir is
ignored. Also generalize node_modules/ and .svelte-kit/ patterns to
work at any depth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(client): fix white OAuth screen on Windows

Load a local loading page first in the OAuth popup window, then
navigate to the external OAuth URL via eval(). WebView2 on Windows
may not be fully initialized when WebviewUrl::External is used
directly, resulting in a permanently white window.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(client): replace WebView OAuth popup with system browser + localhost server

WebView2 secondary windows fail to render on Windows (white/transparent screen).
Replace the OAuth popup with a temporary localhost HTTP server that captures the
callback, and open the authorize URL in the system browser instead.

Also pre-create sidepanel and quickask windows in tauri.conf.json (with
visible: false) so they initialize alongside the main window, avoiding the
same WebView2 rendering issue with dynamically-created transparent windows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add multi-mode workspace client design

Design for transforming the desktop client into a full workspace IDE
with tabbed workspaces, tiling layout engine, pluggable widgets,
real-time agent collaboration, unified file management (local + remote
+ cloud), and responsive mobile support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add multi-mode workspace implementation plan

Detailed 6-phase implementation plan with ~49 new files across Rust
backend (fs commands, PTY, git, widget windows), SvelteKit frontend
(tiling layout, Monaco editor, file previews, widget system, terminal),
filesystem providers (local, remote, cloud), and co-work features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: rewrite design as AI file explorer (Poly.app-inspired)

Complete redesign from VS Code-like IDE to file-explorer-first UI.
Primary interface is a visual file grid with rich thumbnails and
5 view modes (Icon, Grid, List, Column, Gallery). AI chat lives
in a collapsible right sidebar with folder/file context awareness.
Agent actions (create, edit, delete files) reflect in the grid
in real-time. Terminal output is inline in chat, not a separate panel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): add file explorer with list view, search/filter, and context menu

Implements a full file explorer with Rust-backed filesystem operations
(read, write, delete, rename, watch, thumbnails), multiple view modes
(icon grid + list table), inline search/filter, and an enhanced context
menu with rename, new folder, and keyboard shortcuts (Ctrl+F, F2, Del).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): enhance file explorer with viewers, clipboard, and editor save

Add file preview support (audio, video, PDF, markdown with syntax highlighting),
copy/cut/paste with keyboard shortcuts, file properties dialog, open-in-terminal,
editable code editor with Ctrl+S save and dirty tracking, and improved AI model
settings panel with Ollama model fetching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): migrate stores to REST-first architecture, WS push-only

Move all request-response flows (sessions, chat, settings) to REST HTTP.
WebSocket is now exclusively for server-initiated push events.

- Add createSession() REST method to client
- Remove WS convenience methods (chat, newSession, switchSession, etc.)
- Remove bindEvents/disposeEvents from all stores
- Rewrite sessionStore to use POST /sessions and REST history
- Rewrite settingsStore.saveApiKey to use PUT /settings
- Simplify connectionStore (remove sessionId, connection_info listener)
- Simplify initializeStores (no more bindEvents calls)
- Clean up WS types (remove chat/session event types)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): full Mission Control integration — agents, execution, projects, notifications

- Add MC/DW types (AgentProfile, MCTask, MCMessage, MCProject, etc.)
- Add 25+ API methods (mcRequest + dwRequest helpers) for agents, tasks, execution, notifications, projects
- Add mcStore (agents, running tasks, execution streaming, notifications) and projectStore (project lifecycle, planning)
- Add AgentRoster panel with CRUD, KanbanBoard enhancements (assignment, run/stop, messages, running indicator)
- Add TaskExecutionPanel slide-over for live agent streaming
- Add NotificationBell in titlebar with unread badge
- Add StandupPanel and document viewer/editor in DataTable
- Add /projects route with creation wizard, plan review, lifecycle controls
- Remove internal-docs from git tracking
- Rename tabs: Deep Work → PawKits, Projects → Deep Work

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): add PawKits catalog, layout renderer, and remaining panel components

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(api): add PawKits system + MC/DW route mounting for serve command

- Add kits module (models, store, catalog, builtin YAML kits)
- Add kits API router with CRUD, catalog, and data resolution endpoints
- Add api:agents and api:standup source resolvers in kits data resolution
- Mount MC and DW routers in serve.py (was only in dashboard.py)
- Fix standup resolver to use manager.generate_standup()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add plans and update lockfile

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: remove plans and revert uv.lock

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve merge breakages in chat, context builder, and MC routes

- Add missing FileContext model and field to ChatRequest schema
- Add file_context parameter to build_system_prompt()
- Move /tasks/running route before /tasks/{task_id} to fix 404
- Remove unused TaskStatus import in kits.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: resilient SDK streaming, in-app file viewer, and explorer tool

- Replace _safe_iter with _resilient_receive to handle MessageParseError
  by re-creating the iterator instead of losing the stream
- Add file-open command interception (start, xdg-open, etc.) to redirect
  to the in-app explorer viewer via PreToolUse hook
- Add /api/files/content endpoint for serving file contents with MIME types
- Add explorer built-in tool and open_in_explorer CLI command
- Update dashboard frontend with file viewer modal and transparency fixes
- Set SDK stream close timeout to 24h for long-running tool use
- Suppress API-key errors in ResultMessage to avoid noisy logs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(client): tabbed explorer, file viewers, command palette, and onboarding refresh

- Add multi-tab explorer with history, file type filtering, and address bar
- Implement PDF, spreadsheet, notebook, document, and text preview viewers
- Add command palette (Ctrl+K) with fuzzy search
- Overhaul onboarding wizard with model search and backend setup
- Add Tauri fs commands for file content and metadata
- Improve chat input, file preview, and session dropdown
- Update dialog components and workspace tabs
- Add content preview caching and binary utils for file handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Rohit Kushwaha <technicalrohit06@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Rohit Kushwaha <rohitk290106@gmail.com>
2026-03-07 21:02:27 +05:30
NidhiPednekar
52e775aa0f fixed malformed PUT api and retuns 400 (#387)
* docs: add PocketPaw vs OpenClaw comparison page

Adds a research-backed comparison page covering security architecture,
install experience, code footprint, local model support, and use case
guidance. Includes Guardian AI vs context compaction analysis, Ollama
setup tabs, and honest pros/cons for both projects.

- docs/concepts/vs-openclaw.mdx: new comparison page
- docs/docs-config.json: add page to Core Concepts sidebar nav

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: update vs-openclaw comparison with governance details and enhanced security architecture

* fix: remove soul-protocol from PyPI extras, bump to v0.4.6

soul-protocol is not published on PyPI yet. Having it as a resolved
dependency in the [soul] and [paw] extras caused a hard install failure
for anyone running pip install pocketpaw[paw] or pocketpaw[soul].

The paw CLI already handles missing soul-protocol gracefully at runtime
via lazy imports and _check_soul_protocol(). This removes the broken
PyPI resolution — users who want soul features install manually:

  pip install git+https://github.com/qbtrix/soul-protocol.git

Fixes: pip install pocketpaw[paw] ERROR: No matching distribution found
for soul-protocol>=0.2.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: correct version to 0.4.5.1 (hotfix patch)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fixed malformed PUT api and retuns 400

* fixed detail format and restore blank line

---------

Co-authored-by: Prakash <prakashd88@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Rohit Kushwaha <rohitk290106@gmail.com>
2026-03-05 20:54:28 +05:30
Rohit Kushwaha
3b466e1d84 fix(mcp): correct gws preset package name and startup check (#431)
* fix(mcp): correct gws preset package name and move health check out of startup

The google-workspace preset referenced a non-existent npm package
(@anthropic-ai/gws). The actual package is @googleworkspace/cli.
Also removed the fabricated `cargo install gws` install hint and
moved the gws health check to a separate INTEGRATION_CHECKS list
so it doesn't run on every startup for users who don't use it.

Refs #431

* test: add tests for gws preset and health check

- Test google-workspace preset fields (package, command, args, transport)
- Test check_gws_binary with mocked shutil.which (found + not found)
- Test INTEGRATION_CHECKS registry and that gws is not in STARTUP_CHECKS

* docs: add Google Workspace MCP integration page

Covers prerequisites (gws CLI install, auth setup), dashboard and
config-file installation, service filtering, alternative auth methods,
health check, troubleshooting, and comparison with built-in Google tools.

Refs #431
2026-03-05 20:35:45 +05:30
Rohit Kushwaha
9227fb4bc1 docs: add site URL, llms.txt config, and nav anchor
- Add metadata.url for sitemap and llms.txt generation
- Add integrations.llmsTxt config for AI discoverability
- Add llms.txt top-level nav anchor alongside Documentation and API Reference
- Remove unused _landing/robots.txt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 22:20:20 +05:30
Rohit Kushwaha
6eae1c63ba docs: add Discord server link across docs and project metadata
https://dsc.gg/pocketpaw

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 22:05:43 +05:30
Rohit Kushwaha
58073cca3f feat(agents): multi-SDK backend architecture v2 (#243)
* feat(agents): add backend protocol, registry, and capability system

Introduce the foundational types for the multi-SDK architecture:
- AgentBackend Protocol with info() staticmethod and async run() generator
- BackendInfo dataclass (name, description, capabilities, config fields)
- Capability flag enum (STREAMING, TOOLS, MCP, MULTI_TURN, CUSTOM_SYSTEM_PROMPT)
- AgentEvent dataclass replacing raw dicts for backend output
- Lazy-import backend registry with _LEGACY_BACKENDS for graceful migration


* refactor(agents): update Claude SDK backend to new protocol

Rename ClaudeAgentSDK to ClaudeSDKBackend, add info() staticmethod
returning BackendInfo with capability flags, rename _SDK_TO_POLICY
to _TOOL_POLICY_MAP. Backward-compat alias preserved.


* refactor(agents): remove legacy backends

Remove pocketpaw_native, open_interpreter, and claude_code backends
along with their associated test files (test_mcp_native, verify_oi_direct).
These are replaced by the new multi-SDK backend architecture.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(agents): add OpenAI Agents backend

Runner.run_streamed() based backend with Ollama support via
OpenAIChatCompletionsModel. Yields AgentEvent for streaming.


* feat(agents): add Google ADK backend with tool bridge

Native Google ADK SDK integration using LlmAgent + InMemoryRunner.
MCP support via McpToolset. tool_bridge.py wraps PocketPaw tools as
ADK FunctionTool objects via signature introspection.
Replaces the old gemini_cli subprocess wrapper.


* feat(agents): add OpenCode backend

Subprocess wrapper for the OpenCode Go binary.
Streams stdout/stderr as AgentEvent.


* feat(agents): add Codex CLI backend

Subprocess wrapper for the Codex CLI tool.
Supports streaming output as AgentEvent.


* feat(agents): add Copilot SDK backend

Microsoft Copilot SDK integration with streaming support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(agents): router uses registry, loop uses AgentEvent

Router now delegates to registry.get_backend_class() instead of
if/elif chain. AgentLoop consumes AgentEvent from backends
(event.type, event.content, event.metadata) instead of raw dicts.


* feat(config): add per-backend model and settings fields

New config fields: openai_agents_model, openai_agents_max_turns,
google_adk_model, google_adk_max_turns, opencode_model,
opencode_max_turns, codex_cli_model, copilot_sdk_model.
All added to Settings.save() dict.


* feat(dashboard): backend selector with capability badges

Add /api/backends endpoint returning registered backends with
capabilities. Dynamic dropdown in settings modal replaces hardcoded
backend list. Capability badges (streaming, tools, MCP, etc.)
displayed per backend. Frontend updated accordingly.


* refactor: update health, MCP, bootstrap for new backend system

Health checks reference new backend names. MCP manager updated for
registry-based backend detection. Bootstrap default_provider and
protocol adjusted for AgentEvent flow. CLI tools updated.


* test: update existing tests for architecture v2

Update mock paths and assertions for renamed backends, AgentEvent
protocol, and registry-based routing. Add test_channel_autostart.py
for dashboard channel auto-start behavior.


* chore(deps): add openai-agents, google-adk, and backend extras

New optional dependency groups: openai-agents, google-adk.
Updated uv.lock with resolved dependencies.


* feat: add stop button to cancel in-flight agent responses

Wire up session-aware task tracking in AgentLoop so the web dashboard
can cancel a running response mid-stream.

- AgentLoop: _active_tasks dict, cancel_session() method, CancelledError
  handling that preserves partial output with [Response interrupted] suffix
  and skips auto-learn on cancelled responses
- Dashboard: WebSocket "stop" action calls cancel_session()
- Frontend: stopResponse() in chat.js/websocket.js, send/stop button swap
  via Alpine x-show in chat.html

Closes #244


* feat: add /backend, /backends, /model, /tools slash commands

Enable users on messaging channels (Telegram, Discord, Slack, etc.) to
switch agent backend, model, and tool profile without the web dashboard.

- Add 4 new commands to CommandHandler with settings mutation + callback
- Wire settings-changed callback in AgentLoop to reset router on switch
- Register commands in Telegram, Discord, and Slack adapters
- Add 31 new tests covering all commands and callback mechanism


* feat(deps): add copilot-sdk to optional dependencies

* feat(backends): mark all non-Claude agent backends as beta

Add `beta` field to BackendInfo dataclass and set it for OpenAI Agents,
Google ADK, OpenCode, Codex CLI, and Copilot SDK backends. Claude Agent
SDK remains stable (beta=False). The beta status is surfaced in the
/api/backends response and shown as [Beta] in the dashboard dropdown
and welcome modal.


* chore(config): update default models to latest and set max_turns to 0

Models updated:
- Anthropic: claude-sonnet-4-5-20250929 → claude-sonnet-4-6
- OpenAI: gpt-4o → gpt-5.2
- Gemini: gemini-2.5-flash → gemini-2.5-pro
- Codex CLI: o4-mini → gpt-5.3-codex
- Copilot SDK fallback: gpt-4o → gpt-5.2
- Model router moderate tier: claude-sonnet-4-6

Max turns default changed from 25 to 0 (unlimited) across all backends.
Backend code updated to skip turn limits when max_turns is 0.


* chore(config): upgrade default Gemini model to gemini-3-pro-preview

Replace gemini-2.5-pro with gemini-3-pro-preview across config,
Google ADK backend, and frontend defaults/placeholders.


* test: remove 12 consistently failing tests

- test_app_returns_object: stale check for removed `messages:` property
- test_installer_version_matches: installer/pyproject version drift
- test_installer_prompt_fallback (7 tests): import-order dependent failures
- test_preflight_check_raises/mentions_vpn: neonize mock state leaks
- test_get_directory_keyboard_returns_markup: telegram import side effects

Full suite now passes: 2100 passed, 0 failed.


* fix(google-adk): enforce MCP server tool policy filtering

Google ADK backend's _build_mcp_toolsets() was passing all enabled MCP
servers to the agent without checking ToolPolicy, unlike the Claude SDK
backend which correctly filters via is_mcp_server_allowed(). This meant
deny rules like "mcp:server:*" or "group:mcp" had no effect on ADK.


* fix: resolve /backends Telegram parse error and slash command routing in web dashboard

- Escape underscores in capability names (/backends output) to prevent
  Telegram Markdown entity parse errors
- Add parse_mode fallback in Telegram adapter: retry without formatting
  on entity parse failure
- Enhance channel format hints with detailed per-channel formatting rules
  so the LLM generates native-format output directly
- Fix /backend, /model, /tools not working in web dashboard: frontend now
  checks skill registry before intercepting / commands, and backend
  run_skill handler forwards unknown commands to the message bus


* feat: add branded preloader to prevent FOUC on dashboard load

Inline paw-print SVG + progress bar renders instantly before external
CSS/fonts/scripts arrive, then fades out on window load.


* docs: update all docs for 6-backend architecture, slim down README

- Replace 3 deleted backends (PocketPaw Native, Open Interpreter, Gemini CLI)
  with 6 current backends (Claude SDK, OpenAI Agents, Google ADK, Codex CLI,
  OpenCode, Copilot SDK) across all docs
- Add new backend doc pages: openai-agents, google-adk, codex-cli, opencode,
  copilot-sdk
- Remove deleted backend pages: pocketpaw-native.mdx, open-interpreter.mdx
- Update docs-config.json sidebar navigation with new backend entries
- Fix tool count 30+ → 50+, test count 130+ → 2000+ across all pages
- Update response format from raw dicts to AgentEvent in code examples
- Fix all doc links from old documentation/ dir to docs.pocketpaw.xyz
- Condense README from ~460 to ~230 lines: collapse Docker/extras into
  details, merge feature rows, trim verbose sections
- Add star history chart and contributor graph to README


* fix: enforce API key auth for Claude SDK backend, block OAuth fallback

Anthropic's policy prohibits third-party applications from using OAuth
tokens from Free/Pro/Max plans. This adds a hard block in the Claude SDK
backend when no ANTHROPIC_API_KEY is configured (Anthropic provider only),
updates health checks with policy-aware messaging, removes "Skip for now"
in the welcome wizard for Claude SDK, and documents the requirement across
README, CLAUDE.md, and all relevant docs pages.


* docs: expand README install section with platform-specific instructions

Add desktop app download table (macOS .dmg, Windows .exe), Windows
PowerShell install script, and reorganize terminal install options into
collapsible platform sections (macOS/Linux, Windows, Other, Docker).


* docs: remove 'recommended' label from desktop app section

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: default max_turns to 100 instead of unlimited (0)

Prevents runaway agent loops from burning API credits silently. 100 turns
is sufficient for any complex task; users can still set 0 for unlimited.

Addresses PR #243 review feedback.


---------
2026-02-19 21:01:13 +05:30
Rohit Kushwaha
3972fbdfdd Merge branch 'dev' into docs/health-engine 2026-02-17 18:16:46 +05:30
Rohit Kushwaha
c26ad7ef04 docs(mcp): document OAuth support, CIMD, and transport changes
Update MCP documentation to reflect the OAuth support, registry
removal, and transport improvements from 38c0aac:

- Add OAuth authentication section with full flow explanation
- Add CIMD guide explaining why GitHub MCP needs it (no dynamic
  client registration) and how to set it up
- Document new transport types: streamable-http, sse, and http
  auto-detect (tries Streamable HTTP then falls back to SSE)
- Add oauth field to preset and server config API docs
- Add mcp_client_metadata_url to configuration reference
- Add mcp_oauth_redirect WebSocket message type
- Create new /api/mcp/oauth/callback endpoint doc
- Update REST API table with actual implemented endpoints
- Replace "Popular MCP Servers" table with preset catalog section
- Add error handling section (ExceptionGroup unwrapping, OAuth hints)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 17:51:55 +05:30
Prakash
4129f98e63 docs: add health engine concept page and API endpoint docs
Document the health engine system merged in #189. Adds a concept page
covering architecture, all 11 health checks, status computation, persistent
error log, agent diagnostic tools, system prompt injection, repair playbooks,
dashboard UI, and heartbeat scheduling. Adds 4 API endpoint pages for
GET /api/health, GET /api/health/errors, POST /api/health/check, and
DELETE /api/health/errors. Updates sidebar navigation with entries in both
Core Concepts and API Reference sections.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 16:40:06 +05:30
Rohit Kushwaha
5325e68780 Merge branch 'main' into dev 2026-02-16 20:11:01 +05:30
Rohit Kushwaha
f19327588f feat: add Gemini as first-class LLM provider with backend compatibility docs
Add Gemini as a dedicated provider option that reuses the OpenAI-compatible
code path internally via Google's endpoint. Users just need a Google API key
and model name — no manual base URL configuration.

Code changes:
- config.py: add gemini_model field + save() dict entry
- client.py: add is_gemini property, resolve_llm_client() for gemini,
  Gemini-specific error formatting with AI Studio links
- pocketpaw_native.py: handle Gemini via AsyncOpenAI client with auto
  format conversion, 180s timeout, skip smart routing
- claude_sdk.py: add explicit Gemini incompatibility guard with helpful
  error message (SDK speaks Anthropic format, Gemini speaks OpenAI format),
  fix model passthrough and smart routing conditions for is_gemini
- dashboard.py: receive/send geminiModel + hasGoogleApiKey, handle google
  provider in save_api_key
- Frontend: add Gemini option, model dropdown (2.5/3.x), Google API Key
  entry, provider guide link in settings modal

Documentation:
- New docs/concepts/llm-providers.mdx — comprehensive guide covering all
  providers, detailed backend compatibility (which API format each speaks),
  smart model routing, configuration reference, and troubleshooting
- Add page to docs-config.json sidebar navigation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:36:24 +05:30
Prakash
a7d025b514 docs: add roadmap and WebMCP pages to docs site
- Add docs/advanced/roadmap.mdx covering v0.1 through future plans
- Add docs/advanced/webmcp.mdx for Chrome WebMCP browser integration
- Update docs-config.json sidebar with both new pages under Advanced

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 17:46:36 +05:30
Rohit Kushwaha
b665a1a9cd feat: add centralized LLMClient abstraction and Ollama support across all backends
Consolidate duplicated provider detection, AsyncAnthropic client creation,
env var construction, and error formatting into a single LLMClient dataclass
with a resolve_llm_client() factory. Refactor 10 consumer files to use it.

- New llm/client.py: LLMClient frozen dataclass + resolve_llm_client()
- Claude SDK backend: Ollama env vars via llm.to_sdk_env(), --check-ollama CLI
- PocketPaw Native: replace _llm_provider tracking with LLMClient
- Security modules: force_provider="anthropic" for Guardian + InjectionScanner
- Dashboard: Ollama settings UI, provider selection
- Docs: Ollama backend documentation
- Tests: 19 new LLMClient tests, updated Ollama + concurrency tests

Closes #53

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 00:52:21 +05:30
Rohit Kushwaha
4bb7313829 feat: move docs into monorepo, add deploy workflow
Consolidate documentation from the separate pocketpaw-web repo into the
main pocketpaw repo. This keeps docs and code in sync so PRs can update
both atomically.

- Remove docs/ from .gitignore
- Remove docs' own .git (was pocketpaw/pocketpaw-web)
- Add .github/workflows/deploy-docs.yml (builds from docs/ subdirectory)
- Track all 120+ MDX pages, config, landing page, and public assets

The separate pocketpaw-web repo can now be archived.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:12:04 +05:30