poceketPaw: hi

This commit is contained in:
Prakash
2026-02-01 15:10:28 +05:30
commit 1b0d39c8de
55 changed files with 8525 additions and 0 deletions

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.12

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 PocketClaw Team
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

147
README.md Normal file
View File

@@ -0,0 +1,147 @@
# 🦀 PocketClaw
> **The AI agent that runs on your laptop, not a datacenter.**
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![UV](https://img.shields.io/badge/uv-package%20manager-blueviolet)](https://docs.astral.sh/uv/)
PocketClaw is a self-hosted, cross-platform personal AI agent you control via **Telegram**. Unlike cloud-hosted AI assistants, PocketClaw runs on _your_ machine, respects _your_ privacy, and works even on that dusty laptop in your closet.
## ✨ Features
- 🔋 **Sleep Mode** — Near-zero CPU when idle, wakes on your message
- 🔒 **Local-First** — Runs on your machine, your data stays yours
- 🧠 **Dual Agent Backend** — Choose between Open Interpreter or Claude Code
- 🤖 **Multi-LLM Support** — Ollama (local), OpenAI, or Anthropic
- 📱 **Telegram-First** — Control from anywhere, no port forwarding needed
- 🖥️ **Cross-Platform** — macOS, Windows, Linux
## 🚀 Quick Start
### Prerequisites
Install [UV](https://docs.astral.sh/uv/) (the fast Python package manager):
```bash
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
### Install & Run
```bash
# Clone the repo
git clone https://github.com/pocketclaw/pocketclaw.git
cd pocketclaw
# Run (UV handles everything automatically!)
uv run pocketclaw
```
That's it! UV will:
1. Create a virtual environment
2. Install all dependencies
3. Run PocketClaw
4. Open your browser for setup
### One-liner (after release)
```bash
uvx pocketclaw
```
## 🤖 Agent Backends
### Open Interpreter (Default)
Works with any LLM (Ollama, OpenAI, Claude). Full shell and Python execution.
```
User: "Find all PDFs in Downloads and organize them by date"
Agent: [Runs shell commands, moves files]
Agent: "Done! Moved 23 PDFs into dated folders."
```
### Claude Code
Uses Anthropic's computer use capability. Can see your screen and control GUI.
```
User: "Open Chrome and search for weather"
Agent: [Takes screenshot, clicks, types]
Agent: "Done! Showing weather results."
```
## ⚙️ Configuration
PocketClaw stores config in `~/.pocketclaw/config.json`:
```json
{
"telegram_bot_token": "your-bot-token",
"allowed_user_id": 123456789,
"agent_backend": "open_interpreter",
"llm_provider": "auto",
"ollama_model": "llama3.2",
"openai_api_key": "sk-...",
"anthropic_api_key": "sk-ant-..."
}
```
Or use environment variables:
```bash
export POCKETCLAW_OPENAI_API_KEY="sk-..."
export POCKETCLAW_AGENT_BACKEND="claude_code"
```
## 🛠️ Telegram Controls
| Button | Function |
| ------------- | ---------------------------------- |
| 🟢 Status | CPU, RAM, disk, battery, uptime |
| 📁 Fetch | Browse and download files |
| 📸 Screenshot | Capture current screen |
| 🧠 Agent Mode | Toggle autonomous agent |
| 🛑 Panic | Emergency stop all agent processes |
| ⚙️ Settings | Switch agent/LLM backends |
## 🔐 Security
- **Single User Lock** — Only one Telegram user can control the bot
- **File Jail** — File operations restricted to home directory
- **Panic Button** — Hard kill switch for runaway agents
- **Local LLM Option** — Keep everything on-device with Ollama
## 🧑‍💻 Development
```bash
# Clone
git clone https://github.com/pocketclaw/pocketclaw.git
cd pocketclaw
# Install with dev dependencies
uv sync --dev
# Run tests
uv run pytest
# Lint
uv run ruff check .
# Format
uv run ruff format .
```
## 🤝 Contributing
PRs welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
## 📄 License
MIT © PocketClaw Team

188
docs/STATUS.md Normal file
View File

@@ -0,0 +1,188 @@
# PocketClaw: Project Status & Roadmap
> Last updated: 2026-02-01
## 🎯 Current Status: **MVP Complete + Web Dashboard**
The core functionality is implemented and tested.
---
## ✅ Completed (v0.1.0)
### Core Infrastructure
- [x] UV package manager setup
- [x] Cross-platform project structure
- [x] Pydantic-based configuration
- [x] MIT License
- [x] Unit tests (19 passing)
### Telegram Bot
- [x] Long-polling gateway
- [x] Persistent keyboard UI
- [x] User authorization (single-user lock)
- [x] Settings menu with inline keyboard
### Web Dashboard (New!)
- [x] Full web UI for testing without Telegram
- [x] WebSocket real-time updates
- [x] API key input fields (Anthropic, OpenAI)
- [x] Live settings persistence
### Web Pairing
- [x] FastAPI server on localhost:8888
- [x] QR code generation
- [x] Beautiful setup UI
- [x] Auto-shutdown after pairing
### Tools
- [x] 🟢 Status (CPU, RAM, Disk, Battery, Uptime)
- [x] 📁 Fetch (file browser with inline keyboard)
- [x] 📸 Screenshot (with graceful fallback)
- [x] 🛑 Panic (hard kill switch)
### LLM Router
- [x] Auto-detection (Ollama → OpenAI → Claude)
- [x] Ollama client (local LLMs)
- [x] OpenAI client
- [x] Anthropic client
- [x] Conversation history
### Agent Backends
- [x] Agent router (switchable via settings)
- [x] Open Interpreter wrapper
- [x] Claude Code wrapper (with computer use tools)
---
## 🔄 In Progress
| Task | Status | Notes |
| ------------------ | ------------------- | ----------------------------------- |
| End-to-end testing | 🟡 Pending | Need Telegram bot token to test |
| QR deep link flow | 🟡 Needs refinement | Current flow requires manual /start |
---
## 📋 TODO (v0.2.0)
### High Priority
- [ ] **Fix QR deep link** — Auto-extract bot username for proper deep link
- [ ] **Test Open Interpreter integration** — Verify streaming works
- [ ] **Test Claude Code integration** — Test computer use tools
- [ ] **Error handling** — Add proper error messages for common failures
- [ ] **Logging** — Add structured logging with levels
### Medium Priority
- [ ] **Cost controls** — Warn user before expensive LLM operations
- [ ] **Rate limiting** — Prevent Telegram API spam
- [ ] **Multi-user support** — Allow household/team access
- [ ] **Conversation persistence** — Save chat history to disk
### Nice to Have
- [ ] **PyInstaller binaries** — Single executable for distribution
- [ ] **Auto-update** — Check for new versions
- [ ] **Plugin system** — User-defined tools
- [ ] **Webhook mode** — Alternative to long-polling
---
## 📋 TODO (v0.3.0 - Future)
- [ ] **Tailscale integration** — Secure remote access
- [ ] **Web dashboard** — Alternative to Telegram
- [ ] **Mobile app** — Native iOS/Android
- [ ] **Voice messages** — Process voice via Whisper
- [ ] **Scheduled tasks** — Cron-like automation
---
## 📊 File Structure
```
pocketclaw/
├── pyproject.toml ✅ UV config
├── .python-version ✅ Python 3.12
├── README.md ✅ Documentation
├── LICENSE ✅ MIT
├── docs/
│ ├── STATUS.md ✅ This file
│ ├── idea.md 📄 Original idea
│ ├── openclaw.md 📄 Competition analysis
│ └── tech-spec.md 📄 Original tech spec
└── src/pocketclaw/
├── __init__.py ✅
├── __main__.py ✅ Entry point
├── config.py ✅ Settings
├── bot_gateway.py ✅ Telegram handlers
├── web_server.py ✅ QR pairing
├── tools/
│ ├── __init__.py ✅
│ ├── status.py ✅
│ ├── fetch.py ✅
│ └── screenshot.py ✅
├── llm/
│ ├── __init__.py ✅
│ └── router.py ✅
└── agents/
├── __init__.py ✅
├── router.py ✅
├── open_interpreter.py ✅
└── claude_code.py ✅
```
---
## 🚀 How to Test
```bash
# 1. Create a Telegram bot
# Visit @BotFather, send /newbot, get token
# 2. Run PocketClaw
cd /Users/prakash/Documents/Qbtrix/pocketClaw
uv run pocketclaw
# 3. Setup
# - Browser opens to localhost:8888
# - Paste bot token
# - Add API keys (optional)
# - Scan QR / send /start to bot
# 4. Test
# - Tap 🟢 Status → see system stats
# - Tap 📁 Fetch → browse files
# - Tap 📸 Screenshot → get screen image
# - Tap 🧠 Agent Mode → enable agent
# - Type "List files in Downloads" → agent executes
```
---
## 📈 Metrics
| Metric | Value |
| --------------- | --------- |
| Total files | 17 |
| Lines of code | ~1,500 |
| Dependencies | 15 direct |
| Python version | 3.11+ |
| Package manager | UV |
---
## 🔗 Links
- Repository: [github.com/pocketclaw/pocketclaw](https://github.com/pocketclaw/pocketclaw)
- Issues: TBD
- Discord: TBD

142
docs/idea.md Normal file
View File

@@ -0,0 +1,142 @@
Product Name: PocketClaw
Tagline: The Sovereign Interface for your "Dusty Laptop."
Version: 1.0 (The "Strike" Edition)
1. Executive Summary
PocketClaw is the mobile command center for the "Sovereign AI" revolution. It turns any idle computer (Mac Mini, old Lenovo, dusty laptop) into a powerful, autonomous employee that you control via Telegram.
While others are "sitting on the sidelines," PocketClaw users are deploying agents that work, code, and market themselves while their owners sleep. It is zero-config, runs on local hardware, and includes a "Hype Engine" that autonomously generates viral content to spread the movement.
2. User Stories (The "Hype" Cycle)
The "Underground Builder"
"I have a Python script running my crypto trading bot on an old laptop. I want to text it 'Status?' and get a P&L report instantly, without SSH or complex dashboards."
The "Viral Launcher" (New)
"I just automated my entire data entry job. I want PocketClaw to screen-record itself doing the work, add a cool AI voiceover saying 'While you work 9-5, I work 24/7', and save it as a TikTok-ready video so I can flex on X (Twitter)."
The "Sovereign Individual"
"I don't trust the cloud. I want my files, my vision models, and my agent to run locally. PocketClaw lets me control my sovereign infrastructure from a secure, encrypted chat."
3. The "Zero-Config" Architecture
3.1 Connectivity Stack
Platform: Telegram Bot API (Free, Encrypted Cloud).
Protocol: Long-Polling (No open ports, works behind Starlink/Corporate Firewalls).
Auth: User-Generated Token (Zero liability for us).
3.2 The Application (Desktop Side)
Core: Python 3.11+.
Agent Brain: Wraps Open Interpreter or Claude Code.
Distribution: Single binary (.exe / .dmg) via PyInstaller.
4. Feature Specification
The Telegram interface uses a persistent "Command Deck" layout.
🟢 Status
📁 Fetch
🧠 AGENT
📢 HYPE
🛑 Kill
4.1 🧠 Agent Mode (The Sovereign Upgrade)
The Brain: Pipes Telegram text directly to open-interpreter or claude-code.
Capability: Full shell access. Can install packages, run Python, manage files.
Workflow:
User: "Analyze my Downloads folder and delete duplicates."
PocketClaw: [Executes Shell Commands]
PocketClaw: "Done. Deleted 14 files. Saved 200MB."
4.2 📢 Hype Mode (The Auto-Marketer)
Concept: The Agent markets itself.
Trigger: User types /hype "I just built a website in 3 minutes".
The Loop:
Record: PocketClaw starts a headless screen recorder (cv2 or ffmpeg).
Action: The Agent performs the task (scrolling code, opening windows).
Voiceover: Generates a TTS audio file: "Look at me go. I am PocketClaw. I built this while my human was eating dinner."
Edit: Merges Video + Audio using ffmpeg.
Delivery: Uploads the .mp4 directly to the Telegram chat.
Virality: The user just has to click "Forward" to X/TikTok.
4.3 Standard Tools (The Essentials)
🟢 Status: CPU/RAM/Thermal stats.
📁 Fetch: Browse file system -> Upload to Telegram.
👁️ Vision: Take Screenshot / Webcam photo.
🛑 Kill: Force kill the Agent process (Safety switch).
5. Technical Stack
Core Libraries
Bot Interface: python-telegram-bot
Agent Logic: open-interpreter (The brain).
System Control: psutil (Stats), pyautogui (Screenshots/Mouse).
The Hype Stack (Media Generation)
Screen Recording: pyscreenrec (Lightweight, cross-platform) or cv2 (Robust).
Voice (TTS): pyttsx3 (Offline, robotic/hacker vibe) or edge-tts (High quality, free Microsoft voices).
Video Processing: moviepy (For merging audio/video programmatically).
6. The "Self-Marketing" Launch Plan
PocketClaw will launch by using PocketClaw.
Phase 1: We release the binary.
Phase 2: We post a video of PocketClaw installing itself and then tweeting about it.
Phase 3: We encourage users to use /hype to show off their "Dusty Laptop" setups.
Campaign: "#DustyLaptopRevolution"
Incentive: Best auto-generated video gets a shoutout from the official Moltbot/OpenClaw accounts.
7. Development Roadmap (Weekend Sprint)
Saturday AM: Build the "Dumb" Bot (Status, Fetch, Screenshot).
Saturday PM: Integrate open-interpreter loop (Agent Mode).
Sunday AM: Build the "Hype Engine" (Screen recorder + TTS script).
Sunday PM: Record the launch video using the tool itself.

86
docs/issues/log.md Normal file
View File

@@ -0,0 +1,86 @@
# PocketClaw Issues Log
## Issue #1: Agent UI Silent During Execution
**Status:** ✅ Fixed
**Date:** 2026-02-01
### Problem
When using the web dashboard with Agent Mode ON:
- Open Interpreter output appeared in terminal (verbose)
- Web UI showed nothing until execution completed
- Long agent responses made UI appear "frozen"
### Root Cause
The `open_interpreter.py` agent wrapper collected ALL response chunks in a sync function, then yielded them only AFTER the entire execution completed.
### Fix
Implemented real-time streaming using `asyncio.Queue`:
- Chunks are pushed to queue from sync thread as they arrive
- Async generator yields chunks immediately to WebSocket
- Added timeout handling for long operations
### Files Changed
- `src/pocketclaw/agents/open_interpreter.py`
---
## Issue #2: Agent Misidentifies Active VS Code Project
**Status:** 🟡 Open
**Date:** 2026-02-01
### Problem
When asked "what's running in my VS Code", Open Interpreter:
- Found `pocketclaw --web` processes
- Incorrectly assumed that was the active VS Code project
- Didn't check the actual VS Code window/workspace
### Root Cause
Open Interpreter used process list (`ps aux`) to infer what's running, but VS Code's active workspace isn't exposed via process info.
### Potential Fix
Could add a tool that reads VS Code's recently opened workspaces from:
```
~/Library/Application Support/Code/storage.json
```
Or use AppleScript to query the front VS Code window's title.
### Status
This is an Open Interpreter limitation, not a PocketClaw bug. The agent made a reasonable inference but got it wrong.
---
## Issue #3: Missing `python-multipart` dependency
**Status:** ✅ Fixed
**Date:** 2026-02-01
### Problem
Running `uv run pocketclaw` failed with:
```
Form data requires "python-multipart" to be installed.
```
### Fix
Added dependency:
```bash
uv add python-multipart
```

50
docs/openclaw.md Normal file
View File

@@ -0,0 +1,50 @@
### **Executive Report: The Rise of OpenClaw**
**Date:** February 1, 2026
**Subject:** Overview of OpenClaw (formerly Clawdbot/Moltbot) and Market Impact
#### **1. What is OpenClaw?**
**OpenClaw** is currently the most viral open-source AI project of early 2026. It is a **self-hosted, autonomous personal agent** designed to run on your local hardware (typically a Mac Mini or high-end PC) and serve as a "24/7 Jarvis."
Unlike standard AI tools (like ChatGPT or Claude) that wait for you to type in a box, OpenClaw is designed to live in your background processes and communicate with you via **messaging apps** (WhatsApp, Telegram, Signal, iMessage).
**Key Capabilities:**
- **"It Has Hands":** It doesn't just chat; it executes terminal commands, manages local files, and controls web browsers to complete tasks (e.g., "Go find my flight invoice and save it to the Finance folder").
- **Proactive Nature:** It is famous for messaging _you_ first. Users report waking up to "Morning Briefings" sent to their WhatsApp without asking.
- **Local Memory:** It maintains a persistent "brain" using local Markdown files, allowing it to remember context indefinitely across sessions.
#### **2. The "Viral" Timeline & Naming Drama**
The projects explosion in popularity was fueled by a mix of high utility and community drama regarding its name.
- **Phase 1: Clawdbot (Jan 2026):** Released by Peter Steinberger (founder of PSPDFKit), it immediately went viral as "Claude with hands." It gained ~10,000 GitHub stars in days.
- **Phase 2: Moltbot (Late Jan 2026):** Following a trademark dispute with Anthropic (makers of Claude), the project was hastily renamed "Moltbot" (keeping the crustacean theme, implying "molting" into something new).
- **Phase 3: OpenClaw (Current):** As of roughly a week ago, the community settled on **OpenClaw** to emphasize its open-source nature. It currently sits at over **68,000+ GitHub stars**, making it one of the fastest-growing repos in history.
#### **3. Why People Are "Going Crazy" For It**
The hype is driven by three specific cultural and technical factors:
- **The "Mac Mini" Frenzy:**
There is a trending phenomenon of developers buying dedicated Mac Minis solely to run OpenClaw 24/7. This has led to minor hardware shortages and a wave of "My AI Server Setup" posts on X (Twitter) and Reddit.
- **"Crustafarianism" & Memes:**
The community has adopted a semi-ironic "cult" persona (referring to themselves as "Crustafarians" or followers of the Blue Lobster), creating a massive amount of meme-driven engagement that amplifies the project's reach beyond just developers.
- **The "Zero-Employee" Dream:**
Influencers are heavily promoting OpenClaw as the tool that enables a "one-person unicorn," claiming the agent can replace junior developers, executive assistants, and social media managers.
#### **4. Critical Reception & Risks**
While popular, the project is polarizing:
- **Security Panic:** Cybersecurity experts have labeled it a "security nightmare" because users are giving an autonomous AI full read/write access to their file systems and terminal. There have been reports of users accidentally letting the bot delete critical files or expose private keys.
- **Complexity:** Despite the hype, it is not "plug and play." It requires Docker, terminal knowledge, and API key management, leading to a flooded GitHub Issues page from non-technical users trying to install it.
#### **Summary Verdict**
OpenClaw is the definitive "OS Agent" of 2026. It has successfully bridged the gap between a fun coding experiment and a genuine productivity lifestyle tool, largely due to its unique choice to live inside **WhatsApp/Telegram** rather than a command line.
---
**Would you like me to find a setup guide for OpenClaw that avoids the common security pitfalls mentioned above?**

222
docs/tech-spec.md Normal file
View File

@@ -0,0 +1,222 @@
Technical Design Document: PocketClaw
Component: Desktop Agent / Mobile Interface
Version: 1.1 (Web Pairing Added)
1. System Architecture
PocketClaw operates as a single-binary daemon running on the host machine (User's Laptop/Server). It acts as a bridge between the Telegram Bot API and the local operating system shell.
High-Level Data Flow
Input: User sends message via Telegram App OR interacts via Local Web Dashboard.
Transport: Telegram Cloud -> PocketClaw Daemon (via HTTPS Long-Polling).
Routing:
Command Router: Intercepts specific commands (/start, /kill).
Agent Engine: Pipes natural language to LLM/Interpreter.
Execution: Python executes system calls (subprocess, os, pyautogui).
Output: Logs/Files/Video sent back to Telegram Cloud -> User.
2. Interface Specifications (Telegram)
The interface relies on a persistent ReplyKeyboardMarkup. We have removed the explicit "Hype" tab in favor of natural language triggers.
2.1 Persistent Command Deck
🟢 Status
📁 Fetch
🧠 Agent Mode
🛑 Panic
2.2 Control Logic
🟢 Status: Triggers system_tools.get_stats() → Returns text report.
📁 Fetch: Triggers file_manager.list_dir(current_path) → Returns InlineKeyboard navigation.
🧠 Agent Mode: Toggles STATE = AGENT_ACTIVE. Hides buttons, enables free-text input.
🛑 Panic: Triggers process_manager.emergency_kill() → Hard kills agent subprocesses.
3. Module Specifications
The application codebase is divided into five core modules.
3.1 bot_gateway.py (The Listener)
Library: python-telegram-bot (Async).
Responsibility:
Handles Long-Polling (updater.start_polling()).
Auth Gate: Middleware that checks update.effective_user.id against ALLOWED_USER_ID. Drops unauthorized traffic.
Router: Directs traffic to system_tools or agent_runtime.
3.2 web_server.py (The Pairing Bridge)
Library: FastAPI + Uvicorn (Lightweight).
Port: 8888 (Default).
Responsibility:
Serves the Pairing Dashboard at http://localhost:8888.
Generates a dynamic QR Code using qrcode library.
Deep Link: The QR code encodes https://t.me/YourBotName?start=<SESSION_SECRET>.
Handshake: When the user scans the QR and taps Start on Telegram, bot_gateway validates the <SESSION_SECRET> and saves the User's Chat ID as the ALLOWED_USER_ID.
3.3 agent_runtime.py (The Brain)
Library: open-interpreter (Python Interface).
Configuration:
auto_run = True (No human confirmation step).
offline = False (Allow API usage if configured).
Context Window: Maintains a sliding window of the last 10 interactions.
Tool Access: Gives the LLM explicit access to media_engine tools (see 3.5).
3.4 system_tools.py (The Hands)
Stats: Uses psutil to fetch CPU load, RAM usage, and uptime.
Files: Uses pathlib for directory traversal.
Constraint: ROOT_DIR defaults to User Home. os.pardir (..) is blocked if attempting to go above Home (basic jail).
Vision: Uses pyautogui.screenshot() and cv2 (OpenCV) to capture webcam frames.
3.5 media_engine.py (The "Hype" Capability)
This module provides tools for the Agent to "market itself" upon request.
Tool: start_screen_recording(filename)
Spawns a background thread using cv2.VideoWriter grabbing screen frames.
Tool: stop_screen_recording()
Finalizes the .mp4 file.
Tool: generate_voiceover(text)
Uses edge-tts (Microsoft Edge Text-to-Speech) to generate high-quality .mp3 files locally.
Tool: compile_video(video_path, audio_path)
Uses ffmpeg-python (or moviepy) to merge the screen recording with the voiceover track.
4. Pairing & Setup Flow (The Web QR)
This flow ensures "Normies" never touch a config file.
Install: User runs PocketClaw.exe.
Auto-Launch: The app opens the default browser to http://localhost:8888.
Display: A clean web page shows:
"PocketClaw is Ready."
A large QR Code.
Text: "Scan with your phone camera to connect Telegram."
Action: User scans QR.
Redirect: Phone opens Telegram App -> Presses "Start".
Lock: The Desktop App detects the handshake, saves the User ID, and shuts down the Web Server (for security).
5. Agent Tool Definitions (JSON Schema)
The Agent (Open Interpreter) is initialized with these custom tools to enable the "Self-Marketing" features naturally.
[
{
"function": "record_task",
"description": "Records the screen while executing a task, adds a voiceover, and returns the video file.",
"parameters": {
"type": "object",
"properties": {
"task_description": {
"type": "string",
"description": "What you are doing, for the voiceover script."
},
"duration": {
"type": "integer",
"description": "Expected duration in seconds."
}
}
}
}
]
Usage Example:
User: "Run the payroll script and make a hype video about it."
Agent: Calls record_task(task_description="Running payroll automation...").
System: Starts recording -> Runs script -> Generates TTS -> Merges -> Sends Video.
6. Security & Safety Mechanisms
6.1 The "Panic" Switch
Since the Agent has shell access, we need a hard kill switch.
Mechanism: The 🛑 Panic button operates on a separate thread.
Action:
Sets global STOP_SIGNAL = True.
Iterates through psutil.Process(pid).children() of the Agent runtime.
Sends SIGKILL to all child processes immediately.
Resets Agent Context.
6.2 File System Jail (Soft)
The file_manager module strictly validates paths.
if not os.path.abspath(path).startswith(home_dir): raise AccessDenied
7. Deployment Strategy
7.1 Environment Variables (.env)
TELEGRAM_BOT_TOKEN="1234:ABC..."
ALLOWED_USER_ID="999888777"
OPENAI_API_KEY="sk-..." # Or ANTHROPIC_API_KEY / LOCAL_LLM_URL
7.2 Build Process (PyInstaller)
To ensure the "Dusty Laptop" compatibility, we compile to a standalone executable.
build_spec.spec configuration:
Hidden Imports: pydantic, tiktoken, moviepy.audio.fx.all, uvicorn, fastapi.
Data Files: Include ffmpeg binary if not assuming system install.
Command: pyinstaller --onefile --noconsole --name PocketClaw main.py

86
pyproject.toml Normal file
View File

@@ -0,0 +1,86 @@
[project]
name = "pocketclaw"
version = "0.1.0"
description = "The AI agent that runs on your laptop, not a datacenter"
readme = "README.md"
license = "MIT"
requires-python = ">=3.11"
keywords = ["ai", "agent", "telegram", "assistant", "automation"]
authors = [
{ name = "PocketClaw Team" }
]
classifiers = [
"Development Status :: 3 - Alpha",
"Environment :: Console",
"Intended Audience :: End Users/Desktop",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Home Automation",
]
dependencies = [
# Telegram Bot
"python-telegram-bot>=21.0",
# Web Server (QR Pairing)
"fastapi>=0.109.0",
"uvicorn[standard]>=0.27.0",
"qrcode[pil]>=7.4",
# System Tools
"psutil>=5.9.0",
"pyautogui>=0.9.54",
"pillow>=10.0.0",
# LLM Clients
"httpx>=0.26.0",
"openai>=1.10.0",
"anthropic>=0.18.0",
# Agent Backends
"open-interpreter>=0.2.0",
# Config
"pydantic>=2.5.0",
"pydantic-settings>=2.1.0",
"python-multipart>=0.0.22",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"ruff>=0.4.0",
"mypy>=1.8.0",
]
[project.scripts]
pocketclaw = "pocketclaw.__main__:main"
[project.urls]
Homepage = "https://github.com/pocketclaw/pocketclaw"
Repository = "https://github.com/pocketclaw/pocketclaw"
Issues = "https://github.com/pocketclaw/pocketclaw/issues"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/pocketclaw"]
[tool.ruff]
line-length = 100
target-version = "py311"
[tool.ruff.lint]
select = ["E", "F", "I", "UP"]
[tool.uv]
dev-dependencies = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"ruff>=0.4.0",
"mypy>=1.8.0",
]
[tool.pytest.ini_options]
asyncio_mode = "auto"

View File

@@ -0,0 +1,3 @@
"""PocketClaw - The AI agent that runs on your laptop, not a datacenter."""
__version__ = "0.1.0"

104
src/pocketclaw/__main__.py Normal file
View File

@@ -0,0 +1,104 @@
"""PocketClaw entry point."""
import argparse
import asyncio
import logging
import webbrowser
from pathlib import Path
from pocketclaw.config import get_settings, Settings
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
level=logging.INFO
)
logger = logging.getLogger(__name__)
async def run_telegram_mode(settings: Settings) -> None:
"""Run in Telegram bot mode."""
from pocketclaw.web_server import run_pairing_server
from pocketclaw.bot_gateway import run_bot
# Check if we need to run pairing flow
if not settings.telegram_bot_token or not settings.allowed_user_id:
logger.info("🔧 First-time setup: Starting pairing server...")
print("\n" + "="*50)
print("🦀 POCKETCLAW SETUP")
print("="*50)
print("\n1. Create a Telegram bot via @BotFather")
print("2. Copy the bot token")
print("3. Open http://localhost:8888 in your browser")
print("4. Paste the token and scan the QR code\n")
# Open browser automatically
webbrowser.open("http://localhost:8888")
# Run pairing server (blocks until pairing complete)
await run_pairing_server(settings)
# Reload settings after pairing
settings = get_settings(force_reload=True)
# Start the bot
logger.info("🚀 Starting PocketClaw bot...")
await run_bot(settings)
def run_dashboard_mode(settings: Settings, port: int) -> None:
"""Run in web dashboard mode."""
from pocketclaw.dashboard import run_dashboard
print("\n" + "="*50)
print("🦀 POCKETCLAW WEB DASHBOARD")
print("="*50)
print(f"\n🌐 Open http://localhost:{port} in your browser\n")
webbrowser.open(f"http://localhost:{port}")
run_dashboard(host="127.0.0.1", port=port)
def main() -> None:
"""Main entry point."""
parser = argparse.ArgumentParser(
description="🦀 PocketClaw - The AI agent that runs on your laptop",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
pocketclaw Start in Telegram mode (default)
pocketclaw --web Start web dashboard for testing
pocketclaw --web --port 9000 Web dashboard on custom port
"""
)
parser.add_argument(
"--web", "-w",
action="store_true",
help="Run web dashboard instead of Telegram bot"
)
parser.add_argument(
"--port", "-p",
type=int,
default=8888,
help="Port for web server (default: 8888)"
)
parser.add_argument(
"--version", "-v",
action="version",
version="%(prog)s 0.1.0"
)
args = parser.parse_args()
settings = get_settings()
try:
if args.web:
run_dashboard_mode(settings, args.port)
else:
asyncio.run(run_telegram_mode(settings))
except KeyboardInterrupt:
logger.info("👋 PocketClaw stopped.")
if __name__ == "__main__":
main()

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,5 @@
"""Agents package for PocketClaw."""
from pocketclaw.agents.router import AgentRouter
__all__ = ["AgentRouter"]

View File

@@ -0,0 +1,203 @@
"""Claude Code agent wrapper using Anthropic's computer use."""
import asyncio
import logging
from typing import AsyncIterator, Optional
from pocketclaw.config import Settings
from pocketclaw.tools.screenshot import take_screenshot
logger = logging.getLogger(__name__)
class ClaudeCodeAgent:
"""Wraps Claude's computer use capability for autonomous task execution."""
def __init__(self, settings: Settings):
self.settings = settings
self._client = None
self._stop_flag = False
self._initialize()
def _initialize(self) -> None:
"""Initialize the Anthropic client."""
if not self.settings.anthropic_api_key:
logger.warning("⚠️ Claude Code requires Anthropic API key")
return
try:
from anthropic import AsyncAnthropic
self._client = AsyncAnthropic(api_key=self.settings.anthropic_api_key)
logger.info("✅ Claude Code agent initialized")
except ImportError:
logger.error("❌ Anthropic not installed. Run: pip install anthropic")
except Exception as e:
logger.error(f"❌ Failed to initialize Claude Code: {e}")
async def run(self, message: str) -> AsyncIterator[dict]:
"""Run a message through Claude Code."""
if not self._client:
yield {
"type": "message",
"content": "❌ Claude Code requires Anthropic API key. Add it in ⚙️ Settings."
}
return
self._stop_flag = False
try:
# Get current screenshot for context
screenshot_bytes = take_screenshot()
# Build messages with computer use tools
messages = [{"role": "user", "content": message}]
# Add screenshot if available
if screenshot_bytes:
import base64
screenshot_b64 = base64.b64encode(screenshot_bytes).decode()
messages = [{
"role": "user",
"content": [
{
"type": "text",
"text": f"Current screen state is attached. User request: {message}"
},
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": screenshot_b64
}
}
]
}]
# Call Claude with computer use tools
response = await self._client.messages.create(
model=self.settings.anthropic_model,
max_tokens=4096,
system="""You are PocketClaw, an AI agent running on the user's local machine.
You can help the user with tasks by analyzing their screen and providing guidance.
When you need to execute commands, provide them as bash code blocks.
Be concise and helpful.""",
messages=messages,
tools=[
{
"name": "bash",
"description": "Run a bash command on the user's machine",
"input_schema": {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "The bash command to run"
}
},
"required": ["command"]
}
},
{
"name": "computer",
"description": "Control the computer (take screenshot, click, type)",
"input_schema": {
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": ["screenshot", "click", "type", "key"],
"description": "The action to perform"
},
"coordinate": {
"type": "array",
"items": {"type": "integer"},
"description": "X, Y coordinates for click"
},
"text": {
"type": "string",
"description": "Text to type"
}
},
"required": ["action"]
}
}
]
)
# Process response
for block in response.content:
if self._stop_flag:
break
if block.type == "text":
yield {"type": "message", "content": block.text}
elif block.type == "tool_use":
tool_name = block.name
tool_input = block.input
if tool_name == "bash":
command = tool_input.get("command", "")
yield {"type": "code", "content": f"$ {command}"}
# Execute command
result = await self._execute_bash(command)
yield {"type": "message", "content": f"```\n{result}\n```"}
elif tool_name == "computer":
action = tool_input.get("action")
if action == "screenshot":
yield {"type": "message", "content": "📸 Taking screenshot..."}
elif action == "click":
coord = tool_input.get("coordinate", [0, 0])
yield {"type": "message", "content": f"🖱️ Clicking at ({coord[0]}, {coord[1]})"}
await self._click(coord[0], coord[1])
elif action == "type":
text = tool_input.get("text", "")
yield {"type": "message", "content": f"⌨️ Typing: {text[:50]}..."}
await self._type_text(text)
except Exception as e:
logger.error(f"Claude Code error: {e}")
yield {"type": "message", "content": f"❌ Agent error: {str(e)}"}
async def _execute_bash(self, command: str) -> str:
"""Execute a bash command and return output."""
try:
proc = await asyncio.create_subprocess_shell(
command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
stdout, stderr = await proc.communicate()
output = stdout.decode() if stdout else ""
if stderr:
output += f"\nSTDERR: {stderr.decode()}"
return output[:2000] # Limit output size
except Exception as e:
return f"Error: {str(e)}"
async def _click(self, x: int, y: int) -> None:
"""Click at coordinates."""
try:
import pyautogui
pyautogui.click(x, y)
except Exception as e:
logger.error(f"Click failed: {e}")
async def _type_text(self, text: str) -> None:
"""Type text."""
try:
import pyautogui
pyautogui.typewrite(text, interval=0.02)
except Exception as e:
logger.error(f"Type failed: {e}")
async def stop(self) -> None:
"""Stop the agent execution."""
self._stop_flag = True

View File

@@ -0,0 +1,172 @@
"""Open Interpreter agent wrapper."""
import asyncio
import logging
from typing import AsyncIterator, Optional
from pocketclaw.config import Settings
logger = logging.getLogger(__name__)
class OpenInterpreterAgent:
"""Wraps Open Interpreter for autonomous task execution."""
def __init__(self, settings: Settings):
self.settings = settings
self._interpreter = None
self._stop_flag = False
self._initialize()
def _initialize(self) -> None:
"""Initialize the Open Interpreter instance."""
try:
from interpreter import interpreter
# Configure interpreter
interpreter.auto_run = True # Don't ask for confirmation
interpreter.loop = True # Allow multi-step execution
# Set LLM based on settings
provider = self.settings.llm_provider
# Explicit provider selection
if provider == "anthropic" and self.settings.anthropic_api_key:
interpreter.llm.model = self.settings.anthropic_model
interpreter.llm.api_key = self.settings.anthropic_api_key
logger.info(f"🤖 Using Anthropic: {self.settings.anthropic_model}")
elif provider == "openai" and self.settings.openai_api_key:
interpreter.llm.model = self.settings.openai_model
interpreter.llm.api_key = self.settings.openai_api_key
logger.info(f"🤖 Using OpenAI: {self.settings.openai_model}")
elif provider == "ollama":
interpreter.llm.model = f"ollama/{self.settings.ollama_model}"
interpreter.llm.api_base = self.settings.ollama_host
logger.info(f"🤖 Using Ollama: {self.settings.ollama_model}")
# Auto mode: prioritize cloud APIs, fallback to Ollama
elif provider == "auto":
if self.settings.anthropic_api_key:
interpreter.llm.model = self.settings.anthropic_model
interpreter.llm.api_key = self.settings.anthropic_api_key
logger.info(f"🤖 Auto-selected Anthropic: {self.settings.anthropic_model}")
elif self.settings.openai_api_key:
interpreter.llm.model = self.settings.openai_model
interpreter.llm.api_key = self.settings.openai_api_key
logger.info(f"🤖 Auto-selected OpenAI: {self.settings.openai_model}")
else:
interpreter.llm.model = f"ollama/{self.settings.ollama_model}"
interpreter.llm.api_base = self.settings.ollama_host
logger.info(f"🤖 Auto-selected Ollama: {self.settings.ollama_model}")
# Safety settings
interpreter.safe_mode = "ask" # Will still ask before dangerous ops
self._interpreter = interpreter
logger.info("✅ Open Interpreter initialized")
except ImportError:
logger.error("❌ Open Interpreter not installed. Run: pip install open-interpreter")
self._interpreter = None
except Exception as e:
logger.error(f"❌ Failed to initialize Open Interpreter: {e}")
self._interpreter = None
async def run(self, message: str) -> AsyncIterator[dict]:
"""Run a message through Open Interpreter with real-time streaming."""
if not self._interpreter:
yield {
"type": "message",
"content": "❌ Open Interpreter not available. Install with: `pip install open-interpreter`"
}
return
self._stop_flag = False
# Use a queue to stream chunks from the sync thread to the async generator
chunk_queue: asyncio.Queue = asyncio.Queue()
def run_sync():
"""Run interpreter in a thread, push chunks to queue."""
current_message = []
try:
for chunk in self._interpreter.chat(message, stream=True):
if self._stop_flag:
break
if isinstance(chunk, dict):
chunk_type = chunk.get("type", "")
content = chunk.get("content", "")
if chunk_type == "code":
# Flush any pending message first
if current_message:
asyncio.run_coroutine_threadsafe(
chunk_queue.put({"type": "message", "content": "".join(current_message)}),
loop
)
current_message = []
# Send code block
asyncio.run_coroutine_threadsafe(
chunk_queue.put({"type": "code", "content": content}),
loop
)
elif chunk_type == "message" and content:
current_message.append(content)
# Stream partial messages every ~100 chars
if len("".join(current_message)) > 100:
asyncio.run_coroutine_threadsafe(
chunk_queue.put({"type": "message", "content": "".join(current_message)}),
loop
)
current_message = []
elif isinstance(chunk, str) and chunk:
current_message.append(chunk)
# Flush remaining message
if current_message:
asyncio.run_coroutine_threadsafe(
chunk_queue.put({"type": "message", "content": "".join(current_message)}),
loop
)
except Exception as e:
asyncio.run_coroutine_threadsafe(
chunk_queue.put({"type": "error", "content": f"Agent error: {str(e)}"}),
loop
)
finally:
# Signal completion
asyncio.run_coroutine_threadsafe(chunk_queue.put(None), loop)
try:
loop = asyncio.get_event_loop()
# Start the sync function in a thread
executor_future = loop.run_in_executor(None, run_sync)
# Yield chunks as they arrive
while True:
try:
chunk = await asyncio.wait_for(chunk_queue.get(), timeout=60.0)
if chunk is None: # End signal
break
yield chunk
except asyncio.TimeoutError:
yield {"type": "message", "content": "⏳ Still processing..."}
# Wait for executor to finish
await executor_future
except Exception as e:
logger.error(f"Open Interpreter error: {e}")
yield {"type": "error", "content": f"❌ Agent error: {str(e)}"}
async def stop(self) -> None:
"""Stop the agent execution."""
self._stop_flag = True
if self._interpreter:
try:
self._interpreter.reset()
except Exception:
pass

View File

@@ -0,0 +1,47 @@
"""Agent Router - routes to Open Interpreter or Claude Code."""
import logging
from typing import AsyncIterator, Optional
from pocketclaw.config import Settings
from pocketclaw.agents.open_interpreter import OpenInterpreterAgent
from pocketclaw.agents.claude_code import ClaudeCodeAgent
logger = logging.getLogger(__name__)
class AgentRouter:
"""Routes agent requests to the selected backend."""
def __init__(self, settings: Settings):
self.settings = settings
self._agent: Optional[OpenInterpreterAgent | ClaudeCodeAgent] = None
self._initialize_agent()
def _initialize_agent(self) -> None:
"""Initialize the selected agent backend."""
backend = self.settings.agent_backend
if backend == "open_interpreter":
self._agent = OpenInterpreterAgent(self.settings)
logger.info("🧠 Initialized Open Interpreter agent")
elif backend == "claude_code":
self._agent = ClaudeCodeAgent(self.settings)
logger.info("🧠 Initialized Claude Code agent")
else:
logger.warning(f"Unknown agent backend: {backend}, defaulting to Open Interpreter")
self._agent = OpenInterpreterAgent(self.settings)
async def run(self, message: str) -> AsyncIterator[dict]:
"""Run the agent with the given message."""
if not self._agent:
yield {"type": "message", "content": "❌ No agent initialized"}
return
async for chunk in self._agent.run(message):
yield chunk
async def stop(self) -> None:
"""Stop the agent."""
if self._agent:
await self._agent.stop()

View File

@@ -0,0 +1,333 @@
"""Telegram bot gateway - the main interface."""
import logging
from typing import Optional
from telegram import Update, ReplyKeyboardMarkup, InlineKeyboardMarkup, InlineKeyboardButton
from telegram.ext import (
Application,
CommandHandler,
MessageHandler,
CallbackQueryHandler,
ContextTypes,
filters,
)
import httpx
from pocketclaw.config import Settings
from pocketclaw.tools import status, fetch, screenshot
from pocketclaw.llm.router import LLMRouter
from pocketclaw.agents.router import AgentRouter
logger = logging.getLogger(__name__)
# Persistent keyboard
MAIN_KEYBOARD = ReplyKeyboardMarkup(
[
["🟢 Status", "📁 Fetch"],
["📸 Screenshot", "🛑 Panic"],
["🧠 Agent Mode", "⚙️ Settings"]
],
resize_keyboard=True,
is_persistent=True
)
class BotGateway:
"""Main Telegram bot gateway."""
def __init__(self, settings: Settings):
self.settings = settings
self.agent_active = False
self.agent_router: Optional[AgentRouter] = None
self.llm_router: Optional[LLMRouter] = None
def is_authorized(self, user_id: int) -> bool:
"""Check if user is authorized."""
return user_id == self.settings.allowed_user_id
async def start(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle /start command."""
user_id = update.effective_user.id
# If this is first connection, save the user ID
if not self.settings.allowed_user_id:
self.settings.allowed_user_id = user_id
self.settings.save()
# Notify web server that pairing is complete
try:
async with httpx.AsyncClient() as client:
await client.post(
f"http://{self.settings.web_host}:{self.settings.web_port}/complete",
params={"user_id": user_id}
)
except Exception:
pass # Web server might already be shut down
await update.message.reply_text(
"🦀 **PocketClaw Connected!**\n\n"
"Your AI agent is now running on your machine.\n\n"
"Use the buttons below to control it, or just type a message to chat.",
parse_mode="Markdown",
reply_markup=MAIN_KEYBOARD
)
elif self.is_authorized(user_id):
await update.message.reply_text(
"🦀 **Welcome back!**\n\nPocketClaw is ready.",
parse_mode="Markdown",
reply_markup=MAIN_KEYBOARD
)
else:
await update.message.reply_text("⛔ Unauthorized. This bot is locked to another user.")
async def handle_status(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle status request."""
if not self.is_authorized(update.effective_user.id):
return
stats = status.get_system_status()
await update.message.reply_text(stats, parse_mode="Markdown")
async def handle_fetch(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle fetch request - show file browser."""
if not self.is_authorized(update.effective_user.id):
return
keyboard = fetch.get_directory_keyboard(self.settings.file_jail_path)
await update.message.reply_text(
f"📁 **File Browser**\n`{self.settings.file_jail_path}`",
parse_mode="Markdown",
reply_markup=keyboard
)
async def handle_fetch_callback(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle file browser navigation."""
query = update.callback_query
await query.answer()
if not self.is_authorized(query.from_user.id):
return
data = query.data
if data.startswith("fetch:"):
path = data[6:]
result = await fetch.handle_path(path, self.settings.file_jail_path)
if result["type"] == "directory":
await query.edit_message_text(
f"📁 **File Browser**\n`{path}`",
parse_mode="Markdown",
reply_markup=result["keyboard"]
)
elif result["type"] == "file":
await query.message.reply_document(
document=open(path, "rb"),
filename=result["filename"]
)
async def handle_screenshot(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle screenshot request."""
if not self.is_authorized(update.effective_user.id):
return
await update.message.reply_text("📸 Taking screenshot...")
img_bytes = screenshot.take_screenshot()
if img_bytes:
await update.message.reply_photo(photo=img_bytes, caption="📸 Current screen")
else:
await update.message.reply_text("❌ Screenshot failed. Display might not be available.")
async def handle_panic(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle panic button - kill all agent processes."""
if not self.is_authorized(update.effective_user.id):
return
self.agent_active = False
if self.agent_router:
await self.agent_router.stop()
await update.message.reply_text(
"🛑 **PANIC ACTIVATED**\n\nAll agent processes stopped.",
parse_mode="Markdown",
reply_markup=MAIN_KEYBOARD
)
async def handle_agent_mode(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Toggle agent mode."""
if not self.is_authorized(update.effective_user.id):
return
self.agent_active = not self.agent_active
if self.agent_active:
if not self.agent_router:
self.agent_router = AgentRouter(self.settings)
backend = self.settings.agent_backend
await update.message.reply_text(
f"🧠 **Agent Mode: ON**\n\n"
f"Backend: `{backend}`\n\n"
f"Type your requests naturally. The agent has access to your shell and files.\n\n"
f"Tap 🛑 Panic to stop at any time.",
parse_mode="Markdown"
)
else:
await update.message.reply_text(
"🧠 **Agent Mode: OFF**\n\nBack to tool-only mode.",
parse_mode="Markdown",
reply_markup=MAIN_KEYBOARD
)
async def handle_settings(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Show settings menu."""
if not self.is_authorized(update.effective_user.id):
return
current_backend = self.settings.agent_backend
current_llm = self.settings.llm_provider
keyboard = InlineKeyboardMarkup([
[InlineKeyboardButton(
f"{'' if current_backend == 'open_interpreter' else ''} Open Interpreter",
callback_data="settings:backend:open_interpreter"
)],
[InlineKeyboardButton(
f"{'' if current_backend == 'claude_code' else ''} Claude Code",
callback_data="settings:backend:claude_code"
)],
[InlineKeyboardButton("──── LLM Provider ────", callback_data="noop")],
[InlineKeyboardButton(
f"{'' if current_llm == 'auto' else ''} Auto (Ollama → OpenAI → Claude)",
callback_data="settings:llm:auto"
)],
[InlineKeyboardButton(
f"{'' if current_llm == 'ollama' else ''} Ollama (Local)",
callback_data="settings:llm:ollama"
)],
[InlineKeyboardButton(
f"{'' if current_llm == 'openai' else ''} OpenAI",
callback_data="settings:llm:openai"
)],
[InlineKeyboardButton(
f"{'' if current_llm == 'anthropic' else ''} Anthropic (Claude)",
callback_data="settings:llm:anthropic"
)]
])
await update.message.reply_text(
"⚙️ **Settings**\n\nChoose your agent backend and LLM provider:",
parse_mode="Markdown",
reply_markup=keyboard
)
async def handle_settings_callback(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle settings selection."""
query = update.callback_query
await query.answer()
if not self.is_authorized(query.from_user.id):
return
data = query.data
if data == "noop":
return
if data.startswith("settings:backend:"):
backend = data.split(":")[-1]
self.settings.agent_backend = backend
self.settings.save()
await query.edit_message_text(
f"✅ Agent backend set to: **{backend}**",
parse_mode="Markdown"
)
elif data.startswith("settings:llm:"):
provider = data.split(":")[-1]
self.settings.llm_provider = provider
self.settings.save()
await query.edit_message_text(
f"✅ LLM provider set to: **{provider}**",
parse_mode="Markdown"
)
async def handle_message(self, update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
"""Handle text messages."""
if not self.is_authorized(update.effective_user.id):
return
text = update.message.text
# Handle keyboard buttons
if text == "🟢 Status":
await self.handle_status(update, context)
elif text == "📁 Fetch":
await self.handle_fetch(update, context)
elif text == "📸 Screenshot":
await self.handle_screenshot(update, context)
elif text == "🛑 Panic":
await self.handle_panic(update, context)
elif text == "🧠 Agent Mode":
await self.handle_agent_mode(update, context)
elif text == "⚙️ Settings":
await self.handle_settings(update, context)
elif self.agent_active and self.agent_router:
# Send to agent
await update.message.reply_text("🧠 Thinking...")
async for chunk in self.agent_router.run(text):
if chunk.get("type") == "message":
await update.message.reply_text(chunk["content"])
elif chunk.get("type") == "code":
await update.message.reply_text(
f"```\n{chunk['content']}\n```",
parse_mode="Markdown"
)
else:
# Simple LLM chat when agent mode is off
if not self.llm_router:
self.llm_router = LLMRouter(self.settings)
response = await self.llm_router.chat(text)
await update.message.reply_text(response)
async def run_bot(settings: Settings) -> None:
"""Run the Telegram bot."""
gateway = BotGateway(settings)
app = Application.builder().token(settings.telegram_bot_token).build()
# Command handlers
app.add_handler(CommandHandler("start", gateway.start))
# Callback handlers
app.add_handler(CallbackQueryHandler(gateway.handle_fetch_callback, pattern="^fetch:"))
app.add_handler(CallbackQueryHandler(gateway.handle_settings_callback, pattern="^settings:"))
# Message handler
app.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, gateway.handle_message))
# Start polling
await app.initialize()
await app.start()
await app.updater.start_polling(drop_pending_updates=True)
logger.info("🦀 PocketClaw is running! Send /start to your bot.")
# Keep running until stopped
try:
while True:
await asyncio.sleep(1)
except asyncio.CancelledError:
pass
finally:
await app.updater.stop()
await app.stop()
await app.shutdown()
# Need to import asyncio for the sleep
import asyncio

91
src/pocketclaw/config.py Normal file
View File

@@ -0,0 +1,91 @@
"""Configuration management for PocketClaw."""
import json
from pathlib import Path
from typing import Optional
from functools import lru_cache
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
def get_config_dir() -> Path:
"""Get the config directory, creating if needed."""
config_dir = Path.home() / ".pocketclaw"
config_dir.mkdir(exist_ok=True)
return config_dir
def get_config_path() -> Path:
"""Get the config file path."""
return get_config_dir() / "config.json"
class Settings(BaseSettings):
"""PocketClaw settings with env and file support."""
model_config = SettingsConfigDict(
env_prefix="POCKETCLAW_",
env_file=".env",
extra="ignore"
)
# Telegram
telegram_bot_token: Optional[str] = Field(default=None, description="Telegram Bot Token from @BotFather")
allowed_user_id: Optional[int] = Field(default=None, description="Telegram User ID allowed to control the bot")
# Agent Backend
agent_backend: str = Field(default="open_interpreter", description="Agent backend: 'open_interpreter' or 'claude_code'")
# LLM Configuration
llm_provider: str = Field(default="auto", description="LLM provider: 'auto', 'ollama', 'openai', 'anthropic'")
ollama_host: str = Field(default="http://localhost:11434", description="Ollama API host")
ollama_model: str = Field(default="llama3.2", description="Ollama model to use")
openai_api_key: Optional[str] = Field(default=None, description="OpenAI API key")
openai_model: str = Field(default="gpt-4o", description="OpenAI model to use")
anthropic_api_key: Optional[str] = Field(default=None, description="Anthropic API key")
anthropic_model: str = Field(default="claude-sonnet-4-20250514", description="Anthropic model to use")
# Security
file_jail_path: Path = Field(default_factory=Path.home, description="Root path for file operations")
# Web Server
web_host: str = Field(default="127.0.0.1", description="Web server host")
web_port: int = Field(default=8888, description="Web server port")
def save(self) -> None:
"""Save settings to config file."""
config_path = get_config_path()
data = {
"telegram_bot_token": self.telegram_bot_token,
"allowed_user_id": self.allowed_user_id,
"agent_backend": self.agent_backend,
"llm_provider": self.llm_provider,
"ollama_host": self.ollama_host,
"ollama_model": self.ollama_model,
"openai_api_key": self.openai_api_key,
"openai_model": self.openai_model,
"anthropic_api_key": self.anthropic_api_key,
"anthropic_model": self.anthropic_model,
}
config_path.write_text(json.dumps(data, indent=2))
@classmethod
def load(cls) -> "Settings":
"""Load settings from config file, falling back to env/defaults."""
config_path = get_config_path()
if config_path.exists():
try:
data = json.loads(config_path.read_text())
return cls(**data)
except (json.JSONDecodeError, Exception):
pass
return cls()
@lru_cache
def get_settings(force_reload: bool = False) -> Settings:
"""Get cached settings instance."""
if force_reload:
get_settings.cache_clear()
return Settings.load()

246
src/pocketclaw/dashboard.py Normal file
View File

@@ -0,0 +1,246 @@
"""PocketClaw Web Dashboard - API Server
Lightweight FastAPI server that serves the frontend and handles WebSocket communication.
"""
import asyncio
import base64
import logging
from pathlib import Path
from typing import Optional
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
import uvicorn
from pocketclaw.config import Settings
from pocketclaw.llm.router import LLMRouter
from pocketclaw.agents.router import AgentRouter
logger = logging.getLogger(__name__)
# Get frontend directory
FRONTEND_DIR = Path(__file__).parent / "frontend"
# Create FastAPI app
app = FastAPI(title="PocketClaw Dashboard")
# Allow CORS for WebSocket
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Mount static files
app.mount("/static", StaticFiles(directory=FRONTEND_DIR), name="static")
@app.get("/")
async def index():
"""Serve the main dashboard page."""
return FileResponse(FRONTEND_DIR / "index.html")
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
"""WebSocket endpoint for real-time communication."""
await websocket.accept()
# Send welcome notification (not chat message)
await websocket.send_json({
"type": "notification",
"content": "👋 Connected to PocketClaw!"
})
# Load settings
settings = Settings()
# State
agent_active = False
llm_router: Optional[LLMRouter] = None
agent_router: Optional[AgentRouter] = None
try:
while True:
data = await websocket.receive_json()
action = data.get("action")
# Handle tool requests
if action == "tool":
tool = data.get("tool")
await handle_tool(websocket, tool, settings, data)
# Handle agent toggle
elif action == "toggle_agent":
agent_active = data.get("active", False)
if agent_active and not agent_router:
agent_router = AgentRouter(settings)
await websocket.send_json({
"type": "notification",
"content": f"🧠 Agent Mode: {'ON' if agent_active else 'OFF'}"
})
# Handle chat
elif action == "chat":
message = data.get("message", "")
if agent_active and agent_router:
# Stream agent responses
await websocket.send_json({"type": "stream_start"})
try:
async for chunk in agent_router.run(message):
await websocket.send_json(chunk)
finally:
await websocket.send_json({"type": "stream_end"})
else:
# Simple LLM response
if not llm_router:
llm_router = LLMRouter(settings)
await websocket.send_json({"type": "stream_start"})
try:
response = await llm_router.chat(message)
await websocket.send_json({
"type": "message",
"content": response
})
finally:
await websocket.send_json({"type": "stream_end"})
# Handle settings update
elif action == "settings":
settings.agent_backend = data.get("agent_backend", settings.agent_backend)
settings.llm_provider = data.get("llm_provider", settings.llm_provider)
settings.save()
# Reset routers to pick up new settings
llm_router = None
agent_router = None
await websocket.send_json({
"type": "message",
"content": "⚙️ Settings updated"
})
# Handle API key save
elif action == "save_api_key":
provider = data.get("provider")
key = data.get("key", "")
if provider == "anthropic" and key:
settings.anthropic_api_key = key
settings.llm_provider = "anthropic"
settings.save()
llm_router = None
agent_router = None
await websocket.send_json({
"type": "message",
"content": "✅ Anthropic API key saved!"
})
elif provider == "openai" and key:
settings.openai_api_key = key
settings.llm_provider = "openai"
settings.save()
llm_router = None
agent_router = None
await websocket.send_json({
"type": "message",
"content": "✅ OpenAI API key saved!"
})
else:
await websocket.send_json({
"type": "error",
"content": "Invalid API key or provider"
})
# Handle file navigation
elif action == "navigate":
path = data.get("path", "")
await handle_file_navigation(websocket, path, settings)
except WebSocketDisconnect:
logger.info("WebSocket client disconnected")
except Exception as e:
logger.error(f"WebSocket error: {e}")
async def handle_tool(websocket: WebSocket, tool: str, settings: Settings, data: dict):
"""Handle tool execution."""
if tool == "status":
from pocketclaw.tools.status import get_system_status
status = get_system_status() # sync function
await websocket.send_json({
"type": "status",
"content": status
})
elif tool == "screenshot":
from pocketclaw.tools.screenshot import take_screenshot
result = take_screenshot() # sync function
if isinstance(result, bytes):
await websocket.send_json({
"type": "screenshot",
"image": base64.b64encode(result).decode()
})
else:
await websocket.send_json({
"type": "error",
"content": result
})
elif tool == "fetch":
from pocketclaw.tools.fetch import list_directory
path = data.get("path") or str(Path.home())
result = list_directory(path, settings.file_jail_path) # sync function
await websocket.send_json({
"type": "message",
"content": result
})
elif tool == "panic":
await websocket.send_json({
"type": "message",
"content": "🛑 PANIC: All agent processes stopped!"
})
# TODO: Actually stop agent processes
else:
await websocket.send_json({
"type": "error",
"content": f"Unknown tool: {tool}"
})
async def handle_file_navigation(websocket: WebSocket, path: str, settings: Settings):
"""Handle file browser navigation."""
from pocketclaw.tools.fetch import list_directory
result = list_directory(path, settings.file_jail_path) # sync function
await websocket.send_json({
"type": "message",
"content": result
})
def run_dashboard(host: str = "127.0.0.1", port: int = 8888):
"""Run the dashboard server."""
print("\n" + "=" * 50)
print("🦀 POCKETCLAW WEB DASHBOARD")
print("=" * 50)
print(f"\n🌐 Open http://localhost:{port} in your browser\n")
uvicorn.run(app, host=host, port=port)
if __name__ == "__main__":
run_dashboard()

View File

@@ -0,0 +1,430 @@
/* =====================================================
PocketClaw Dashboard - Modern Apple Design
Glassmorphism. SF Pro. Refined.
===================================================== */
:root {
/* Apple-inspired Dark Palette */
--bg-color: #000000;
--glass-bg: rgba(28, 28, 30, 0.65);
--glass-border: rgba(255, 255, 255, 0.12);
--card-bg: rgba(44, 44, 46, 0.6);
/* Text */
--text-primary: #FFFFFF;
--text-secondary: rgba(235, 235, 245, 0.6);
--text-tertiary: rgba(235, 235, 245, 0.3);
/* Accents */
--accent-color: #0A84FF; /* iOS Blue */
--accent-hover: #0077ED;
--success-color: #30D158;
--danger-color: #FF453A;
--warning-color: #FF9F0A;
/* Typography */
--font-sans: -apple-system, BlinkMacSystemFont, "SF Pro Text", "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
--font-mono: "SF Mono", "JetBrains Mono", Menlo, monospace;
/* Spacing & Radius */
--radius-lg: 18px;
--radius-md: 12px;
--radius-sm: 8px;
--radius-pill: 999px;
/* Shadows */
--shadow-sm: 0 1px 2px rgba(0, 0, 0, 0.1);
--shadow-lg: 0 20px 40px rgba(0, 0, 0, 0.4);
}
/* Reset & Base */
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: var(--font-sans);
background-color: var(--bg-color);
background-image: radial-gradient(circle at 50% 0%, #1a1a1a 0%, #000000 70%); /* Subtle depth */
color: var(--text-primary);
height: 100vh;
overflow: hidden;
display: flex;
-webkit-font-smoothing: antialiased;
}
/* =====================================================
Sidebar (Glass)
===================================================== */
.sidebar {
width: 280px;
background-color: var(--glass-bg);
backdrop-filter: blur(25px) saturate(180%);
-webkit-backdrop-filter: blur(25px) saturate(180%);
border-right: 1px solid var(--glass-border);
display: flex;
flex-direction: column;
padding: 24px 20px;
gap: 32px;
z-index: 10;
}
.logo {
font-size: 19px;
font-weight: 600;
letter-spacing: -0.01em;
display: flex;
align-items: center;
gap: 12px;
color: var(--text-primary);
padding-left: 8px;
}
.logo span { font-size: 24px; filter: drop-shadow(0 0 10px rgba(10, 132, 255, 0.3)); }
/* Navigation */
.nav-section { display: flex; flex-direction: column; gap: 6px; }
.nav-title {
font-size: 11px;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--text-secondary);
font-weight: 600;
margin-bottom: 10px;
padding-left: 12px;
}
.nav-item {
display: flex;
align-items: center;
gap: 14px;
padding: 10px 14px;
border-radius: var(--radius-md);
border: none;
background: transparent;
color: var(--text-primary);
font-size: 14px;
font-weight: 500;
cursor: pointer;
transition: all 0.2s cubic-bezier(0.25, 0.1, 0.25, 1);
width: 100%;
text-align: left;
}
.nav-item:hover { background: rgba(255, 255, 255, 0.08); }
.nav-item:active { transform: scale(0.98); background: rgba(255, 255, 255, 0.12); }
.nav-item.danger { color: var(--danger-color); }
.nav-item span { font-size: 18px; width: 20px; text-align: center; }
/* Status Widget */
.status-widget {
margin-top: auto;
background: rgba(0, 0, 0, 0.3);
border-radius: var(--radius-lg);
padding: 18px;
border: 1px solid var(--glass-border);
backdrop-filter: blur(10px);
}
.status-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 16px;
margin-top: 4px;
}
.status-item { display: flex; flex-direction: column; gap: 4px; }
.status-label { font-size: 11px; color: var(--text-secondary); font-weight: 500; }
.status-title { display: none; } /* Cleaner look */
.status-value {
font-size: 15px;
font-weight: 500;
font-family: var(--font-mono);
letter-spacing: -0.02em;
}
/* =====================================================
Main Content
===================================================== */
.main {
flex: 1;
display: flex;
flex-direction: column;
position: relative;
background: radial-gradient(circle at 50% 30%, #1c1c1e 0%, #000000 100%);
}
/* Top Bar (Floating Glass) */
.topbar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 16px 32px;
/* Glass header feels */
position: absolute;
top: 0; left: 0; right: 0;
z-index: 5;
background: linear-gradient(to bottom, rgba(0,0,0,0.4) 0%, transparent 100%);
}
/* View Toggle (Segmented Control) */
.view-toggle {
display: flex;
background: rgba(118, 118, 128, 0.24);
border-radius: 9px;
padding: 2px;
}
.view-btn {
padding: 6px 24px;
border-radius: 7px;
border: none;
background: transparent;
color: var(--text-secondary);
font-size: 13px;
font-weight: 500;
cursor: pointer;
transition: all 0.2s;
}
.view-btn.active {
background: rgba(99, 99, 102, 0.8); /* iOS Segmented Control Active */
color: white;
box-shadow: 0 3px 8px rgba(0,0,0,0.12), 0 3px 1px rgba(0,0,0,0.04);
}
/* Switch (iOS Style) */
.agent-toggle { display: flex; align-items: center; gap: 12px; }
.agent-label { font-size: 13px; font-weight: 500; color: var(--text-secondary); }
.switch { position: relative; width: 51px; height: 31px; }
.switch input { opacity: 0; width: 0; height: 0; }
.slider {
position: absolute;
inset: 0;
background-color: rgba(120, 120, 128, 0.32);
border-radius: 31px;
transition: .3s cubic-bezier(0.25, 0.1, 0.25, 1);
}
.slider:before {
content: "";
position: absolute;
height: 27px;
width: 27px;
left: 2px;
bottom: 2px;
background: white;
border-radius: 50%;
box-shadow: 0 3px 8px rgba(0,0,0,0.15), 0 3px 1px rgba(0,0,0,0.06);
transition: .3s cubic-bezier(0.25, 0.1, 0.25, 1);
}
input:checked + .slider { background-color: var(--success-color); }
input:checked + .slider:before { transform: translateX(20px); }
/* =====================================================
Chat View
===================================================== */
.chat-container {
flex: 1;
display: flex;
flex-direction: column;
padding-top: 80px; /* Offset for absolute header */
}
.messages {
flex: 1;
overflow-y: auto;
padding: 20px 32px;
display: flex;
flex-direction: column;
gap: 24px;
}
.message {
max-width: 75%;
display: flex;
flex-direction: column;
gap: 6px;
animation: messageSlide 0.3s ease-out;
}
@keyframes messageSlide { from { opacity: 0; transform: translateY(10px); } to { opacity: 1; transform: translateY(0); } }
.message.user { align-self: flex-end; align-items: flex-end; }
.message.assistant { align-self: flex-start; }
.message-header { display: flex; gap: 8px; align-items: center; margin-top: 6px; padding: 0 4px; opacity: 0.8; }
.message.user .message-header { justify-content: flex-end; flex-direction: row-reverse; } .message.user .message-header .message-author { display: none; } /* Hide 'You' label usually */
.message-author { font-size: 12px; font-weight: 600; color: var(--text-secondary); }
.message-time { font-size: 11px; color: var(--text-tertiary); }
.message-content {
padding: 14px 18px;
border-radius: var(--radius-lg);
font-size: 15px;
line-height: 1.5;
box-shadow: var(--shadow-sm);
position: relative;
}
.message.user .message-content {
background: var(--accent-color);
color: white;
border-bottom-right-radius: 4px; /* Message tail effect */
}
.message.assistant .message-content {
background: var(--card-bg);
border: 1px solid var(--glass-border);
border-bottom-left-radius: 4px;
}
/* Typing Indicator (iOS iMessage style) */
.typing-indicator { font-weight: bold; animation: pulse 1s infinite; }
@keyframes pulse { 0% { opacity: 0.3; } 50% { opacity: 1; } 100% { opacity: 0.3; } }
/* Code Blocks (Glass Terminal) */
.message-content pre {
background: rgba(0, 0, 0, 0.5);
padding: 16px;
border-radius: var(--radius-md);
margin: 12px 0;
font-family: var(--font-mono);
font-size: 13px;
border: 1px solid var(--glass-border);
white-space: pre-wrap;
}
.message-content code {
font-family: var(--font-mono);
background: rgba(0, 0, 0, 0.3);
padding: 2px 6px;
border-radius: 4px;
font-size: 13px;
}
/* Input Area (Floating Island) */
.input-area {
padding: 24px 32px 32px;
background: transparent;
}
.input-area input {
width: 100%;
padding: 16px 20px;
padding-right: 60px; /* Space for button */
border-radius: 24px;
background: rgba(44, 44, 46, 0.8);
border: 1px solid var(--glass-border);
color: white;
font-size: 15px;
backdrop-filter: blur(20px);
box-shadow: 0 4px 20px rgba(0,0,0,0.2);
transition: all 0.2s;
}
.input-area input:focus {
outline: none;
background: rgba(54, 54, 56, 0.9);
border-color: rgba(255,255,255,0.2);
box-shadow: 0 4px 24px rgba(0,0,0,0.3);
}
.input-area { position: relative; } /* To position button inside */
.chat-input { width: 100%; } /* Handled by input-area input above */
.send-btn {
position: absolute;
right: 24px;
top: 50%;
transform: translateY(-50%) scale(1);
width: 32px;
height: 32px;
border-radius: 50%;
background: var(--accent-color);
border: none;
color: white;
display: flex;
align-items: center;
justify-content: center;
cursor: pointer;
transition: all 0.2s cubic-bezier(0.25, 0.1, 0.25, 1);
box-shadow: 0 2px 8px rgba(10, 132, 255, 0.4);
}
.send-btn:hover { background: var(--accent-hover); transform: translateY(-50%) scale(1.08); }
.send-btn:active { transform: translateY(-50%) scale(0.95); }
.send-btn:disabled { background: #444; cursor: default; box-shadow: none; }
/* =====================================================
Terminal View
===================================================== */
.terminal-container {
flex: 1;
padding: 100px 32px 32px;
background: #000;
font-family: var(--font-mono);
}
.terminal-output { font-size: 13px; color: #BBB; line-height: 1.5; }
.term-line { margin-bottom: 4px; }
.term-time { color: #555; margin-right: 12px; font-size: 11px; }
/* =====================================================
Modals
===================================================== */
.modal-overlay {
background: rgba(0, 0, 0, 0.4);
backdrop-filter: blur(15px);
}
.modal {
background: rgba(30, 30, 32, 0.85);
border: 1px solid var(--glass-border);
border-radius: 20px;
box-shadow: var(--shadow-lg);
backdrop-filter: blur(20px);
}
.modal-header { padding: 20px 24px; border-bottom: 1px solid var(--glass-border); }
.modal-header h2 { font-size: 17px; font-weight: 600; }
.close-btn { font-size: 28px; color: var(--text-secondary); transition: color 0.2s; }
.close-btn:hover { color: white; }
.modal-body { padding: 24px; }
.form-group label { margin-bottom: 8px; font-weight: 500; font-size: 13px; color: var(--text-secondary); }
.form-group select,
.form-group input {
background: rgba(0,0,0,0.2);
border: 1px solid var(--glass-border);
border-radius: 10px;
padding: 12px;
color: white;
font-size: 14px;
}
.btn-sm { border-radius: 8px; font-weight: 600; }
/* =====================================================
Toast Notifications
===================================================== */
.toast-container { top: 32px; right: 32px; gap: 16px; }
.toast {
background: rgba(35, 35, 35, 0.85);
border: 1px solid var(--glass-border);
border-radius: 14px; /* iOS notification style */
padding: 14px 18px;
backdrop-filter: blur(20px);
box-shadow: 0 8px 32px rgba(0,0,0,0.3);
font-size: 14px;
font-weight: 500;
}
.toast.success { border-left: none; color: var(--success-color); } /* Icon handles color */
.toast.error { border-left: none; color: var(--danger-color); }
.toast.info { border-left: none; color: var(--text-primary); }
.toast-icon { margin-right: 12px; font-size: 16px; }
/* Scrollbar */
::-webkit-scrollbar { width: 10px; }
::-webkit-scrollbar-thumb { background: rgba(255,255,255,0.15); border-radius: 99px; border: 3px solid transparent; background-clip: content-box; }
::-webkit-scrollbar-thumb:hover { background-color: rgba(255,255,255,0.25); }
/* Utilities */
[x-cloak] { display: none !important; }

View File

@@ -0,0 +1,259 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>PocketClaw</title>
<link
rel="icon"
href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg'><text y='32' font-size='32'>🦀</text></svg>"
/>
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link
href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600&family=JetBrains+Mono:wght@400;500&display=swap"
rel="stylesheet"
/>
<link rel="stylesheet" href="/static/css/styles.css" />
<script
defer
src="https://cdn.jsdelivr.net/npm/alpinejs@3.14.3/dist/cdn.min.js"
></script>
</head>
<body x-data="app()" x-init="init()">
<!-- Sidebar -->
<aside class="sidebar">
<div class="logo"><span>🦀</span> PocketClaw</div>
<!-- Tools Section -->
<nav class="nav-section">
<h3 class="nav-title">Tools</h3>
<button class="nav-item" @click="runTool('fetch')">
<span>📁</span> Files
</button>
<button class="nav-item" @click="runTool('screenshot')">
<span>📸</span> Screenshot
</button>
<button class="nav-item danger" @click="runTool('panic')">
<span>🛑</span> Panic Button
</button>
</nav>
<!-- Config Section -->
<nav class="nav-section">
<h3 class="nav-title">Config</h3>
<button class="nav-item" @click="showSettings = true">
<span>⚙️</span> Settings
</button>
</nav>
<!-- System Status Widget -->
<div class="status-widget">
<h4 class="status-title">System Status</h4>
<div class="status-grid">
<div class="status-item">
<span class="status-label">CPU</span>
<span class="status-value" x-text="typeof status.cpu === 'number' ? status.cpu + '%' : status.cpu"></span>
</div>
<div class="status-item">
<span class="status-label">RAM</span>
<span class="status-value" x-text="typeof status.ram === 'number' ? status.ram + '%' : status.ram"></span>
</div>
<div class="status-item">
<span class="status-label">Disk</span>
<span class="status-value" x-text="typeof status.disk === 'number' ? status.disk + '%' : status.disk"></span>
</div>
<div class="status-item">
<span class="status-label">Bat</span>
<span class="status-value" x-text="typeof status.battery === 'number' ? status.battery + '%' : status.battery"></span>
</div>
</div>
</div>
</aside>
<!-- Main Content -->
<main class="main">
<!-- Top Bar -->
<header class="topbar">
<div class="view-toggle">
<button
class="view-btn"
:class="{ active: view === 'chat' }"
@click="view = 'chat'"
>
Chat
</button>
<button
class="view-btn"
:class="{ active: view === 'terminal' }"
@click="view = 'terminal'"
>
Terminal
</button>
</div>
<div class="agent-toggle">
<span class="agent-label">Agent Mode</span>
<label class="switch">
<input
type="checkbox"
x-model="agentActive"
@change="toggleAgent()"
/>
<span class="slider"></span>
</label>
</div>
</header>
<!-- Chat View -->
<div class="chat-container" x-show="view === 'chat'">
<div class="messages" x-ref="messages">
<template x-for="(msg, index) in messages" :key="index">
<div class="message" :class="msg.role">
<div
class="message-content"
x-html="formatMessage(msg.content)"
></div>
<div class="message-header">
<span
class="message-author"
x-text="msg.role === 'user' ? 'You' : 'PocketClaw'"
></span>
<span class="message-time" x-text="msg.time"></span>
</div>
</div>
</template>
<!-- Streaming indicator -->
<div class="message assistant" x-show="isStreaming">
<div class="message-content">
<span x-text="streamingContent"></span>
<span class="typing-indicator"></span>
</div>
<div class="message-header">
<span class="message-author">PocketClaw</span>
<span class="message-time" x-text="currentTime()"></span>
</div>
</div>
</div>
<form class="input-area" @submit.prevent="sendMessage()">
<input
type="text"
class="chat-input"
placeholder="Type a message..."
x-model="inputText"
:disabled="isStreaming"
/>
<button
type="submit"
class="send-btn"
:disabled="!inputText.trim() || isStreaming"
>
<span></span>
</button>
</form>
</div>
<!-- Terminal View -->
<div class="terminal-container" x-show="view === 'terminal'">
<div class="terminal-output" x-ref="terminal">
<template x-for="(log, index) in logs" :key="index">
<div class="term-line" :class="'term-' + log.level">
<span class="term-time" x-text="log.time"></span>
<span x-text="log.message"></span>
</div>
</template>
</div>
</div>
</main>
<!-- Settings Modal -->
<div
class="modal-overlay"
x-show="showSettings"
x-transition.opacity
@click.self="showSettings = false"
>
<div class="modal" @click.stop>
<div class="modal-header">
<h2>Settings</h2>
<button class="close-btn" @click="showSettings = false">×</button>
</div>
<div class="modal-body">
<div class="form-group">
<label>Agent Backend</label>
<select x-model="settings.agentBackend" @change="saveSettings()">
<option value="open_interpreter">Open Interpreter</option>
<option value="claude_code">Claude Code</option>
</select>
</div>
<div class="form-group">
<label>LLM Provider</label>
<select x-model="settings.llmProvider" @change="saveSettings()">
<option value="auto">Auto</option>
<option value="anthropic">Anthropic</option>
<option value="openai">OpenAI</option>
<option value="ollama">Ollama</option>
</select>
</div>
<hr />
<h3>API Keys</h3>
<div class="form-group">
<label>Anthropic API Key</label>
<div class="input-group">
<input
type="password"
x-model="apiKeys.anthropic"
placeholder="sk-ant-..."
/>
<button class="btn-sm" @click="saveApiKey('anthropic')">
Save
</button>
</div>
</div>
<div class="form-group">
<label>OpenAI API Key</label>
<div class="input-group">
<input
type="password"
x-model="apiKeys.openai"
placeholder="sk-..."
/>
<button class="btn-sm" @click="saveApiKey('openai')">Save</button>
</div>
</div>
</div>
</div>
</div>
<!-- Screenshot Modal -->
<div
class="modal-overlay"
x-show="showScreenshot"
x-transition.opacity
@click.self="showScreenshot = false"
>
<div class="modal modal-lg" @click.stop>
<div class="modal-header">
<h2>Screenshot</h2>
<button class="close-btn" @click="showScreenshot = false">×</button>
</div>
<div class="modal-body">
<img
:src="screenshotSrc"
alt="Screenshot"
class="screenshot-preview"
/>
</div>
</div>
</div>
<!-- Toast Container -->
<div class="toast-container" x-ref="toasts"></div>
<!-- Scripts -->
<script src="/static/js/websocket.js"></script>
<script src="/static/js/tools.js"></script>
<script src="/static/js/app.js"></script>
</body>
</html>

View File

@@ -0,0 +1,331 @@
/**
* PocketClaw Main Application
* Alpine.js component for the dashboard
*/
function app() {
return {
// View state
view: 'chat',
showSettings: false,
showScreenshot: false,
screenshotSrc: '',
// Agent state
agentActive: false,
isStreaming: false,
streamingContent: '',
streamingMessageId: null,
hasShownWelcome: false,
// Messages
messages: [],
logs: [],
inputText: '',
// System status
status: {
cpu: '—',
ram: '—',
disk: '—',
battery: '—'
},
// Settings
settings: {
agentBackend: 'open_interpreter',
llmProvider: 'auto'
},
// API Keys (not persisted client-side)
apiKeys: {
anthropic: '',
openai: ''
},
/**
* Initialize the app
*/
init() {
this.log('PocketClaw Dashboard initialized', 'info');
// Register event handlers first
this.setupSocketHandlers();
// Connect WebSocket (singleton - will only connect once)
socket.connect();
// Start status polling (low frequency)
this.startStatusPolling();
},
/**
* Set up WebSocket event handlers
*/
setupSocketHandlers() {
// Clear existing handlers to prevent duplicates
socket.clearHandlers();
const onConnected = () => {
this.log('Connected to PocketClaw Engine', 'success');
// Fetch initial status
socket.runTool('status');
};
socket.on('connected', onConnected);
// If already connected, trigger manually
if (socket.isConnected) {
onConnected();
}
socket.on('disconnected', () => {
this.log('Disconnected from server', 'error');
});
socket.on('message', (data) => this.handleMessage(data));
socket.on('notification', (data) => this.handleNotification(data));
socket.on('status', (data) => this.handleStatus(data));
socket.on('screenshot', (data) => this.handleScreenshot(data));
socket.on('code', (data) => this.handleCode(data));
socket.on('error', (data) => this.handleError(data));
socket.on('stream_start', () => this.startStreaming());
socket.on('stream_end', () => this.endStreaming());
},
/**
* Handle notification
*/
handleNotification(data) {
const content = data.content || '';
// Skip duplicate connection messages
if (content.includes('Connected to PocketClaw') && this.hasShownWelcome) {
return;
}
if (content.includes('Connected to PocketClaw')) {
this.hasShownWelcome = true;
}
this.showToast(content, 'info');
this.log(content, 'info');
},
/**
* Handle incoming message
*/
handleMessage(data) {
const content = data.content || '';
// Check if it's a status update (don't show in chat)
if (content.includes('System Status') || content.includes('🧠 CPU:')) {
this.status = Tools.parseStatus(content);
return;
}
// Handle streaming vs complete messages
if (this.isStreaming) {
this.streamingContent += content;
} else {
this.addMessage('assistant', content);
}
this.log(content.substring(0, 80) + (content.length > 80 ? '...' : ''), 'info');
},
/**
* Handle status updates
*/
handleStatus(data) {
if (data.content) {
this.status = Tools.parseStatus(data.content);
}
},
/**
* Handle screenshot
*/
handleScreenshot(data) {
if (data.image) {
this.screenshotSrc = `data:image/png;base64,${data.image}`;
this.showScreenshot = true;
}
},
/**
* Handle code blocks
*/
handleCode(data) {
const content = data.content || '';
if (this.isStreaming) {
this.streamingContent += '\n```\n' + content + '\n```\n';
} else {
this.addMessage('assistant', '```\n' + content + '\n```');
}
},
/**
* Handle errors
*/
handleError(data) {
const content = data.content || 'Unknown error';
this.addMessage('assistant', '❌ ' + content);
this.log(content, 'error');
this.showToast(content, 'error');
this.endStreaming();
},
/**
* Start streaming mode
*/
startStreaming() {
this.isStreaming = true;
this.streamingContent = '';
},
/**
* End streaming mode
*/
endStreaming() {
if (this.isStreaming && this.streamingContent) {
this.addMessage('assistant', this.streamingContent);
}
this.isStreaming = false;
this.streamingContent = '';
},
/**
* Add a message to the chat
*/
addMessage(role, content) {
this.messages.push({
role,
content,
time: Tools.formatTime()
});
// Auto scroll to bottom
this.$nextTick(() => {
if (this.$refs.messages) {
this.$refs.messages.scrollTop = this.$refs.messages.scrollHeight;
}
});
},
/**
* Send a chat message
*/
sendMessage() {
const text = this.inputText.trim();
if (!text) return;
// Add user message
this.addMessage('user', text);
this.inputText = '';
// Start streaming indicator
this.startStreaming();
// Send to server
socket.chat(text);
this.log(`You: ${text}`, 'info');
},
/**
* Run a tool
*/
runTool(tool) {
this.log(`Running tool: ${tool}`, 'info');
socket.runTool(tool);
},
/**
* Toggle agent mode
*/
toggleAgent() {
socket.toggleAgent(this.agentActive);
this.log(`Switched Agent Mode: ${this.agentActive ? 'ON' : 'OFF'}`, 'info');
},
/**
* Save settings
*/
saveSettings() {
socket.saveSettings(this.settings.agentBackend, this.settings.llmProvider);
this.log('Settings updated', 'info');
this.showToast('Settings saved', 'success');
},
/**
* Save API key
*/
saveApiKey(provider) {
const key = this.apiKeys[provider];
if (!key) {
this.showToast('Please enter an API key', 'error');
return;
}
socket.saveApiKey(provider, key);
this.apiKeys[provider] = ''; // Clear input
this.log(`Saved ${provider} API key`, 'success');
this.showToast(`${provider.charAt(0).toUpperCase() + provider.slice(1)} API key saved!`, 'success');
},
/**
* Start polling for system status (every 10 seconds, only when connected)
*/
startStatusPolling() {
setInterval(() => {
if (socket.isConnected) {
socket.runTool('status');
}
}, 10000); // Poll every 10 seconds, not 3
},
/**
* Add log entry
*/
log(message, level = 'info') {
this.logs.push({
time: Tools.formatTime(),
message,
level
});
// Keep only last 100 logs
if (this.logs.length > 100) {
this.logs.shift();
}
// Auto scroll terminal
this.$nextTick(() => {
if (this.$refs.terminal) {
this.$refs.terminal.scrollTop = this.$refs.terminal.scrollHeight;
}
});
},
/**
* Format message content
*/
formatMessage(content) {
return Tools.formatMessage(content);
},
/**
* Get current time string
*/
currentTime() {
return Tools.formatTime();
},
/**
* Show toast notification
*/
showToast(message, type = 'info') {
Tools.showToast(message, type, this.$refs.toasts);
}
};
}

View File

@@ -0,0 +1,131 @@
/**
* PocketClaw Tools Module
* Handles tool-specific UI interactions and formatting
*/
const Tools = {
/**
* Format message content (markdown-like)
*/
formatMessage(content) {
if (!content) return '';
// Escape HTML first
let formatted = content
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;');
// Code blocks
formatted = formatted.replace(
/```(\w*)\n?([\s\S]*?)```/g,
'<pre><code>$2</code></pre>'
);
// Inline code
formatted = formatted.replace(
/`([^`]+)`/g,
'<code>$1</code>'
);
// Bold
formatted = formatted.replace(
/\*\*(.+?)\*\*/g,
'<strong>$1</strong>'
);
// Line breaks
formatted = formatted.replace(/\n/g, '<br>');
return formatted;
},
/**
* Format current time
*/
formatTime(date = new Date()) {
return date.toLocaleTimeString('en-US', {
hour: '2-digit',
minute: '2-digit',
hour12: true
});
},
/**
* Show toast notification
*/
showToast(message, type = 'info', container) {
const toast = document.createElement('div');
toast.className = `toast ${type}`;
const icons = {
success: '✅',
error: '❌',
info: '',
warning: '⚠️'
};
toast.innerHTML = `
<span class="toast-icon">${icons[type] || icons.info}</span>
<span class="toast-msg">${message}</span>
`;
container.appendChild(toast);
// Auto remove after 3s
setTimeout(() => {
toast.style.opacity = '0';
toast.style.transform = 'translateX(100%)';
setTimeout(() => toast.remove(), 300);
}, 3000);
},
/**
* Parse system status from response
*/
parseStatus(content) {
const status = {
cpu: '—',
ram: '—',
disk: '—',
battery: '—'
};
if (!content) return status;
// Parse CPU: "🧠 CPU: 50.0% (8 cores)"
const cpuMatch = content.match(/CPU:\s*([\d.]+)%/);
if (cpuMatch) status.cpu = Math.round(parseFloat(cpuMatch[1]));
// Parse RAM: "💾 RAM: 10.0 / 16.0 GB (60%)"
const ramMatch = content.match(/RAM:.*?\(([\d.]+)%\)/);
if (ramMatch) status.ram = Math.round(parseFloat(ramMatch[1]));
// Parse Disk: "💿 Disk: 200 / 500 GB (40%)"
const diskMatch = content.match(/Disk:.*?\(([\d.]+)%\)/);
if (diskMatch) status.disk = Math.round(parseFloat(diskMatch[1]));
// Parse Battery: "🔋 Battery: 80%"
const batteryMatch = content.match(/Battery:\s*([\d.]+)%/);
if (batteryMatch) status.battery = Math.round(parseFloat(batteryMatch[1]));
return status;
},
/**
* Check if content is a file browser response
*/
isFileBrowser(content) {
return content && content.includes('📁') && content.includes('📂');
},
/**
* Check if content is a screenshot
*/
isScreenshot(data) {
return data.type === 'screenshot' && data.image;
}
};
// Export
window.Tools = Tools;

View File

@@ -0,0 +1,177 @@
/**
* PocketClaw WebSocket Module
* Singleton WebSocket connection with proper state management
*/
class PocketClawSocket {
constructor() {
this.ws = null;
this.handlers = new Map();
this.reconnectAttempts = 0;
this.maxReconnectAttempts = 5;
this.isConnecting = false;
this.isConnected = false;
}
/**
* Connect to WebSocket server (only if not already connected)
*/
connect() {
// Prevent multiple connections
if (this.isConnected || this.isConnecting) {
console.log('[WS] Already connected or connecting');
return;
}
this.isConnecting = true;
const url = `ws://${window.location.host}/ws`;
console.log('[WS] Connecting to', url);
this.ws = new WebSocket(url);
this.ws.onopen = () => {
console.log('[WS] Connected');
this.isConnecting = false;
this.isConnected = true;
this.reconnectAttempts = 0;
this.emit('connected');
};
this.ws.onmessage = (event) => {
try {
const data = JSON.parse(event.data);
this.handleMessage(data);
} catch (e) {
console.error('[WS] Parse error:', e);
}
};
this.ws.onclose = () => {
console.log('[WS] Disconnected');
this.isConnecting = false;
this.isConnected = false;
this.emit('disconnected');
this.attemptReconnect();
};
this.ws.onerror = (error) => {
console.error('[WS] Error:', error);
this.isConnecting = false;
this.emit('error', error);
};
}
/**
* Attempt to reconnect with exponential backoff
*/
attemptReconnect() {
if (this.reconnectAttempts >= this.maxReconnectAttempts) {
console.log('[WS] Max reconnect attempts reached');
this.emit('maxReconnectReached');
return;
}
const delay = Math.min(1000 * Math.pow(2, this.reconnectAttempts), 10000);
this.reconnectAttempts++;
console.log(`[WS] Reconnecting in ${delay}ms (attempt ${this.reconnectAttempts})`);
setTimeout(() => this.connect(), delay);
}
/**
* Handle incoming messages - route to type-specific handlers
*/
handleMessage(data) {
const type = data.type;
// Emit to type-specific handlers first
if (type && this.handlers.has(type)) {
this.handlers.get(type).forEach(handler => handler(data));
}
}
/**
* Register event handler
*/
on(event, handler) {
if (!this.handlers.has(event)) {
this.handlers.set(event, []);
}
this.handlers.get(event).push(handler);
}
/**
* Remove event handler
*/
off(event, handler) {
if (this.handlers.has(event)) {
const handlers = this.handlers.get(event);
const index = handlers.indexOf(handler);
if (index > -1) {
handlers.splice(index, 1);
}
}
}
/**
* Clear all handlers for an event or all events
*/
clearHandlers(event = null) {
if (event) {
this.handlers.delete(event);
} else {
this.handlers.clear();
}
}
/**
* Emit event to handlers
*/
emit(event, data = null) {
if (this.handlers.has(event)) {
this.handlers.get(event).forEach(handler => handler(data));
}
}
/**
* Send message to server
*/
send(action, data = {}) {
if (this.ws && this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify({ action, ...data }));
return true;
} else {
console.warn('[WS] Not connected, cannot send:', action);
return false;
}
}
/**
* Convenience methods for common actions
*/
runTool(tool, options = {}) {
this.send('tool', { tool, ...options });
}
toggleAgent(active) {
this.send('toggle_agent', { active });
}
chat(message) {
this.send('chat', { message });
}
saveSettings(agentBackend, llmProvider) {
this.send('settings', {
agent_backend: agentBackend,
llm_provider: llmProvider
});
}
saveApiKey(provider, key) {
this.send('save_api_key', { provider, key });
}
}
// Export singleton - only one instance ever
window.socket = window.socket || new PocketClawSocket();

View File

@@ -0,0 +1,5 @@
"""LLM package for PocketClaw."""
from pocketclaw.llm.router import LLMRouter
__all__ = ["LLMRouter"]

Binary file not shown.

View File

@@ -0,0 +1,141 @@
"""LLM Router - routes requests to available LLM backends."""
import logging
from typing import Optional, List
import httpx
from pocketclaw.config import Settings
logger = logging.getLogger(__name__)
class LLMRouter:
"""Routes LLM requests to available backends."""
def __init__(self, settings: Settings):
self.settings = settings
self.conversation_history: List[dict] = []
self._available_backend: Optional[str] = None
async def _check_ollama(self) -> bool:
"""Check if Ollama is available."""
try:
async with httpx.AsyncClient(timeout=2.0) as client:
response = await client.get(f"{self.settings.ollama_host}/api/tags")
return response.status_code == 200
except Exception:
return False
async def _detect_backend(self) -> Optional[str]:
"""Detect available LLM backend based on settings."""
provider = self.settings.llm_provider
if provider == "ollama":
if await self._check_ollama():
return "ollama"
return None
if provider == "openai":
if self.settings.openai_api_key:
return "openai"
return None
if provider == "anthropic":
if self.settings.anthropic_api_key:
return "anthropic"
return None
# Auto mode - try in order: Ollama → OpenAI → Anthropic
if provider == "auto":
if await self._check_ollama():
return "ollama"
if self.settings.openai_api_key:
return "openai"
if self.settings.anthropic_api_key:
return "anthropic"
return None
async def chat(self, message: str) -> str:
"""Send a chat message and get a response."""
if not self._available_backend:
self._available_backend = await self._detect_backend()
if not self._available_backend:
return (
"❌ No LLM backend available.\n\n"
"Options:\n"
"• Install [Ollama](https://ollama.ai) and run `ollama run llama3.2`\n"
"• Add OpenAI API key in ⚙️ Settings\n"
"• Add Anthropic API key in ⚙️ Settings"
)
self.conversation_history.append({"role": "user", "content": message})
try:
if self._available_backend == "ollama":
response = await self._chat_ollama(message)
elif self._available_backend == "openai":
response = await self._chat_openai(message)
elif self._available_backend == "anthropic":
response = await self._chat_anthropic(message)
else:
response = "Unknown backend"
self.conversation_history.append({"role": "assistant", "content": response})
return response
except Exception as e:
logger.error(f"LLM error: {e}")
return f"❌ LLM Error: {str(e)}"
async def _chat_ollama(self, message: str) -> str:
"""Chat via Ollama."""
async with httpx.AsyncClient(timeout=120.0) as client:
response = await client.post(
f"{self.settings.ollama_host}/api/chat",
json={
"model": self.settings.ollama_model,
"messages": self.conversation_history,
"stream": False
}
)
response.raise_for_status()
data = response.json()
return data.get("message", {}).get("content", "No response")
async def _chat_openai(self, message: str) -> str:
"""Chat via OpenAI."""
from openai import AsyncOpenAI
client = AsyncOpenAI(api_key=self.settings.openai_api_key)
response = await client.chat.completions.create(
model=self.settings.openai_model,
messages=[
{"role": "system", "content": "You are PocketClaw, a helpful AI assistant running locally on the user's machine."},
*self.conversation_history
]
)
return response.choices[0].message.content
async def _chat_anthropic(self, message: str) -> str:
"""Chat via Anthropic."""
from anthropic import AsyncAnthropic
client = AsyncAnthropic(api_key=self.settings.anthropic_api_key)
response = await client.messages.create(
model=self.settings.anthropic_model,
max_tokens=4096,
system="You are PocketClaw, a helpful AI assistant running locally on the user's machine.",
messages=self.conversation_history
)
return response.content[0].text
def clear_history(self) -> None:
"""Clear conversation history."""
self.conversation_history = []

View File

@@ -0,0 +1,5 @@
"""Tools package for PocketClaw."""
from pocketclaw.tools import status, fetch, screenshot
__all__ = ["status", "fetch", "screenshot"]

Binary file not shown.

View File

@@ -0,0 +1,142 @@
"""File browser tool."""
import os
from pathlib import Path
from typing import Optional
from telegram import InlineKeyboardMarkup, InlineKeyboardButton
def is_safe_path(path: Path, jail: Path) -> bool:
"""Check if path is within the jail directory."""
try:
path = path.resolve()
jail = jail.resolve()
return str(path).startswith(str(jail))
except Exception:
return False
def get_directory_keyboard(path: Path, jail: Optional[Path] = None) -> InlineKeyboardMarkup:
"""Generate inline keyboard for directory contents."""
if jail is None:
jail = Path.home()
path = Path(path).resolve()
if not is_safe_path(path, jail):
path = jail
buttons = []
# Parent directory button (if not at jail root)
if path != jail:
parent = path.parent
buttons.append([InlineKeyboardButton("📁 ..", callback_data=f"fetch:{parent}")])
try:
items = sorted(path.iterdir(), key=lambda x: (not x.is_dir(), x.name.lower()))
for item in items[:20]: # Limit to 20 items
if item.name.startswith("."):
continue # Skip hidden files
if item.is_dir():
buttons.append([InlineKeyboardButton(
f"📁 {item.name}/",
callback_data=f"fetch:{item}"
)])
else:
# Show file size
try:
size = item.stat().st_size
if size < 1024:
size_str = f"{size} B"
elif size < 1024 * 1024:
size_str = f"{size/1024:.1f} KB"
else:
size_str = f"{size/(1024*1024):.1f} MB"
except Exception:
size_str = "?"
buttons.append([InlineKeyboardButton(
f"📄 {item.name} ({size_str})",
callback_data=f"fetch:{item}"
)])
except PermissionError:
buttons.append([InlineKeyboardButton("⛔ Permission denied", callback_data="noop")])
return InlineKeyboardMarkup(buttons)
async def handle_path(path_str: str, jail: Path) -> dict:
"""Handle a path selection - return directory listing or file."""
path = Path(path_str).resolve()
if not is_safe_path(path, jail):
return {
"type": "error",
"message": "Access denied: path outside allowed directory"
}
if path.is_dir():
return {
"type": "directory",
"keyboard": get_directory_keyboard(path, jail)
}
elif path.is_file():
return {
"type": "file",
"path": path,
"filename": path.name
}
else:
return {
"type": "error",
"message": "Path does not exist"
}
def list_directory(path_str: str, jail_str: Optional[str] = None) -> str:
"""List directory contents as formatted string for web dashboard."""
path = Path(path_str).resolve()
jail = Path(jail_str).resolve() if jail_str else Path.home()
if not is_safe_path(path, jail):
return "⛔ Access denied: path outside allowed directory"
if not path.is_dir():
return f"📄 {path.name} - File selected"
lines = [f"📂 **{path}**\n"]
try:
items = sorted(path.iterdir(), key=lambda x: (not x.is_dir(), x.name.lower()))
for item in items[:30]: # Limit to 30 items
if item.name.startswith("."):
continue
if item.is_dir():
lines.append(f"📁 {item.name}/")
else:
try:
size = item.stat().st_size
if size < 1024:
size_str = f"{size} B"
elif size < 1024 * 1024:
size_str = f"{size/1024:.1f} KB"
else:
size_str = f"{size/(1024*1024):.1f} MB"
except Exception:
size_str = "?"
lines.append(f"📄 {item.name} ({size_str})")
if len(items) > 30:
lines.append(f"\n... and {len(items) - 30} more items")
except PermissionError:
lines.append("⛔ Permission denied")
return "\n".join(lines)

View File

@@ -0,0 +1,30 @@
"""Screenshot tool."""
import io
from typing import Optional
try:
import pyautogui
PYAUTOGUI_AVAILABLE = True
except Exception:
PYAUTOGUI_AVAILABLE = False
def take_screenshot() -> Optional[bytes]:
"""Take a screenshot and return as bytes."""
if not PYAUTOGUI_AVAILABLE:
return None
try:
# Take screenshot
screenshot = pyautogui.screenshot()
# Convert to bytes
buffer = io.BytesIO()
screenshot.save(buffer, format="PNG")
buffer.seek(0)
return buffer.getvalue()
except Exception as e:
# Common on headless servers or when display is not available
return None

View File

@@ -0,0 +1,53 @@
"""System status tool."""
import platform
from datetime import datetime, timedelta
import psutil
def get_system_status() -> str:
"""Get formatted system status."""
# CPU
cpu_percent = psutil.cpu_percent(interval=0.5)
cpu_count = psutil.cpu_count()
# Memory
mem = psutil.virtual_memory()
mem_used_gb = mem.used / (1024 ** 3)
mem_total_gb = mem.total / (1024 ** 3)
# Disk
disk = psutil.disk_usage("/")
disk_used_gb = disk.used / (1024 ** 3)
disk_total_gb = disk.total / (1024 ** 3)
# Uptime
boot_time = datetime.fromtimestamp(psutil.boot_time())
uptime = datetime.now() - boot_time
uptime_str = str(timedelta(seconds=int(uptime.total_seconds())))
# Battery (if available)
battery_str = ""
try:
battery = psutil.sensors_battery()
if battery:
battery_str = f"\n🔋 Battery: {battery.percent:.0f}%"
if battery.power_plugged:
battery_str += ""
except Exception:
pass
# Platform info
system = platform.system()
machine = platform.machine()
return f"""🟢 **System Status**
💻 **{system} ({machine})**
🧠 CPU: {cpu_percent:.1f}% ({cpu_count} cores)
💾 RAM: {mem_used_gb:.1f} / {mem_total_gb:.1f} GB ({mem.percent:.0f}%)
💿 Disk: {disk_used_gb:.0f} / {disk_total_gb:.0f} GB ({disk.percent:.0f}%){battery_str}
⏱️ Uptime: {uptime_str}
"""

View File

@@ -0,0 +1,302 @@
"""Web server for QR code pairing flow."""
import asyncio
import secrets
from typing import Optional
from fastapi import FastAPI, Request, Form
from fastapi.responses import HTMLResponse
import uvicorn
import qrcode
import qrcode.image.svg
from io import BytesIO
import base64
from pocketclaw.config import Settings
# Global state for pairing
_pairing_complete = asyncio.Event()
_session_secret: Optional[str] = None
_settings: Optional[Settings] = None
def generate_qr_svg(deep_link: str) -> str:
"""Generate QR code as SVG string."""
qr = qrcode.QRCode(version=1, box_size=10, border=2)
qr.add_data(deep_link)
qr.make(fit=True)
# Generate as PNG and convert to base64
img = qr.make_image(fill_color="black", back_color="white")
buffer = BytesIO()
img.save(buffer, format="PNG")
img_base64 = base64.b64encode(buffer.getvalue()).decode()
return f"data:image/png;base64,{img_base64}"
def create_app(settings: Settings) -> FastAPI:
"""Create the FastAPI app for pairing."""
global _session_secret, _settings
_settings = settings
_session_secret = secrets.token_urlsafe(32)
app = FastAPI(title="PocketClaw Setup")
@app.get("/", response_class=HTMLResponse)
async def setup_page():
"""Render the setup page."""
return f"""
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>PocketClaw Setup</title>
<style>
* {{ margin: 0; padding: 0; box-sizing: border-box; }}
body {{
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
min-height: 100vh;
display: flex;
align-items: center;
justify-content: center;
color: #fff;
}}
.container {{
background: rgba(255,255,255,0.05);
backdrop-filter: blur(10px);
border-radius: 24px;
padding: 48px;
max-width: 480px;
width: 90%;
text-align: center;
border: 1px solid rgba(255,255,255,0.1);
}}
.logo {{ font-size: 64px; margin-bottom: 16px; }}
h1 {{ font-size: 28px; margin-bottom: 8px; }}
.tagline {{ color: #888; margin-bottom: 32px; }}
.step {{
background: rgba(255,255,255,0.05);
border-radius: 12px;
padding: 20px;
margin-bottom: 16px;
text-align: left;
}}
.step-number {{
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
width: 28px; height: 28px;
border-radius: 50%;
display: inline-flex;
align-items: center;
justify-content: center;
font-weight: bold;
font-size: 14px;
margin-right: 12px;
}}
input {{
width: 100%;
padding: 14px 16px;
border: 1px solid rgba(255,255,255,0.2);
border-radius: 8px;
background: rgba(0,0,0,0.3);
color: #fff;
font-size: 14px;
margin-top: 12px;
}}
input:focus {{ outline: none; border-color: #667eea; }}
button {{
width: 100%;
padding: 16px;
border: none;
border-radius: 12px;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: #fff;
font-size: 16px;
font-weight: 600;
cursor: pointer;
margin-top: 24px;
transition: transform 0.2s, box-shadow 0.2s;
}}
button:hover {{ transform: translateY(-2px); box-shadow: 0 8px 24px rgba(102,126,234,0.4); }}
.qr-section {{ display: none; margin-top: 32px; }}
.qr-section.active {{ display: block; }}
.qr-code {{
background: #fff;
padding: 16px;
border-radius: 16px;
display: inline-block;
margin: 16px 0;
}}
.qr-code img {{ width: 200px; height: 200px; }}
.success {{
background: rgba(34, 197, 94, 0.2);
border: 1px solid rgba(34, 197, 94, 0.5);
padding: 16px;
border-radius: 12px;
margin-top: 16px;
}}
.api-keys {{
margin-top: 16px;
text-align: left;
}}
.api-keys label {{
display: block;
font-size: 12px;
color: #888;
margin-top: 12px;
margin-bottom: 4px;
}}
</style>
</head>
<body>
<div class="container">
<div class="logo">🦀</div>
<h1>PocketClaw Setup</h1>
<p class="tagline">Your AI agent, on your machine</p>
<form id="setup-form" method="POST" action="/setup">
<div class="step">
<span class="step-number">1</span>
<strong>Create a Telegram Bot</strong>
<p style="color: #888; font-size: 14px; margin-top: 8px;">
Open <a href="https://t.me/BotFather" target="_blank" style="color: #667eea;">@BotFather</a>
on Telegram and send <code>/newbot</code>
</p>
<input type="text" name="bot_token" placeholder="Paste your bot token here..." required>
</div>
<div class="step api-keys">
<span class="step-number">2</span>
<strong>LLM API Keys (Optional)</strong>
<p style="color: #888; font-size: 14px; margin-top: 8px;">
Add API keys for cloud LLMs. Leave blank to use local Ollama only.
</p>
<label>OpenAI API Key</label>
<input type="password" name="openai_key" placeholder="sk-...">
<label>Anthropic API Key</label>
<input type="password" name="anthropic_key" placeholder="sk-ant-...">
</div>
<button type="submit">Generate QR Code →</button>
</form>
<div id="qr-section" class="qr-section">
<div class="step">
<span class="step-number">3</span>
<strong>Scan with Telegram</strong>
<p style="color: #888; font-size: 14px; margin-top: 8px;">
Open your phone camera and scan this QR code
</p>
</div>
<div class="qr-code">
<img id="qr-image" src="" alt="QR Code">
</div>
<p style="color: #888; font-size: 14px;">Waiting for connection...</p>
</div>
<div id="success-section" class="success" style="display: none;">
✅ <strong>Connected!</strong> PocketClaw is now running.
</div>
</div>
<script>
document.getElementById('setup-form').addEventListener('submit', async (e) => {{
e.preventDefault();
const formData = new FormData(e.target);
const response = await fetch('/setup', {{
method: 'POST',
body: formData
}});
const data = await response.json();
if (data.qr_url) {{
document.getElementById('qr-image').src = data.qr_url;
document.getElementById('qr-section').classList.add('active');
e.target.style.display = 'none';
pollStatus();
}}
}});
async function pollStatus() {{
while (true) {{
const response = await fetch('/status');
const data = await response.json();
if (data.paired) {{
document.getElementById('qr-section').style.display = 'none';
document.getElementById('success-section').style.display = 'block';
setTimeout(() => window.close(), 3000);
break;
}}
await new Promise(r => setTimeout(r, 1000));
}}
}}
</script>
</body>
</html>
"""
@app.post("/setup")
async def setup(
bot_token: str = Form(...),
openai_key: Optional[str] = Form(None),
anthropic_key: Optional[str] = Form(None)
):
"""Handle setup form submission."""
global _settings
# Save the bot token
_settings.telegram_bot_token = bot_token
if openai_key:
_settings.openai_api_key = openai_key
if anthropic_key:
_settings.anthropic_api_key = anthropic_key
# Generate deep link with session secret
# Extract bot username from token (we'll need user to provide or fetch from API)
deep_link = f"https://t.me/start?startgroup={_session_secret}"
# For now, we'll use a simpler flow where user sends /start manually
# The QR just opens Telegram
qr_data = generate_qr_svg(f"https://t.me/share/url?url=Send%20/start%20to%20your%20bot")
return {"qr_url": qr_data, "session_secret": _session_secret}
@app.get("/status")
async def status():
"""Check pairing status."""
return {"paired": _pairing_complete.is_set()}
@app.post("/complete")
async def complete(user_id: int):
"""Called internally when pairing is complete."""
global _settings
_settings.allowed_user_id = user_id
_settings.save()
_pairing_complete.set()
return {"ok": True}
return app
async def run_pairing_server(settings: Settings) -> None:
"""Run the pairing server until pairing is complete."""
app = create_app(settings)
config = uvicorn.Config(
app,
host=settings.web_host,
port=settings.web_port,
log_level="warning"
)
server = uvicorn.Server(config)
# Run server in background
server_task = asyncio.create_task(server.serve())
# Wait for pairing to complete
await _pairing_complete.wait()
# Shutdown server
server.should_exit = True
await server_task

1
tests/__init__.py Normal file
View File

@@ -0,0 +1 @@
# Tests for PocketClaw

Binary file not shown.

3
tests/conftest.py Normal file
View File

@@ -0,0 +1,3 @@
"""Pytest configuration."""
import pytest

255
tests/test_tools.py Normal file
View File

@@ -0,0 +1,255 @@
"""Unit tests for PocketClaw tools."""
import pytest
from pathlib import Path
from unittest.mock import patch, MagicMock
class TestStatusTool:
"""Tests for status tool."""
def test_get_system_status_returns_string(self):
"""Status should return a formatted string."""
from pocketclaw.tools import status
result = status.get_system_status()
assert isinstance(result, str)
assert "System Status" in result
assert "CPU" in result
assert "RAM" in result
assert "Disk" in result
def test_get_system_status_contains_percentages(self):
"""Status should contain percentage values."""
from pocketclaw.tools import status
result = status.get_system_status()
# Should have percentage signs
assert "%" in result
class TestFetchTool:
"""Tests for fetch tool."""
def test_is_safe_path_within_jail(self, tmp_path):
"""Paths within jail should be safe."""
from pocketclaw.tools.fetch import is_safe_path
jail = tmp_path
safe_path = tmp_path / "subdir"
safe_path.mkdir()
assert is_safe_path(safe_path, jail) is True
def test_is_safe_path_outside_jail(self, tmp_path):
"""Paths outside jail should be unsafe."""
from pocketclaw.tools.fetch import is_safe_path
jail = tmp_path / "jail"
jail.mkdir()
outside_path = tmp_path / "outside"
outside_path.mkdir()
assert is_safe_path(outside_path, jail) is False
def test_is_safe_path_parent_traversal(self, tmp_path):
"""Parent traversal should be blocked."""
from pocketclaw.tools.fetch import is_safe_path
jail = tmp_path / "jail"
jail.mkdir()
traversal_path = jail / ".." / "outside"
assert is_safe_path(traversal_path, jail) is False
def test_get_directory_keyboard_returns_markup(self, tmp_path):
"""Should return InlineKeyboardMarkup."""
from pocketclaw.tools.fetch import get_directory_keyboard
from telegram import InlineKeyboardMarkup
# Create some test files
(tmp_path / "file1.txt").write_text("test")
(tmp_path / "subdir").mkdir()
result = get_directory_keyboard(tmp_path, tmp_path)
assert isinstance(result, InlineKeyboardMarkup)
@pytest.mark.asyncio
async def test_handle_path_directory(self, tmp_path):
"""Should handle directory paths."""
from pocketclaw.tools.fetch import handle_path
result = await handle_path(str(tmp_path), tmp_path)
assert result["type"] == "directory"
assert "keyboard" in result
@pytest.mark.asyncio
async def test_handle_path_file(self, tmp_path):
"""Should handle file paths."""
from pocketclaw.tools.fetch import handle_path
test_file = tmp_path / "test.txt"
test_file.write_text("content")
result = await handle_path(str(test_file), tmp_path)
assert result["type"] == "file"
assert result["filename"] == "test.txt"
@pytest.mark.asyncio
async def test_handle_path_outside_jail(self, tmp_path):
"""Should reject paths outside jail."""
from pocketclaw.tools.fetch import handle_path
jail = tmp_path / "jail"
jail.mkdir()
outside = tmp_path / "outside"
outside.mkdir()
result = await handle_path(str(outside), jail)
assert result["type"] == "error"
class TestScreenshotTool:
"""Tests for screenshot tool."""
def test_take_screenshot_returns_bytes_or_none(self):
"""Screenshot should return bytes or None."""
from pocketclaw.tools import screenshot
result = screenshot.take_screenshot()
# Should be bytes or None (depending on display availability)
assert result is None or isinstance(result, bytes)
@patch('pocketclaw.tools.screenshot.PYAUTOGUI_AVAILABLE', False)
def test_take_screenshot_without_pyautogui(self):
"""Should return None when pyautogui unavailable."""
from pocketclaw.tools import screenshot
# Force reimport to pick up patched value
with patch.object(screenshot, 'PYAUTOGUI_AVAILABLE', False):
result = screenshot.take_screenshot()
assert result is None
class TestConfig:
"""Tests for configuration."""
def test_settings_defaults(self):
"""Settings should have sensible defaults."""
from pocketclaw.config import Settings
settings = Settings()
assert settings.agent_backend == "open_interpreter"
assert settings.llm_provider == "auto"
assert settings.web_port == 8888
assert settings.ollama_model == "llama3.2"
def test_settings_save_and_load(self, tmp_path, monkeypatch):
"""Settings should persist to disk."""
from pocketclaw.config import Settings, get_config_path
# Mock config path to use temp directory
config_file = tmp_path / "config.json"
monkeypatch.setattr('pocketclaw.config.get_config_path', lambda: config_file)
# Create and save settings
settings = Settings(telegram_bot_token="test-token", allowed_user_id=12345)
settings.save()
# Verify file exists
assert config_file.exists()
# Load and verify
loaded = Settings.load()
assert loaded.telegram_bot_token == "test-token"
assert loaded.allowed_user_id == 12345
def test_get_config_dir_creates_directory(self, tmp_path, monkeypatch):
"""Config dir should be created if not exists."""
from pocketclaw.config import get_config_dir
# Mock home to use temp
new_home = tmp_path / "home"
new_home.mkdir()
monkeypatch.setattr(Path, 'home', lambda: new_home)
result = get_config_dir()
assert result.exists()
assert result.name == ".pocketclaw"
class TestLLMRouter:
"""Tests for LLM router."""
def test_router_initialization(self):
"""Router should initialize without errors."""
from pocketclaw.config import Settings
from pocketclaw.llm.router import LLMRouter
settings = Settings()
router = LLMRouter(settings)
assert router.conversation_history == []
def test_router_clear_history(self):
"""Should clear conversation history."""
from pocketclaw.config import Settings
from pocketclaw.llm.router import LLMRouter
settings = Settings()
router = LLMRouter(settings)
router.conversation_history = [{"role": "user", "content": "test"}]
router.clear_history()
assert router.conversation_history == []
@pytest.mark.asyncio
async def test_router_no_backend_returns_error(self):
"""Should return error when no backend available."""
from pocketclaw.config import Settings
from pocketclaw.llm.router import LLMRouter
settings = Settings(
llm_provider="openai",
openai_api_key=None # No key
)
router = LLMRouter(settings)
result = await router.chat("Hello")
assert "No LLM backend available" in result
class TestAgentRouter:
"""Tests for agent router."""
def test_router_defaults_to_open_interpreter(self):
"""Should default to Open Interpreter."""
from pocketclaw.config import Settings
from pocketclaw.agents.router import AgentRouter
settings = Settings(agent_backend="open_interpreter")
router = AgentRouter(settings)
assert router._agent is not None
def test_router_switches_to_claude_code(self):
"""Should switch to Claude Code when configured."""
from pocketclaw.config import Settings
from pocketclaw.agents.router import AgentRouter
settings = Settings(agent_backend="claude_code", anthropic_api_key="test")
router = AgentRouter(settings)
# Should have initialized (even if API key is fake)
assert router._agent is not None

4113
uv.lock generated Normal file

File diff suppressed because it is too large Load Diff