New tasks added

This commit is contained in:
larchanka
2026-02-17 14:22:53 +01:00
committed by Mikhail Larchanka
parent b4f8c3141a
commit 94bd727a43
20 changed files with 800 additions and 210 deletions

View File

@@ -1,138 +1,224 @@
# Reminder System Implementation Plan
# Enhanced HTTP Get Tool with Playwright Implementation Plan
## Overview
Implement a reminder system that allows users to request one-time or recurring reminders via Telegram. When a reminder fires, the bot will send a message back to the user. The system will leverage the existing `CronManager` service and integrate with the Planner and Orchestrator.
Enhance the existing `http_get` tool to support Single Page Applications (SPAs) by integrating Playwright with Chromium. The current implementation uses `fetch`, which cannot execute JavaScript or handle dynamic content. The enhanced version will:
1. Use Playwright to render JavaScript-heavy websites
2. Convert HTML responses to Markdown for better LLM consumption
3. Bypass bot detection mechanisms using realistic browser fingerprints and behaviors
4. Maintain backward compatibility with the existing `fetch`-based approach for simple requests
## User Review Required
> [!IMPORTANT]
> **Natural Language Parsing Approach**
> **Fallback Strategy**
>
> The current plan uses the LLM (via `model-router`) to parse natural language time expressions (e.g., "remind me in 5 minutes", "every Monday at 9am") into cron expressions. This is flexible but adds latency and LLM dependency.
> The plan includes using `fetch` as the primary method and falling back to Playwright only when needed (e.g., on 403 errors, detected SPAs, or explicit user request). This minimizes resource usage and latency.
>
> **Alternative**: Use a dedicated library like `chrono-node` for more deterministic parsing. Please confirm which approach you prefer.
> **Alternative**: Always use Playwright for all requests. This would be slower but more consistent. Please confirm which approach you prefer.
> [!IMPORTANT]
> **Reminder Message Format**
> **HTML to Markdown Conversion Library**
>
> When a reminder fires, the bot will send: `🔔 Reminder: <original message>`. Please confirm if you want a different format or additional options (e.g., snooze, dismiss).
> The plan proposes using `turndown` (popular, well-maintained) for HTML-to-Markdown conversion.
>
> **Alternatives**:
> - `html-to-md` (lighter but less feature-rich)
> - `node-html-markdown` (newer, good TypeScript support)
>
> Please confirm if `turndown` is acceptable or if you prefer a different library.
> [!IMPORTANT]
> **Bot Detection Bypass Techniques**
>
> The plan includes:
> - Realistic user agents
> - Randomized viewport sizes
> - Stealth plugin for Playwright
> - Realistic mouse movements and delays
>
> These techniques may not work for all sites with advanced bot detection (e.g., Cloudflare Turnstile, reCAPTCHA). Please confirm if this level of bypass is sufficient or if you need more advanced techniques.
## Proposed Changes
### Component 1: Planner Prompt Enhancement
### Component 1: Dependencies
Update the planner prompt to recognize reminder requests and include a new capability type for scheduling reminders.
Add required npm packages for Playwright and HTML-to-Markdown conversion.
#### [MODIFY] [package.json](file:///Users/mikhaillarchanka/Projects/AI-Agent/package.json)
- Add `playwright` to dependencies for browser automation
- Add `playwright-extra` and `puppeteer-extra-plugin-stealth` for bot detection bypass
- Add `turndown` for HTML-to-Markdown conversion
- Add corresponding TypeScript type definitions to devDependencies
---
### Component 2: HTML to Markdown Converter
Create a utility service to convert HTML content to clean, readable Markdown.
#### [NEW] [html-to-markdown.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/utils/html-to-markdown.ts)
- Implement `htmlToMarkdown(html: string, options?: ConversionOptions): string` function
- Configure Turndown to preserve important elements (links, images, code blocks, tables)
- Strip unnecessary elements (scripts, styles, navigation, footers)
- Handle edge cases (malformed HTML, empty content)
- Export configuration options for customization
---
### Component 3: Playwright Browser Service
Create a service to manage Playwright browser instances and page interactions.
#### [NEW] [browser-service.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/services/browser-service.ts)
- Implement singleton pattern for browser instance management (reuse browser across requests)
- Configure Chromium with stealth plugin to bypass bot detection
- Implement realistic user agent rotation
- Implement randomized viewport sizes
- Add methods:
- `fetchWithBrowser(url: string, options?: BrowserFetchOptions): Promise<{ status: number, html: string, finalUrl: string }>`
- `close(): Promise<void>` for cleanup
- Handle timeouts and errors gracefully
- Add realistic delays and mouse movements to mimic human behavior
---
### Component 4: Enhanced HTTP Get Tool
Update the existing `http_get` tool to intelligently choose between `fetch` and Playwright.
#### [MODIFY] [tool-host.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/services/tool-host.ts)
- Update `httpGetTool` to accept new optional parameters:
- `useBrowser?: boolean` - Force Playwright usage
- `convertToMarkdown?: boolean` - Convert HTML to Markdown (default: true for HTML responses)
- Implement smart fallback logic:
1. Try `fetch` first (fast path)
2. If 403/401 or user specified `useBrowser`, use Playwright
3. Detect HTML content type and convert to Markdown if requested
- Return enhanced response:
```typescript
{
status: number,
body: string, // HTML or Markdown
contentType: string,
finalUrl: string, // After redirects
method: 'fetch' | 'browser' // Which method was used
}
```
- Handle errors from both methods and provide clear error messages
---
### Component 5: Bot Detection Bypass Configuration
Create configuration for user agents, viewport sizes, and stealth settings.
#### [NEW] [browser-config.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/services/browser-config.ts)
- Export array of realistic user agents (Chrome, Firefox, Safari on various OS)
- Export array of common viewport sizes (desktop and mobile)
- Export stealth plugin configuration
- Export function to randomly select user agent and viewport
- Document bot detection bypass techniques used
---
### Component 6: Planner Prompt Update
Update the planner prompt to document the enhanced `http_get` capabilities.
#### [MODIFY] [planner.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/agents/prompts/planner.ts)
- Add `cron-manager` service documentation to the available services section
- Add new capability type: `schedule_reminder`
- Include example showing how to parse a reminder request into a plan with time parsing + cron scheduling nodes
- Update `http_get` tool documentation to include new parameters
- Add examples showing when to use `useBrowser: true`
- Add examples showing Markdown conversion for web scraping tasks
- Document that HTML responses are automatically converted to Markdown
---
### Component 2: Cron Event Handling
### Component 7: Configuration
The `CronManager` currently only emits `event.cron.completed` events to the logger. We need to handle these events in the Orchestrator and route them to Telegram.
Add browser service configuration to the main config.
#### [MODIFY] [orchestrator.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/core/orchestrator.ts)
#### [MODIFY] [config.json](file:///Users/mikhaillarchanka/Projects/AI-Agent/config.json)
- Add handler for `event.cron.completed` events in `handleCoreMessage`
- Extract reminder metadata (chatId, message) from the event payload
- Send reminder message to Telegram using `sendToTelegram`
- Add `browserService` section with:
- `headless: true` - Run browser in headless mode
- `timeout: 30000` - Page load timeout in ms
- `enableStealth: true` - Enable bot detection bypass
- `reuseContext: true` - Reuse browser context for performance
#### [MODIFY] [cron-manager.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/services/cron-manager.ts)
#### [MODIFY] [config.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/shared/config.ts)
- Update `runJob` to emit `event.cron.completed` with structured payload including `chatId` and `reminderMessage`
- Ensure the payload stored in the database includes all necessary metadata for reminder delivery
- Add TypeScript types for `browserService` configuration
- Add validation for browser service config
---
### Component 3: Executor Integration
The Executor needs to handle the new `schedule_reminder` capability type and communicate with the `CronManager`.
#### [MODIFY] [executor-agent.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/agents/executor-agent.ts)
- Add handler for `schedule_reminder` node type
- Send `cron.schedule.add` message to `cron-manager` with parsed cron expression and reminder metadata
- Handle response and store schedule ID in node output
---
### Component 4: Time Expression Parsing
Create a new service or utility to parse natural language time expressions into cron expressions.
#### [NEW] [time-parser.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/services/time-parser.ts)
- Implement `parseTimeExpression(input: string): { cronExpr: string, isRecurring: boolean }` function
- Use LLM (via `model-router`) to convert natural language to cron expressions
- Include validation and error handling for invalid expressions
- Support both one-time (using specific date/time) and recurring (using cron patterns) reminders
---
### Component 5: Database Schema
The existing `cron_schedules` table should be sufficient, but we may want to add reminder-specific metadata.
#### [MODIFY] [cron-manager.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/services/cron-manager.ts) (Schema)
- Consider adding optional columns: `reminder_chat_id`, `reminder_message`, `reminder_user_id` for better querying
- **Alternative**: Store all reminder metadata in the existing `payload` JSON column (simpler, no migration needed)
**Recommendation**: Use the existing `payload` column to avoid schema changes.
---
### Component 6: Reminder Management Commands
Add Telegram commands to list and cancel reminders.
#### [MODIFY] [telegram-adapter.ts](file:///Users/mikhaillarchanka/Projects/AI-Agent/src/adapters/telegram-adapter.ts)
- Add `/reminders` command to list active reminders for the user
- Add `/cancel_reminder <id>` command to remove a specific reminder
- Send requests to `cron-manager` via the orchestrator
## Verification Plan
### Automated Tests
1. **Unit test for time parser**
1. **Unit test for HTML to Markdown converter**
```bash
npm test src/services/__tests__/time-parser.test.ts
npm test src/utils/__tests__/html-to-markdown.test.ts
```
- Test parsing of common expressions: "in 5 minutes", "tomorrow at 3pm", "every Monday at 9am"
- Test error handling for invalid expressions
- Test conversion of various HTML elements (headings, lists, tables, code blocks)
- Test stripping of unwanted elements (scripts, styles)
- Test handling of malformed HTML
2. **Integration test for cron scheduling**
2. **Integration test for browser service**
```bash
npm test src/services/__tests__/cron-manager.test.ts
npm test src/services/__tests__/browser-service.test.ts
```
- Test adding a schedule via IPC
- Test that `event.cron.completed` is emitted with correct payload
- Test listing and removing schedules
- Test fetching a simple static page
- Test fetching a JavaScript-heavy SPA (e.g., React app)
- Test timeout handling
- Test browser instance reuse
- Test cleanup on shutdown
3. **Integration test for enhanced http_get tool**
```bash
npm test src/services/__tests__/tool-host.test.ts
```
- Test `fetch` fallback to Playwright on 403
- Test explicit `useBrowser: true` parameter
- Test HTML to Markdown conversion
- Test response format with both methods
### Manual Verification
1. **One-time reminder test**
1. **Test with SPA website**
- Start the orchestrator: `npm run dev:orchestrator`
- Send message via Telegram: "Remind me in 2 minutes to check the oven"
- Verify that the bot responds with confirmation
- Wait 2 minutes and verify reminder message is received
- Send message via Telegram: "Fetch https://react-example-app.com and summarize the content"
- Verify that the tool uses Playwright (check logs)
- Verify that content is properly extracted and converted to Markdown
2. **Recurring reminder test**
- Send message via Telegram: "Remind me every day at 9am to take vitamins"
- Verify confirmation message
- Check database to confirm cron expression is correct: `sqlite3 data/cron.sqlite "SELECT * FROM cron_schedules;"`
- Verify reminder fires at the scheduled time (may need to adjust time for testing)
2. **Test bot detection bypass**
- Test with a site known to block bots (e.g., some news sites)
- Send message: "Fetch https://site-with-bot-detection.com with browser"
- Verify that the request succeeds (status 200)
- Verify that content is properly extracted
3. **List and cancel reminders**
- Send `/reminders` command
- Verify list of active reminders is displayed
- Send `/cancel_reminder <id>` with an ID from the list
- Verify reminder is removed and confirmation is sent
- Send `/reminders` again to confirm it's gone
3. **Test fallback mechanism**
- Test with a simple static site (should use `fetch`)
- Test with a site that returns 403 (should fallback to Playwright)
- Check logs to confirm which method was used
- Verify response times (fetch should be faster)
4. **Test Markdown conversion**
- Fetch a Wikipedia article
- Verify that the Markdown output is clean and readable
- Verify that links, headings, and lists are properly formatted
- Verify that navigation, scripts, and styles are stripped
5. **Test error handling**
- Test with invalid URL
- Test with timeout (very slow site)
- Test with network error
- Verify that error messages are clear and helpful

View File

@@ -1,134 +1,180 @@
# Reminder System Tasks
# Enhanced HTTP Get Tool Tasks
## Phase 1: Core Infrastructure
## Phase 1: Dependencies and Configuration
### Task 1.1: Create Time Parser Service
**File**: `src/services/time-parser.ts`
### Task 1.1: Add Required Dependencies
**File**: `package.json`
**Dependencies**: None
**Description**: Create a service that converts natural language time expressions into cron expressions using the LLM.
**Description**: Add Playwright, stealth plugin, and HTML-to-Markdown conversion libraries.
**Acceptance Criteria**:
- Exports `parseTimeExpression(input: string): Promise<{ cronExpr: string, isRecurring: boolean, description: string }>`
- Uses `model-router` to parse natural language
- Returns valid cron expressions compatible with `node-cron`
- Handles both one-time and recurring reminders
- Throws descriptive errors for invalid inputs
- Add `playwright` to dependencies
- Add `playwright-extra` to dependencies
- Add `puppeteer-extra-plugin-stealth` to dependencies
- Add `turndown` to dependencies
- Add `@types/turndown` to devDependencies
- Run `npm install` successfully
- All dependencies are compatible with Node.js 20+
### Task 1.2: Add Time Parser Tests
**File**: `src/services/__tests__/time-parser.test.ts`
### Task 1.2: Install Playwright Browsers
**Dependencies**: Task 1.1
**Description**: Create unit tests for the time parser service.
**Description**: Install Chromium browser for Playwright.
**Acceptance Criteria**:
- Tests parsing "in X minutes/hours/days"
- Tests parsing "tomorrow/next week at HH:MM"
- Tests parsing "every day/week/Monday at HH:MM"
- Tests error handling for invalid expressions
- All tests pass with `npm test`
- Run `npx playwright install chromium`
- Verify Chromium is installed successfully
- Document browser installation in README or setup docs
### Task 1.3: Update Cron Manager Event Payload
**File**: `src/services/cron-manager.ts`
### Task 1.3: Add Browser Service Configuration
**File**: `config.json`
**Dependencies**: None
**Description**: Modify `CronManager.runJob()` to emit structured reminder data in `event.cron.completed`.
**Description**: Add configuration section for browser service.
**Acceptance Criteria**:
- `event.cron.completed` payload includes `chatId`, `reminderMessage`, `userId`
- Payload is extracted from the stored `payload` column in the database
- Existing functionality is not broken
- Add `browserService` object with `headless`, `timeout`, `enableStealth`, `reuseContext` properties
- Set sensible defaults (headless: true, timeout: 30000, enableStealth: true, reuseContext: true)
- Configuration is valid JSON
### Task 1.4: Add Cron Manager Integration Tests
**File**: `src/services/__tests__/cron-manager.test.ts`
### Task 1.4: Update Config TypeScript Types
**File**: `src/shared/config.ts`
**Dependencies**: Task 1.3
**Description**: Create integration tests for cron manager reminder functionality.
**Description**: Add TypeScript types and validation for browser service configuration.
**Acceptance Criteria**:
- Tests adding a reminder schedule via IPC
- Tests that `event.cron.completed` is emitted with correct reminder payload
- Tests listing schedules
- Tests removing schedules
- All tests pass with `npm test`
- Add `BrowserServiceConfig` interface with typed properties
- Add `browserService` to main config type
- Add validation for browser service config in `getConfig()`
- TypeScript compilation succeeds with no errors
---
## Phase 2: Orchestrator Integration
## Phase 2: Core Utilities
### Task 2.1: Handle Cron Events in Orchestrator
**File**: `src/core/orchestrator.ts`
**Dependencies**: Task 1.3
**Description**: Add handler for `event.cron.completed` to route reminders to Telegram.
### Task 2.1: Create HTML to Markdown Converter
**File**: `src/utils/html-to-markdown.ts`
**Dependencies**: Task 1.1
**Description**: Create utility to convert HTML to clean Markdown using Turndown.
**Acceptance Criteria**:
- `handleCoreMessage` handles `event.cron.completed` events
- Extracts `chatId` and `reminderMessage` from payload
- Calls `sendToTelegram` with formatted reminder message
- Logs errors if chatId or message is missing
- Exports `htmlToMarkdown(html: string, options?: ConversionOptions): string`
- Configures Turndown to preserve links, images, code blocks, tables, headings, lists
- Strips scripts, styles, navigation, footers, and other non-content elements
- Handles malformed HTML gracefully
- Returns empty string for empty/invalid input
- Includes JSDoc documentation
### Task 2.2: Add HTML to Markdown Tests
**File**: `src/utils/__tests__/html-to-markdown.test.ts`
**Dependencies**: Task 2.1
**Description**: Create unit tests for HTML to Markdown converter.
**Acceptance Criteria**:
- Tests conversion of headings (h1-h6)
- Tests conversion of lists (ul, ol)
- Tests conversion of links and images
- Tests conversion of code blocks and inline code
- Tests conversion of tables
- Tests stripping of scripts and styles
- Tests handling of malformed HTML
- Tests handling of empty input
- All tests pass with `npm test`
### Task 2.3: Create Browser Configuration
**File**: `src/services/browser-config.ts`
**Dependencies**: None
**Description**: Create configuration for user agents, viewports, and stealth settings.
**Acceptance Criteria**:
- Exports array of 10+ realistic user agents (Chrome, Firefox, Safari on Windows, macOS, Linux)
- Exports array of common viewport sizes (1920x1080, 1366x768, 1536x864, etc.)
- Exports function `getRandomUserAgent(): string`
- Exports function `getRandomViewport(): { width: number, height: number }`
- Exports stealth plugin configuration object
- Includes JSDoc documentation explaining bot detection bypass techniques
---
## Phase 3: Planner Enhancement
## Phase 3: Browser Service
### Task 3.1: Update Planner Prompt with Reminder Capability
**File**: `src/agents/prompts/planner.ts`
**Dependencies**: None
**Description**: Add `cron-manager` service and `schedule_reminder` capability to planner prompt.
### Task 3.1: Create Browser Service Core
**File**: `src/services/browser-service.ts`
**Dependencies**: Task 1.1, Task 1.4, Task 2.3
**Description**: Create service to manage Playwright browser instances.
**Acceptance Criteria**:
- Documents `cron-manager` service in "Available Services and Capabilities" section
- Adds `schedule_reminder` capability type with input schema
- Includes example showing reminder request → plan with time parsing + scheduling nodes
- Prompt instructs planner to recognize reminder keywords ("remind me", "reminder", etc.)
- Implements singleton pattern for browser instance
- Exports `BrowserService` class extending `BaseProcess`
- Implements `fetchWithBrowser(url: string, options?: BrowserFetchOptions): Promise<BrowserFetchResult>`
- Implements `close(): Promise<void>` for cleanup
- Configures Chromium with stealth plugin
- Uses random user agent and viewport for each request
- Handles browser launch errors gracefully
- Includes timeout handling (from config)
- Reuses browser context when `reuseContext` is enabled
### Task 3.2: Add Planner Example for Reminders
**File**: `src/agents/prompts/planner.ts`
### Task 3.2: Add Realistic Behavior to Browser Service
**File**: `src/services/browser-service.ts`
**Dependencies**: Task 3.1
**Description**: Add few-shot example showing how to plan a reminder request.
**Description**: Add human-like behaviors to bypass bot detection.
**Acceptance Criteria**:
- Example shows user request: "Remind me tomorrow at 3pm to call John"
- Plan includes two nodes: parse time expression, schedule reminder
- Dependencies are correctly specified
- Example follows the existing format
- Adds random delay (100-500ms) before page interaction
- Implements realistic mouse movement to random coordinates
- Waits for network idle before extracting content
- Adds random scroll behavior for long pages
- Configures browser to disable automation flags
- Sets realistic browser headers (Accept-Language, Accept-Encoding, etc.)
### Task 3.3: Add Browser Service Tests
**File**: `src/services/__tests__/browser-service.test.ts`
**Dependencies**: Task 3.2
**Description**: Create integration tests for browser service.
**Acceptance Criteria**:
- Tests fetching a simple static HTML page
- Tests fetching a page with JavaScript (mocked SPA)
- Tests timeout handling with slow-loading page
- Tests browser instance reuse
- Tests cleanup on shutdown
- Tests error handling for invalid URLs
- Tests stealth plugin is applied
- All tests pass with `npm test`
---
## Phase 4: Executor Enhancement
## Phase 4: Enhanced HTTP Get Tool
### Task 4.1: Add Schedule Reminder Handler to Executor
**File**: `src/agents/executor-agent.ts`
**Dependencies**: Task 1.1, Task 3.1
**Description**: Add handler for `schedule_reminder` node type in the executor.
### Task 4.1: Update HTTP Get Tool with Smart Fallback
**File**: `src/services/tool-host.ts`
**Dependencies**: Task 2.1, Task 3.2
**Description**: Enhance `httpGetTool` to support Playwright with smart fallback logic.
**Acceptance Criteria**:
- Executor recognizes `schedule_reminder` node type
- Calls `time-parser` to convert natural language to cron expression
- Sends `cron.schedule.add` message to `cron-manager` with reminder metadata
- Stores schedule ID in node output
- Handles errors from time parser and cron manager
- Accepts new optional parameters: `useBrowser?: boolean`, `convertToMarkdown?: boolean`
- Implements fallback logic: try `fetch` first, use Playwright on 403/401 or if `useBrowser` is true
- Detects HTML content type from response headers
- Converts HTML to Markdown when `convertToMarkdown` is true (default for HTML)
- Returns enhanced response with `status`, `body`, `contentType`, `finalUrl`, `method` fields
- Handles errors from both `fetch` and Playwright
- Logs which method was used (fetch vs browser)
### Task 4.2: Add HTTP Get Tool Tests
**File**: `src/services/__tests__/tool-host.test.ts`
**Dependencies**: Task 4.1
**Description**: Create integration tests for enhanced HTTP get tool.
**Acceptance Criteria**:
- Tests successful fetch with simple URL
- Tests fallback to Playwright on 403 response
- Tests explicit `useBrowser: true` parameter
- Tests HTML to Markdown conversion
- Tests response format includes all required fields
- Tests error handling for invalid URLs
- Tests error handling for network failures
- All tests pass with `npm test`
---
## Phase 5: Telegram Commands
## Phase 5: Planner Integration
### Task 5.1: Add List Reminders Command
**File**: `src/adapters/telegram-adapter.ts`
**Dependencies**: Task 1.3
**Description**: Add `/reminders` command to list active reminders for the user.
### Task 5.1: Update Planner Prompt
**File**: `src/agents/prompts/planner.ts`
**Dependencies**: Task 4.1
**Description**: Update planner prompt to document enhanced `http_get` capabilities.
**Acceptance Criteria**:
- `/reminders` command sends `cron.schedule.list` to `cron-manager` via orchestrator
- Filters results to show only user's reminders (by chatId)
- Formats and displays reminder list with ID, time, and message
- Shows "No active reminders" if list is empty
### Task 5.2: Add Cancel Reminder Command
**File**: `src/adapters/telegram-adapter.ts`
**Dependencies**: Task 5.1
**Description**: Add `/cancel_reminder` command to remove a specific reminder.
**Acceptance Criteria**:
- `/cancel_reminder <id>` command sends `cron.schedule.remove` to `cron-manager`
- Validates that the reminder belongs to the requesting user
- Sends confirmation message on success
- Sends error message if ID is invalid or reminder not found
### Task 5.3: Update Help Command
**File**: `src/adapters/telegram-adapter.ts`
**Dependencies**: Task 5.1, Task 5.2
**Description**: Update `/help` command to document reminder functionality.
**Acceptance Criteria**:
- Help text includes reminder examples
- Documents `/reminders` and `/cancel_reminder` commands
- Provides example reminder requests
- Updates `http_get` tool documentation to include `useBrowser` and `convertToMarkdown` parameters
- Adds example showing when to use `useBrowser: true` (e.g., for SPAs, sites with bot detection)
- Adds example showing Markdown conversion for web scraping
- Documents that HTML responses are automatically converted to Markdown
- Explains that `fetch` is tried first for performance, with automatic fallback to browser
---
@@ -136,21 +182,48 @@
### Task 6.1: Manual End-to-End Testing
**Dependencies**: All previous tasks
**Description**: Perform manual testing of the complete reminder flow.
**Description**: Perform manual testing of the complete enhanced HTTP get flow.
**Acceptance Criteria**:
- One-time reminder works: "Remind me in 2 minutes to check the oven"
- Recurring reminder works: "Remind me every day at 9am to take vitamins"
- `/reminders` command lists active reminders
- `/cancel_reminder` command removes reminders
- Reminders are delivered to the correct chat
- Error messages are clear and helpful
- Test fetching static HTML page (should use `fetch`)
- Test fetching SPA website (should fallback to Playwright)
- Test fetching site with bot detection (should use Playwright with stealth)
- Test explicit `useBrowser: true` parameter
- Test Markdown conversion quality on real websites
- Test error handling with invalid URLs
- Test timeout handling with slow sites
- Verify logs show which method was used
- Verify response times (fetch should be faster than Playwright)
### Task 6.2: Update README
### Task 6.2: Performance Benchmarking
**Dependencies**: Task 6.1
**Description**: Benchmark performance of fetch vs Playwright.
**Acceptance Criteria**:
- Measure average response time for `fetch` (should be <1s for most sites)
- Measure average response time for Playwright (should be <5s for most sites)
- Measure browser startup time (first request vs subsequent requests)
- Document performance characteristics in code comments or README
- Verify browser context reuse improves performance
### Task 6.3: Update README Documentation
**File**: `README.md`
**Dependencies**: Task 6.1
**Description**: Document the reminder feature in the README.
**Description**: Document the enhanced HTTP get tool in the README.
**Acceptance Criteria**:
- Adds "Reminder System" section to Features
- Documents supported time expressions
- Documents `/reminders` and `/cancel_reminder` commands
- Includes examples of reminder requests
- Adds section explaining enhanced `http_get` capabilities
- Documents when Playwright is used vs `fetch`
- Documents bot detection bypass techniques
- Documents HTML to Markdown conversion
- Includes examples of using `useBrowser` parameter
- Documents browser installation requirement (`npx playwright install chromium`)
### Task 6.4: Add Troubleshooting Guide
**File**: `README.md` or `docs/TROUBLESHOOTING.md`
**Dependencies**: Task 6.3
**Description**: Create troubleshooting guide for common browser service issues.
**Acceptance Criteria**:
- Documents how to debug Playwright issues (headless: false, slowMo, screenshots)
- Documents common bot detection bypass failures and solutions
- Documents how to handle sites with CAPTCHA
- Documents browser installation issues
- Documents timeout configuration
- Documents how to view browser logs

View File

@@ -0,0 +1,22 @@
# P12-01: Add Required Dependencies
**File**: `package.json`
**Dependencies**: None
**Phase**: 1 - Dependencies and Configuration
## Description
Add Playwright, stealth plugin, and HTML-to-Markdown conversion libraries to the project.
## Acceptance Criteria
- Add `playwright` to dependencies
- Add `playwright-extra` to dependencies
- Add `puppeteer-extra-plugin-stealth` to dependencies
- Add `turndown` to dependencies
- Add `@types/turndown` to devDependencies
- Run `npm install` successfully
- All dependencies are compatible with Node.js 20+
## Implementation Notes
- Use latest stable versions of all packages
- Verify compatibility between `playwright-extra` and `playwright` versions
- Check that `puppeteer-extra-plugin-stealth` works with Playwright (not just Puppeteer)

View File

@@ -0,0 +1,18 @@
# P12-02: Install Playwright Browsers
**File**: N/A (Command execution)
**Dependencies**: P12-01
**Phase**: 1 - Dependencies and Configuration
## Description
Install Chromium browser for Playwright to enable browser automation.
## Acceptance Criteria
- Run `npx playwright install chromium` successfully
- Verify Chromium is installed and functional
- Document browser installation in README or setup docs
## Implementation Notes
- The browser binaries are large (~200MB), ensure sufficient disk space
- Consider adding this to the setup/installation instructions
- May need to run with `--with-deps` flag on Linux systems

View File

@@ -0,0 +1,20 @@
# P12-03: Add Browser Service Configuration
**File**: `config.json`
**Dependencies**: None
**Phase**: 1 - Dependencies and Configuration
## Description
Add configuration section for browser service with sensible defaults.
## Acceptance Criteria
- Add `browserService` object to config.json
- Include properties: `headless`, `timeout`, `enableStealth`, `reuseContext`
- Set defaults: headless=true, timeout=30000, enableStealth=true, reuseContext=true
- Configuration is valid JSON
- No syntax errors
## Implementation Notes
- Keep timeout reasonable (30s) to avoid hanging requests
- Headless mode should be default for production
- Allow override via environment variables if needed

View File

@@ -0,0 +1,20 @@
# P12-04: Update Config TypeScript Types
**File**: `src/shared/config.ts`
**Dependencies**: P12-03
**Phase**: 1 - Dependencies and Configuration
## Description
Add TypeScript types and validation for browser service configuration.
## Acceptance Criteria
- Add `BrowserServiceConfig` interface with typed properties
- Add `browserService` to main config type
- Add validation for browser service config in `getConfig()`
- TypeScript compilation succeeds with no errors
- All config properties are properly typed
## Implementation Notes
- Use Zod schema for runtime validation if the project uses it
- Ensure timeout is a positive number
- Ensure boolean flags are properly typed

View File

@@ -0,0 +1,23 @@
# P12-05: Create HTML to Markdown Converter
**File**: `src/utils/html-to-markdown.ts`
**Dependencies**: P12-01
**Phase**: 2 - Core Utilities
## Description
Create utility to convert HTML to clean Markdown using Turndown library.
## Acceptance Criteria
- Exports `htmlToMarkdown(html: string, options?: ConversionOptions): string`
- Configures Turndown to preserve links, images, code blocks, tables, headings, lists
- Strips scripts, styles, navigation, footers, and other non-content elements
- Handles malformed HTML gracefully
- Returns empty string for empty/invalid input
- Includes JSDoc documentation
## Implementation Notes
- Use Turndown's `addRule()` to customize element handling
- Use `remove()` to strip unwanted elements (script, style, nav, footer, header)
- Configure `headingStyle: 'atx'` for consistent heading format
- Configure `codeBlockStyle: 'fenced'` for better code block rendering
- Test with real-world HTML to ensure quality output

View File

@@ -0,0 +1,25 @@
# P12-06: Add HTML to Markdown Tests
**File**: `src/utils/__tests__/html-to-markdown.test.ts`
**Dependencies**: P12-05
**Phase**: 2 - Core Utilities
## Description
Create unit tests for HTML to Markdown converter to ensure quality conversions.
## Acceptance Criteria
- Tests conversion of headings (h1-h6)
- Tests conversion of lists (ul, ol)
- Tests conversion of links and images
- Tests conversion of code blocks and inline code
- Tests conversion of tables
- Tests stripping of scripts and styles
- Tests handling of malformed HTML
- Tests handling of empty input
- All tests pass with `npm test`
## Implementation Notes
- Use Vitest for testing (already in project)
- Create fixtures with sample HTML for each test case
- Verify Markdown output matches expected format
- Test edge cases (nested elements, special characters, etc.)

View File

@@ -0,0 +1,22 @@
# P12-07: Create Browser Configuration
**File**: `src/services/browser-config.ts`
**Dependencies**: None
**Phase**: 2 - Core Utilities
## Description
Create configuration for user agents, viewports, and stealth settings to bypass bot detection.
## Acceptance Criteria
- Exports array of 10+ realistic user agents (Chrome, Firefox, Safari on Windows, macOS, Linux)
- Exports array of common viewport sizes (1920x1080, 1366x768, 1536x864, etc.)
- Exports function `getRandomUserAgent(): string`
- Exports function `getRandomViewport(): { width: number, height: number }`
- Exports stealth plugin configuration object
- Includes JSDoc documentation explaining bot detection bypass techniques
## Implementation Notes
- Use recent browser versions in user agents (2024-2026)
- Include both desktop and mobile user agents
- Viewport sizes should match common screen resolutions
- Document which stealth features are enabled and why

View File

@@ -0,0 +1,27 @@
# P12-08: Create Browser Service Core
**File**: `src/services/browser-service.ts`
**Dependencies**: P12-01, P12-04, P12-07
**Phase**: 3 - Browser Service
## Description
Create service to manage Playwright browser instances with singleton pattern.
## Acceptance Criteria
- Implements singleton pattern for browser instance
- Exports `BrowserService` class extending `BaseProcess`
- Implements `fetchWithBrowser(url: string, options?: BrowserFetchOptions): Promise<BrowserFetchResult>`
- Implements `close(): Promise<void>` for cleanup
- Configures Chromium with stealth plugin
- Uses random user agent and viewport for each request
- Handles browser launch errors gracefully
- Includes timeout handling (from config)
- Reuses browser context when `reuseContext` is enabled
## Implementation Notes
- Use `playwright-extra` with stealth plugin
- Launch browser only once and reuse across requests
- Create new context for each request (or reuse if configured)
- Handle browser crashes and auto-restart
- Wait for `networkidle` or `load` event before extracting content
- Return HTML content, status code, and final URL (after redirects)

View File

@@ -0,0 +1,24 @@
# P12-09: Add Realistic Behavior to Browser Service
**File**: `src/services/browser-service.ts`
**Dependencies**: P12-08
**Phase**: 3 - Browser Service
## Description
Add human-like behaviors to bypass advanced bot detection mechanisms.
## Acceptance Criteria
- Adds random delay (100-500ms) before page interaction
- Implements realistic mouse movement to random coordinates
- Waits for network idle before extracting content
- Adds random scroll behavior for long pages
- Configures browser to disable automation flags
- Sets realistic browser headers (Accept-Language, Accept-Encoding, etc.)
## Implementation Notes
- Use `page.mouse.move()` with random coordinates
- Use `page.evaluate()` to scroll to random positions
- Set `navigator.webdriver = false` via page context
- Add random delays between actions using `page.waitForTimeout()`
- Set Accept-Language to common values (en-US, en-GB, etc.)
- Disable `--enable-automation` flag in browser launch args

View File

@@ -0,0 +1,25 @@
# P12-10: Add Browser Service Tests
**File**: `src/services/__tests__/browser-service.test.ts`
**Dependencies**: P12-09
**Phase**: 3 - Browser Service
## Description
Create integration tests for browser service to ensure reliability.
## Acceptance Criteria
- Tests fetching a simple static HTML page
- Tests fetching a page with JavaScript (mocked SPA)
- Tests timeout handling with slow-loading page
- Tests browser instance reuse
- Tests cleanup on shutdown
- Tests error handling for invalid URLs
- Tests stealth plugin is applied
- All tests pass with `npm test`
## Implementation Notes
- Use local test server or mock responses for predictable tests
- Test with real websites sparingly (can be flaky)
- Verify browser is properly closed after tests
- Test both headless and headful modes if needed
- Mock slow responses using test server delays

View File

@@ -0,0 +1,25 @@
# P12-11: Update HTTP Get Tool with Smart Fallback
**File**: `src/services/tool-host.ts`
**Dependencies**: P12-05, P12-09
**Phase**: 4 - Enhanced HTTP Get Tool
## Description
Enhance `httpGetTool` to support Playwright with smart fallback logic from fetch to browser.
## Acceptance Criteria
- Accepts new optional parameters: `useBrowser?: boolean`, `convertToMarkdown?: boolean`
- Implements fallback logic: try `fetch` first, use Playwright on 403/401 or if `useBrowser` is true
- Detects HTML content type from response headers
- Converts HTML to Markdown when `convertToMarkdown` is true (default for HTML)
- Returns enhanced response with `status`, `body`, `contentType`, `finalUrl`, `method` fields
- Handles errors from both `fetch` and Playwright
- Logs which method was used (fetch vs browser)
## Implementation Notes
- Check response status code: if 403 or 401, retry with browser
- Check Content-Type header: if contains "text/html", convert to Markdown
- Default `convertToMarkdown` to true for HTML responses
- Add error handling for browser service failures
- Log performance metrics (response time) for monitoring
- Ensure backward compatibility with existing `http_get` usage

View File

@@ -0,0 +1,26 @@
# P12-12: Add HTTP Get Tool Tests
**File**: `src/services/__tests__/tool-host.test.ts`
**Dependencies**: P12-11
**Phase**: 4 - Enhanced HTTP Get Tool
## Description
Create integration tests for enhanced HTTP get tool to verify all functionality.
## Acceptance Criteria
- Tests successful fetch with simple URL
- Tests fallback to Playwright on 403 response
- Tests explicit `useBrowser: true` parameter
- Tests HTML to Markdown conversion
- Tests response format includes all required fields
- Tests error handling for invalid URLs
- Tests error handling for network failures
- All tests pass with `npm test`
## Implementation Notes
- Mock HTTP responses for predictable test results
- Test both successful and error cases
- Verify Markdown conversion quality
- Test that `method` field correctly indicates 'fetch' or 'browser'
- Verify `finalUrl` is populated correctly after redirects
- Test with both HTML and non-HTML content types

View File

@@ -0,0 +1,22 @@
# P12-13: Update Planner Prompt
**File**: `src/agents/prompts/planner.ts`
**Dependencies**: P12-11
**Phase**: 5 - Planner Integration
## Description
Update planner prompt to document enhanced `http_get` capabilities and parameters.
## Acceptance Criteria
- Updates `http_get` tool documentation to include `useBrowser` and `convertToMarkdown` parameters
- Adds example showing when to use `useBrowser: true` (e.g., for SPAs, sites with bot detection)
- Adds example showing Markdown conversion for web scraping
- Documents that HTML responses are automatically converted to Markdown
- Explains that `fetch` is tried first for performance, with automatic fallback to browser
## Implementation Notes
- Add to the "Available tools" section in planner prompt
- Include parameter descriptions and types
- Provide clear examples of when to use browser mode
- Mention that browser mode is slower but more reliable for SPAs
- Update any existing examples that use `http_get`

View File

@@ -0,0 +1,24 @@
# P12-16: Update README Documentation
**File**: `README.md`
**Dependencies**: P12-14
**Phase**: 6 - Testing & Documentation
## Description
Document the enhanced HTTP get tool capabilities in the README.
## Acceptance Criteria
- Adds section explaining enhanced `http_get` capabilities
- Documents when Playwright is used vs `fetch`
- Documents bot detection bypass techniques
- Documents HTML to Markdown conversion
- Includes examples of using `useBrowser` parameter
- Documents browser installation requirement (`npx playwright install chromium`)
## Implementation Notes
- Add to "Features" section
- Include code examples showing parameter usage
- Explain the smart fallback mechanism
- Document performance trade-offs
- Add setup instructions for Playwright
- Link to troubleshooting guide if created

View File

@@ -0,0 +1,24 @@
# P12-17: Add Troubleshooting Guide
**File**: `README.md` or `docs/TROUBLESHOOTING.md`
**Dependencies**: P12-16
**Phase**: 6 - Testing & Documentation
## Description
Create troubleshooting guide for common browser service issues.
## Acceptance Criteria
- Documents how to debug Playwright issues (headless: false, slowMo, screenshots)
- Documents common bot detection bypass failures and solutions
- Documents how to handle sites with CAPTCHA
- Documents browser installation issues
- Documents timeout configuration
- Documents how to view browser logs
## Implementation Notes
- Include common error messages and solutions
- Provide debugging tips (enable headful mode, take screenshots)
- Explain limitations (CAPTCHA cannot be bypassed automatically)
- Document how to increase timeout for slow sites
- Include links to Playwright documentation
- Add FAQ section if helpful

View File

@@ -1,16 +1,17 @@
# My Project Board
## To Do
## In Progress
## Done
### P12-14 Manual End-to-End Testing
- Source: P12-14_E2E_TESTING.md
### P12-15 Performance Benchmarking
- Source: P12-15_PERFORMANCE_BENCHMARK.md
## Done
### P12-16 Update README Documentation
- Source: P12-16_UPDATE_README.md

View File

@@ -541,19 +541,49 @@ export class ExecutorAgent extends BaseProcess {
}
if (!cronExpr) {
throw new Error("schedule_reminder requires cronExpr in input or dependency output");
const dependsOn = (nodeInput.dependsOn as string[] | undefined) ?? [];
const depOutputs = dependsOn.map(depId => {
const output = context[depId];
return `${depId}: ${typeof output === "string" ? output.substring(0, 100) : JSON.stringify(output).substring(0, 100)}`;
}).join("; ");
throw new Error(`schedule_reminder requires cronExpr in input or dependency output. Dependencies: ${depOutputs || "none"}`);
}
// Extract reminder metadata
const chatId = (nodeInput.chatId as number | string | undefined) ??
(context.chatId as number | string | undefined);
const reminderMessage = (nodeInput.reminderMessage as string | undefined) ??
let reminderMessage = (nodeInput.reminderMessage as string | undefined) ??
(context.reminderMessage as string | undefined);
const userId = (nodeInput.userId as number | string | undefined) ??
(context.userId as number | string | undefined);
if (!chatId || !reminderMessage) {
throw new Error("schedule_reminder requires chatId and reminderMessage");
// Fallback: try to extract reminderMessage from goal if not provided
if (!reminderMessage && context._goal && typeof context._goal === "string") {
const goal = context._goal as string;
// Try to extract reminder message from common patterns
// Patterns: "remind me to X", "remind me X", "notify me about X", etc.
const patterns = [
/remind\s+me\s+(?:to\s+)?(.+?)(?:\s+(?:every|at|in|tomorrow|today|on))?$/i,
/notify\s+me\s+(?:at\s+[^a]+)?(?:about\s+)?(.+?)$/i,
/schedule\s+(?:a\s+)?reminder\s+(?:for\s+)?(.+?)(?:\s+(?:every|at|in|tomorrow|today|on))?$/i,
];
for (const pattern of patterns) {
const match = goal.match(pattern);
if (match && match[1]) {
reminderMessage = match[1].trim();
// Remove time expressions that might have been captured
reminderMessage = reminderMessage.replace(/\s+(?:every|at|in|tomorrow|today|on)\s+.+$/i, "").trim();
if (reminderMessage) break;
}
}
}
if (!chatId) {
throw new Error(`schedule_reminder requires chatId (should be provided automatically by executor). Context keys: ${Object.keys(context).join(", ")}`);
}
if (!reminderMessage) {
const goal = context._goal ? ` Goal: "${context._goal}"` : "";
throw new Error(`schedule_reminder requires reminderMessage in node input or extractable from goal.${goal} Node input keys: ${Object.keys(nodeInput).join(", ")}`);
}
// Send cron.schedule.add to cron-manager

View File

@@ -64,7 +64,30 @@ export class GeneratorService extends BaseProcess {
prompt = buildSummarizerPrompt(chatHistory);
systemPrompt = SUMMARIZER_SYSTEM_PROMPT;
} else if (typeof p.input?.prompt === "string") {
prompt = p.input.prompt;
// When there's an explicit prompt, still include dependency outputs if available
const depOutputs = Object.entries(context)
.filter(([k]) => !k.startsWith("_"))
.map(([, v]) => {
// Extract body from http_get responses
if (v && typeof v === "object" && "body" in v && typeof v.body === "string") {
return v.body;
}
// Extract content from read_file responses
if (v && typeof v === "object" && "content" in v && typeof v.content === "string") {
return v.content;
}
// For strings, return as-is
if (typeof v === "string") {
return v;
}
// For other objects, stringify
return JSON.stringify(v);
});
if (depOutputs.length > 0) {
prompt = `${p.input.prompt}\n\nContent:\n${depOutputs.join("\n\n")}`;
} else {
prompt = p.input.prompt;
}
} else if (goal && (context["_criticFeedback"] != null || context["_previousDraft"] != null)) {
const feedback = context["_criticFeedback"] as string | undefined;
const previous = context["_previousDraft"] as string | undefined;
@@ -72,10 +95,40 @@ export class GeneratorService extends BaseProcess {
} else if (goal) {
const depOutputs = Object.entries(context)
.filter(([k]) => !k.startsWith("_"))
.map(([, v]) => (typeof v === "string" ? v : JSON.stringify(v)));
.map(([, v]) => {
// Extract body from http_get responses
if (v && typeof v === "object" && "body" in v && typeof v.body === "string") {
return v.body;
}
// Extract content from read_file responses
if (v && typeof v === "object" && "content" in v && typeof v.content === "string") {
return v.content;
}
// For strings, return as-is
if (typeof v === "string") {
return v;
}
// For other objects, stringify
return JSON.stringify(v);
});
prompt = `User goal: ${goal}\n\nContext from previous steps:\n${depOutputs.join("\n\n")}\n\nProduce a direct response to the goal. Output only the response text.`;
} else {
const depOutputs = Object.values(context).map((v) => (typeof v === "string" ? v : JSON.stringify(v)));
const depOutputs = Object.values(context).map((v) => {
// Extract body from http_get responses
if (v && typeof v === "object" && "body" in v && typeof v.body === "string") {
return v.body;
}
// Extract content from read_file responses
if (v && typeof v === "object" && "content" in v && typeof v.content === "string") {
return v.content;
}
// For strings, return as-is
if (typeof v === "string") {
return v;
}
// For other objects, stringify
return JSON.stringify(v);
});
prompt = depOutputs.join("\n\n") || "Generate a brief response.";
}
const genResult = systemPrompt