# React Agent Loop Design ## Overview This document outlines the design and implementation strategy for integrating a React (Reasoning + Acting) agent execution pattern into the existing BrowserAgent architecture. The React pattern, based on the paper "ReAct: Synergizing Reasoning and Acting in Language Models", provides a structured approach for agents to alternate between reasoning about tasks and taking actions. ## Background ### Current Execution Strategies The BrowserAgent currently implements two execution strategies: 1. **Simple Task Strategy** (`_executeSimpleTaskStrategy`) - Direct execution without planning - Maximum 10 attempts - Suitable for straightforward, single-step tasks 2. **Multi-Step Strategy** (`_executeMultiStepStrategy`) - Plan → Execute → Validate → Re-plan cycle - Uses PlannerTool for structured planning - TODO list management for progress tracking - Suitable for complex, multi-step tasks ### React Pattern Advantages The React pattern offers: - **Explicit Reasoning**: Each action is preceded by thought/reasoning - **Better Interpretability**: Clear chain of thought visible to users - **Improved Error Recovery**: Reasoning helps identify when to change approach - **Natural Language Flow**: More conversational and understandable execution ## Design Options ### Option 1: LangChain React Agent Integration Leverage LangChain's `createReactAgent` and `AgentExecutor` for standard React behavior. ```typescript private async _executeReactAgentStrategy(task: string): Promise { // Pull the React prompt template const reactPrompt = await pull("hwchase17/react"); // Get LLM from execution context const llm = await this.executionContext.getLLM(); // Create React agent with existing tools const agent = await createReactAgent({ llm: llm, tools: this.toolManager.getAll(), prompt: reactPrompt }); // Create AgentExecutor with streaming const agentExecutor = new AgentExecutor({ agent, tools: this.toolManager.getAll(), maxIterations: 15, returnIntermediateSteps: true }); // Stream execution with PubSub integration const eventStream = agentExecutor.streamEvents( { input: task }, { version: "v2", signal: this.executionContext.abortController.signal } ); for await (const event of eventStream) { this.checkIfAborted(); // Process and publish events via PubSub } } ``` **Pros:** - Minimal code changes - Proven implementation - Compatible with existing tools **Cons:** - Less control over execution flow - May need tool wrappers ### Option 2: Custom React Implementation (Recommended) Build a custom React strategy that integrates with existing planning and validation infrastructure. ## Detailed Implementation: Enhanced Custom React Strategy ### Core Architecture ```typescript private async _executeCustomReactStrategy(task: string): Promise { const MAX_REACT_ITERATIONS = 30; const PLAN_EVERY_N_STEPS = 5; const VALIDATE_EVERY_N_STEPS = 3; // State tracking let currentPlan: Plan | null = null; let iterationCount = 0; let consecutiveFailures = 0; let lastValidation: ValidationResult | null = null; // Enhanced React system prompt const reactSystemPrompt = this._generateReactSystemPrompt(); this.messageManager.addSystem(reactSystemPrompt); // Phase 1: Initial classification and planning const classification = await this._classifyTask(task); if (!classification.is_simple_task) { currentPlan = await this._createMultiStepPlan(task); await this._updateTodosFromPlan(currentPlan); } // Phase 2: React execution loop for (let i = 0; i < MAX_REACT_ITERATIONS; i++) { this.checkIfAborted(); iterationCount++; // Periodic validation if (i > 0 && i % VALIDATE_EVERY_N_STEPS === 0) { lastValidation = await this._validateProgressInReactLoop(task, currentPlan); if (lastValidation.isComplete) return; } // Periodic re-planning if (currentPlan && i > 0 && i % PLAN_EVERY_N_STEPS === 0) { if (await this._shouldReplan(task, currentPlan, iterationCount)) { currentPlan = await this._createMultiStepPlan(task); } } // Execute React step const stepResult = await this._executeReactStep(task, currentPlan, i); if (stepResult.doneToolCalled) return; if (stepResult.requirePlanningCalled) { currentPlan = await this._createMultiStepPlan(task); } // Loop detection if (this._detectLoop()) { currentPlan = await this._replanWithContext(task, lastValidation); } } } ``` ### Key Components #### 1. React System Prompt ```typescript private _generateReactSystemPrompt(): string { return `You are an advanced ReAct agent with planning and validation capabilities. EXECUTION PATTERN: 1. Thought: Analyze the current situation and what needs to be done 2. Decision: Determine if you need to plan, execute a tool, or validate progress 3. Action: Execute the chosen tool 4. Observation: Process the result and update your understanding SPECIAL TOOLS: - planner_tool: Use when you need a structured multi-step plan - validator_tool: Use to check if the task is complete - todo_manager_tool: Use to track your progress visually - require_planning_tool: Use when the current approach isn't working IMPORTANT: - For complex tasks, start by creating a plan - Regularly validate your progress - If stuck, request re-planning - When complete, use done_tool with a summary`; } ``` #### 2. React Step Execution ```typescript private async _executeReactStep( task: string, currentPlan: Plan | null, stepNumber: number ): Promise { // Build context-aware prompt let prompt = `Step ${stepNumber + 1} of ReAct execution for task: ${task}\n\n`; if (currentPlan) { const todoState = await this._getCurrentTodoState(); prompt += `Current Progress:\n${todoState}\n\n`; } prompt += `Based on the conversation and current state: 1. THOUGHT: What should I do next and why? 2. ACTION: Which tool should I use?`; // Execute with streaming this.messageManager.addHuman(prompt); const response = await this._invokeLLMWithStreaming(); // Extract and display thought if (response.content) { const thought = this._extractThought(response.content as string); if (thought) { this.pubsub.publishMessage( PubSub.createMessage(`💭 ${thought}`, 'ultrathinking') ); } } // Process tool calls if (response.tool_calls?.length > 0) { const result = await this._processToolCalls(response.tool_calls); await this._updateTodoProgress(response.tool_calls[0].name); return result; } return { doneToolCalled: false, requirePlanningCalled: false, requiresHumanInput: false }; } ``` #### 3. Progress Validation ```typescript private async _validateProgressInReactLoop( task: string, currentPlan: Plan | null ): Promise { const validatorTool = this.toolManager.get('validator_tool'); if (!validatorTool) return { isComplete: false, reasoning: "", suggestions: [] }; const result = await validatorTool.func({ task, include_suggestions: true }); const parsed = JSON.parse(result); if (parsed.ok && parsed.output) { const validation = parsed.output; // Publish insights this.pubsub.publishMessage( PubSub.createMessage(`📊 Validation: ${validation.reasoning}`, 'thinking') ); if (validation.suggestions?.length > 0) { this.pubsub.publishMessage( PubSub.createMessage(`💡 Suggestions: ${validation.suggestions.join(', ')}`, 'thinking') ); } return validation; } return { isComplete: false, reasoning: "Validation failed", suggestions: [] }; } ``` #### 4. Intelligent Re-planning ```typescript private async _shouldReplan( task: string, currentPlan: Plan, iterationCount: number ): Promise { const todoState = await this._getTodoCompletionMetrics(); const { completedCount, totalCount } = todoState; if (totalCount === 0) return false; const completionRate = completedCount / totalCount; const stepsPerTodo = iterationCount / Math.max(completedCount, 1); // Replan if progress is too slow or completion rate is low return (stepsPerTodo > 5 && completionRate < 0.5) || (iterationCount > 10 && completionRate < 0.3); } private async _replanWithContext( task: string, validation: ValidationResult | null ): Promise { this.pubsub.publishMessage( PubSub.createMessage("🔄 Creating new plan based on feedback", 'ultrathinking') ); if (validation) { const context = `Previous validation: ${validation.reasoning} Suggestions: ${validation.suggestions.join(', ')}`; this.messageManager.addAI(context); } return await this._createMultiStepPlan(task); } ``` #### 5. TODO Synchronization ```typescript private async _updateTodoProgress(toolName: string): Promise { const todoTool = this.toolManager.get('todo_manager_tool'); if (!todoTool) return; const result = await todoTool.func({ action: 'get' }); const currentTodos = JSON.parse(result).output || ''; // Smart TODO completion based on tool execution const lines = currentTodos.split('\n'); let updated = false; for (let i = 0; i < lines.length; i++) { if (lines[i].includes('- [ ]')) { // Match tool name to TODO item const todoText = lines[i].toLowerCase(); const toolKeyword = toolName.replace(/_tool$/, '').replace(/_/g, ' '); if (todoText.includes(toolKeyword)) { lines[i] = lines[i].replace('- [ ]', '- [x]'); updated = true; break; } } } if (updated) { await todoTool.func({ action: 'set', todos: lines.join('\n') }); } } ``` ## Integration Strategy ### 1. Triggering React Mode Add to the `execute()` method: ```typescript // Option A: Via classification if (classification.use_react_pattern) { await this._executeCustomReactStrategy(task); return; } // Option B: Via metadata if (metadata?.executionMode === 'react') { await this._executeCustomReactStrategy(task); return; } // Option C: Via task complexity if (classification.complexity === 'high' || task.includes('reason')) { await this._executeCustomReactStrategy(task); return; } ``` ### 2. Classification Tool Enhancement Update ClassificationTool to detect when React pattern would be beneficial: ```typescript // In ClassificationTool const needsReactPattern = (task: string): boolean => { const reactIndicators = [ 'analyze and', 'think about', 'reason through', 'step by step', 'explain your thinking', 'debug', 'troubleshoot' ]; return reactIndicators.some(indicator => task.toLowerCase().includes(indicator) ); }; ``` ## Benefits of This Approach ### 1. Best of Both Worlds - Combines React's explicit reasoning with existing planning infrastructure - Maintains TODO visibility for user feedback - Preserves validation and re-planning capabilities ### 2. Adaptive Execution - Automatically switches strategies based on progress - Detects stuck states and forces new approaches - Adjusts planning frequency based on performance ### 3. Full Observability - Real-time thought streaming via PubSub - TODO progress tracking - Validation insights displayed to user ### 4. Robust Error Recovery - Multiple mechanisms to detect and recover from failures - Context-aware re-planning based on validation - Loop detection with automatic strategy switching ### 5. Seamless Integration - Uses existing ToolManager and MessageManager - Compatible with current streaming infrastructure - Maintains abort/cancellation support ## Performance Considerations ### Token Management - React pattern generates more tokens due to explicit reasoning - Mitigated by MessageManager's automatic trimming at 60% capacity - Consider shorter React prompts for token efficiency ### Execution Speed - Additional reasoning steps may increase execution time - Balanced by better decision-making and fewer errors - Validation intervals can be adjusted for performance ### Memory Usage - Tracking additional state (plans, validations, metrics) - Minimal overhead compared to benefits - State cleared between executions ## Future Enhancements ### 1. Meta-Reasoning Add a meta-reasoning layer that evaluates the React process itself: ```typescript private async _metaReasoning(history: Message[]): Promise<{ switchStrategy: boolean; adjustParameters: Record; }> { // Analyze execution patterns and suggest improvements } ``` ### 2. Learning from Execution Track successful patterns and adjust React behavior: ```typescript private async _learnFromExecution(task: string, success: boolean): Promise { // Store patterns that led to success/failure // Adjust future React prompts based on learning } ``` ### 3. Hybrid Strategies Combine React with other patterns dynamically: ```typescript private async _executeHybridStrategy(task: string): Promise { // Start with React for reasoning // Switch to direct execution for simple subtasks // Use planning for complex sequences } ``` ## Conclusion The Enhanced Custom React Strategy provides a powerful addition to BrowserAgent's execution capabilities. By combining React's reasoning pattern with existing planning and validation tools, we achieve: - **Better interpretability** through explicit reasoning - **Improved reliability** via validation and re-planning - **Enhanced user experience** with real-time thought streaming - **Robust error recovery** through multiple fallback mechanisms This design maintains backward compatibility while offering a sophisticated execution pattern for complex tasks that benefit from explicit reasoning and iterative refinement.