Files
BrowserOS/docs/4-react-agent-loop.md
Felarof e54f7049eb Tracking for MCP tools
react agent loop
2025-08-21 13:09:55 -07:00

14 KiB

React Agent Loop Design

Overview

This document outlines the design and implementation strategy for integrating a React (Reasoning + Acting) agent execution pattern into the existing BrowserAgent architecture. The React pattern, based on the paper "ReAct: Synergizing Reasoning and Acting in Language Models", provides a structured approach for agents to alternate between reasoning about tasks and taking actions.

Background

Current Execution Strategies

The BrowserAgent currently implements two execution strategies:

  1. Simple Task Strategy (_executeSimpleTaskStrategy)

    • Direct execution without planning
    • Maximum 10 attempts
    • Suitable for straightforward, single-step tasks
  2. Multi-Step Strategy (_executeMultiStepStrategy)

    • Plan → Execute → Validate → Re-plan cycle
    • Uses PlannerTool for structured planning
    • TODO list management for progress tracking
    • Suitable for complex, multi-step tasks

React Pattern Advantages

The React pattern offers:

  • Explicit Reasoning: Each action is preceded by thought/reasoning
  • Better Interpretability: Clear chain of thought visible to users
  • Improved Error Recovery: Reasoning helps identify when to change approach
  • Natural Language Flow: More conversational and understandable execution

Design Options

Option 1: LangChain React Agent Integration

Leverage LangChain's createReactAgent and AgentExecutor for standard React behavior.

private async _executeReactAgentStrategy(task: string): Promise<void> {
  // Pull the React prompt template
  const reactPrompt = await pull<PromptTemplate>("hwchase17/react");
  
  // Get LLM from execution context
  const llm = await this.executionContext.getLLM();
  
  // Create React agent with existing tools
  const agent = await createReactAgent({
    llm: llm,
    tools: this.toolManager.getAll(),
    prompt: reactPrompt
  });
  
  // Create AgentExecutor with streaming
  const agentExecutor = new AgentExecutor({
    agent,
    tools: this.toolManager.getAll(),
    maxIterations: 15,
    returnIntermediateSteps: true
  });
  
  // Stream execution with PubSub integration
  const eventStream = agentExecutor.streamEvents(
    { input: task },
    { version: "v2", signal: this.executionContext.abortController.signal }
  );
  
  for await (const event of eventStream) {
    this.checkIfAborted();
    // Process and publish events via PubSub
  }
}

Pros:

  • Minimal code changes
  • Proven implementation
  • Compatible with existing tools

Cons:

  • Less control over execution flow
  • May need tool wrappers

Build a custom React strategy that integrates with existing planning and validation infrastructure.

Detailed Implementation: Enhanced Custom React Strategy

Core Architecture

private async _executeCustomReactStrategy(task: string): Promise<void> {
  const MAX_REACT_ITERATIONS = 30;
  const PLAN_EVERY_N_STEPS = 5;
  const VALIDATE_EVERY_N_STEPS = 3;
  
  // State tracking
  let currentPlan: Plan | null = null;
  let iterationCount = 0;
  let consecutiveFailures = 0;
  let lastValidation: ValidationResult | null = null;
  
  // Enhanced React system prompt
  const reactSystemPrompt = this._generateReactSystemPrompt();
  this.messageManager.addSystem(reactSystemPrompt);
  
  // Phase 1: Initial classification and planning
  const classification = await this._classifyTask(task);
  
  if (!classification.is_simple_task) {
    currentPlan = await this._createMultiStepPlan(task);
    await this._updateTodosFromPlan(currentPlan);
  }
  
  // Phase 2: React execution loop
  for (let i = 0; i < MAX_REACT_ITERATIONS; i++) {
    this.checkIfAborted();
    iterationCount++;
    
    // Periodic validation
    if (i > 0 && i % VALIDATE_EVERY_N_STEPS === 0) {
      lastValidation = await this._validateProgressInReactLoop(task, currentPlan);
      if (lastValidation.isComplete) return;
    }
    
    // Periodic re-planning
    if (currentPlan && i > 0 && i % PLAN_EVERY_N_STEPS === 0) {
      if (await this._shouldReplan(task, currentPlan, iterationCount)) {
        currentPlan = await this._createMultiStepPlan(task);
      }
    }
    
    // Execute React step
    const stepResult = await this._executeReactStep(task, currentPlan, i);
    
    if (stepResult.doneToolCalled) return;
    if (stepResult.requirePlanningCalled) {
      currentPlan = await this._createMultiStepPlan(task);
    }
    
    // Loop detection
    if (this._detectLoop()) {
      currentPlan = await this._replanWithContext(task, lastValidation);
    }
  }
}

Key Components

1. React System Prompt

private _generateReactSystemPrompt(): string {
  return `You are an advanced ReAct agent with planning and validation capabilities.

EXECUTION PATTERN:
1. Thought: Analyze the current situation and what needs to be done
2. Decision: Determine if you need to plan, execute a tool, or validate progress
3. Action: Execute the chosen tool
4. Observation: Process the result and update your understanding

SPECIAL TOOLS:
- planner_tool: Use when you need a structured multi-step plan
- validator_tool: Use to check if the task is complete
- todo_manager_tool: Use to track your progress visually
- require_planning_tool: Use when the current approach isn't working

IMPORTANT:
- For complex tasks, start by creating a plan
- Regularly validate your progress
- If stuck, request re-planning
- When complete, use done_tool with a summary`;
}

2. React Step Execution

private async _executeReactStep(
  task: string,
  currentPlan: Plan | null,
  stepNumber: number
): Promise<SingleTurnResult> {
  // Build context-aware prompt
  let prompt = `Step ${stepNumber + 1} of ReAct execution for task: ${task}\n\n`;
  
  if (currentPlan) {
    const todoState = await this._getCurrentTodoState();
    prompt += `Current Progress:\n${todoState}\n\n`;
  }
  
  prompt += `Based on the conversation and current state:
1. THOUGHT: What should I do next and why?
2. ACTION: Which tool should I use?`;
  
  // Execute with streaming
  this.messageManager.addHuman(prompt);
  const response = await this._invokeLLMWithStreaming();
  
  // Extract and display thought
  if (response.content) {
    const thought = this._extractThought(response.content as string);
    if (thought) {
      this.pubsub.publishMessage(
        PubSub.createMessage(`💭 ${thought}`, 'ultrathinking')
      );
    }
  }
  
  // Process tool calls
  if (response.tool_calls?.length > 0) {
    const result = await this._processToolCalls(response.tool_calls);
    await this._updateTodoProgress(response.tool_calls[0].name);
    return result;
  }
  
  return { doneToolCalled: false, requirePlanningCalled: false, requiresHumanInput: false };
}

3. Progress Validation

private async _validateProgressInReactLoop(
  task: string,
  currentPlan: Plan | null
): Promise<ValidationResult> {
  const validatorTool = this.toolManager.get('validator_tool');
  if (!validatorTool) return { isComplete: false, reasoning: "", suggestions: [] };
  
  const result = await validatorTool.func({ task, include_suggestions: true });
  const parsed = JSON.parse(result);
  
  if (parsed.ok && parsed.output) {
    const validation = parsed.output;
    
    // Publish insights
    this.pubsub.publishMessage(
      PubSub.createMessage(`📊 Validation: ${validation.reasoning}`, 'thinking')
    );
    
    if (validation.suggestions?.length > 0) {
      this.pubsub.publishMessage(
        PubSub.createMessage(`💡 Suggestions: ${validation.suggestions.join(', ')}`, 'thinking')
      );
    }
    
    return validation;
  }
  
  return { isComplete: false, reasoning: "Validation failed", suggestions: [] };
}

4. Intelligent Re-planning

private async _shouldReplan(
  task: string,
  currentPlan: Plan,
  iterationCount: number
): Promise<boolean> {
  const todoState = await this._getTodoCompletionMetrics();
  const { completedCount, totalCount } = todoState;
  
  if (totalCount === 0) return false;
  
  const completionRate = completedCount / totalCount;
  const stepsPerTodo = iterationCount / Math.max(completedCount, 1);
  
  // Replan if progress is too slow or completion rate is low
  return (stepsPerTodo > 5 && completionRate < 0.5) || 
         (iterationCount > 10 && completionRate < 0.3);
}

private async _replanWithContext(
  task: string,
  validation: ValidationResult | null
): Promise<Plan> {
  this.pubsub.publishMessage(
    PubSub.createMessage("🔄 Creating new plan based on feedback", 'ultrathinking')
  );
  
  if (validation) {
    const context = `Previous validation: ${validation.reasoning}
Suggestions: ${validation.suggestions.join(', ')}`;
    this.messageManager.addAI(context);
  }
  
  return await this._createMultiStepPlan(task);
}

5. TODO Synchronization

private async _updateTodoProgress(toolName: string): Promise<void> {
  const todoTool = this.toolManager.get('todo_manager_tool');
  if (!todoTool) return;
  
  const result = await todoTool.func({ action: 'get' });
  const currentTodos = JSON.parse(result).output || '';
  
  // Smart TODO completion based on tool execution
  const lines = currentTodos.split('\n');
  let updated = false;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].includes('- [ ]')) {
      // Match tool name to TODO item
      const todoText = lines[i].toLowerCase();
      const toolKeyword = toolName.replace(/_tool$/, '').replace(/_/g, ' ');
      
      if (todoText.includes(toolKeyword)) {
        lines[i] = lines[i].replace('- [ ]', '- [x]');
        updated = true;
        break;
      }
    }
  }
  
  if (updated) {
    await todoTool.func({ action: 'set', todos: lines.join('\n') });
  }
}

Integration Strategy

1. Triggering React Mode

Add to the execute() method:

// Option A: Via classification
if (classification.use_react_pattern) {
  await this._executeCustomReactStrategy(task);
  return;
}

// Option B: Via metadata
if (metadata?.executionMode === 'react') {
  await this._executeCustomReactStrategy(task);
  return;
}

// Option C: Via task complexity
if (classification.complexity === 'high' || task.includes('reason')) {
  await this._executeCustomReactStrategy(task);
  return;
}

2. Classification Tool Enhancement

Update ClassificationTool to detect when React pattern would be beneficial:

// In ClassificationTool
const needsReactPattern = (task: string): boolean => {
  const reactIndicators = [
    'analyze and',
    'think about',
    'reason through',
    'step by step',
    'explain your thinking',
    'debug',
    'troubleshoot'
  ];
  
  return reactIndicators.some(indicator => 
    task.toLowerCase().includes(indicator)
  );
};

Benefits of This Approach

1. Best of Both Worlds

  • Combines React's explicit reasoning with existing planning infrastructure
  • Maintains TODO visibility for user feedback
  • Preserves validation and re-planning capabilities

2. Adaptive Execution

  • Automatically switches strategies based on progress
  • Detects stuck states and forces new approaches
  • Adjusts planning frequency based on performance

3. Full Observability

  • Real-time thought streaming via PubSub
  • TODO progress tracking
  • Validation insights displayed to user

4. Robust Error Recovery

  • Multiple mechanisms to detect and recover from failures
  • Context-aware re-planning based on validation
  • Loop detection with automatic strategy switching

5. Seamless Integration

  • Uses existing ToolManager and MessageManager
  • Compatible with current streaming infrastructure
  • Maintains abort/cancellation support

Performance Considerations

Token Management

  • React pattern generates more tokens due to explicit reasoning
  • Mitigated by MessageManager's automatic trimming at 60% capacity
  • Consider shorter React prompts for token efficiency

Execution Speed

  • Additional reasoning steps may increase execution time
  • Balanced by better decision-making and fewer errors
  • Validation intervals can be adjusted for performance

Memory Usage

  • Tracking additional state (plans, validations, metrics)
  • Minimal overhead compared to benefits
  • State cleared between executions

Future Enhancements

1. Meta-Reasoning

Add a meta-reasoning layer that evaluates the React process itself:

private async _metaReasoning(history: Message[]): Promise<{
  switchStrategy: boolean;
  adjustParameters: Record<string, any>;
}> {
  // Analyze execution patterns and suggest improvements
}

2. Learning from Execution

Track successful patterns and adjust React behavior:

private async _learnFromExecution(task: string, success: boolean): Promise<void> {
  // Store patterns that led to success/failure
  // Adjust future React prompts based on learning
}

3. Hybrid Strategies

Combine React with other patterns dynamically:

private async _executeHybridStrategy(task: string): Promise<void> {
  // Start with React for reasoning
  // Switch to direct execution for simple subtasks
  // Use planning for complex sequences
}

Conclusion

The Enhanced Custom React Strategy provides a powerful addition to BrowserAgent's execution capabilities. By combining React's reasoning pattern with existing planning and validation tools, we achieve:

  • Better interpretability through explicit reasoning
  • Improved reliability via validation and re-planning
  • Enhanced user experience with real-time thought streaming
  • Robust error recovery through multiple fallback mechanisms

This design maintains backward compatibility while offering a sophisticated execution pattern for complex tasks that benefit from explicit reasoning and iterative refinement.