mirror of https://github.com/browseros-ai/BrowserOS.git synced 2026-05-18 11:06:19 +00:00

Files

Felarof e54f7049eb Tracking for MCP tools

react agent loop

2025-08-21 13:09:55 -07:00

14 KiB

Raw Blame History

React Agent Loop Design

Overview

This document outlines the design and implementation strategy for integrating a React (Reasoning + Acting) agent execution pattern into the existing BrowserAgent architecture. The React pattern, based on the paper "ReAct: Synergizing Reasoning and Acting in Language Models", provides a structured approach for agents to alternate between reasoning about tasks and taking actions.

Background

Current Execution Strategies

The BrowserAgent currently implements two execution strategies:

Simple Task Strategy (_executeSimpleTaskStrategy)
- Direct execution without planning
- Maximum 10 attempts
- Suitable for straightforward, single-step tasks
Multi-Step Strategy (_executeMultiStepStrategy)
- Plan → Execute → Validate → Re-plan cycle
- Uses PlannerTool for structured planning
- TODO list management for progress tracking
- Suitable for complex, multi-step tasks

React Pattern Advantages

The React pattern offers:

Explicit Reasoning: Each action is preceded by thought/reasoning
Better Interpretability: Clear chain of thought visible to users
Improved Error Recovery: Reasoning helps identify when to change approach
Natural Language Flow: More conversational and understandable execution

Design Options

Option 1: LangChain React Agent Integration

Leverage LangChain's createReactAgent and AgentExecutor for standard React behavior.

private async _executeReactAgentStrategy(task: string): Promise<void> {
  // Pull the React prompt template
  const reactPrompt = await pull<PromptTemplate>("hwchase17/react");
  
  // Get LLM from execution context
  const llm = await this.executionContext.getLLM();
  
  // Create React agent with existing tools
  const agent = await createReactAgent({
    llm: llm,
    tools: this.toolManager.getAll(),
    prompt: reactPrompt
  });
  
  // Create AgentExecutor with streaming
  const agentExecutor = new AgentExecutor({
    agent,
    tools: this.toolManager.getAll(),
    maxIterations: 15,
    returnIntermediateSteps: true
  });
  
  // Stream execution with PubSub integration
  const eventStream = agentExecutor.streamEvents(
    { input: task },
    { version: "v2", signal: this.executionContext.abortController.signal }
  );
  
  for await (const event of eventStream) {
    this.checkIfAborted();
    // Process and publish events via PubSub
  }
}

Pros:

Minimal code changes
Proven implementation
Compatible with existing tools

Cons:

Less control over execution flow
May need tool wrappers

Option 2: Custom React Implementation (Recommended)

Build a custom React strategy that integrates with existing planning and validation infrastructure.

Detailed Implementation: Enhanced Custom React Strategy

Core Architecture

private async _executeCustomReactStrategy(task: string): Promise<void> {
  const MAX_REACT_ITERATIONS = 30;
  const PLAN_EVERY_N_STEPS = 5;
  const VALIDATE_EVERY_N_STEPS = 3;
  
  // State tracking
  let currentPlan: Plan | null = null;
  let iterationCount = 0;
  let consecutiveFailures = 0;
  let lastValidation: ValidationResult | null = null;
  
  // Enhanced React system prompt
  const reactSystemPrompt = this._generateReactSystemPrompt();
  this.messageManager.addSystem(reactSystemPrompt);
  
  // Phase 1: Initial classification and planning
  const classification = await this._classifyTask(task);
  
  if (!classification.is_simple_task) {
    currentPlan = await this._createMultiStepPlan(task);
    await this._updateTodosFromPlan(currentPlan);
  }
  
  // Phase 2: React execution loop
  for (let i = 0; i < MAX_REACT_ITERATIONS; i++) {
    this.checkIfAborted();
    iterationCount++;
    
    // Periodic validation
    if (i > 0 && i % VALIDATE_EVERY_N_STEPS === 0) {
      lastValidation = await this._validateProgressInReactLoop(task, currentPlan);
      if (lastValidation.isComplete) return;
    }
    
    // Periodic re-planning
    if (currentPlan && i > 0 && i % PLAN_EVERY_N_STEPS === 0) {
      if (await this._shouldReplan(task, currentPlan, iterationCount)) {
        currentPlan = await this._createMultiStepPlan(task);
      }
    }
    
    // Execute React step
    const stepResult = await this._executeReactStep(task, currentPlan, i);
    
    if (stepResult.doneToolCalled) return;
    if (stepResult.requirePlanningCalled) {
      currentPlan = await this._createMultiStepPlan(task);
    }
    
    // Loop detection
    if (this._detectLoop()) {
      currentPlan = await this._replanWithContext(task, lastValidation);
    }
  }
}

Key Components

1. React System Prompt

private _generateReactSystemPrompt(): string {
  return `You are an advanced ReAct agent with planning and validation capabilities.

EXECUTION PATTERN:
1. Thought: Analyze the current situation and what needs to be done
2. Decision: Determine if you need to plan, execute a tool, or validate progress
3. Action: Execute the chosen tool
4. Observation: Process the result and update your understanding

SPECIAL TOOLS:
- planner_tool: Use when you need a structured multi-step plan
- validator_tool: Use to check if the task is complete
- todo_manager_tool: Use to track your progress visually
- require_planning_tool: Use when the current approach isn't working

IMPORTANT:
- For complex tasks, start by creating a plan
- Regularly validate your progress
- If stuck, request re-planning
- When complete, use done_tool with a summary`;
}

2. React Step Execution

private async _executeReactStep(
  task: string,
  currentPlan: Plan | null,
  stepNumber: number
): Promise<SingleTurnResult> {
  // Build context-aware prompt
  let prompt = `Step ${stepNumber + 1} of ReAct execution for task: ${task}\n\n`;
  
  if (currentPlan) {
    const todoState = await this._getCurrentTodoState();
    prompt += `Current Progress:\n${todoState}\n\n`;
  }
  
  prompt += `Based on the conversation and current state:
1. THOUGHT: What should I do next and why?
2. ACTION: Which tool should I use?`;
  
  // Execute with streaming
  this.messageManager.addHuman(prompt);
  const response = await this._invokeLLMWithStreaming();
  
  // Extract and display thought
  if (response.content) {
    const thought = this._extractThought(response.content as string);
    if (thought) {
      this.pubsub.publishMessage(
        PubSub.createMessage(`💭 ${thought}`, 'ultrathinking')
      );
    }
  }
  
  // Process tool calls
  if (response.tool_calls?.length > 0) {
    const result = await this._processToolCalls(response.tool_calls);
    await this._updateTodoProgress(response.tool_calls[0].name);
    return result;
  }
  
  return { doneToolCalled: false, requirePlanningCalled: false, requiresHumanInput: false };
}

3. Progress Validation

private async _validateProgressInReactLoop(
  task: string,
  currentPlan: Plan | null
): Promise<ValidationResult> {
  const validatorTool = this.toolManager.get('validator_tool');
  if (!validatorTool) return { isComplete: false, reasoning: "", suggestions: [] };
  
  const result = await validatorTool.func({ task, include_suggestions: true });
  const parsed = JSON.parse(result);
  
  if (parsed.ok && parsed.output) {
    const validation = parsed.output;
    
    // Publish insights
    this.pubsub.publishMessage(
      PubSub.createMessage(`📊 Validation: ${validation.reasoning}`, 'thinking')
    );
    
    if (validation.suggestions?.length > 0) {
      this.pubsub.publishMessage(
        PubSub.createMessage(`💡 Suggestions: ${validation.suggestions.join(', ')}`, 'thinking')
      );
    }
    
    return validation;
  }
  
  return { isComplete: false, reasoning: "Validation failed", suggestions: [] };
}

4. Intelligent Re-planning

private async _shouldReplan(
  task: string,
  currentPlan: Plan,
  iterationCount: number
): Promise<boolean> {
  const todoState = await this._getTodoCompletionMetrics();
  const { completedCount, totalCount } = todoState;
  
  if (totalCount === 0) return false;
  
  const completionRate = completedCount / totalCount;
  const stepsPerTodo = iterationCount / Math.max(completedCount, 1);
  
  // Replan if progress is too slow or completion rate is low
  return (stepsPerTodo > 5 && completionRate < 0.5) || 
         (iterationCount > 10 && completionRate < 0.3);
}

private async _replanWithContext(
  task: string,
  validation: ValidationResult | null
): Promise<Plan> {
  this.pubsub.publishMessage(
    PubSub.createMessage("🔄 Creating new plan based on feedback", 'ultrathinking')
  );
  
  if (validation) {
    const context = `Previous validation: ${validation.reasoning}
Suggestions: ${validation.suggestions.join(', ')}`;
    this.messageManager.addAI(context);
  }
  
  return await this._createMultiStepPlan(task);
}

5. TODO Synchronization

private async _updateTodoProgress(toolName: string): Promise<void> {
  const todoTool = this.toolManager.get('todo_manager_tool');
  if (!todoTool) return;
  
  const result = await todoTool.func({ action: 'get' });
  const currentTodos = JSON.parse(result).output || '';
  
  // Smart TODO completion based on tool execution
  const lines = currentTodos.split('\n');
  let updated = false;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].includes('- [ ]')) {
      // Match tool name to TODO item
      const todoText = lines[i].toLowerCase();
      const toolKeyword = toolName.replace(/_tool$/, '').replace(/_/g, ' ');
      
      if (todoText.includes(toolKeyword)) {
        lines[i] = lines[i].replace('- [ ]', '- [x]');
        updated = true;
        break;
      }
    }
  }
  
  if (updated) {
    await todoTool.func({ action: 'set', todos: lines.join('\n') });
  }
}

Integration Strategy

1. Triggering React Mode

Add to the execute() method:

// Option A: Via classification
if (classification.use_react_pattern) {
  await this._executeCustomReactStrategy(task);
  return;
}

// Option B: Via metadata
if (metadata?.executionMode === 'react') {
  await this._executeCustomReactStrategy(task);
  return;
}

// Option C: Via task complexity
if (classification.complexity === 'high' || task.includes('reason')) {
  await this._executeCustomReactStrategy(task);
  return;
}

2. Classification Tool Enhancement

Update ClassificationTool to detect when React pattern would be beneficial:

// In ClassificationTool
const needsReactPattern = (task: string): boolean => {
  const reactIndicators = [
    'analyze and',
    'think about',
    'reason through',
    'step by step',
    'explain your thinking',
    'debug',
    'troubleshoot'
  ];
  
  return reactIndicators.some(indicator => 
    task.toLowerCase().includes(indicator)
  );
};

Benefits of This Approach

1. Best of Both Worlds

Combines React's explicit reasoning with existing planning infrastructure
Maintains TODO visibility for user feedback
Preserves validation and re-planning capabilities

2. Adaptive Execution

Automatically switches strategies based on progress
Detects stuck states and forces new approaches
Adjusts planning frequency based on performance

3. Full Observability

Real-time thought streaming via PubSub
TODO progress tracking
Validation insights displayed to user

4. Robust Error Recovery

Multiple mechanisms to detect and recover from failures
Context-aware re-planning based on validation
Loop detection with automatic strategy switching

5. Seamless Integration

Uses existing ToolManager and MessageManager
Compatible with current streaming infrastructure
Maintains abort/cancellation support

Performance Considerations

Token Management

React pattern generates more tokens due to explicit reasoning
Mitigated by MessageManager's automatic trimming at 60% capacity
Consider shorter React prompts for token efficiency

Execution Speed

Additional reasoning steps may increase execution time
Balanced by better decision-making and fewer errors
Validation intervals can be adjusted for performance

Memory Usage

Tracking additional state (plans, validations, metrics)
Minimal overhead compared to benefits
State cleared between executions

Future Enhancements

1. Meta-Reasoning

Add a meta-reasoning layer that evaluates the React process itself:

private async _metaReasoning(history: Message[]): Promise<{
  switchStrategy: boolean;
  adjustParameters: Record<string, any>;
}> {
  // Analyze execution patterns and suggest improvements
}

2. Learning from Execution

Track successful patterns and adjust React behavior:

private async _learnFromExecution(task: string, success: boolean): Promise<void> {
  // Store patterns that led to success/failure
  // Adjust future React prompts based on learning
}

3. Hybrid Strategies

Combine React with other patterns dynamically:

private async _executeHybridStrategy(task: string): Promise<void> {
  // Start with React for reasoning
  // Switch to direct execution for simple subtasks
  // Use planning for complex sequences
}

Conclusion

The Enhanced Custom React Strategy provides a powerful addition to BrowserAgent's execution capabilities. By combining React's reasoning pattern with existing planning and validation tools, we achieve:

Better interpretability through explicit reasoning
Improved reliability via validation and re-planning
Enhanced user experience with real-time thought streaming
Robust error recovery through multiple fallback mechanisms

This design maintains backward compatibility while offering a sophisticated execution pattern for complex tasks that benefit from explicit reasoning and iterative refinement.

14 KiB Raw Blame History