* PocAgent + refactor (#77)
* Screenshot tool fixed
* ReactAgent loop
ReactAgent loop
v0.2
* Trim to max tokens implemented correctly
trim max tokens
* JSON Parse fix
Fixed json.parse
* Minor fix -- add system message always at position 0
* minor fix
* Added support for passing screenshot size to captureScreenshot
backup
* Make react agent use screenshot tool
* Refactor backend and execution (#75)
* wip: new exection class and manager
* wip: new pubsub channels
* wip: new background handlers
* new execution logic
* removed execution status
* handle workflow status for processing in sidepanel
* mcp server fix
* sending pause message
* better portName parsing, sidepanel sends tabId, storing tabId too
* 49.0.0.26 release
* docs: OmkarBansod02 signed the CLA in browseros-ai/BrowserOS-agent#$pullRequestNo
* Refactor backend and execution (#75)
* wip: new exection class and manager
* wip: new pubsub channels
* wip: new background handlers
* new execution logic
* removed execution status
* handle workflow status for processing in sidepanel
* mcp server fix
* sending pause message
* better portName parsing, sidepanel sends tabId, storing tabId too
* Moved react agent into POCAgent
* Revert changes of ReacStrategy from BrowserAgent
* Minor fix
* fix: execution class abort issue
* clean-up: removed un-used MessageTypes
* clean-up: execution-manager simplified
* better abort handling
---------
Co-authored-by: Felarof <nithin.sonti@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Screenshot failure handling
* fixes to custom agent -> sidepanel
* fix: portName stable
* fix: pause/reset + newtab connector
* New agent: wip-1
* fix: JSONify message content in MessageManagerReadyOnly
* disable cacheLLM and fixing image type detection
* new agent: wip-2
* rename: <BrowserState> as <browser-state> and <SystemReminder> as <system-reminder>
* new-agent: wip-3
---------
Co-authored-by: Felarof <nithin.sonti@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* implement validator eval
* implement online eval foundation
* further implementing online evals
* enhance evaluation data logging
* implement LLM scoring, remove redundant EventEnricher
* cleanup
* fix build errs from merging, extend LLM scorer context
* settled evaluation framework
* update evals documentation
* fix evals screenshots
* fix typos
* Evals config moved to env variables and tested
* test
* Update manifest to 49.1
* Removed duplciate + button
* Just use previous way of registering tools as that is not required for evals
* Add claude commands for research, plan and implement
* evals2 research and plan
implementation plan
new implementation plan
* Evals2 implementation
test test
* Removed old eval hooks
Remove old evals hooks
* evals 2 added to env
* Eval2 enhancement plan
backup
* Make Braintrust project configurable
Make Braintrust project configurable
* Enhanced scorer -- using Gemini 2.5 pro for evaluation
backup v0.1
enhancement v0.2
v0.2
backup v0.3
backup v0.4
* Deleted old evals directory
* Clean up old evals code
* Bunch of fixes and improvements
backup
fixes 0.1
more fixes
fixes
more elaborate prompts
braintrust logger fix
* Renamed files
backup
* Better ChatAgent
ChatAgent is context uses BrowserState messages
* Changed system-context to BrowserState and system-reminder to SystemReminder
* Multiple tab selection works with ChatAgent
multiple tabs fix
* Minor fix
* Removing inner loop, let agent handle the todo
* Renamed to todo_manager_tool
* replace_all todos instead of appending
* POCAgent + other fixes (#18)
* Removing inner loop, let agent handle the todo
* Renamed to todo_manager_tool
* Setting correct max context length in message manager for different providers
Setting correct max context length in message manager for different providers
backup
max tokens
* BrowserAgent inner loop implementation
Setting model name as gpt4 for nxtscape so that token counting works
Don't add browser refresh state twice
inner loop
Add browserstate message as browser state to remove any duplicates
* Fixed prompts -- saving BrowserState as <system-context> and others are being saved as <system-reminder>
Don't emit browser state message
Changed <system-reminder> to <system-context>
backup
updated the prompt
small fix
backup
* Changed default model to opus
* Add simplified BrowserState for RefreshStateTool
* Minor fix
* Make message manager trim messages during getMessages as well
* minor fix
* Implemented simple retry logic for planner tool and validator tool
* Update tools with retry logic
* changed back to gemini 2.5
---------
Co-authored-by: Nikhil <nikhilsv92@gmail.com>
* add show version in help
* minor update to find_element prompt
* screenshot tool
* adding screenshot to browser agent
* PlannerTool prompt updated
* Minor changes
* Passing Task as context to FindElementTool
* Update todo-tool-design.md
Create todo-tool-design.md
Update todo-tool-design.md
* TodoList Manager
TodoList Manager
TodoList Manager
todo list manager v0.1
* Markdown fix
* Small update to BrowserAgent to populate TODOs from plan
* Added another test
* Small refinement
* Add BrowserState as system reminder
* extract tool added
* extraction integration test
* registring extract tool
* fix extract tool build error
* Updated some event info
* Get selected tabs tool
Get selected tabs tool
* Todo List being displayed
* Format Tool
* Remove newline after format tool
* fix default nextscape model as per liteLLM
* Collapseable event execution
Collapseable event execution
* Revert "Collapseable event execution"
This reverts commit 9e3833931162eff06778e46480e6691eb508ff44.
* move to open router
* Revert "move to open router"
This reverts commit c25f5c68f4f7b5dae54dcbb3a6e97c3faf2efa5d.
* debug test file
* Event emitter
This reverts commit 39c78ec47633616b5a52eea5d323811f60ab8eba.
* Rename
* Created a new markdown rendering engine, which displays table correctly, removed react-markdown
* fix: openAI requires toolMessage after tool call
* fix: Claude can't have system messages other than top. Make systemReminder as humanMessage similar to claude code
* fix: adding gemini support
* Added new ToolResult emitter -- and renamed others to mean what they are
backup
backup
Small fix
* ListTabs fixed
* Tool icon changes
* remove spinner from side panel
* new limits for sub-loop and main loop
* new ResultTool to summarise and output result
* integrate new result tool
* removed emit.complete as emitTaskResult is enough
---------
Co-authored-by: Felarof <nithin.sonti@gmail.com>
* clean-up bunch of files for re-write
* more clean-up and adding basic agent
* Minor fix moved types into respective files.
* Deleted bunch of old files
backup
Update gitignore
Deleted a bunch of files
Remove message manager
Deleted old docs
Update rules
rename Profiler to profiler
* Temporarily adding old code
* Adding two small things back
* backup
* Implemented LangChainProvider and updated cursor rules
backup
LangChainProvider
curosr rules
* Implement tests for LangChainProvider -- unit test and integration test
integration test passes
integration test backup
* Tool Design
Tools Desing
tools design
* NavigationTool ready
NavigationTool ready
NavigationTool ready
NaivgationTool ready
backup
* MessageManager
MessageManager
backup
* Fixed integration test
* Agent design new
Updated agent design and added bunch of /NTN commands
agent new design
* Delete old agent design
* MessageManagerReadOnly class
* PlannerTool ready
PlannerTool almost ready
* ToolManager and DoneTool
* Integration of BrowserAgent
* BrowserAgent implementation v0.1
* BrowserAgent small fix v0.2
* Tool calling design
too call design
tool design claude
* Update agent tool design with // NTN
* add zod-to-json npm install
* BrowserAGent v0.3
* BrowserAgent v0.4
* BrowserAgent v0.5
* fixes
* Build error fixes in my NEWLY added code
build errors fix
* Build error fixes in old code (integration work)
backup
* Comment StreamEventProcessor for now, it is not used
* Small build error fix
* Small rename
* Added integration test to check structuredLLM and changed to 4o-mini
change default to nxtscape
integration test
* Small docstring
* Simplified BrowserAgent code and added integration test
Simplified BrowserAgent code
BrowserAGent integrationt est
* Update CLAUDE.md with project memory and instructions on how to write code
Update CLAUDE.md with project memory and instructions on how to write code
Project Memory
* Just a mova.. Moved ToolManager outside. Build works.
* TabOperations tool
TabOperations Tool and fixing some test
tab operations
* Update CLAUDE.md
* Added ClassificationTool
classifiction tool
classification prommpt
* Refactored and simplified PlannerTool unit test and integration test
* Updated Plnnaer tool
* Update CLAUDE.md
* BrowserAgent modified to do classification
BrowserAgent with classification
* minor fix to ToolManager
* Instead of ToolCall and ToolResult -- just updating message manager once
* minor fix to BrowserAgent integration test
* Changed done to "done_tool"
* Updated CLAUDE.md to reflect understanding of claude
* Uncommented stream event processor
* Renamed EventBus to StreamEventBus
* Commented StreamEventProcessor
* Event Processor
* Integrated EventProcessor with BrowserAgent
Added EventProcessor to BrowserAgetn
* Renamed StreamEventBus to EventBus
* Made EventBus required parameter in ExecutionContext
* PlanGenerator rewrite
PlanGenerator rewrite
backup
* For simple task, explicitly tell it to call done tool
* Max attempts for simple task
* backup
* Revert "backup"
This reverts commit 7d79a3d4d5774bfef79ec9827878b74edad3593f.
* Consolidating where EventBus and EventProcessor are created and initialized
backup
* Update CLAUDE.md
Update CLAUDE.md
* Improving agent loop code
Cleaned up processTooCall
classification task
* Create test-writer subAgent
test-agent-prompt
test agent prompt
test-agent-prompt
Update test-writer.md
* BrowserAgent test
Browseragent test
BrowserAgent test
* BrowserAgent refactor
backup
backup
* Minor fixes
* Minor fix
* minor change -- NEW AGENT LOOP IS WORKING WELL
* Update cursor rules
* Small change
* Improved BrowserAgent integration test
Improved BrowserAgent integration test
* Small change
* Update CLAUDE.md
* Different tools
* FindElementTool is ready
Find element update
backup
find element backup
* Updated to test strings to say "tests..."
* ScrollTool is ready
* RefreshStateTool is updated as well
* MessageManager updated
* SearchTool is ready
backup
* Interaction Element is also ready
* Add debugMessage emitter
* ValidatorTool ready and tests are passing
Validation Tool
validator tool
backup
backup
* GroupTabs tool ready
* Registered all the tools
* Planning changed to 5 steps
* BrowserAgent integration test fix
* Minor string changes
* backup
* Removed too many confusing events in EventProcessor -- there is only event.info right now
* Abort control implemented
backup
Abort
* Formatter for toolResult
Formatter for toolResult
backup
* Always render using Markdown
* Minor fix
---------
Co-authored-by: Nikhil Sonti <nikhilsv92@gmail.com>