Commit Graph

13 Commits

Author SHA1 Message Date
Omkar Bansod
400d20fdf8 Feature/Feedback-System (#85)
* feat(feedback):Introduce a feedback feature with firebase service

* fix firebasconfig issue to test the UI

* fix response button alignment

* feat/add firebase config

* update firebase output

* fix(chore)

* add userQuery

* add Enter button to submit feedback

---------

Co-authored-by: Felarof <nithin.sonti@gmail.com>
2025-09-11 13:33:56 -07:00
Felarof
1abbee638a Braintrust basic evals (#87)
* implement validator eval

* implement online eval foundation

* further implementing online evals

* enhance evaluation data logging

* implement LLM scoring, remove redundant EventEnricher

* cleanup

* fix build errs from merging, extend LLM scorer context

* settled evaluation framework

* update evals documentation

* fix evals screenshots

* fix typos

* Evals config moved to env variables and tested

* test

* Update manifest to 49.1

* Removed duplciate + button

* Just use previous way of registering tools as that is not required for evals

* Add claude commands for research, plan and implement

* evals2 research and plan

implementation plan

new implementation plan

* Evals2 implementation

test test

* Removed old eval hooks

Remove old evals hooks

* evals 2 added to env

* Eval2 enhancement plan

backup

* Make Braintrust project configurable

Make Braintrust project configurable

* Enhanced scorer -- using Gemini 2.5 pro for evaluation

backup v0.1

enhancement v0.2

v0.2

backup v0.3

backup v0.4

* Deleted old evals directory

* Clean up old evals code

* Bunch of fixes and improvements

backup

fixes 0.1

more fixes

fixes

more elaborate prompts

braintrust logger fix

* Renamed files

backup
2025-09-05 18:04:07 -07:00
Omkar Bansod
5e48e7c56e fix: add rimraf for cross-platform builds 2025-08-18 07:58:50 -07:00
gbsierra
5cb4bf6b2b implement validator eval (#45) 2025-08-15 13:57:20 -07:00
gbsierra
404703ea86 initial evals commit 2025-08-14 13:48:49 -07:00
Felarof
263b416504 Revert "Implement PDF extraction and various general improvements (#39)" (#41)
This reverts commit 8ff2b76408.
2025-08-12 14:18:58 -07:00
gbsierra
fe45200d47 Implement PDF extraction and various general improvements (#39)
* ui fihttps://github.com/shadowfax92xes

* Implement PDF extraction and various general improvements

* general ui updates

* cleanup
2025-08-12 12:13:12 -07:00
Nikhil Sonti
9872017aa9 remove build:chrome as not needed 2025-08-01 15:21:28 -07:00
Nikhil
c23c3c4677 New agent architecture (#10)
* add show version in help

* minor update to find_element prompt

* screenshot tool

* adding screenshot to browser agent

* PlannerTool prompt updated

* Minor changes

* Passing Task as context to FindElementTool

* Update todo-tool-design.md

Create todo-tool-design.md

Update todo-tool-design.md

* TodoList Manager

TodoList Manager

TodoList Manager

todo list manager v0.1

* Markdown fix

* Small update to BrowserAgent to populate TODOs from plan

* Added another test

* Small refinement

* Add BrowserState as system reminder

* extract tool added

* extraction integration test

* registring extract tool

* fix extract tool build error

* Updated some event info

* Get selected tabs tool

Get selected tabs tool

* Todo List being displayed

* Format Tool

* Remove newline after format tool

* fix default nextscape model as per liteLLM

* Collapseable event execution

Collapseable event execution

* Revert "Collapseable event execution"

This reverts commit 9e3833931162eff06778e46480e6691eb508ff44.

* move to open router

* Revert "move to open router"

This reverts commit c25f5c68f4f7b5dae54dcbb3a6e97c3faf2efa5d.

* debug test file

* Event emitter

This reverts commit 39c78ec47633616b5a52eea5d323811f60ab8eba.

* Rename

* Created a new markdown rendering engine, which displays table correctly, removed react-markdown

* fix: openAI requires toolMessage after tool call

* fix: Claude can't have system messages other than top. Make systemReminder as humanMessage similar to claude code

* fix: adding gemini support

* Added new ToolResult emitter -- and renamed others to mean what they are

backup

backup

Small fix

* ListTabs fixed

* Tool icon changes

* remove spinner from side panel

* new limits for sub-loop and main loop

* new ResultTool to summarise and output result

* integrate new result tool

* removed emit.complete as emitTaskResult is enough

---------

Co-authored-by: Felarof <nithin.sonti@gmail.com>
2025-07-30 16:56:26 -07:00
Felarof
8245dfe0ff Rewrite Agent Loop (#7)
* clean-up bunch of files for re-write

* more clean-up and adding basic agent

* Minor fix moved types into respective files.

* Deleted bunch of old files

backup

Update gitignore

Deleted a bunch of files

Remove message manager

Deleted old docs

Update rules

rename Profiler to profiler

* Temporarily adding old code

* Adding two small things back

* backup

* Implemented LangChainProvider and updated cursor rules

backup

LangChainProvider

curosr rules

* Implement tests for LangChainProvider -- unit test and integration test

integration test passes

integration test backup

* Tool Design

Tools Desing

tools design

* NavigationTool ready

NavigationTool ready

NavigationTool ready

NaivgationTool ready

backup

* MessageManager

MessageManager

backup

* Fixed integration test

* Agent design new

Updated agent design and added bunch of /NTN commands

agent new design

* Delete old agent design

* MessageManagerReadOnly class

* PlannerTool ready

PlannerTool almost ready

* ToolManager and DoneTool

* Integration of BrowserAgent

* BrowserAgent implementation v0.1

* BrowserAgent small fix v0.2

* Tool calling design

too call design

tool design claude

* Update agent tool design with // NTN

* add zod-to-json npm install

* BrowserAGent v0.3

* BrowserAgent v0.4

* BrowserAgent v0.5

* fixes

* Build error fixes in my NEWLY added code

build errors fix

* Build error fixes in old code (integration work)

backup

* Comment StreamEventProcessor for now, it is not used

* Small build error fix

* Small rename

* Added integration test to check structuredLLM and changed to 4o-mini

change default to nxtscape

integration test

* Small docstring

* Simplified BrowserAgent code and added integration test

Simplified BrowserAgent code

BrowserAGent integrationt est

* Update CLAUDE.md with project memory and instructions on how to write code

Update CLAUDE.md with project memory and instructions on how to write code

Project Memory

* Just a mova.. Moved ToolManager outside. Build works.

* TabOperations tool

TabOperations Tool and fixing some test

tab operations

* Update CLAUDE.md

* Added ClassificationTool

classifiction tool

classification prommpt

* Refactored and simplified PlannerTool unit test and integration test

* Updated Plnnaer tool

* Update CLAUDE.md

* BrowserAgent modified to do classification

BrowserAgent with classification

* minor fix to ToolManager

* Instead of ToolCall and ToolResult -- just updating message manager once

* minor fix to BrowserAgent integration test

* Changed done to "done_tool"

* Updated CLAUDE.md to reflect understanding of claude

* Uncommented stream event processor

* Renamed EventBus to StreamEventBus

* Commented StreamEventProcessor

* Event Processor

* Integrated EventProcessor with BrowserAgent

Added EventProcessor to BrowserAgetn

* Renamed StreamEventBus to EventBus

* Made EventBus required parameter in ExecutionContext

* PlanGenerator rewrite

PlanGenerator rewrite

backup

* For simple task, explicitly tell it to call done tool

* Max attempts for simple task

* backup

* Revert "backup"

This reverts commit 7d79a3d4d5774bfef79ec9827878b74edad3593f.

* Consolidating where EventBus and EventProcessor are created and initialized

backup

* Update CLAUDE.md

Update CLAUDE.md

* Improving agent loop code

Cleaned up processTooCall

classification task

* Create test-writer subAgent

test-agent-prompt

test agent prompt

test-agent-prompt

Update test-writer.md

* BrowserAgent test

Browseragent test

BrowserAgent test

* BrowserAgent refactor

backup

backup

* Minor fixes

* Minor fix

* minor change -- NEW AGENT LOOP IS WORKING WELL

* Update cursor rules

* Small change

* Improved BrowserAgent integration test

Improved BrowserAgent integration test

* Small change

* Update CLAUDE.md

* Different tools

* FindElementTool is ready

Find element update

backup

find element backup

* Updated to test strings to say "tests..."

* ScrollTool is ready

* RefreshStateTool is updated as well

* MessageManager updated

* SearchTool is ready

backup

* Interaction Element is also ready

* Add debugMessage emitter

* ValidatorTool ready and tests are passing

Validation Tool

validator tool

backup

backup

* GroupTabs tool ready

* Registered all the tools

* Planning changed to 5 steps

* BrowserAgent integration test fix

* Minor string changes

* backup

* Removed too many confusing events in EventProcessor -- there is only event.info right now

* Abort control implemented

backup

Abort

* Formatter for toolResult

Formatter for toolResult

backup

* Always render using Markdown

* Minor fix

---------

Co-authored-by: Nikhil Sonti <nikhilsv92@gmail.com>
2025-07-29 08:14:45 -07:00
Nikhil
48c19c775f clean-up few things
* remove un-used files like readabilty

* posthog key to ENV

* update sample env

* dropped v2 prefix

* fix build

* fix package.json files
2025-07-19 13:46:58 -07:00
Felarof
4f70b95e44 Removed jest, added vite (#2)
* Update yarn.lock

* Remove BAML rules

* Removed jest, added vite support
2025-07-18 09:19:16 -07:00
Felarof
3c83077bbe Separate out side panel agent into a new repo 2025-07-18 08:36:51 -07:00