* implement validator eval
* implement online eval foundation
* further implementing online evals
* enhance evaluation data logging
* implement LLM scoring, remove redundant EventEnricher
* cleanup
* fix build errs from merging, extend LLM scorer context
* settled evaluation framework
* update evals documentation
* fix evals screenshots
* fix typos
* Evals config moved to env variables and tested
* test
* Update manifest to 49.1
* Removed duplciate + button
* Just use previous way of registering tools as that is not required for evals
* Add claude commands for research, plan and implement
* evals2 research and plan
implementation plan
new implementation plan
* Evals2 implementation
test test
* Removed old eval hooks
Remove old evals hooks
* evals 2 added to env
* Eval2 enhancement plan
backup
* Make Braintrust project configurable
Make Braintrust project configurable
* Enhanced scorer -- using Gemini 2.5 pro for evaluation
backup v0.1
enhancement v0.2
v0.2
backup v0.3
backup v0.4
* Deleted old evals directory
* Clean up old evals code
* Bunch of fixes and improvements
backup
fixes 0.1
more fixes
fixes
more elaborate prompts
braintrust logger fix
* Renamed files
backup
* wip: new exection class and manager
* wip: new pubsub channels
* wip: new background handlers
* new execution logic
* removed execution status
* handle workflow status for processing in sidepanel
* mcp server fix
* sending pause message
* better portName parsing, sidepanel sends tabId, storing tabId too
* wip: new exection class and manager
* wip: new pubsub channels
* wip: new background handlers
* new execution logic
* removed execution status
* handle workflow status for processing in sidepanel
* mcp server fix
* sending pause message
* better portName parsing, sidepanel sends tabId, storing tabId too