feat(eval): switch to Linux GitHub-hosted runner (#519)

* feat(eval): switch to ubuntu-latest runner, add OE-Clado config

- Switch workflow from self-hosted Mac Studio to ubuntu-latest
- Install BrowserOS Linux .deb in CI (no self-hosted runner needed)
- Add browseros-oe-clado-weekly.json config for orchestrator-executor
- Fix report chart to show date+time (not just date)
- Make BROWSEROS_BINARY configurable via env var

* feat(eval): add NopeCHA captcha solver extension to eval runs

- Auto-load NopeCHA extension in eval Chrome instances
- Works in incognito + headless mode
- CI workflow downloads NopeCHA before eval
- extensions/ directory gitignored (downloaded at runtime)

* feat(eval): per-config concurrency — different configs run in parallel

* feat(eval): remove concurrency limit — all runs execute in parallel
This commit is contained in:
shivammittal274
2026-03-21 23:04:45 +05:30
committed by GitHub
parent ba7892322b
commit f157436e7d
5 changed files with 64 additions and 10 deletions

View File

@@ -11,22 +11,24 @@ on:
required: false
default: 'configs/browseros-agent-weekly.json'
concurrency:
group: eval-runner
cancel-in-progress: false
permissions:
contents: read
jobs:
eval:
runs-on: self-hosted
runs-on: ubuntu-latest
timeout-minutes: 360
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install BrowserOS
run: |
wget -q https://github.com/browseros-ai/BrowserOS/releases/download/v0.44.0.1/BrowserOS_v0.44.0.1_amd64.deb
sudo dpkg -i BrowserOS_v0.44.0.1_amd64.deb
browseros --version || echo "BrowserOS installed at $(which browseros)"
- name: Install Bun
uses: oven-sh/setup-bun@v2
with:
@@ -34,7 +36,14 @@ jobs:
- name: Install dependencies
working-directory: packages/browseros-agent
run: bun install
run: bun install --ignore-scripts && bun run build:agent-sdk
- name: Install captcha solver extension
working-directory: packages/browseros-agent/apps/eval
run: |
mkdir -p extensions
curl -sL -o /tmp/nopecha.zip https://github.com/NopeCHALLC/nopecha-extension/releases/latest/download/chromium_automation.zip
unzip -qo /tmp/nopecha.zip -d extensions/nopecha
- name: Run eval
working-directory: packages/browseros-agent/apps/eval
@@ -43,7 +52,7 @@ jobs:
OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
BROWSEROS_BINARY: ${{ secrets.BROWSEROS_BINARY }}
BROWSEROS_BINARY: /usr/bin/browseros
EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/browseros-agent-weekly.json' }}
run: |
echo "Running eval with config: $EVAL_CONFIG"