feat(eval): NopeCHA CAPTCHA solver integration (#537)

* feat(eval): show mean score instead of pass/fail in report and viewer

* feat(eval): integrate NopeCHA CAPTCHA solver into eval pipeline

Add CAPTCHA detection and waiting so screenshots capture post-solve state.
Run headed with xvfb on CI since headless breaks extension content scripts.

- Add CaptchaWaiter module (detect reCAPTCHA/hCaptcha/Turnstile, poll until solved)
- Add optional `captcha` config block to EvalConfigSchema
- Wait for CAPTCHA solve before screenshot in single-agent and orchestrator-executor
- Patch NopeCHA manifest with API key before launching workers
- Fix CAPTCHA_EXT_DIR path (was pointing one level too high)
- Remove --incognito (extensions don't run in incognito; fresh user-data-dir isolates)
- CI: install xvfb, run headed via xvfb-run, pass NOPECHA_API_KEY secret
This commit is contained in:
shivammittal274
2026-03-24 00:14:16 +05:30
committed by GitHub
parent 1270b5b55c
commit 0babc05077
11 changed files with 503 additions and 5 deletions

View File

@@ -43,6 +43,9 @@ jobs:
working-directory: packages/browseros-agent
run: bun install --ignore-scripts && bun run build:agent-sdk
- name: Install xvfb
run: sudo apt-get update && sudo apt-get install -y xvfb
- name: Install captcha solver extension
working-directory: packages/browseros-agent/apps/eval
run: |
@@ -55,11 +58,12 @@ jobs:
env:
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
NOPECHA_API_KEY: ${{ secrets.NOPECHA_API_KEY }}
BROWSEROS_BINARY: /usr/bin/browseros
EVAL_CONFIG: ${{ github.event.inputs.config || 'configs/browseros-agent-weekly.json' }}
run: |
echo "Running eval with config: $EVAL_CONFIG"
bun run src/index.ts -c "$EVAL_CONFIG"
xvfb-run --auto-servernum --server-args="-screen 0 1440x900x24" bun run src/index.ts -c "$EVAL_CONFIG"
- name: Upload runs to R2
if: success()