From 0513b285ef95d54aad2e598c88d40080f4dd1e98 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Wed, 13 May 2026 15:46:50 +0100 Subject: [PATCH] docs: update crabbox skill guidance --- .agents/skills/crabbox/SKILL.md | 50 +++++++++++++++++++++++++++------ 1 file changed, 41 insertions(+), 9 deletions(-) diff --git a/.agents/skills/crabbox/SKILL.md b/.agents/skills/crabbox/SKILL.md index 5b2106547e8..70577758c41 100644 --- a/.agents/skills/crabbox/SKILL.md +++ b/.agents/skills/crabbox/SKILL.md @@ -31,6 +31,10 @@ pnpm crabbox:run -- --help | sed -n '1,120p' - Check `.crabbox.yaml` for repo defaults, but override provider explicitly. Even if config still says AWS, maintainer validation should normally pass `--provider blacksmith-testbox`. +- If a warm direct-provider lease smells stale, retry with `--full-resync` + (alias `--fresh-sync`) before replacing the lease. This resets the remote + workdir, skips the fingerprint fast path, reseeds Git when possible, and + uploads the checkout from scratch. - For live/provider bugs, use the configured secret workflow before downgrading to mocks. Copy only the exact needed key into the remote process environment for that one command. Do not print it, do not sync it as a repo file, and do @@ -163,7 +167,13 @@ Use these on debugging runs before inventing ad hoc logging: executing it, then forwards only allowlisted names. `--allow-env` is repeatable and comma-separated. Profile values override ambient allowlisted env values for that run. Direct POSIX, WSL2, and native Windows runs are - supported; delegated providers are not. + supported; delegated providers are not. Crabbox probes the uploaded profile + remotely and prints redacted presence/length metadata before the command. +- `--env-helper `: with `--env-from-profile` on POSIX SSH targets, + persists `.crabbox/env/` and `.crabbox/env/.env` so follow-up + commands on the same lease can run through `./.crabbox/env/ `. + Use only on leases you control; the profile stays until cleanup, lease reset, + or `--full-resync`. - `--script ` / `--script-stdin`: upload a local script into `.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute directly on POSIX; scripts without a shebang run through `bash`. Native @@ -173,12 +183,19 @@ Use these on debugging runs before inventing ad hoc logging: fresh remote checkout of the GitHub PR. Bare numbers use the current repo's GitHub origin. Add `--apply-local-patch` only when the current local `git diff --binary HEAD` should be applied on top of that PR checkout. +- `--full-resync` / `--fresh-sync`: reset a stale direct-provider workdir + before syncing. Use after sync fingerprints look wrong, SSH times out before + sync, or rsync watchdog output suggests it. It is redundant with + `--fresh-pr`, incompatible with `--no-sync`, and unsupported by delegated + providers. - `--capture-stdout ` / `--capture-stderr `: write remote streams to local files and keep binary/noisy output out of retained logs. Parent directories must already exist. These are direct-provider only. - `--capture-on-fail`: on non-zero direct-provider exits, downloads `.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`, `coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed. +- `--keep-on-failure`: leave a failed one-shot lease alive for live debugging + until idle/TTL expiry. Useful on direct providers and delegated one-shots. - `--timing-json`: final machine-readable timing. Add `echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell commands; direct providers and Blacksmith Testbox both report them as @@ -200,9 +217,10 @@ pnpm crabbox:run -- --provider aws \ ``` Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or -`--sync-only` to delegated providers. Also do not pass `--script*` or -`--fresh-pr` there. Crabbox rejects these because the provider owns sync or -command transport. +`--sync-only` to delegated providers. Also do not pass `--script*`, +`--fresh-pr`, `--full-resync`, or `--env-helper` there. Crabbox rejects these +because the provider owns sync or command transport. `--keep-on-failure` is OK +for delegated one-shots when you need to inspect a failed lease. ## Efficient Bug E2E Verification @@ -250,6 +268,8 @@ Keep it efficient: - Use `--fresh-pr ` when validating an upstream PR in isolation from the local dirty tree. Add `--apply-local-patch` only when testing a local fixup on top of that PR. +- Use `--full-resync` before replacing a warmed direct-provider lease when the + remote workdir or sync fingerprint appears stale. - Use one-shot Crabbox for a single proof; use a reusable Testbox only when several commands must share built images, installed packages, or live state. - Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a @@ -345,10 +365,17 @@ Useful WebVNC commands: ```sh ../crabbox/bin/crabbox webvnc --provider hetzner --id --open -../crabbox/bin/crabbox webvnc --provider hetzner --id --daemon --open -../crabbox/bin/crabbox webvnc --provider hetzner --id --status -../crabbox/bin/crabbox webvnc --provider hetzner --id --stop -../crabbox/bin/crabbox screenshot --provider hetzner --id --output desktop.png +../crabbox/bin/crabbox webvnc daemon start --provider hetzner --id --open +../crabbox/bin/crabbox webvnc daemon status --provider hetzner --id +../crabbox/bin/crabbox webvnc daemon stop --provider hetzner --id +../crabbox/bin/crabbox webvnc status --provider hetzner --id +../crabbox/bin/crabbox webvnc reset --provider hetzner --id --open +../crabbox/bin/crabbox desktop doctor --provider hetzner --id +../crabbox/bin/crabbox desktop click --provider hetzner --id --x 640 --y 420 +../crabbox/bin/crabbox desktop paste --provider hetzner --id --text "user@example.com" +../crabbox/bin/crabbox desktop key --provider hetzner --id ctrl+l +../crabbox/bin/crabbox artifacts collect --id --all --output artifacts/ +../crabbox/bin/crabbox artifacts publish --dir artifacts/ --pr ``` `desktop launch --webvnc --open` is usually the nicest one-shot: it starts the @@ -383,7 +410,9 @@ Common Crabbox-only failures: - Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the printed Actions URL. Large sync warnings now include top source directories by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect - those before reaching for `--force-sync-large`. + those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH + timeouts now print `next_action=` hints; follow them, usually `--full-resync` + first and a fresh lease second. - Cleanup uncertainty: run `blacksmith testbox list` and stop only boxes you created. - Testbox queued/capacity pressure: do not convert a broad changed gate or full @@ -525,7 +554,10 @@ crabbox run --id --shell -- 'DISPLAY=:99 xdotool search --onlyvisible -- crabbox status --id --wait crabbox inspect --id --json crabbox sync-plan +crabbox history --limit 20 crabbox history --lease +crabbox attach +crabbox events --json crabbox logs crabbox results crabbox cache stats --id