docs: update crabbox skill guidance

This commit is contained in:
Peter Steinberger
2026-05-13 15:46:50 +01:00
parent 418d7afb33
commit 0513b285ef

View File

@@ -31,6 +31,10 @@ pnpm crabbox:run -- --help | sed -n '1,120p'
- Check `.crabbox.yaml` for repo defaults, but override provider explicitly. - Check `.crabbox.yaml` for repo defaults, but override provider explicitly.
Even if config still says AWS, maintainer validation should normally pass Even if config still says AWS, maintainer validation should normally pass
`--provider blacksmith-testbox`. `--provider blacksmith-testbox`.
- If a warm direct-provider lease smells stale, retry with `--full-resync`
(alias `--fresh-sync`) before replacing the lease. This resets the remote
workdir, skips the fingerprint fast path, reseeds Git when possible, and
uploads the checkout from scratch.
- For live/provider bugs, use the configured secret workflow before downgrading - For live/provider bugs, use the configured secret workflow before downgrading
to mocks. Copy only the exact needed key into the remote process environment to mocks. Copy only the exact needed key into the remote process environment
for that one command. Do not print it, do not sync it as a repo file, and do for that one command. Do not print it, do not sync it as a repo file, and do
@@ -163,7 +167,13 @@ Use these on debugging runs before inventing ad hoc logging:
executing it, then forwards only allowlisted names. `--allow-env` is executing it, then forwards only allowlisted names. `--allow-env` is
repeatable and comma-separated. Profile values override ambient allowlisted repeatable and comma-separated. Profile values override ambient allowlisted
env values for that run. Direct POSIX, WSL2, and native Windows runs are env values for that run. Direct POSIX, WSL2, and native Windows runs are
supported; delegated providers are not. supported; delegated providers are not. Crabbox probes the uploaded profile
remotely and prints redacted presence/length metadata before the command.
- `--env-helper <name>`: with `--env-from-profile` on POSIX SSH targets,
persists `.crabbox/env/<name>` and `.crabbox/env/<name>.env` so follow-up
commands on the same lease can run through `./.crabbox/env/<name> <command>`.
Use only on leases you control; the profile stays until cleanup, lease reset,
or `--full-resync`.
- `--script <file>` / `--script-stdin`: upload a local script into - `--script <file>` / `--script-stdin`: upload a local script into
`.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute `.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute
directly on POSIX; scripts without a shebang run through `bash`. Native directly on POSIX; scripts without a shebang run through `bash`. Native
@@ -173,12 +183,19 @@ Use these on debugging runs before inventing ad hoc logging:
fresh remote checkout of the GitHub PR. Bare numbers use the current repo's fresh remote checkout of the GitHub PR. Bare numbers use the current repo's
GitHub origin. Add `--apply-local-patch` only when the current local GitHub origin. Add `--apply-local-patch` only when the current local
`git diff --binary HEAD` should be applied on top of that PR checkout. `git diff --binary HEAD` should be applied on top of that PR checkout.
- `--full-resync` / `--fresh-sync`: reset a stale direct-provider workdir
before syncing. Use after sync fingerprints look wrong, SSH times out before
sync, or rsync watchdog output suggests it. It is redundant with
`--fresh-pr`, incompatible with `--no-sync`, and unsupported by delegated
providers.
- `--capture-stdout <path>` / `--capture-stderr <path>`: write remote streams to - `--capture-stdout <path>` / `--capture-stderr <path>`: write remote streams to
local files and keep binary/noisy output out of retained logs. Parent local files and keep binary/noisy output out of retained logs. Parent
directories must already exist. These are direct-provider only. directories must already exist. These are direct-provider only.
- `--capture-on-fail`: on non-zero direct-provider exits, downloads - `--capture-on-fail`: on non-zero direct-provider exits, downloads
`.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`, `.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`,
`coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed. `coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed.
- `--keep-on-failure`: leave a failed one-shot lease alive for live debugging
until idle/TTL expiry. Useful on direct providers and delegated one-shots.
- `--timing-json`: final machine-readable timing. Add - `--timing-json`: final machine-readable timing. Add
`echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell `echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell
commands; direct providers and Blacksmith Testbox both report them as commands; direct providers and Blacksmith Testbox both report them as
@@ -200,9 +217,10 @@ pnpm crabbox:run -- --provider aws \
``` ```
Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or
`--sync-only` to delegated providers. Also do not pass `--script*` or `--sync-only` to delegated providers. Also do not pass `--script*`,
`--fresh-pr` there. Crabbox rejects these because the provider owns sync or `--fresh-pr`, `--full-resync`, or `--env-helper` there. Crabbox rejects these
command transport. because the provider owns sync or command transport. `--keep-on-failure` is OK
for delegated one-shots when you need to inspect a failed lease.
## Efficient Bug E2E Verification ## Efficient Bug E2E Verification
@@ -250,6 +268,8 @@ Keep it efficient:
- Use `--fresh-pr <pr>` when validating an upstream PR in isolation from the - Use `--fresh-pr <pr>` when validating an upstream PR in isolation from the
local dirty tree. Add `--apply-local-patch` only when testing a local fixup on local dirty tree. Add `--apply-local-patch` only when testing a local fixup on
top of that PR. top of that PR.
- Use `--full-resync` before replacing a warmed direct-provider lease when the
remote workdir or sync fingerprint appears stale.
- Use one-shot Crabbox for a single proof; use a reusable Testbox only when - Use one-shot Crabbox for a single proof; use a reusable Testbox only when
several commands must share built images, installed packages, or live state. several commands must share built images, installed packages, or live state.
- Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a - Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a
@@ -345,10 +365,17 @@ Useful WebVNC commands:
```sh ```sh
../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --open ../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --open
../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --daemon --open ../crabbox/bin/crabbox webvnc daemon start --provider hetzner --id <cbx_id-or-slug> --open
../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --status ../crabbox/bin/crabbox webvnc daemon status --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --stop ../crabbox/bin/crabbox webvnc daemon stop --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox screenshot --provider hetzner --id <cbx_id-or-slug> --output desktop.png ../crabbox/bin/crabbox webvnc status --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox webvnc reset --provider hetzner --id <cbx_id-or-slug> --open
../crabbox/bin/crabbox desktop doctor --provider hetzner --id <cbx_id-or-slug>
../crabbox/bin/crabbox desktop click --provider hetzner --id <cbx_id-or-slug> --x 640 --y 420
../crabbox/bin/crabbox desktop paste --provider hetzner --id <cbx_id-or-slug> --text "user@example.com"
../crabbox/bin/crabbox desktop key --provider hetzner --id <cbx_id-or-slug> ctrl+l
../crabbox/bin/crabbox artifacts collect --id <cbx_id-or-slug> --all --output artifacts/<slug>
../crabbox/bin/crabbox artifacts publish --dir artifacts/<slug> --pr <number>
``` ```
`desktop launch --webvnc --open` is usually the nicest one-shot: it starts the `desktop launch --webvnc --open` is usually the nicest one-shot: it starts the
@@ -383,7 +410,9 @@ Common Crabbox-only failures:
- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the - Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the
printed Actions URL. Large sync warnings now include top source directories printed Actions URL. Large sync warnings now include top source directories
by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect
those before reaching for `--force-sync-large`. those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH
timeouts now print `next_action=` hints; follow them, usually `--full-resync`
first and a fresh lease second.
- Cleanup uncertainty: run `blacksmith testbox list` and stop only boxes you - Cleanup uncertainty: run `blacksmith testbox list` and stop only boxes you
created. created.
- Testbox queued/capacity pressure: do not convert a broad changed gate or full - Testbox queued/capacity pressure: do not convert a broad changed gate or full
@@ -525,7 +554,10 @@ crabbox run --id <lease> --shell -- 'DISPLAY=:99 xdotool search --onlyvisible --
crabbox status --id <id-or-slug> --wait crabbox status --id <id-or-slug> --wait
crabbox inspect --id <id-or-slug> --json crabbox inspect --id <id-or-slug> --json
crabbox sync-plan crabbox sync-plan
crabbox history --limit 20
crabbox history --lease <id-or-slug> crabbox history --lease <id-or-slug>
crabbox attach <run_id>
crabbox events <run_id> --json
crabbox logs <run_id> crabbox logs <run_id>
crabbox results <run_id> crabbox results <run_id>
crabbox cache stats --id <id-or-slug> crabbox cache stats --id <id-or-slug>