Problem
When TEST_SUITE=full, the beforeAll in brev-e2e.test.js spends ~6 min (358s) running nemoclaw onboard to create a sandbox. The full test suite (test-full-e2e.sh) then immediately destroys that sandbox in Phase 0 and runs install.sh --non-interactive which creates its own from scratch.
This is ~6 min of wasted compute on every full E2E run.
Evidence
From run 23957694836:
[408s] Starting nemoclaw onboard in background...
[358s] Sandbox e2e-test is Ready! ← 6 min spent building sandbox
[799s] beforeAll complete
=== Phase 0: Pre-cleanup ===
Deleting sandbox 'e2e-test'... ← immediately destroyed
✓ Sandbox 'e2e-test' destroyed
=== Phase 2: Install nemoclaw ===
...install.sh --non-interactive... ← builds a new one from scratch
Total run: 1121s (18.7 min). The beforeAll onboard accounts for 358s (32%) of that time.
Proposed Fix
Skip the beforeAll onboard when TEST_SUITE=full. Other suites (credential-sanitization, telegram-injection, all) still need it and are unaffected.
Expected impact
| Metric |
Before |
After |
TEST_SUITE=full duration |
~18.7 min |
~13-14 min |
| Other suites |
unchanged |
unchanged |
Side effect (improvement)
The full test currently runs in a "resume after interrupted onboard" state (logs show [resume] Skipping preflight (cached)). After the fix, it tests a true clean install — more realistic and more valuable.
Problem
When
TEST_SUITE=full, thebeforeAllinbrev-e2e.test.jsspends ~6 min (358s) runningnemoclaw onboardto create a sandbox. Thefulltest suite (test-full-e2e.sh) then immediately destroys that sandbox in Phase 0 and runsinstall.sh --non-interactivewhich creates its own from scratch.This is ~6 min of wasted compute on every
fullE2E run.Evidence
From run 23957694836:
Total run: 1121s (18.7 min). The beforeAll onboard accounts for 358s (32%) of that time.
Proposed Fix
Skip the beforeAll onboard when
TEST_SUITE=full. Other suites (credential-sanitization,telegram-injection,all) still need it and are unaffected.Expected impact
TEST_SUITE=fulldurationSide effect (improvement)
The
fulltest currently runs in a "resume after interrupted onboard" state (logs show[resume] Skipping preflight (cached)). After the fix, it tests a true clean install — more realistic and more valuable.