v1.33.2.0 fix: parser kind field, PhaseKind cleanup, critical-exit lock leaks#1444
Open
anbangr wants to merge 214 commits into
Conversation
# Conflicts: # README.md
- Adds implement/SKILL.md.tmpl to execute plans in phases - Updates GSTACK_PLAYBOOK.md to include the new workflow
…s review and ship, add implement reexamine mode
… and sonnet for review/qa
…erative fix, and deployment
… review instead of /review
Remove the `--skip-sweep` flag and the unshipped feat/* sweep bullet from the Startup Gates section and flags table. Aligns with the code removal in 3e2b8b2. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Adds mock configure.cm file to prevent jq from failing in Step M3.5 mock
…11-074503-fabe4c3f-4-e2e-test-touchfile-registration'
1. plan-selection (6 tests): `defaultActiveRunRegistryDir()` hardcoded `~/.gstack/build-state/active-runs` and ignored `GSTACK_BUILD_STATE_DIR`, causing 11 real active-run records to leak into unit tests and inflate candidate counts (turning expected "selected" into "ambiguous"). Fix: honour the env var consistently, the same way `state.ts` already does. 2. integration (3 tests): plan review subprocess called `codex` with `OPENAI_API_KEY` from the inherited `process.env`, triggering a real ~30s API call against the LLM. These tests exercise feature lifecycle, not plan review. Fix: add `--no-plan-review` to each CLI invocation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…estSpec detection Four improvements identified during code review of 3e2b8b2: - Move `extractCoverageTarget` from cli.ts to sub-agents.ts (alongside parseCoveragePercent); re-export via import in cli.ts. Eliminates the circular-import risk when phase-runner.ts calls coverage functions. - Fix decimal truncation in extractCoverageTarget: `(\d+)` only matched integers, silently returning 80 for targets like ≥90.5%. Changed to `([\d.]+)` + parseFloat. - Fix `hasTestSpec` detection in buildGeminiTestSpecPrompt: was `phase.body.includes("#### Test Spec")` (fragile string match, false negative when body text differs). Now `phase.testSpecCheckboxLine !== -1` (parser already computes this — zero extra overhead). - Wire coverage gate in RUN_TESTS handler: after GREEN tests pass and the phase has a test spec (`testSpecCheckboxLine !== -1`), call parseCoveragePercent(result.stdout, testCmd) and compare against extractCoverageTarget(phase.body). Below target → set coverageResult and route to test_fix_running. Unknown framework → log advisory, proceed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Complete the coverage gate: `injectCoverageFlags(testCmd)` appends the appropriate flag for the detected framework before the GREEN test run, so `parseCoveragePercent` reliably finds coverage data in stdout even when projects don't pre-configure coverage in their test script. Framework → flag mapping: jest → --coverage --coverageReporters text vitest → --coverage bun test → --coverage pytest → --cov --cov-report term-missing go test → -cover unknown → unchanged (advisory log, gate skips) Injection is idempotent (no-op if flag already present) and only fires when the phase has a test spec (testSpecCheckboxLine !== -1) — VERIFY_RED and legacy phases use the bare test command unchanged. 11 unit tests added covering each framework, idempotency, and unknowns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
`phase.kind !== "code" ? "" : ""` always evaluated to "" regardless of the condition, and was silently filtered by .filter(Boolean). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d, document sweep removal
…p (Bug D1) Two failing tests document the bug: 1. After CRITICAL verdict, state.planReview must be persisted with status "critical_exit_pending" — currently cli.ts does not persist anything before process.exit(3), so planReview stays undefined on disk. 2. On resume with the sentinel set, the plan-review gate must still fire — the current guard (!state.planReview) is false when planReview is truthy, so the gate is skipped after the sentinel is introduced. Two GREEN tests confirm baseline behavior: APPROVE verdict suppresses the gate; undefined planReview (first run) fires the gate. Tests MUST fail until Feature 4 implementation lands. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Before this fix, a CRITICAL plan-review verdict caused process.exit(3)
without saving any sentinel to state. On resume, !state.planReview was
true → review ran again → CRITICAL again → infinite loop.
Fix:
1. Save state.planReview = { ...verdict, status: "critical_exit_pending" }
before releaseLock + process.exit(3) so the sentinel survives on disk.
2. Widen the plan-review gate guard from !state.planReview to
!state.planReview || state.planReview.status === "critical_exit_pending"
so the gate re-fires on resume when the sentinel is present.
Tests: two new tests in phase-runner.test.ts cover both the sentinel
persistence and the widened gate; 90/90 passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g D2) Introduces ExitError (errors.ts) — thrown instead of process.exit(N) inside try/finally blocks so the finally clause runs cleanup before the process terminates. Changes: - errors.ts: new ExitError class (instanceof Error, numeric code field) - cli.ts: import ExitError; replace critical_exit process.exit(3) with throw new ExitError(3); update main().catch to call process.exit(err.code) when err instanceof ExitError - phase-runner.test.ts: 5 new tests (ExitError shape, propagation through finally, default and custom messages); 95/95 passing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ature 6) applyResult() now populates phaseState.coverageResult when: - action is RUN_TESTS - tests are GREEN (status = "tests_green") - extra.phaseBody is provided - parseCoveragePercent() returns a non-null value for the stdout Coverage below target emits an advisory warning but keeps status "tests_green" — not blocking. The target defaults to 80 when no "**Coverage target: ≥N%**" line appears in the phase body. 6 new tests in phase-runner.test.ts; 101/101 passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ics + test assertions - Add errors.ts to MODULE_TEST_OWNERS in coverage-matrix.test.ts - Fix analytics logActivity to emit "success" for exit code 13 (FINALIZATION_REQUIRED), which is a success state (pending ship), not a failure - Fix integration test assertions: --skip-ship correctly exits 13, not 0, when features reach origin_verified (pre-existing test/impl mismatch)
…d [Phase 1.1] RED phase TDD: 11 tests fail because the parser does not yet stamp kind: "code" on emitted phases, and existing Phase literal construction sites have no kind field (undefined fails the VALID_KINDS.includes runtime assertion). 11 tests pass immediately: direct Phase construction with explicit kind values, and PhaseKind union membership checks (both already exist in types.ts). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add required kind: PhaseKind field to the parser factory init and to every Phase literal construction site in tests/fixtures. This ensures backward-compatible default of kind: "code" for all existing phases while the type system enforces correctness going forward. - parser.ts: stamp kind: "code" on every emitted Phase - state.test.ts, cli.test.ts, phase-runner.test.ts, feature-review.test.ts, cli-guardrails.test.ts, phase-kind.test.ts: add kind: "code" to all helpers and inline literals
…tations - Fix PHASE_HEADING regex to allow optional [kind] bracket between number and colon - Add BODY_KIND_PATTERN for <!-- kind: X --> HTML comment fallback - Add IMPL_LABELS_BY_KIND and REVIEW_LABELS_BY_KIND maps for all 5 PhaseKind values - Parser now stamps kind from heading bracket (primary), body comment (fallback), or defaults to "code" - Inline kind-comment detection ensures kind is set before checkbox processing - Add implCheckboxRe/reviewCheckboxRe for kind-specific checkbox matching - Add 16 new parser tests covering all bracket annotations, HTML fallback, checkbox recognition Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add IMPL_MARKER_BY_KIND and REVIEW_MARKER_BY_KIND lookup tables - Update flipPhaseCheckboxes signature to accept optional kind?: PhaseKind - Derives implMarker/reviewMarker from kind ?? "code" (backward compat) - Update reconcilePhaseCheckboxes to pass phase.kind - Update both cli.ts call sites (lines ~3870, ~4282) to pass kind: phase.kind - Add 9 kind-aware mutator tests covering all 5 kinds + error cases + backward compat Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…EW gates, ship gate
- Fix critical-exit path skipping active-run registry cleanup and telemetry (cli.ts) - Remove dead phase kind branches from build instructions (cli.ts) - Hardcode phase kind to 'code' in parser to clarify intent (parser.ts) - Collapse PhaseKind union to a single literal to match parser reality (types.ts) - Add test coverage for critical_exit releasing the lock and exiting 3 (integration.test.ts) - Remove exitCode 13 override for --skip-ship to fix failing integration tests (cli.ts)
- Fix integration test mock codex script sed pattern to correctly extract output file path (was capturing trailing sentence period) - Remove stale contradictory comment block about exit code 13 in --skip-ship path - Remove dead exitCode === 13 branch in active-run registry update - Remove dead conditional for non-code phase rubric in buildCodexReviewBody All build/orchestrator tests pass (330 tests).
…-storm-20260511-122548-fabe4c3f-3-commit-housekeeping-parser-fix-readme-regenera # Conflicts: # CHANGELOG.md # test/gen-skill-docs.test.ts # test/helpers/touchfiles.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR ships parser fixes, PhaseKind cleanup, and critical-exit reliability improvements for the build orchestrator.
Parser & types
parser.ts— emitkind: p.kind ?? \"code\"on parsed phases (restores backward compat)types.ts— narrowPhaseKindto"code"; makekindoptional for strict-mode fixturesCLI cleanup
switch (phase.kind)branches for writing/experiment/research/manual inbuildKindInstructionsbuildCodexReviewBody--skip-ship; origin-verified features now pause naturallyReliability
critical_exit_pendingsentinel to state instead of rawprocess.exit(3), letting the finally block release the lockTests
integration.test.ts— add--skip-shipresume + origin-verification tests andcritical_exitlock-release testparser.test.ts— add kind-annotation parsing tests and ReferenceError-guard testsBuild skill
build/SKILL.md.tmpl— bump frontmatter version to 1.22.0Repo hygiene
.gitignore— add.llm-tmp/Test Coverage
All new code paths have test coverage.
Tests: 808 pass, 0 fail (build/orchestrator/tests/)
Pre-Landing Review
kindproperty merge artifact inparser.ts--skip-shipexit code (0, not 13)describeblock inparser.test.tsfrom mergeTODOS
No TODO items completed in this PR.
Test plan
🤖 Generated with Claude Code
Need help on this PR? Tag
@codesmithwith what you need.