Skip to content

fix(verify-gate): keep TEST-PROOF-* on sorry-grep — lake build exceeds gate budget#230

Merged
avrabe merged 2 commits into
mainfrom
chore/v0.10.x-re-revert-lake-build
May 19, 2026
Merged

fix(verify-gate): keep TEST-PROOF-* on sorry-grep — lake build exceeds gate budget#230
avrabe merged 2 commits into
mainfrom
chore/v0.10.x-re-revert-lake-build

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 18, 2026

Summary

Reverts #227's lake build step back to sorry-grep + file-exists. The verify-gate doesn't cache Mathlib build artifacts, so cold-compile takes >60min — past even #229's bumped timeout.

Why this changed mid-cycle

Right structural decision

The verify-gate's job is drift detection, not exhaustive verification:

  • ! grep -E '^[[:space:]]*sorry[[:space:]]*(--.*)?$' Latency.lean catches "agent accidentally left a sorry" — the most likely failure mode for a proof artifact.
  • Full type-checking is the dedicated "Lean proof typecheck (lake build)" workflow's job — and that workflow already caches Mathlib.

Two checks on the same source, two budgets, one shared trust boundary. The verify-gate is the fast layer; the dedicated workflow is the slow layer.

Future work

A actions/cache@v4 step on proofs/.lake/build (with key based on proofs/lakefile.toml + proofs/lean-toolchain hashes) inside the verify-gate workflow would let us restore lake build cleanly. Smithy team has the bigger picture on whether caching is added at workflow level or at sccache-style runner level.

Test plan

  • rivet validate clean
  • Verify-gate finishes in <10min on this PR
  • CI green

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

…s gate budget

PR #227 reverted TEST-PROOF-* steps to `cd proofs && lake build` on the
assumption smithy's ELAN_HOME deploy made it possible. It does work —
`lake` runs cleanly — but the lake build against Mathlib on a cold
elan cache routinely exceeds 30 minutes. PR #229 bumped the gate budget
to 60min and the FIRST PR after #229 ALSO hit the new ceiling.

The dedicated "Lean proof typecheck (lake build)" CI workflow has
`actions/cache@v4` on `proofs/.lake/build`, so its run completes
in seconds on warm cache. The rivet verification gate doesn't cache
that path, so it pays the full Mathlib cold-compile cost every run.

Right structural fix: the verify-gate's purpose is "surface drift
between YAML claims and reality". A sorry-free grep on the proof file
IS a meaningful drift check: it catches "agent accidentally left a
sorry". Full type-checking is the dedicated workflow's job — defense
in depth, separation of concerns.

Reverts to the sorry-grep + file-exists pattern from before #227.
Future work: add lake cache to verify-gate so this restriction lifts.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@avrabe avrabe enabled auto-merge (squash) May 18, 2026 18:37
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Rivet verification gate

19/19 passed

count
Passed 19
Failed 0
Skipped (no steps) 0

Filter: (and (= type "feature") (or (has-tag "v093") (has-tag "v0100")))

Failed artifacts

(none)

Updated automatically by tools/post_verification_comment.py. Source of truth: artifacts/verification.yaml.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@avrabe avrabe merged commit d902bf2 into main May 19, 2026
17 of 18 checks passed
@avrabe avrabe deleted the chore/v0.10.x-re-revert-lake-build branch May 19, 2026 07:29
avrabe added a commit that referenced this pull request May 20, 2026
v0.10.0 ships:

**Mermaid emission (M1 + M2 + M3)**
- spar-mermaid foundation crate with `emit_flowchart` (#220)
- `spar emit --format mermaid` CLI subcommand (#222)
- `emit_class_diagram` + `emit_requirement_diagram` + matching CLI
  flags `--format mermaid-class` / `mermaid-req` (#228)

**Soundness deepening**
- Lean 4 sorry-free proofs of end-to-end latency monotonicity and
  ARINC 653 partition isolation, alongside the pre-existing RTA / EDF /
  Network Calculus proofs (#223)
- Kani BMC harnesses on generated-code AADL contract preservation
  (thread Period, port Direction, bus access right) — spar's
  Logika-equivalent strategy for verified codegen (#224)

**Safety analysis**
- EMV2 error-propagation traversal across the AADL connection graph
  (closes the #1 gap vs OSATE/HAMR in safety-case reviews) (#225)

**Verification infrastructure**
- Rivet-driven verification gate that executes every artifact's
  `fields.steps[].run` commands and posts a sticky PR comment with
  pass/fail counts and failed artifact IDs (#221)
- Workflow tuning: gate timeout 30→60 min for future Mathlib-heavy
  workloads; TEST-PROOF-* stay on sorry-grep until lake cache lands
  (#227, #229, #230)

**Chore**
- Pruned stale dev artifacts (.playwright-mcp logs + dashboard-render
  PNGs) and tightened gitignore (#226)

Bumps Cargo.toml + vscode-spar/package.json from 0.9.3 → 0.10.0.
The release workflow's `check-versions` job enforces tag/Cargo/vsix
agreement, so these must move together with the v0.10.0 tag push.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant