fix #307: record_test_baseline timeout is fail-safe, not fail-open by azalio · Pull Request #315 · azalio/map-framework

azalio · 2026-06-30T06:12:43Z

Summary

Fixes #307.

record_test_baseline defaulted to a 120 s subprocess timeout. On any repo whose full suite takes longer, the process was killed and the function returned {"status":"baseline_failures","baseline_failures":[]} — an empty failures list that was indistinguishable from a genuinely clean run. Any pre-existing failure was silently treated as "not pre-existing", making the regression-vs-pre-existing gate unreliable.

Changes

record_test_baseline

status is now "timed_out" (instead of "baseline_failures") when the subprocess is killed by the timeout, giving callers a distinct value to branch on.
New baseline_complete: bool field — false when the run timed out, true otherwise. Downstream code should check this before trusting an empty baseline_failures list.
Default timeout_seconds raised from 120 → 600 (10 min) so most test suites complete in time; the --timeout CLI flag still accepts an explicit override.

list_baseline_failures

Propagates baseline_complete and timed_out from the stored JSON.
Adds a "warning" key (human-readable explanation) when the stored baseline is incomplete, so callers that log the full report surface the problem automatically.

Tests (tests/test_map_step_runner.py)

test_baseline_timeout_is_unknown_not_clean — monkeypatches subprocess.run to raise TimeoutExpired, verifies status=="timed_out", baseline_complete==False, and that list_baseline_failures propagates the warning.
test_baseline_complete_true_on_normal_run — verifies baseline_complete==True and no warning key on a successful run.

All 3 225 existing tests pass; ruff, pyright, and make check-render are clean.

Generated by Claude Code

Summary by CodeRabbit

Bug Fixes
- Increased the baseline test timeout to give long-running suites more time to finish.
- Improved baseline reporting to clearly distinguish between a timeout and actual test failures.
- Added clearer warnings when baseline results are incomplete, so empty failure lists are not mistaken for a clean run.
- Baseline results now include completion and timeout status for better visibility.

When record_test_baseline times out the subprocess never finishes, so baseline_failures was always [] — indistinguishable from a genuinely clean suite. Any pre-existing failure was silently treated as "not pre-existing", defeating the regression-vs-pre-existing distinction. Changes: - status is now "timed_out" (not "baseline_failures") when the subprocess times out, so the two cases have distinct values - new baseline_complete: bool field — false on timeout, true otherwise; downstream code can check this before trusting an empty baseline - list_baseline_failures propagates baseline_complete, timed_out, and emits a "warning" key when the stored baseline is incomplete - default timeout_seconds raised from 120 → 600 (10 min) to give most suites room to finish; --timeout flag still accepts an explicit value - two regression tests added: one for the timeout path, one asserting baseline_complete is true on a normal run Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015ncF4ESXM8YoxbH4U9VyYY

coderabbitai · 2026-06-30T06:13:01Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 64c63171-1e3d-48c6-a4d1-82a67b510b15

📥 Commits

Reviewing files that changed from the base of the PR and between 0502a1d and c373fd3.

📒 Files selected for processing (4)

.map/scripts/map_step_runner.py
src/mapify_cli/templates/map/scripts/map_step_runner.py
src/mapify_cli/templates_src/map/scripts/map_step_runner.py.jinja
tests/test_map_step_runner.py

📝 Walkthrough

Walkthrough

record_test_baseline default timeout increases from 120s to 600s. On timeout, status is now timed_out instead of baseline_failures, and the saved payload gains baseline_complete and timed_out fields. list_baseline_failures reads those fields and emits a warning when the baseline timed out. Changes are mirrored across the jinja source, rendered template, live script, and covered by two new regression tests.

Baseline Timeout Handling

Layer / File(s)	Summary
Jinja source: timeout logic in record/list functions `src/mapify_cli/templates_src/map/scripts/map_step_runner.py.jinja`	Default `timeout_seconds` raised to 600; `record_test_baseline` now sets `status="timed_out"` on `TimeoutExpired` and writes `baseline_complete`/`timed_out` into the payload; `list_baseline_failures` reads those fields and adds a `warning` when `timed_out` is true; CLI default also raised to 600.
Rendered template mirror `src/mapify_cli/templates/map/scripts/map_step_runner.py`	Identical logic changes applied to the rendered copy.
Live .map script mirror `.map/scripts/map_step_runner.py`	Identical logic changes applied to the live deployed copy.
Regression tests `tests/test_map_step_runner.py`	Two tests added: timeout path asserts `status="timed_out"`, `baseline_complete=False`, `timed_out=True`, empty `baseline_failures`, and warning from `list_baseline_failures`; normal path asserts `baseline_complete=True`, `timed_out=False`, no warning.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 The clock used to tick at two minutes flat,
Now ten whole minutes — imagine that!
timed_out is noted, baseline_complete set,
No silent clean slate from a suite that's not met.
Warnings hop loudly when things don't conclude ~

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/compassionate-cerf-6l674i

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

azalio merged commit a650894 into main Jun 30, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #307: record_test_baseline timeout is fail-safe, not fail-open#315

fix #307: record_test_baseline timeout is fail-safe, not fail-open#315
azalio merged 1 commit into
mainfrom
claude/compassionate-cerf-6l674i

azalio commented Jun 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

Review failed

Walkthrough

Estimated code review effort

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

azalio commented Jun 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Summary by CodeRabbit

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Estimated code review effort

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

azalio commented Jun 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading