Harden live-run reliability from AP-939 test#244
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis pull request refactors Cube's writer lifecycle model to centralise state management, defer phase-terminal operations, and enforce consistent ownership semantics across workflows. A new Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/cube/cli.py`:
- Around line 859-866: The ensure_state call unconditionally overwrites stored
run fields (mode/writer_key) on resume; change the logic so before calling
ensure_state you read the current persisted state for task_id (the stored entry
for path "INTAKE") and only pass mode and writer_key through when the user
explicitly requested a change (e.g., single_mode flag set or writer_key
provided) or when the stored values are missing/blank; otherwise preserve the
stored state values and call ensure_state with the preserved mode/writer_key to
avoid clobbering on resume.
In `@python/cube/core/writer_contract.py`:
- Around line 32-35: The current replacement using
text.split(WRITER_LIFECYCLE_START, 1) and rest.split(WRITER_LIFECYCLE_END, 1)
can raise if the lifecycle block is malformed; update the logic in
writer_contract.py to robustly locate the sentinel pair (use text.find/ rfind or
guard the splits) and only perform the in-place replace when both start and end
are found and in correct order (preserve the existing names before, rest, _old,
after, updated, section); if locating/parsing fails, fall back to prepending the
lifecycle section (i.e., set updated = section + text) so corrupted partial
blocks do not break writer startup.
In `@python/cube/ui/routes/tasks.py`:
- Around line 1123-1126: _writer_log_exists currently only matches
writer-...-*.json which misses other valid artefact extensions (.jsonl, .log,
.txt) and causes false "not-started" states; update the function to detect any
file with the writer-{writer_name}-{task_id}-* prefix (e.g.,
LOGS_DIR.glob(f"writer-{writer_name}-{task_id}-*")) or explicitly include the
extensions ('.json','.jsonl','.log','.txt'), and apply the same fix to the
analogous check found around lines referenced (the block at 1078-1085) so all
writer log detection is extension-agnostic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e213a3cf-a91d-4d71-be48-1fbbae8ef981
📒 Files selected for processing (23)
planning/ap-939-live-run-test-fixes.mdpython/cube/automation/judge_panel.pypython/cube/automation/prompts.pypython/cube/automation/verify.pypython/cube/cli.pypython/cube/commands/orchestrate/executor.pypython/cube/commands/orchestrate/handlers.pypython/cube/commands/orchestrate/phases.pypython/cube/commands/orchestrate/prompts.pypython/cube/commands/peer_review.pypython/cube/core/git.pypython/cube/core/parsers/codex.pypython/cube/core/state.pypython/cube/core/writer_contract.pypython/cube/ui/routes/tasks.pytests/cli/test_adapters.pytests/cli/test_orchestrate_handlers.pytests/cli/test_single_writer.pytests/core/test_display_events.pytests/core/test_git_refresh.pytests/core/test_state.pytests/core/test_writer_contract.pytests/ui/test_run_snapshots.py
📜 Review details
🧰 Additional context used
🪛 LanguageTool
planning/ap-939-live-run-test-fixes.md
[style] ~24-~24: To elevate your writing, try using a synonym here.
Context: ...nch showed the real state. - Impact: hard to tell whether the run is active, park...
(HARD_TO)
[formatting] ~65-~65: Insert a comma before quoting reported speech: “says, "”…
Context: ...139 and 226. Cube's parent run log then says "Changes committed and pushed", so the pa...
(SAID_COMMA_SPEECH)
[uncategorized] ~66-~66: Possible missing comma found.
Context: ...s unclear. It also encourages raw git commands despite the repo instruction to use `rt...
(AI_HYDRA_LEO_MISSING_COMMA)
[grammar] ~76-~76: “Run” is a singular noun. It appears that the verb form is incorrect.
Context: ...step as pending. - Evidence: the run log shows unanimous peer-review approval fo...
(PCT_SINGULAR_NOUN_PLURAL_VERB_AGREEMENT)
🪛 Ruff (0.15.14)
python/cube/core/git.py
[error] 28-28: Starting a process with a partial executable path
(S607)
[error] 36-36: Starting a process with a partial executable path
(S607)
🔇 Additional comments (14)
planning/ap-939-live-run-test-fixes.md (1)
1-79: LGTM!tests/core/test_writer_contract.py (1)
9-33: LGTM!python/cube/core/state.py (1)
115-154: LGTM!Also applies to: 182-182
tests/core/test_state.py (1)
8-8: LGTM!Also applies to: 13-13, 84-110
python/cube/cli.py (1)
578-585: LGTM!Also applies to: 743-748
python/cube/commands/orchestrate/executor.py (1)
5-5: LGTM!Also applies to: 60-67
python/cube/commands/orchestrate/handlers.py (1)
626-627: LGTM!python/cube/commands/orchestrate/prompts.py (1)
12-12: LGTM!Also applies to: 23-23, 42-46, 61-61, 97-97, 199-202, 212-212, 237-237
tests/core/test_git_refresh.py (1)
1-45: LGTM!tests/cli/test_orchestrate_handlers.py (1)
1-84: LGTM!tests/cli/test_single_writer.py (1)
9-10: LGTM!Also applies to: 89-90, 93-95, 125-130, 133-177
tests/ui/test_run_snapshots.py (1)
139-156: LGTM!Also applies to: 158-177, 229-229, 262-262
tests/cli/test_adapters.py (1)
245-257: LGTM!tests/core/test_display_events.py (1)
90-97: LGTM!Also applies to: 103-103
|
Actionable comments posted: 0 |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
python/cube/cli.py (1)
862-885:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse the persisted mode/writer for runtime execution, not the pre-override values.
You correctly persist
persisted_modeandpersisted_writer_key, but execution still usessingle_modeandwriter_key. On resume, this can run a different mode than the stored workflow state.Suggested fix
@@ if existing_state is None or explicit_mode_override: persisted_mode = computed_mode persisted_writer_key = writer_key @@ else: persisted_writer_key = None + + effective_single_mode = persisted_mode == "single" + effective_writer_key = persisted_writer_key @@ asyncio.run( orchestrate_auto_command( task_file, resolved_resume_from, task_id=task_id, resume_alias=resume_alias, - single_mode=single_mode, - writer_key=writer_key, + single_mode=effective_single_mode, + writer_key=effective_writer_key, fresh_writer=fresh_writer, fresh_judges=fresh, resume_prompt=prompt, jira_metadata=jira_metadata, ) )Also applies to: 908-916
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/cube/cli.py` around lines 862 - 885, The runtime execution is still using the pre-override variables (single_mode and writer_key) instead of the persisted values, which can cause resumed runs to execute in a different mode than stored; update the places that determine runtime behavior (after load_state and update_phase, and the analogous block around lines 908-916) to use persisted_mode and persisted_writer_key for any downstream decision-making or execution (e.g., task execution, mode checks, writer selection) so the running process honors the persisted state from load_state and update_phase.
🧹 Nitpick comments (1)
python/cube/automation/verify.py (1)
100-100: ⚡ Quick winConsider adding missing git commands for consistency with the main writer prompt.
The prohibition list here omits
git fetch,git merge, andgit pull, which are explicitly prohibited in the main writer prompt (prompts.py lines 44-45). While writers have already seen the full constraints and the most critical commands (add/commit/push) are present, adding these for completeness would reinforce the lifecycle contract consistently across all writer-facing prompts.♻️ Proposed addition for consistency
-**Do not run verify, tests, lint, typecheck, `git add`, `git commit`, or `git push`.** Cube re-runs verify and commits/pushes outside the writer sandbox once you exit. +**Do not run verify, tests, lint, typecheck, `git fetch`, `git merge`, `git pull`, `git add`, `git commit`, or `git push`.** Cube re-runs verify and commits/pushes outside the writer sandbox once you exit.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/cube/automation/verify.py` at line 100, Update the prohibition message in verify.py (the "Do not run verify, tests, lint, typecheck, `git add`, `git commit`, or `git push`." text) to also include `git fetch`, `git merge`, and `git pull` so it matches the main writer prompt constraints; locate the prohibition string in python/cube/automation/verify.py (around the displayed "Do not run verify..." message) and append those three commands to the list for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@python/cube/cli.py`:
- Around line 862-885: The runtime execution is still using the pre-override
variables (single_mode and writer_key) instead of the persisted values, which
can cause resumed runs to execute in a different mode than stored; update the
places that determine runtime behavior (after load_state and update_phase, and
the analogous block around lines 908-916) to use persisted_mode and
persisted_writer_key for any downstream decision-making or execution (e.g., task
execution, mode checks, writer selection) so the running process honors the
persisted state from load_state and update_phase.
---
Nitpick comments:
In `@python/cube/automation/verify.py`:
- Line 100: Update the prohibition message in verify.py (the "Do not run verify,
tests, lint, typecheck, `git add`, `git commit`, or `git push`." text) to also
include `git fetch`, `git merge`, and `git pull` so it matches the main writer
prompt constraints; locate the prohibition string in
python/cube/automation/verify.py (around the displayed "Do not run verify..."
message) and append those three commands to the list for consistency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 98a24c15-bb73-45e0-a605-8cf8c3d2a86f
📒 Files selected for processing (9)
python/cube/automation/verify.pypython/cube/cli.pypython/cube/commands/orchestrate/executor.pypython/cube/commands/orchestrate/phases.pypython/cube/commands/orchestrate/prompts.pypython/cube/core/state.pypython/cube/ui/routes/tasks.pytests/core/test_state.pytests/ui/test_run_snapshots.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/ui/test_run_snapshots.py
- python/cube/commands/orchestrate/phases.py
📜 Review details
🔇 Additional comments (7)
python/cube/core/state.py (1)
126-151: LGTM!tests/core/test_state.py (1)
82-87: LGTM!Also applies to: 98-104
python/cube/commands/orchestrate/executor.py (1)
5-5: LGTM!Also applies to: 60-68
python/cube/commands/orchestrate/prompts.py (3)
40-47: LGTM!
63-63: LGTM!
200-210: LGTM!python/cube/ui/routes/tasks.py (1)
1084-1084: LGTM!Also applies to: 1122-1131
Summary
origin/mainbefore worktree reset/create, and fail clearly if no base ref is availableplanning/ap-939-live-run-test-fixes.mdRoot Cause
AP-939 exposed Cube-owned workflow responsibilities leaking into agents or UI state. Writers were instructed to sync, stage, commit, and push inside a sandbox that cannot write shared Git worktree metadata. The dashboard also depended on completed phase writes, so active/parked runs could be invisible or stale. Finally, the all-approved peer-review path created a PR in Phase 8 and exited before Phase 10 could mark PR/conformance complete.
Validation
rtk pytest -q tests/core/test_state.py tests/core/test_writer_contract.py tests/core/test_display_events.py tests/cli/test_adapters.py tests/cli/test_single_writer.py tests/ui/test_run_snapshots.py tests/cli/test_orchestrate_handlers.py tests/core/test_git_refresh.pyrtk ruff check python/cube testsrtk mypy python/cubertk pytest -q testsrtk git diff --check