Harden live-run reliability from AP-939 test#243
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (67)
WalkthroughThis pull request introduces a comprehensive multi-layered system overhaul spanning backend display events, workflow state durability, and complete web UI redesign. The change adds provider-neutral display-event parsing infrastructure (Codex, fallback adapters) that converts raw agent output into timeline events; implements durable steering advisory recording with delivery metadata and writer-targeted injection; standardises writer behaviour via a lifecycle contract embedded across synthesis, fixes, and feedback phases; improves state initialisation with Possibly related PRs
✨ Finishing Touches⚔️ Resolve merge conflicts
|
|
Superseded by clean branch PR #244. The first branch was based on the already-merged UI PR history, so this keeps the reliability diff reviewable. |
Summary
origin/mainbefore worktree reset/create, and fail clearly if no base ref is availableplanning/ap-939-live-run-test-fixes.mdRoot Cause
AP-939 exposed Cube-owned workflow responsibilities leaking into agents or UI state. Writers were instructed to sync, stage, commit, and push inside a sandbox that cannot write shared Git worktree metadata. The dashboard also depended on completed phase writes, so active/parked runs could be invisible or stale. Finally, the all-approved peer-review path created a PR in Phase 8 and exited before Phase 10 could mark PR/conformance complete.
Validation
rtk pytest -q tests/core/test_state.py tests/core/test_writer_contract.py tests/core/test_display_events.py tests/cli/test_adapters.py tests/cli/test_single_writer.py tests/ui/test_run_snapshots.py tests/cli/test_orchestrate_handlers.py tests/core/test_git_refresh.pyrtk ruff check python/cube testsrtk mypy python/cubertk pytest -q testsrtk git diff --checkHarden Live-Run Reliability from AP-939 Test
State Visibility & Persistence
ensure_state()early in the workflow, persisting initial state before orchestration beginsstart_phase()before phase handlers execute, preventing dashboard lagWorkflow Phase Routing
all_approved=Trueand allow phases 9/10 to run, ensuring PR/conformance state completes properlyWriter Lifecycle Contract
WRITER_LIFECYCLE_CONTRACTmodule defines shared Cube-owned responsibilities: Cube handles base refresh, verify, staging, commits, and pushes; agents only edit and inspect Gitensure_writer_lifecycle_contract(), replacing conflicting hardcoded instructionstarget_agent_id,target_agent_label,delivery) when steers are injectedGit Operations
refresh_origin_main()helper refreshesrefs/remotes/origin/mainbefore worktree operations, with graceful fallback to existing ref or clear failure when unavailableSteering Injection Durability
steering-injections.jsonlwith record metadata (target agent, delivery type, timing) viarecord_injected_steers()Dashboard & UI Redesign
DisplayEventAdapterabstracts Codex/fallback parsers, generating normalisedAgentDisplayEventobjects/tasks/snapshotsand/tasks/{id}/snapshotendpoints return comprehensive run state snapshots (workflow steps, verify status, review cycles, PR/conformance, communications)AgentTimelinecomponent) renders merged writer/steerer messages with grouped tool events and optional debug outputdscomponent library (Container, Navbar, Badge, StatusPill, etc.)Diagnostics & Noise Filtering
Planning Documentation
planning/ap-939-live-run-test-fixes.md(14 observed operational/UI/workflow issues with reproduction, impact, and candidate fixes)Test Coverage
Extended validation across state management, Git refresh, writer contracts, display events, steering injection, handler orchestration, UI snapshots, and E2E flows.