Summary
Drift findings are frozen at the initial drift-review phase and never updated as task groups resolve them. By the time later groups run, the coder is told about problems that no longer exist — wasting prompt space and potentially causing confusion.
Evidence
Session 20260624_165149_e74eca, spec 07 (nightshift_standalone_cli):
The drift review (group 0) produced 12 findings describing the pre-implementation state. These same 12 findings were injected unchanged into every coder prompt from group 2 through group 7:
07_nightshift_standalone_cli:2 prompt — 12 drift findings (3,356 chars)
07_nightshift_standalone_cli:3 prompt — 12 drift findings (3,356 chars, identical)
07_nightshift_standalone_cli:4 prompt — 12 drift findings (3,356 chars, identical)
07_nightshift_standalone_cli:5 prompt — 12 drift findings (3,356 chars, identical)
07_nightshift_standalone_cli:6 prompt — 12 drift findings (3,356 chars, identical)
07_nightshift_standalone_cli:7 prompt — 12 drift findings (3,356 chars, identical)
Examples of stale findings injected into the group 6 coder prompt (docs update):
"The entire packages/nightshift/ directory does not exist." — resolved by group 3 which created the scaffold
"packages/af/af/nightshift.py still exists in the repository." — resolved by group 2 which deleted it
"packages/af/af/app.py imports and registers night-shift." — resolved by group 2 which removed the registration
"Night-shift tests have not been migrated." — resolved by group 5 which migrated them
"Root pyproject.toml is missing nightshift workspace entries." — resolved by group 3 which added them
By group 6, only 2–3 of the 12 drift findings were still relevant (docs references to af night-shift). The other 9–10 described state that earlier groups had already fixed.
Impact
- Prompt waste: 3,356 chars of drift findings per session, ~70% of which is stale by later groups. Across 6 groups, that's ~14k chars of stale context.
- Potential confusion: A coder seeing "packages/nightshift/ does not exist" in its prompt while looking at a directory that clearly exists could second-guess its own codebase exploration, or worse, try to "fix" something that's already done.
- Missed opportunity: The prompt space consumed by stale drift findings could carry genuinely useful information instead.
How it currently works
- The drift reviewer runs once (group 0) and produces N findings
- Findings are stored in
review_findings table via review_store.insert_drift_findings()
fox_provider._query_drift_findings() retrieves all active (non-superseded) drift findings for the spec
- Every subsequent coder session gets the full set injected into its system prompt
- No mechanism exists to mark drift findings as resolved
Suggested Fix
After each task group completes and merges successfully, run a lightweight check to supersede drift findings that the completed group's changes have resolved. Two approaches:
Option A: Automatic supersession based on task-to-drift mapping (recommended)
When the coder session completes and its changes are merged:
- Check which files were touched by the merge commit
- For each drift finding that references a file or path that was touched, mark it as superseded in
review_findings (set superseded_by to the completing session's node_id)
- The
superseded_by IS NULL filter in _query_drift_findings() will automatically exclude resolved findings from subsequent prompts
This is similar to how review findings are already superseded (review_store.supersede_injected_findings()). The drift findings just need the same treatment.
Option B: Re-run drift check after each group
After each task group merges, re-run a lightweight drift check that evaluates the current codebase state against the spec. Replace the stored drift findings with the fresh set. This is more accurate but more expensive (requires an LLM call per group completion).
Option C: Time-based decay
Mark drift findings with the group they were generated for. Only inject drift findings into coder prompts if they were generated within the last N groups. Simple but imprecise — a drift finding could still be relevant after N groups if no one addressed it.
Files
packages/agentfox/agentfox/knowledge/review_store.py — insert_drift_findings(), supersession logic
packages/agentfox/agentfox/knowledge/fox_provider.py — _query_drift_findings() retrieval
packages/agentfox/agentfox/engine/result_handler.py — post-merge handling where supersession should be triggered
packages/agentfox/agentfox/knowledge/migrations.py — review_findings table schema (already has superseded_by column)
Summary
Drift findings are frozen at the initial drift-review phase and never updated as task groups resolve them. By the time later groups run, the coder is told about problems that no longer exist — wasting prompt space and potentially causing confusion.
Evidence
Session
20260624_165149_e74eca, spec 07 (nightshift_standalone_cli):The drift review (group 0) produced 12 findings describing the pre-implementation state. These same 12 findings were injected unchanged into every coder prompt from group 2 through group 7:
Examples of stale findings injected into the group 6 coder prompt (docs update):
"The entire packages/nightshift/ directory does not exist."— resolved by group 3 which created the scaffold"packages/af/af/nightshift.py still exists in the repository."— resolved by group 2 which deleted it"packages/af/af/app.py imports and registers night-shift."— resolved by group 2 which removed the registration"Night-shift tests have not been migrated."— resolved by group 5 which migrated them"Root pyproject.toml is missing nightshift workspace entries."— resolved by group 3 which added themBy group 6, only 2–3 of the 12 drift findings were still relevant (docs references to
af night-shift). The other 9–10 described state that earlier groups had already fixed.Impact
How it currently works
review_findingstable viareview_store.insert_drift_findings()fox_provider._query_drift_findings()retrieves all active (non-superseded) drift findings for the specSuggested Fix
After each task group completes and merges successfully, run a lightweight check to supersede drift findings that the completed group's changes have resolved. Two approaches:
Option A: Automatic supersession based on task-to-drift mapping (recommended)
When the coder session completes and its changes are merged:
review_findings(setsuperseded_byto the completing session's node_id)superseded_by IS NULLfilter in_query_drift_findings()will automatically exclude resolved findings from subsequent promptsThis is similar to how review findings are already superseded (
review_store.supersede_injected_findings()). The drift findings just need the same treatment.Option B: Re-run drift check after each group
After each task group merges, re-run a lightweight drift check that evaluates the current codebase state against the spec. Replace the stored drift findings with the fresh set. This is more accurate but more expensive (requires an LLM call per group completion).
Option C: Time-based decay
Mark drift findings with the group they were generated for. Only inject drift findings into coder prompts if they were generated within the last N groups. Simple but imprecise — a drift finding could still be relevant after N groups if no one addressed it.
Files
packages/agentfox/agentfox/knowledge/review_store.py—insert_drift_findings(), supersession logicpackages/agentfox/agentfox/knowledge/fox_provider.py—_query_drift_findings()retrievalpackages/agentfox/agentfox/engine/result_handler.py— post-merge handling where supersession should be triggeredpackages/agentfox/agentfox/knowledge/migrations.py—review_findingstable schema (already hassuperseded_bycolumn)