Remove context.md and scope prompt.txt to checkpoint-only prompts#572
Remove context.md and scope prompt.txt to checkpoint-only prompts#572gtrrz-victor wants to merge 1 commit intomainfrom
Conversation
context.md was dead code — written during checkpointing but never consumed by any reader. prompt.txt stored all session prompts, but every consumer only used the first or last one. This simplifies both by removing context.md entirely and scoping prompt.txt to only the current checkpoint's prompts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 1104437d743e
PR SummaryMedium Risk Overview Changes Written by Cursor Bugbot for commit bd9d181. Configure here. |
There was a problem hiding this comment.
Pull request overview
This PR removes the dead-code context.md generation/storage/reading across the entire codebase and scopes prompt.txt to contain only the prompts from the current checkpoint portion of a session transcript (rather than all session prompts). It is a cleanup PR that nets -542 lines.
Changes:
- Remove
context.mdentirely: DeletesgenerateContextFromPrompts(),createContextFile(),GetSessionContext(),getCheckpointsForSession(), theContextfields fromWriteCommittedOptions/UpdateCommittedOptions/SessionContent/SessionFilePaths, related git-blob creation/reading incommitted.go, and theContextFileNameconstant. - Scope
prompt.txtto checkpoint-only prompts: IntroducesextractCheckpointPrompts()that slices the full transcript to the checkpoint portion (viatranscript.SliceFromLinefor JSONL agents orSliceFromMessagefor Gemini/OpenCode) before extracting user prompts; updatesextractSessionData()andextractSessionDataFromLiveTranscript()to use it. - Update tests and docs: Removes tests for deleted functions, updates integration tests to assert checkpoint-scoped (not full-session) prompt content, and updates architecture docs and CLAUDE.md to reflect the new structure.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
cmd/entire/cli/strategy/manual_commit_condensation.go |
Adds extractCheckpointPrompts(); removes generateContextFromPrompts() and getCheckpointsForSession(); updates prompt extraction calls |
cmd/entire/cli/strategy/manual_commit_hooks.go |
Removes context generation and redaction in finalizeAllTurnCheckpoints |
cmd/entire/cli/strategy/manual_commit_logs.go |
Removes dead GetSessionContext() function and its imports |
cmd/entire/cli/strategy/manual_commit_types.go |
Removes Context field from ExtractedSessionData |
cmd/entire/cli/strategy/common.go |
Removes context.md fallback in getSessionDescriptionFromTree and the # markdown prefix stripping |
cmd/entire/cli/checkpoint/checkpoint.go |
Removes Context fields from WriteCommittedOptions, UpdateCommittedOptions, SessionContent, SessionFilePaths |
cmd/entire/cli/checkpoint/committed.go |
Removes context blob creation/reading in writeSessionToSubdirectory, UpdateCommitted, ReadSessionContent |
cmd/entire/cli/paths/paths.go |
Removes ContextFileName constant |
cmd/entire/cli/lifecycle.go |
Removes createContextFile() function and its call |
cmd/entire/cli/checkpoint/checkpoint_test.go |
Removes Context-related test assertions |
cmd/entire/cli/checkpoint/committed_update_test.go |
Removes Context-related test assertions |
cmd/entire/cli/strategy/manual_commit_condensation_test.go |
Removes generateContextFromPrompts tests; updates Gemini multi-checkpoint test |
cmd/entire/cli/strategy/manual_commit_test.go |
Updates assertions to verify checkpoint-scoped prompts instead of full-session prompts |
cmd/entire/cli/integration_test/manual_commit_workflow_test.go |
Updates integration tests to assert checkpoint-scoped prompt content |
cmd/entire/cli/lifecycle_test.go |
Removes createContextFile tests |
cmd/entire/cli/benchutil/parse_tree_bench_test.go |
Updates comment and capacity for 4-file structure |
CLAUDE.md |
Updates architecture doc to remove context.md references |
docs/architecture/sessions-and-checkpoints.md |
Updates checkpoint structure diagrams |
docs/architecture/claude-hooks-integration.md |
Updates hook behavior description |
| // buildShardedMetadataTree builds a realistic metadata branch tree with N checkpoints | ||
| // distributed across shards. Each checkpoint has 5 files (metadata.json, full.jsonl, | ||
| // prompt.txt, context.md, content_hash.txt) matching the real storage format. | ||
| // distributed across shards. Each checkpoint has 4 files (metadata.json, full.jsonl, | ||
| // prompt.txt, content_hash.txt) matching the real storage format. | ||
| // | ||
| // Returns the root tree hash. | ||
| func buildShardedMetadataTree(b *testing.B, repo *gogit.Repository, checkpointCount int) plumbing.Hash { | ||
| b.Helper() | ||
|
|
||
| entries := make(map[string]object.TreeEntry, checkpointCount*5) | ||
| entries := make(map[string]object.TreeEntry, checkpointCount*4) |
There was a problem hiding this comment.
The comment says "Each checkpoint has 4 files (metadata.json, full.jsonl, prompt.txt, content_hash.txt)" and the map is initialized with checkpointCount*4, but the actual file list inside the loop has 5 entries: metadata.json, 0/metadata.json, 0/full.jsonl, 0/prompt.txt, 0/content_hash.txt. The 0/metadata.json per-session file is not mentioned in the comment, and the initial map capacity is one too small per checkpoint. This likely existed before this PR (the old version had context.md making 6 entries with a capacity of 5), but the comment update introduced a new inaccuracy. The map capacity should be checkpointCount*5 and the comment should list all 5 files.
| switch agentType { | ||
| case agent.AgentTypeGemini: | ||
| var err error | ||
| scoped, err = geminicli.SliceFromMessage(fullTranscript, checkpointStart) | ||
| if err != nil { | ||
| scoped = fullTranscript // fallback to full on error | ||
| } | ||
| case agent.AgentTypeOpenCode: | ||
| var err error | ||
| scoped, err = opencode.SliceFromMessage(fullTranscript, checkpointStart) | ||
| if err != nil { | ||
| scoped = fullTranscript // fallback to full on error | ||
| } | ||
| default: | ||
| // Claude Code, Cursor, Unknown — JSONL line offset | ||
| scoped = transcript.SliceFromLine(fullTranscript, checkpointStart) | ||
| if scoped == nil { | ||
| scoped = fullTranscript // fallback to full if offset beyond content | ||
| } | ||
| } |
There was a problem hiding this comment.
In extractCheckpointPrompts, the fallback behavior for an out-of-bounds checkpointStart is inconsistent across agent types:
- For Gemini and OpenCode: when
checkpointStart >= message count,SliceFromMessagereturns(nil, nil). Sinceerr == nil, no fallback is triggered andscopedstaysnil.extractUserPromptsis then called with an empty string, returning no prompts. - For JSONL agents (Claude Code, Cursor, Unknown): when
checkpointStartexceeds line count,SliceFromLinereturnsnil, which triggers thescoped = fullTranscriptfallback, returning all session prompts instead.
This means a Gemini/OpenCode checkpoint with a checkpointStart beyond the transcript will have empty prompts, while a Claude Code checkpoint in the same situation will have all session prompts. The behavior should be consistent. Either all agent types should fall back to full transcript or none should.
| // Extract prompts from the full transcript | ||
| prompts := extractUserPrompts(state.AgentType, string(fullTranscript)) |
There was a problem hiding this comment.
In finalizeAllTurnCheckpoints (line 2017), prompts are still extracted from the full transcript using extractUserPrompts(state.AgentType, string(fullTranscript)) without any checkpoint scoping. This means that when mid-turn checkpoints (stored in TurnCheckpointIDs) are finalized at stop time, their prompt.txt is overwritten with all-session prompts, contradicting the PR's stated goal of scoping prompt.txt to checkpoint-only prompts.
Unlike the condensation path (CondenseSession → extractSessionData → extractCheckpointPrompts), this finalization path doesn't use extractCheckpointPrompts with a CheckpointTranscriptStart offset. The challenge here is that CheckpointTranscriptStart only tracks the current checkpoint's start position, not the per-checkpoint start positions for the multiple checkpoints in TurnCheckpointIDs.
If checkpoint-scoped prompts are desired for turn-finalized checkpoints too, the per-checkpoint CheckpointTranscriptStart would need to be stored alongside each checkpoint ID in TurnCheckpointIDs.
| // Extract prompts from the full transcript | |
| prompts := extractUserPrompts(state.AgentType, string(fullTranscript)) | |
| // Extract checkpoint-scoped prompts from the full transcript, matching condensation behavior | |
| prompts := extractCheckpointPrompts(state.AgentType, string(fullTranscript), state.CheckpointTranscriptStart) |
| // Verify second checkpoint has the FULL transcript (all three prompts) | ||
| // Session files are now in numbered subdirectories (e.g., 0/prompt.txt) | ||
| promptPath2 := SessionFilePath(checkpoint2ID, "prompt.txt") | ||
| prompt2Content, found := env.ReadFileFromBranch(paths.MetadataBranchName, promptPath2) | ||
| if !found { | ||
| t.Errorf("prompt.txt should exist at %s", promptPath2) | ||
| } else { | ||
| t.Logf("Second prompt.txt content:\n%s", prompt2Content) | ||
|
|
||
| // Should contain all prompts (full transcript preserved) | ||
| // Should contain only the checkpoint-scoped prompt (third prompt only) | ||
| if !strings.Contains(prompt2Content, "create function C") { | ||
| t.Error("Second prompt.txt should contain 'create function C'") | ||
| } | ||
| if !strings.Contains(prompt2Content, "Create function A") { | ||
| t.Error("Second prompt.txt should contain 'Create function A' (full transcript)") | ||
| } | ||
| if !strings.Contains(prompt2Content, "create function B") { | ||
| t.Error("Second prompt.txt should contain 'create function B' (full transcript)") | ||
| } | ||
| } | ||
|
|
||
| contextPath2 := SessionFilePath(checkpoint2ID, "context.md") | ||
| context2Content, found := env.ReadFileFromBranch(paths.MetadataBranchName, contextPath2) | ||
| if !found { | ||
| t.Errorf("context.md should exist at %s", contextPath2) | ||
| } else { | ||
| t.Logf("Second context.md content:\n%s", context2Content) | ||
|
|
||
| // Should contain full transcript context | ||
| if !strings.Contains(context2Content, "Create function A") { | ||
| t.Error("Second context.md should contain 'Create function A' (full transcript)") | ||
| } | ||
| } |
There was a problem hiding this comment.
The comment on line 837 ("Verify second checkpoint has the FULL transcript (all three prompts)") is stale and contradicts the new checkpoint-scoping behavior introduced by this PR. The second checkpoint should now contain only the third prompt (checkpoint-scoped), not all three prompts.
Additionally, the test only checks that the scoped prompt IS present (positive assertion) but does not verify that the earlier prompts are NOT present in the second checkpoint's prompt.txt. The TestShadow_FullTranscriptCondensationWithIntermediateCommits test includes negative assertions (checking that "function A" and "function B" do NOT appear in the second checkpoint), and this test should do the same to properly validate the checkpoint-scoping behavior.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
| } | ||
|
|
||
| // Extract prompts and context from the full transcript | ||
| // Extract prompts from the full transcript |
There was a problem hiding this comment.
Finalization overwrites checkpoint-scoped prompts with all session prompts
High Severity
The finalizeAllTurnCheckpoints method still calls extractUserPrompts (which returns ALL session prompts) instead of the new extractCheckpointPrompts. At stop time, this overwrites the checkpoint-scoped prompts written during condensation with all-session prompts for every turn checkpoint, negating the core intent of this PR. The condensation paths (extractSessionData and extractSessionDataFromLiveTranscript) correctly use extractCheckpointPrompts, but the finalization path was not updated to match.
Additional Locations (1)
| if scoped == nil { | ||
| scoped = fullTranscript // fallback to full if offset beyond content | ||
| } | ||
| } |
There was a problem hiding this comment.
Missing nil fallback for Gemini/OpenCode in extractCheckpointPrompts
Low Severity
Both geminicli.SliceFromMessage and opencode.SliceFromMessage return (nil, nil) when checkpointStart exceeds the message count, but extractCheckpointPrompts only handles the err != nil case for these branches. The scoped variable stays nil, causing extractUserPrompts to receive an empty string and return nil prompts. By contrast, the JSONL default branch explicitly checks if scoped == nil and falls back to fullTranscript. This inconsistency means Gemini/OpenCode silently lose all prompts when the offset is stale, while JSONL agents preserve them.


Closes #571
Summary
context.mdentirely — it was dead code (written during checkpointing but never consumed by any reader). Removes generation, writing, reading, and all test/doc references across 19 files.prompt.txtto checkpoint-only prompts — previously stored all session prompts joined by---, but every consumer only used the first or last. Now stores only prompts from the current checkpoint portion of the transcript.What changed
context.md removal:
generateContextFromPrompts()function and all callersContextfields fromWriteCommittedOptions,UpdateCommittedOptions,SessionContent,SessionFilePathscommitted.gocreateContextFile()fromlifecycle.goGetSessionContext()andgetCheckpointsForSession()(dead code)getSessionDescriptionFromTree()ContextFileNameconstant frompaths.goprompt.txt scoping:
extractCheckpointPrompts()that scopes transcript by agent type (JSONL/Gemini/OpenCode) before extracting promptsextractSessionData()andextractSessionDataFromLiveTranscript()to use itlifecycle.gowas already checkpoint-scoped — no change neededBackward compatibility:
context.mdfiles onentire/checkpoints/v1are simply ignored (no reader)prompt.txtwith all-session prompts still works withExtractFirstPrompt()/extractLastPrompt()Test plan
go build ./...compiles cleanlymise run fmt— no formatting changesmise run test:ci— all unit + integration tests passentire explainon a repo with existing checkpoints (backward compat)context.mdon committed branchentire rewindto verify session labels display correctly🤖 Generated with Claude Code