Remove context.md and scope prompt.txt to checkpoint-only prompts by gtrrz-victor · Pull Request #572 · entireio/cli

gtrrz-victor · 2026-03-02T04:23:42Z

Closes #571

Summary

Remove context.md entirely — it was dead code (written during checkpointing but never consumed by any reader). Removes generation, writing, reading, and all test/doc references across 19 files.
Scope prompt.txt to checkpoint-only prompts — previously stored all session prompts joined by ---, but every consumer only used the first or last. Now stores only prompts from the current checkpoint portion of the transcript.
Net deletion of 542 lines of unnecessary code and complexity.

What changed

context.md removal:

generateContextFromPrompts() function and all callers
Context fields from WriteCommittedOptions, UpdateCommittedOptions, SessionContent, SessionFilePaths
context.md blob creation/reading in committed.go
createContextFile() from lifecycle.go
GetSessionContext() and getCheckpointsForSession() (dead code)
context.md fallback in getSessionDescriptionFromTree()
ContextFileName constant from paths.go

prompt.txt scoping:

Added extractCheckpointPrompts() that scopes transcript by agent type (JSONL/Gemini/OpenCode) before extracting prompts
Updated extractSessionData() and extractSessionDataFromLiveTranscript() to use it
Filesystem path in lifecycle.go was already checkpoint-scoped — no change needed

Backward compatibility:

Old context.md files on entire/checkpoints/v1 are simply ignored (no reader)
Old prompt.txt with all-session prompts still works with ExtractFirstPrompt()/extractLastPrompt()

Test plan

go build ./... compiles cleanly
mise run fmt — no formatting changes
mise run test:ci — all unit + integration tests pass
Manual: entire explain on a repo with existing checkpoints (backward compat)
Manual: Create new session, commit, verify no context.md on committed branch
Manual: entire rewind to verify session labels display correctly

🤖 Generated with Claude Code

context.md was dead code — written during checkpointing but never consumed by any reader. prompt.txt stored all session prompts, but every consumer only used the first or last one. This simplifies both by removing context.md entirely and scoping prompt.txt to only the current checkpoint's prompts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 1104437d743e

cursor · 2026-03-02T04:23:48Z

PR Summary

Medium Risk
Changes the persisted checkpoint metadata shape and prompt extraction logic across multiple agents, which could affect checkpoint/rewind behavior and any consumers expecting full-session prompts. Backward compatibility relies on ignoring old context.md and tests covering the new prompt scoping.

Overview
Removes context.md from checkpoint metadata end-to-end (constants, on-disk generation, git blob write/read paths, and related fields in WriteCommittedOptions/UpdateCommittedOptions, SessionContent, and SessionFilePaths), along with associated redaction and tests.

Changes prompt.txt semantics to be checkpoint-scoped rather than full-session: condensation now slices the transcript from CheckpointTranscriptStart (agent-aware for JSONL vs Gemini/OpenCode) via new extractCheckpointPrompts, and integration/unit tests + docs are updated to assert the new behavior.

^{Written by Cursor Bugbot for commit bd9d181. Configure here.}

Copilot

Pull request overview

This PR removes the dead-code context.md generation/storage/reading across the entire codebase and scopes prompt.txt to contain only the prompts from the current checkpoint portion of a session transcript (rather than all session prompts). It is a cleanup PR that nets -542 lines.

Changes:

Remove context.md entirely: Deletes generateContextFromPrompts(), createContextFile(), GetSessionContext(), getCheckpointsForSession(), the Context fields from WriteCommittedOptions/UpdateCommittedOptions/SessionContent/SessionFilePaths, related git-blob creation/reading in committed.go, and the ContextFileName constant.
Scope prompt.txt to checkpoint-only prompts: Introduces extractCheckpointPrompts() that slices the full transcript to the checkpoint portion (via transcript.SliceFromLine for JSONL agents or SliceFromMessage for Gemini/OpenCode) before extracting user prompts; updates extractSessionData() and extractSessionDataFromLiveTranscript() to use it.
Update tests and docs: Removes tests for deleted functions, updates integration tests to assert checkpoint-scoped (not full-session) prompt content, and updates architecture docs and CLAUDE.md to reflect the new structure.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`cmd/entire/cli/strategy/manual_commit_condensation.go`	Adds `extractCheckpointPrompts()`; removes `generateContextFromPrompts()` and `getCheckpointsForSession()`; updates prompt extraction calls
`cmd/entire/cli/strategy/manual_commit_hooks.go`	Removes context generation and redaction in `finalizeAllTurnCheckpoints`
`cmd/entire/cli/strategy/manual_commit_logs.go`	Removes dead `GetSessionContext()` function and its imports
`cmd/entire/cli/strategy/manual_commit_types.go`	Removes `Context` field from `ExtractedSessionData`
`cmd/entire/cli/strategy/common.go`	Removes `context.md` fallback in `getSessionDescriptionFromTree` and the `#` markdown prefix stripping
`cmd/entire/cli/checkpoint/checkpoint.go`	Removes `Context` fields from `WriteCommittedOptions`, `UpdateCommittedOptions`, `SessionContent`, `SessionFilePaths`
`cmd/entire/cli/checkpoint/committed.go`	Removes context blob creation/reading in `writeSessionToSubdirectory`, `UpdateCommitted`, `ReadSessionContent`
`cmd/entire/cli/paths/paths.go`	Removes `ContextFileName` constant
`cmd/entire/cli/lifecycle.go`	Removes `createContextFile()` function and its call
`cmd/entire/cli/checkpoint/checkpoint_test.go`	Removes `Context`-related test assertions
`cmd/entire/cli/checkpoint/committed_update_test.go`	Removes `Context`-related test assertions
`cmd/entire/cli/strategy/manual_commit_condensation_test.go`	Removes `generateContextFromPrompts` tests; updates Gemini multi-checkpoint test
`cmd/entire/cli/strategy/manual_commit_test.go`	Updates assertions to verify checkpoint-scoped prompts instead of full-session prompts
`cmd/entire/cli/integration_test/manual_commit_workflow_test.go`	Updates integration tests to assert checkpoint-scoped prompt content
`cmd/entire/cli/lifecycle_test.go`	Removes `createContextFile` tests
`cmd/entire/cli/benchutil/parse_tree_bench_test.go`	Updates comment and capacity for 4-file structure
`CLAUDE.md`	Updates architecture doc to remove context.md references
`docs/architecture/sessions-and-checkpoints.md`	Updates checkpoint structure diagrams
`docs/architecture/claude-hooks-integration.md`	Updates hook behavior description

Copilot · 2026-03-02T04:32:23Z

cmd/entire/cli/benchutil/parse_tree_bench_test.go

 // buildShardedMetadataTree builds a realistic metadata branch tree with N checkpoints
-// distributed across shards. Each checkpoint has 5 files (metadata.json, full.jsonl,
-// prompt.txt, context.md, content_hash.txt) matching the real storage format.
+// distributed across shards. Each checkpoint has 4 files (metadata.json, full.jsonl,
+// prompt.txt, content_hash.txt) matching the real storage format.
 //
 // Returns the root tree hash.
 func buildShardedMetadataTree(b *testing.B, repo *gogit.Repository, checkpointCount int) plumbing.Hash {
 	b.Helper()

-	entries := make(map[string]object.TreeEntry, checkpointCount*5)
+	entries := make(map[string]object.TreeEntry, checkpointCount*4)


The comment says "Each checkpoint has 4 files (metadata.json, full.jsonl, prompt.txt, content_hash.txt)" and the map is initialized with checkpointCount*4, but the actual file list inside the loop has 5 entries: metadata.json, 0/metadata.json, 0/full.jsonl, 0/prompt.txt, 0/content_hash.txt. The 0/metadata.json per-session file is not mentioned in the comment, and the initial map capacity is one too small per checkpoint. This likely existed before this PR (the old version had context.md making 6 entries with a capacity of 5), but the comment update introduced a new inaccuracy. The map capacity should be checkpointCount*5 and the comment should list all 5 files.

Copilot · 2026-03-02T04:32:24Z

cmd/entire/cli/strategy/manual_commit_condensation.go

+	switch agentType {
+	case agent.AgentTypeGemini:
+		var err error
+		scoped, err = geminicli.SliceFromMessage(fullTranscript, checkpointStart)
+		if err != nil {
+			scoped = fullTranscript // fallback to full on error
+		}
+	case agent.AgentTypeOpenCode:
+		var err error
+		scoped, err = opencode.SliceFromMessage(fullTranscript, checkpointStart)
+		if err != nil {
+			scoped = fullTranscript // fallback to full on error
+		}
+	default:
+		// Claude Code, Cursor, Unknown — JSONL line offset
+		scoped = transcript.SliceFromLine(fullTranscript, checkpointStart)
+		if scoped == nil {
+			scoped = fullTranscript // fallback to full if offset beyond content
+		}
+	}


In extractCheckpointPrompts, the fallback behavior for an out-of-bounds checkpointStart is inconsistent across agent types:

For Gemini and OpenCode: when checkpointStart >= message count, SliceFromMessage returns (nil, nil). Since err == nil, no fallback is triggered and scoped stays nil. extractUserPrompts is then called with an empty string, returning no prompts.

For JSONL agents (Claude Code, Cursor, Unknown): when checkpointStart exceeds line count, SliceFromLine returns nil, which triggers the scoped = fullTranscript fallback, returning all session prompts instead.

This means a Gemini/OpenCode checkpoint with a checkpointStart beyond the transcript will have empty prompts, while a Claude Code checkpoint in the same situation will have all session prompts. The behavior should be consistent. Either all agent types should fall back to full transcript or none should.

Copilot · 2026-03-02T04:32:24Z

cmd/entire/cli/strategy/manual_commit_hooks.go

+	// Extract prompts from the full transcript
 	prompts := extractUserPrompts(state.AgentType, string(fullTranscript))


In finalizeAllTurnCheckpoints (line 2017), prompts are still extracted from the full transcript using extractUserPrompts(state.AgentType, string(fullTranscript)) without any checkpoint scoping. This means that when mid-turn checkpoints (stored in TurnCheckpointIDs) are finalized at stop time, their prompt.txt is overwritten with all-session prompts, contradicting the PR's stated goal of scoping prompt.txt to checkpoint-only prompts.

Unlike the condensation path (CondenseSession → extractSessionData → extractCheckpointPrompts), this finalization path doesn't use extractCheckpointPrompts with a CheckpointTranscriptStart offset. The challenge here is that CheckpointTranscriptStart only tracks the current checkpoint's start position, not the per-checkpoint start positions for the multiple checkpoints in TurnCheckpointIDs.

If checkpoint-scoped prompts are desired for turn-finalized checkpoints too, the per-checkpoint CheckpointTranscriptStart would need to be stored alongside each checkpoint ID in TurnCheckpointIDs.

Suggested change

// Extract prompts from the full transcript

prompts := extractUserPrompts(state.AgentType, string(fullTranscript))

// Extract checkpoint-scoped prompts from the full transcript, matching condensation behavior

prompts := extractCheckpointPrompts(state.AgentType, string(fullTranscript), state.CheckpointTranscriptStart)

Copilot · 2026-03-02T04:32:24Z

cmd/entire/cli/integration_test/manual_commit_workflow_test.go

 	// Verify second checkpoint has the FULL transcript (all three prompts)
 	// Session files are now in numbered subdirectories (e.g., 0/prompt.txt)
 	promptPath2 := SessionFilePath(checkpoint2ID, "prompt.txt")
 	prompt2Content, found := env.ReadFileFromBranch(paths.MetadataBranchName, promptPath2)
 	if !found {
 		t.Errorf("prompt.txt should exist at %s", promptPath2)
 	} else {
 		t.Logf("Second prompt.txt content:\n%s", prompt2Content)

-		// Should contain all prompts (full transcript preserved)
+		// Should contain only the checkpoint-scoped prompt (third prompt only)
 		if !strings.Contains(prompt2Content, "create function C") {
 			t.Error("Second prompt.txt should contain 'create function C'")
 		}
-		if !strings.Contains(prompt2Content, "Create function A") {
-			t.Error("Second prompt.txt should contain 'Create function A' (full transcript)")
-		}
-		if !strings.Contains(prompt2Content, "create function B") {
-			t.Error("Second prompt.txt should contain 'create function B' (full transcript)")
-		}
-	}
-
-	contextPath2 := SessionFilePath(checkpoint2ID, "context.md")
-	context2Content, found := env.ReadFileFromBranch(paths.MetadataBranchName, contextPath2)
-	if !found {
-		t.Errorf("context.md should exist at %s", contextPath2)
-	} else {
-		t.Logf("Second context.md content:\n%s", context2Content)
-
-		// Should contain full transcript context
-		if !strings.Contains(context2Content, "Create function A") {
-			t.Error("Second context.md should contain 'Create function A' (full transcript)")
-		}
 	}


The comment on line 837 ("Verify second checkpoint has the FULL transcript (all three prompts)") is stale and contradicts the new checkpoint-scoping behavior introduced by this PR. The second checkpoint should now contain only the third prompt (checkpoint-scoped), not all three prompts.

Additionally, the test only checks that the scoped prompt IS present (positive assertion) but does not verify that the earlier prompts are NOT present in the second checkpoint's prompt.txt. The TestShadow_FullTranscriptCondensationWithIntermediateCommits test includes negative assertions (checking that "function A" and "function B" do NOT appear in the second checkpoint), and this test should do the same to properly validate the checkpoint-scoping behavior.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

cursor · 2026-03-02T04:33:19Z

cmd/entire/cli/strategy/manual_commit_hooks.go

 	}

-	// Extract prompts and context from the full transcript
+	// Extract prompts from the full transcript


Finalization overwrites checkpoint-scoped prompts with all session prompts

High Severity

The finalizeAllTurnCheckpoints method still calls extractUserPrompts (which returns ALL session prompts) instead of the new extractCheckpointPrompts. At stop time, this overwrites the checkpoint-scoped prompts written during condensation with all-session prompts for every turn checkpoint, negating the core intent of this PR. The condensation paths (extractSessionData and extractSessionDataFromLiveTranscript) correctly use extractCheckpointPrompts, but the finalization path was not updated to match.

Additional Locations (1)

cmd/entire/cli/strategy/manual_commit_condensation.go#L462-L465

cursor · 2026-03-02T04:33:19Z

cmd/entire/cli/strategy/manual_commit_condensation.go

+		if scoped == nil {
+			scoped = fullTranscript // fallback to full if offset beyond content
+		}
+	}


Missing nil fallback for Gemini/OpenCode in extractCheckpointPrompts

Low Severity

Both geminicli.SliceFromMessage and opencode.SliceFromMessage return (nil, nil) when checkpointStart exceeds the message count, but extractCheckpointPrompts only handles the err != nil case for these branches. The scoped variable stays nil, causing extractUserPrompts to receive an empty string and return nil prompts. By contrast, the JSONL default branch explicitly checks if scoped == nil and falls back to fullTranscript. This inconsistency means Gemini/OpenCode silently lose all prompts when the offset is stale, while JSONL agents preserve them.

Copilot AI review requested due to automatic review settings March 2, 2026 04:23

Copilot started reviewing on behalf of gtrrz-victor March 2, 2026 04:24 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

cursor bot reviewed Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove context.md and scope prompt.txt to checkpoint-only prompts#572

Remove context.md and scope prompt.txt to checkpoint-only prompts#572
gtrrz-victor wants to merge 1 commit intomainfrom
571-remove-contextmd-and-simplify-prompttxt-to-checkpoint-prompts-only

gtrrz-victor commented Mar 2, 2026

Uh oh!

cursor bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 2, 2026

Uh oh!

cursor bot Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

		// Extract prompts from the full transcript
		prompts := extractUserPrompts(state.AgentType, string(fullTranscript))

Conversation

gtrrz-victor commented Mar 2, 2026

Summary

What changed

Test plan

Uh oh!

cursor bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 2, 2026

Choose a reason for hiding this comment

Finalization overwrites checkpoint-scoped prompts with all session prompts

Uh oh!

cursor bot Mar 2, 2026

Choose a reason for hiding this comment

Missing nil fallback for Gemini/OpenCode in extractCheckpointPrompts

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

cursor bot commented Mar 2, 2026 •

edited

Loading