Current AI agent implementations organize conversation history as a linear sequence of messages. This creates several fundamental issues:
Full linear history fills the context window, requiring lossy compaction. Every debugging tangent, exploration, and dead-end remains in context even after the relevant work is complete.
Engineers think of work as a call stack - push a subtask, complete it, pop back to the parent. Linear transcripts do not reflect this natural structure of problem-solving.
When working on Task B, the full history of sibling Task A is unnecessarily included. A 50-message debugging session from Task A wastes tokens when you have moved on to Task B.
Task relationships (parent/child/sibling) are implicit rather than explicit. There is no way to reference "what we decided in the auth subtask" without it being somewhere in the linear history.
Organize agent context as a tree of frames rather than a linear chat log:
[Root Frame: "Build App"]
|
+---------------+---------------+
| |
[Frame A: Auth] [Frame B: API Routes]
(completed) (in progress)
| |
+----+----+ +-----+-----+
| | | |
[A1] [A2] [B1] [B2]
(done) (done) (in progress) (planned)
Each frame represents a discrete unit of work with its own goal, success criteria, and results. The full conversation history of each frame is preserved, but only relevant summaries cross-pollinate between frames.
- Push: Create a new child frame when starting a distinct subtask
- Pop: Return to parent frame when subtask completes (or fails/blocks)
Heuristics for when to push:
- Failure Boundary: Work that could be retried as a unit if it fails
- Context Switch: Different files, concepts, or subsystems
- Complexity Threshold: Work that benefits from its own summary
Every frame's complete history is saved to a log file. Nothing is truly lost - agents can browse previous frame logs if needed. This means compaction is additive context, not a replacement for lost information.
Each frame has immutable identity set at creation:
| Field | Description |
|---|---|
title |
Short name (2-5 words) - e.g., "User Authentication" |
successCriteria |
What defines "done" in concrete, verifiable terms |
successCriteriaCompacted |
Dense version for tree/context display |
The success criteria provide clear exit conditions and enable verification of completion.
When a frame completes, results are recorded:
| Field | Description |
|---|---|
results |
Detailed summary of what was accomplished |
resultsCompacted |
Dense version for context injection |
artifacts |
Files/resources produced |
decisions |
Key decisions made |
These results become the "handoff" to sibling and parent frames.
When working in Frame B1, the active context includes:
- B1's own working history (full conversation)
- Compaction of parent B (its successCriteria and any partial results)
- Compaction of grandparent Root (the overall goal)
- Compaction of completed sibling A (its results) - this is the cross-talk
- NOT included: The full linear history of A1, A2, or any deep exploration
This selective inclusion means the context window contains only what is relevant to the current work.
Context is structured as XML for parsing but contains prose for readability:
<stack-context id="root" status="in_progress">
<title>Build the application</title>
<success-criteria>Complete working app with auth and API</success-criteria>
<child id="A" status="completed">
<title>User Authentication</title>
<results>Implemented JWT-based auth with refresh tokens.
Created User model, auth middleware, login/logout routes.</results>
<artifacts>src/auth/*, src/models/User.ts</artifacts>
</child>
<child id="B" status="in_progress">
<title>API Routes</title>
<success-criteria>RESTful CRUD endpoints with pagination</success-criteria>
<child id="B1" status="in_progress">
<title>CRUD Endpoints</title>
<success-criteria>GET/POST/PUT/DELETE for resources</success-criteria>
<!-- Current working context -->
</child>
<child id="B2" status="planned">
<title>Pagination</title>
<success-criteria>Cursor-based pagination with limits</success-criteria>
</child>
</child>
</stack-context>Frames can exist in planned state before execution begins:
- Planned frames can have planned children (sketch B -> B1, B2, B3 before starting B)
- Plans are mutable - discoveries during execution can add/remove/modify planned frames
- When a frame is invalidated, all planned children cascade to invalidated
- This enables upfront planning without commitment
Frames move through a defined lifecycle:
| Status | Description |
|---|---|
planned |
Defined but not yet started |
in_progress |
Currently being worked on |
completed |
Successfully finished with results |
failed |
Attempted but did not succeed |
blocked |
Cannot proceed without external input |
invalidated |
No longer relevant (cascades to children) |
Both humans and agents can manage the frame tree:
- Human: Explicit commands (push, pop, plan, status)
- Agent: Autonomous decisions based on heuristics and context
- The system can suggest frame operations when patterns are detected
The crucial insight is structural separation of concerns:
| Linear History | Call Stack |
|---|---|
| Task A exploration -> Task A solution -> Task B exploration -> Task B solution | Task A's full history in Frame A, Task B's full history in Frame B, only compactions cross-pollinate |
| Context grows monotonically | Active context = current frame + ancestor compactions + sibling compactions |
| Compaction loses detail | Full logs persist, compaction is additive context not replacement |
| No retry boundary | Frame = natural retry/rollback unit |
| Implicit task relationships | Explicit parent/child/sibling structure |
An implementation of this specification requires:
- Frame State Manager: Track tree of frames, their status, and relationships
- Log Persistence Layer: Write full frame logs to disk for each frame
- Results Generator: Produce summaries when frames complete
- Context Assembler: Build active context from current frame + relevant ancestors and siblings
- Frame Controller: Handle push/pop/plan commands (human or agent-initiated)
- Plan Manager: Handle planned frames, activation, and cascade invalidation
- Token Budget Manager: Allocate context window space across ancestors, siblings, and current frame