diff --git a/docs/reference/operator-control-plane.md b/docs/reference/operator-control-plane.md index b6cb7578..cf6000e9 100644 --- a/docs/reference/operator-control-plane.md +++ b/docs/reference/operator-control-plane.md @@ -154,6 +154,28 @@ The command does not require live Linear or GitHub observer access. It resolves local runs from the runtime database and can also perform a direct lookup when both `--run-id` and `--attempt` are supplied. +## Sparse Linear Updates + +Sparse Linear updates are expected. A healthy lane may have only a run-start record, +one or more phase-level progress projections, a PR handoff, and a terminal landing, +closeout, cleanup, or needs-attention record. The absence of detailed checkpoint text, +raw command output, heartbeat messages, token-pressure notes, or retry diagnostics in +Linear does not mean that evidence is missing. + +Interpret the surfaces in this order: + +1. Use `status`, the dashboard, or `diagnose --json` for current local ownership, + run ids, attempts, health, and private-evidence references. +2. Use `decodex evidence --run-id --attempt --json` for full + structured local evidence when the public summary is too terse. +3. Use logs only to explain process diagnostics such as startup failures, connector + backoff, or maintenance warnings. +4. Use Linear for public team-visible lifecycle state and collaboration context. + +Do not backfill Linear with private evidence just to make the issue history look like a +complete execution transcript. If a teammate needs a public update, write or wait for +the next allowlisted lifecycle summary instead of pasting local evidence payloads. + Worktree visibility follows the owning dashboard section: - `Running Lanes` means the runtime DB still has an active lease, active attempt, or diff --git a/docs/reference/test-suite.md b/docs/reference/test-suite.md index fbd2496b..5cb1abff 100644 --- a/docs/reference/test-suite.md +++ b/docs/reference/test-suite.md @@ -14,8 +14,8 @@ standards for keeping, merging, or deleting tests. ## Current Snapshot -This cleanup keeps 685 `nextest` tests plus one ignored live app-server test. Regenerate -the runnable inventory with: +This snapshot lists 805 `nextest` tests. The repo gate run for this inventory reported +805 passed tests and 1 skipped test. Regenerate the runnable inventory with: ```sh cargo nextest list --workspace --all-targets --all-features @@ -36,13 +36,13 @@ cargo nextest list --workspace --all-targets --all-features 2>/dev/null \ | Group | Count | Primary surfaces | Owns | | --- | ---: | --- | --- | -| Orchestrator | 359 | `apps/decodex/src/orchestrator/tests.rs`, `apps/decodex/src/orchestrator/tests/**/*.rs` | Intake, retry, review/landing, runtime cleanup, operator status, repo gates | +| Orchestrator | 405 | `apps/decodex/src/orchestrator/tests.rs`, `apps/decodex/src/orchestrator/tests/**/*.rs` | Intake, retry, review/landing, runtime cleanup, operator status, repo gates | | Tracker tool bridge | 85 | `apps/decodex/src/agent/tracker_tool_bridge/tests.rs`, `apps/decodex/src/agent/tracker_tool_bridge/tests/**/*.rs` | Dynamic tracker tools, continuation guards, review handoff writes, closeout writes | -| App-server protocol/runtime | 38 | `apps/decodex/src/agent/app_server/tests.rs`, `apps/decodex/src/agent/json_rpc.rs`, app-server protocol tests | JSON-RPC parsing, turn execution, dynamic tools, thread config, transport failures | -| Runtime state and locks | 39 | `state::tests`, `runtime::tests` | Persistent local state, lock ownership, runtime database contracts | -| Workflow and config parsing | 53 | `workflow::tests`, `config::tests` | `WORKFLOW.md`, project config, removed-field rejection, default policy | -| Git, worktree, and landing helpers | 93 | `worktree::tests`, `manual::tests`, `commit_message::tests`, `github::tests`, `default_branch_sync::tests`, `pull_request::tests` | Git/worktree behavior, manual landing, GitHub/PR helpers, commit-message policy | -| CLI, archive, and tracker integration | 18 | `cli::tests`, `archive_hygiene::tests`, `tracker::linear::tests` | User-facing commands, archive hygiene, direct Linear adapter behavior | +| App-server protocol/runtime | 59 | `apps/decodex/src/agent/app_server/tests.rs`, `apps/decodex/src/agent/json_rpc.rs`, app-server protocol tests | JSON-RPC parsing, turn execution, dynamic tools, thread config, transport failures | +| Runtime state, locks, and maintenance | 45 | `state::tests`, `runtime::tests`, `maintenance::tests` | Persistent local state, lock ownership, runtime database contracts, local retention | +| Workflow and config parsing | 44 | `workflow::tests`, `config::tests`, `codex_config::tests` | `WORKFLOW.md`, project config, Codex config edits, removed-field rejection, default policy | +| Git, worktree, landing, and recovery helpers | 108 | `worktree::tests`, `manual::tests`, `commit_message::tests`, `github::tests`, `default_branch_sync::tests`, `pull_request::tests`, `recovery::tests`, `git_credentials::tests` | Git/worktree behavior, manual landing, GitHub/PR helpers, recovery commands, commit-message policy | +| Account, CLI, archive, and tracker integration | 59 | `accounts::tests`, `agent::codex_accounts::tests`, `agent::decodex_tool_bridge::tests`, `app_bridge::tests`, `cli::tests`, `archive_hygiene::tests`, `tracker::*::tests` | User-facing commands, account pools, app bridge parsing, archive hygiene, direct tracker adapter and public-text behavior | ## Orchestrator Inventory @@ -53,46 +53,47 @@ large catch-all test file unless the behavior crosses several of these stages. | --- | ---: | --- | | `apps/decodex/src/orchestrator/tests/intake/workflow_reload.rs` | 4 | Workflow reload and cached policy snapshots | | `apps/decodex/src/orchestrator/tests/intake/eligibility.rs` | 7 | Intake eligibility and queue label safety | -| `apps/decodex/src/orchestrator/tests/intake/run_and_prompting.rs` | 38 | Prompt construction, machine-only redaction, run setup | +| `apps/decodex/src/orchestrator/tests/intake/run_and_prompting.rs` | 33 | Prompt construction, machine-only redaction, run setup | | `apps/decodex/src/orchestrator/tests/intake/prepare_issue_run.rs` | 10 | Worktree preparation and pre-run guards | -| `apps/decodex/src/orchestrator/tests/intake/candidate_selection.rs` | 24 | Candidate ordering, retained lane preference, closeout dispatch policy | -| `apps/decodex/src/orchestrator/tests/retry/scheduling.rs` | 28 | Retry timing, dry-run behavior, retry marker semantics | -| `apps/decodex/src/orchestrator/tests/retry/selection.rs` | 16 | Retry queue selection and blocked retry candidates | +| `apps/decodex/src/orchestrator/tests/intake/candidate_selection.rs` | 22 | Candidate ordering, retained lane preference, closeout dispatch policy | +| `apps/decodex/src/orchestrator/tests/retry/scheduling.rs` | 24 | Retry timing, dry-run behavior, retry marker semantics | +| `apps/decodex/src/orchestrator/tests/retry/selection.rs` | 14 | Retry queue selection and blocked retry candidates | | `apps/decodex/src/orchestrator/tests/runtime/repo_gate.rs` | 8 | Repo gate command selection, cleanliness, shell fallback, and failure classification | -| `apps/decodex/src/orchestrator/tests/runtime/failure.rs` | 35 | Failure comments, runtime credentials, cleanup, lease release | -| `apps/decodex/src/orchestrator/tests/recovery/reconciliation.rs` | 18 | Stale lease, recovery worktree, and reconciliation behavior | +| `apps/decodex/src/orchestrator/tests/runtime/failure.rs` | 30 | Failure comments, runtime credentials, cleanup, lease release | +| `apps/decodex/src/orchestrator/tests/recovery/reconciliation.rs` | 17 | Stale lease, recovery worktree, and reconciliation behavior | | `apps/decodex/src/orchestrator/tests/recovery/terminal_support.rs` | 0 | Shared retained recovery and closeout fixtures | -| `apps/decodex/src/orchestrator/tests/recovery/closeout/dispatch.rs` | 4 | Direct closeout dispatch and PR validation | -| `apps/decodex/src/orchestrator/tests/recovery/closeout/identity.rs` | 6 | Closeout identity reuse after retained runs | +| `apps/decodex/src/orchestrator/tests/recovery/closeout/dispatch.rs` | 5 | Direct closeout dispatch and PR validation | +| `apps/decodex/src/orchestrator/tests/recovery/closeout/identity.rs` | 4 | Closeout identity reuse after retained runs | | `apps/decodex/src/orchestrator/tests/recovery/closeout/cleanup.rs` | 6 | Retained closeout cleanup and cleanup blockers | | `apps/decodex/src/orchestrator/tests/recovery/terminal_failures.rs` | 8 | Terminal failure labeling and nonretryable attention | -| `apps/decodex/src/orchestrator/tests/recovery/runtime_reentry.rs` | 25 | Runtime reentry, recovered worktrees, liveness, and live-run recovery | +| `apps/decodex/src/orchestrator/tests/recovery/runtime_reentry.rs` | 27 | Runtime reentry, recovered worktrees, liveness, and live-run recovery | | `apps/decodex/src/orchestrator/tests/operator/status_support.rs` | 0 | Shared operator status fixtures | -| `apps/decodex/src/orchestrator/tests/operator/status/control_plane.rs` | 3 | Registered project control-plane rows | -| `apps/decodex/src/orchestrator/tests/operator/status/running_lanes.rs` | 22 | Running lanes, stalled lanes, active-run hydration, and local worktrees | -| `apps/decodex/src/orchestrator/tests/operator/status/history.rs` | 4 | Run ledger and Linear history hydration | -| `apps/decodex/src/orchestrator/tests/operator/status/text.rs` | 4 | Human-readable operator status text | -| `apps/decodex/src/orchestrator/tests/operator/status/publishing.rs` | 6 | Snapshot publishing, degraded observers, and tracker backoff | -| `apps/decodex/src/orchestrator/tests/operator/status/queue.rs` | 8 | Intake queue classifications and shared-claim visibility | -| `apps/decodex/src/orchestrator/tests/operator/status/http.rs` | 17 | Operator dashboard HTTP pages/assets, `/livez`, WebSocket control, and removed snapshot-route responses | -| `apps/decodex/src/orchestrator/tests/operator/status/dashboard.rs` | 33 | Dashboard client rendering contracts | +| `apps/decodex/src/orchestrator/tests/operator/status/control_plane.rs` | 5 | Registered project control-plane rows | +| `apps/decodex/src/orchestrator/tests/operator/status/running_lanes.rs` | 29 | Running lanes, stalled lanes, active-run hydration, and local worktrees | +| `apps/decodex/src/orchestrator/tests/operator/status/history.rs` | 6 | Run ledger and Linear history hydration | +| `apps/decodex/src/orchestrator/tests/operator/status/text.rs` | 8 | Human-readable operator status text | +| `apps/decodex/src/orchestrator/tests/operator/status/publishing.rs` | 7 | Snapshot publishing, degraded observers, and tracker backoff | +| `apps/decodex/src/orchestrator/tests/operator/status/queue.rs` | 10 | Intake queue classifications and shared-claim visibility | +| `apps/decodex/src/orchestrator/tests/operator/status/http.rs` | 21 | Operator dashboard HTTP pages/assets, `/livez`, WebSocket control, and removed snapshot-route responses | +| `apps/decodex/src/orchestrator/tests/operator/status/dashboard.rs` | 34 | Dashboard client rendering contracts | +| `apps/decodex/src/orchestrator/tests/operator/status/agent_evidence.rs` | 4 | Agent evidence snapshots and private evidence readback | | `apps/decodex/src/orchestrator/tests/review_landing/status_support.rs` | 0 | Shared Review & Landing status fixtures | -| `apps/decodex/src/orchestrator/tests/review_landing/status_rows.rs` | 18 | Review & Landing status rows and handoff lineage | -| `apps/decodex/src/orchestrator/tests/review_landing/orchestration.rs` | 12 | Review orchestration, admin merge, and repair routing | -| `apps/decodex/src/orchestrator/tests/review_landing/status_markers.rs` | 2 | Review orchestration marker handling and recovered targeted visibility | -| `apps/decodex/src/orchestrator/tests/review_landing/classification_review.rs` | 13 | Review repair, request-pending, stale handoff, merged PR classification | -| `apps/decodex/src/orchestrator/tests/review_landing/classification_checks.rs` | 15 | Required checks, GitHub token gates, GraphQL pagination/query shape | +| `apps/decodex/src/orchestrator/tests/review_landing/status_rows.rs` | 17 | Review & Landing status rows and handoff lineage | +| `apps/decodex/src/orchestrator/tests/review_landing/orchestration.rs` | 16 | Review orchestration, admin merge, and repair routing | +| `apps/decodex/src/orchestrator/tests/review_landing/status_markers.rs` | 1 | Review orchestration marker handling and recovered targeted visibility | +| `apps/decodex/src/orchestrator/tests/review_landing/classification_review.rs` | 12 | Review repair, request-pending, stale handoff, merged PR classification | +| `apps/decodex/src/orchestrator/tests/review_landing/classification_checks.rs` | 14 | Required checks, GitHub token gates, GraphQL pagination/query shape | | `apps/decodex/src/orchestrator/tests/review_landing/review_state.rs` | 2 | Pull-request review-state conversion from GitHub GraphQL nodes | ## Tracker Bridge Inventory | File | Count | Group | | --- | ---: | --- | -| `apps/decodex/src/agent/tracker_tool_bridge/tests/mutation/dispatch.rs` | 22 | Tool argument validation, state transitions, label mutations, closeout dispatch | +| `apps/decodex/src/agent/tracker_tool_bridge/tests/mutation/dispatch.rs` | 24 | Tool argument validation, state transitions, label mutations, closeout dispatch | | `apps/decodex/src/agent/tracker_tool_bridge/tests/mutation/continuation.rs` | 13 | Continuation-blocking writes and reactivation safety | -| `apps/decodex/src/agent/tracker_tool_bridge/tests/mutation/progress.rs` | 5 | Progress checkpoint comments and worktree path handling | +| `apps/decodex/src/agent/tracker_tool_bridge/tests/mutation/progress.rs` | 7 | Progress checkpoint comments and worktree path handling | | `apps/decodex/src/agent/tracker_tool_bridge/tests/review/policy.rs` | 22 | Internal-review stop policy, repair/writeback behavior, checkpoint handling | -| `apps/decodex/src/agent/tracker_tool_bridge/tests/review/handoff.rs` | 23 | Review handoff, repair complete, terminal finalize, closeout complete | +| `apps/decodex/src/agent/tracker_tool_bridge/tests/review/handoff.rs` | 19 | Review handoff, repair complete, terminal finalize, closeout complete | ## Keep Standards diff --git a/docs/reference/workspace-layout.md b/docs/reference/workspace-layout.md index 52c6f3ce..64f072f5 100644 --- a/docs/reference/workspace-layout.md +++ b/docs/reference/workspace-layout.md @@ -115,10 +115,21 @@ Runtime state that belongs to the local operator, not to this repository, lives `~/.codex/decodex/`: - `runtime.sqlite3` is the single-machine control-plane database for all registered - projects. + projects. It owns active leases, attempts, private execution events, tracker/PR + caches, retained PR state, retry state, and project registration. +- `agent-evidence//` stores local agent-readable diagnosis artifacts, + including `handoff-index.json`, `events.jsonl`, `blockers/*.json`, and + `runs///capsule.json`. This is a derived handoff view, not the + runtime source of truth and not a public mirror. - `accounts.jsonl` stores the optional shared ChatGPT account pool used for Codex app-server auth token injection and refresh. -- `logs/` stores Decodex process logs. +- `logs/` stores Decodex process logs. Logs are diagnostic text; structured execution + evidence belongs in `runtime.sqlite3`. +- `projects//project.toml` stores the central service config for one + registered project. +- `projects//WORKFLOW.md` stores that project's execution policy. +- Project discovery comes from explicit registration, not from scanning Codex history + or repo-local config files. Repo-local Radar history that belongs to the current checkout, not to Git, lives under `.decodex/`: @@ -128,14 +139,6 @@ Repo-local Radar history that belongs to the current checkout, not to Git, lives `.decodex/` is ignored by Git. Public curated artifacts and archive manifests remain in the checked-in tree. -- `agent-evidence//` stores local agent-readable diagnosis artifacts, - including `handoff-index.json`, `events.jsonl`, `blockers/*.json`, and - `runs///capsule.json`. -- `projects//project.toml` stores the central service config for one - registered project. -- `projects//WORKFLOW.md` stores that project's execution policy. -- Project discovery comes from explicit registration, not from scanning Codex history - or repo-local config files. This local control-plane state chooses registered projects. Once a checkout is selected, the matching project directory's `WORKFLOW.md` remains the execution contract for gates, diff --git a/docs/runbook/self-dogfood-pilot.md b/docs/runbook/self-dogfood-pilot.md index a46f2c83..cd8a03b9 100644 --- a/docs/runbook/self-dogfood-pilot.md +++ b/docs/runbook/self-dogfood-pilot.md @@ -527,45 +527,68 @@ wants to observe the self-bootstrap loop without reading source code. Worktrees`, the runtime DB-backed status view, the latest coarse Linear summaries, and the retained worktree named on the card. -### Local versus Linear state +### Local Evidence, Diagnostics, And Linear State -Use this boundary when the dashboard, retained worktrees, and Linear issue state disagree. +Use this boundary when the dashboard, retained worktrees, local evidence, logs, and +Linear issue state disagree. -Decodex stores runtime state in one SQLite database owned by the local Decodex -installation: +Decodex stores private runtime evidence in one SQLite database owned by the local +Decodex installation: - registered projects and config fingerprints - active leases, dispatch slots, run attempts, retry schedules, protocol events, and local `Run Ledger` attempt rows +- private execution events for full checkpoint payloads, verification notes, local head + evidence, and recovery details scoped by project, issue, run, and attempt - linked Git worktrees under `.worktrees/` plus shared Git administration under `.git/worktrees/*` -- short-lived worktree heartbeat markers such as `.decodex-run-activity` - current `status` and dashboard snapshots derived from the runtime DB, live process, retained worktrees, and low-frequency connector cache -Linear stores the collaboration surface that teammates and later machines can see: +Decodex writes derived local handoff evidence under +`~/.codex/decodex/agent-evidence//`. Use it to hand a repair agent a +compact `handoff-index.json`, blocker snapshots, run capsules, and a pointer to +`decodex evidence`. Do not treat those files as scheduling authority, GitHub/Linear +collaboration records, or a replacement for SQLite. + +Logs and `.decodex-run-activity` are diagnostic. Logs explain process failures and +maintenance warnings. The activity marker explains live child process and protocol +liveness. Neither surface is the structured private evidence ledger, and neither should +be pasted into Linear as execution history. + +Linear stores the public collaboration surface that teammates and later machines can +see: - issue state such as `Todo`, `In Progress`, `In Review`, and terminal states - Decodex control labels such as `decodex:queued:`, `decodex:active:`, `decodex:manual-only`, and `decodex:needs-attention` -- issue comments, including Linear execution ledger records for completed lane - outcomes, progress checkpoints, handoff notes, failure explanations, and human - replies -- issue description, attachments, linked documents, and PR references that provide - shared issue context +- Linear execution ledger comments for low-frequency lifecycle records such as + run-start, material progress phase, PR handoff, failure, landing, closeout, and + cleanup summaries +- issue description, attachments, linked documents, human comments, and PR references + that provide shared issue context Do not treat Linear comments as the real-time runtime backend. Fine-grained timing, -retry state, raw attempt history, agent activity, connector backoff, and recovery -details belong in the Decodex runtime DB. If teammates need to understand the -team-visible outcome of a completed lane, keep Linear issue state, labels, execution -ledger comments, attachments, or the PR link current at the coarse lifecycle boundary. +retry state, raw attempt history, full checkpoint text, agent activity, connector +backoff, private evidence payloads, and recovery details belong in runtime SQLite or +local diagnostic surfaces. Sparse Linear history is healthy when the public lifecycle +summary is current and the full evidence can be read locally. + +For recovery, start with the operator dashboard or `status`, then inspect private +evidence when the public summary is too terse: + +```sh +decodex status --json +decodex evidence XY-123 --run-id --attempt --json +``` -For recovery, start with the operator dashboard or `status`, then inspect Linear state -and comments for the team-visible lifecycle record. A retained worktree or runtime DB -recovery row means "inspect this local lane"; it does not mean the team-visible issue -state changed. Removing a short-lived heartbeat marker does not erase the runtime DB row -or the Linear summary. +Use `--include-payload` only for local repair. Do not paste full payloads into Linear or +GitHub. Inspect Linear state and comments after local readback when you need the +team-visible lifecycle record. A retained worktree or runtime DB recovery row means +"inspect this local lane"; it does not mean the team-visible issue state changed. +Removing a short-lived heartbeat marker does not erase the runtime DB row or the Linear +summary. Decodex is intentionally Unix-only, and the control plane relies on Unix file-descriptor inheritance when the parent process hands the project dispatch-slot lock to the spawned hidden `_attempt` child. @@ -620,16 +643,42 @@ After running manual commands from a lane, check `decodex project list` or the o ## Inspecting a failed run -Start with Linear: +Start with local runtime readback: + +```sh +cargo run -p decodex --bin decodex -- status +cargo run -p decodex --bin decodex -- status --json +``` + +Use the human-readable view when you need the current leased run, lane worktree +ownership, and session-history summary at a glance. Use `--json` when you want stable +identifiers such as `run_id`, `issue_id`, `thread_id`, `branch`, and +repository-relative `worktree_path`. + +If the status row or run capsule points to private evidence, inspect it before treating +the Linear summary as complete: + +```sh +cargo run -p decodex --bin decodex -- evidence XY-123 --run-id --attempt --json +``` + +Then inspect Linear for the public collaboration state: - check the issue state -- read the latest `decodex` comment for `run_id`, attempt number, timestamps, and next action +- read the latest Decodex ledger comment for public `run_id`, attempt number, + timestamps, phase, and next action - if retries were exhausted, look for the `decodex:needs-attention` label -- if the agent explicitly requested human attention, expect the issue to move back to `Todo` with `decodex:needs-attention` immediately instead of retrying -- any issue that still carries `decodex:needs-attention` is intentionally ineligible for another automatic run until a human clears that label -- if the failure comment says the label was unavailable on the team, expect the issue to remain in a non-startable guard state such as `In Progress` until a human moves it back to a startable state manually -- if the issue is already terminal, expect the worktree to disappear on the next live pass or startup reconciliation -- if the run failed as `stalled_run_detected`, expect the worktree to remain in place so you can inspect the partially completed lane before re-enabling automation +- if the agent explicitly requested human attention, expect the issue to move back to + `Todo` with `decodex:needs-attention` immediately instead of retrying +- any issue that still carries `decodex:needs-attention` is intentionally ineligible + for another automatic run until a human clears that label +- if the failure comment says the label was unavailable on the team, expect the issue to + remain in a non-startable guard state such as `In Progress` until a human moves it + back to a startable state manually +- if the issue is already terminal, expect the worktree to disappear on the next live + pass or startup reconciliation +- if the run failed as `stalled_run_detected`, expect the worktree to remain in place so + you can inspect the partially completed lane before re-enabling automation Parent repo-gate retry note: @@ -643,22 +692,13 @@ Parent repo-gate retry note: Linear reaches `Done` with the service queue/active labels and `decodex:needs-attention` cleaned up. -Then inspect the worktree mentioned in the comment: +Then inspect the worktree mentioned by status, private evidence, or the public ledger: ```sh git -C /absolute/path/to/hack-ink/decodex/.worktrees/XY-123 status --short git -C /absolute/path/to/hack-ink/decodex/.worktrees/XY-123 log --oneline --decorate -5 ``` -Before dropping to local storage internals, inspect the supported runtime surface: - -```sh -cargo run -p decodex --bin decodex -- status -cargo run -p decodex --bin decodex -- status --json -``` - -Use the human-readable view when you need the current leased run, lane worktree ownership, and session-history summary at a glance. Use `--json` when you want a machine-readable snapshot with stable identifiers such as `run_id`, `issue_id`, `thread_id`, `branch`, and repository-relative `worktree_path`. - The operator snapshot also exposes coarse liveness semantics so you do not have to infer progress from worktree file churn alone: - `phase = executing`: the lane is actively running @@ -682,7 +722,7 @@ For the running-lane fields: If you pass `--limit`, it only caps the recent-run section. Running lanes remain uncapped in both the human-readable and JSON status views so the currently leased lanes stay visible. -The runtime SQLite database is the supported recovery store, but operators should not debug it through ad hoc SQL first. If `status` is insufficient, inspect the tracker summary plus retained worktree lane directly: +The runtime SQLite database is the supported recovery store, but operators should not debug it through ad hoc SQL first. If `status` and `decodex evidence` are insufficient, inspect the public tracker summary plus retained worktree lane directly: 1. Read the Linear issue state, labels, comments, attachments, and linked PR for `XY-123`. 2. Inspect the retained worktree: @@ -692,7 +732,10 @@ The runtime SQLite database is the supported recovery store, but operators shoul git -C /absolute/path/to/hack-ink/decodex/.worktrees/XY-123 log --oneline --decorate -5 ``` -Use the operator dashboard or `status` for run ids, attempts, and failure class; use the retained worktree when the failure happened inside `app-server` transport or thread lifecycle rather than during repo gate commands. Linear should carry only the coarse team-visible failure summary. +Use the operator dashboard, `status`, and `decodex evidence` for run ids, attempts, +failure class, and private execution evidence. Use the retained worktree when the +failure happened inside `app-server` transport or thread lifecycle rather than during +repo gate commands. Linear should carry only the coarse team-visible failure summary. ## Re-running after failure diff --git a/docs/spec/agent-evidence.md b/docs/spec/agent-evidence.md index 847baf64..33306f07 100644 --- a/docs/spec/agent-evidence.md +++ b/docs/spec/agent-evidence.md @@ -31,6 +31,15 @@ evidence that should not be mirrored to Linear. Agent evidence files may point a toward the current runtime context, but they do not replace the private execution event store. +Boundary summary: + +| Surface | Role | +| --- | --- | +| Runtime SQLite private execution events | Authoritative structured local evidence for a run/attempt. | +| Agent evidence files | Derived local handoff index, blocker snapshots, run capsules, and event-stream breadcrumbs for repair agents. | +| Process logs | Diagnostic text for control-plane and process failures. Logs are not the private evidence ledger. | +| Linear execution ledger comments | Public lifecycle mirror. Agent evidence must not be mirrored there. | + ## Path Layout Agent evidence lives under the local Decodex home: diff --git a/docs/spec/linear-execution-ledger.md b/docs/spec/linear-execution-ledger.md index 0995a18d..ba7e24d4 100644 --- a/docs/spec/linear-execution-ledger.md +++ b/docs/spec/linear-execution-ledger.md @@ -93,6 +93,14 @@ fields. `failed_command` and `raw_error` are optional and must be omitted or rej when they contain host-local paths, credential-like names, private identity details, tokens, secrets, or other private runtime evidence. +Linear frequency is deliberately sparse. A lane should normally create one start +record, a new public progress projection only when the material public lifecycle signal +changes, PR/handoff records when review state changes, and terminal failure, landing, +closeout, or cleanup records at those coarse boundaries. Private-only updates such as a +new checkpoint focus, next action, evidence item, verification note, raw command output, +heartbeat, token pressure, or retry detail belong in runtime SQLite, agent evidence, or +diagnostic logs, not in another Linear comment. + ## Record envelope All field names are snake_case. diff --git a/docs/spec/runtime.md b/docs/spec/runtime.md index 166c58a7..783a7703 100644 --- a/docs/spec/runtime.md +++ b/docs/spec/runtime.md @@ -47,6 +47,16 @@ Defines: The runtime scope, source-of-truth boundaries, eligibility rules, lane runtime ownership. - The local SQLite database must not become a replacement for the human issue backlog. It is the operator control-plane state for this machine. +The evidence boundary is ordered from private runtime authority to public collaboration +mirror: + +| Surface | Boundary | +| --- | --- | +| Runtime SQLite `private_execution_events` | Structured private execution evidence for the local Decodex installation. This is where full checkpoint payloads, verification notes, local head evidence, and recovery detail belong. | +| Agent evidence under `~/.codex/decodex/agent-evidence//` | Derived local handoff view for repair agents. It may reference private evidence readback commands and compact run capsules, but it is not scheduling authority and is not a public mirror. | +| Logs under `~/.codex/decodex/logs/` and `.decodex-run-activity` | Diagnostic process and liveness signals. They may explain what a local process did, but they are not the structured execution ledger and must not be replayed as tracker state. | +| Linear execution ledger comments | Low-frequency public projection for team-visible lifecycle state. They carry coarse start, progress phase, PR, handoff, failure, landing, closeout, and cleanup summaries only. | + ### Operator snapshot recovery boundary Operator snapshots are local runtime views. They must remain useful when Linear is unavailable by reading the Decodex runtime SQLite database, retained worktrees, and locally cached connector state that already belong to this machine. diff --git a/docs/spec/tracker-tools.md b/docs/spec/tracker-tools.md index 22947de6..e39ac806 100644 --- a/docs/spec/tracker-tools.md +++ b/docs/spec/tracker-tools.md @@ -64,6 +64,23 @@ The follow-up MVP should support these issue-scoped operations: Additional operations such as richer metadata updates may be added later, but they are not required for the first PR-backed self-dogfood pilot. +## Private And Public Outputs + +Tracker tools must keep local execution evidence private by default: + +- `issue_progress_checkpoint` stores the full normalized checkpoint payload in + runtime SQLite `private_execution_events` before any Linear write is attempted. +- Its Linear output is only the public `progress_checkpoint` projection defined by + [`linear-execution-ledger.md`](./linear-execution-ledger.md). That projection is + keyed by public lifecycle signal, so private-only checkpoint changes do not create + another Linear comment. +- `issue_comment` is not a generic comment escape hatch. It accepts only allowlisted + public comment kinds, currently `manual_attention`, and Decodex renders the + corresponding Linear ledger record from structured public fields. +- Logs and `.decodex-run-activity` markers are diagnostic inputs for local operators. + They must not be copied into Linear through tracker tools and must not replace + private execution events. + ## Completion signal contract At turn completion, the issue-scoped tool bridge must leave `decodex` with exactly one terminal completion signal for the leased issue and a matching explicit terminal-finalization call: