From 25f8406f7d3936cb0fc7158615dee9cdbe7f6a84 Mon Sep 17 00:00:00 2001 From: James Ross Date: Mon, 30 Mar 2026 14:12:25 -0700 Subject: [PATCH] docs: improve security doc discoverability --- ARCHITECTURE.md | 3 +- CHANGELOG.md | 1 + CONTRIBUTING.md | 2 + README.md | 45 ++++-- WORKFLOW.md | 3 + docs/API.md | 6 + docs/BACKLOG/README.md | 1 - docs/DOCS_CHECKLIST.md | 1 + docs/archive/BACKLOG/README.md | 2 + ...-007-security-doc-discoverability-audit.md | 6 +- docs/design/README.md | 1 + ...-007-security-doc-discoverability-audit.md | 141 ++++++++++++++++++ docs/legends/TR-truth.md | 4 +- 13 files changed, 193 insertions(+), 23 deletions(-) rename docs/{ => archive}/BACKLOG/TR-007-security-doc-discoverability-audit.md (78%) create mode 100644 docs/design/TR-007-security-doc-discoverability-audit.md diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index f42c184..1f53ca4 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -4,7 +4,8 @@ This document is the high-level map of the shipped `git-cas` system. It is intentionally not a full API reference. For command and method details, see [docs/API.md](./docs/API.md). For crypto and security guidance, see -[SECURITY.md](./SECURITY.md). +[SECURITY.md](./SECURITY.md). For attacker models, trust boundaries, and +metadata exposure, see [docs/THREAT_MODEL.md](./docs/THREAT_MODEL.md). ## System Model diff --git a/CHANGELOG.md b/CHANGELOG.md index 907497a..d5170d4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -30,6 +30,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **Planning lifecycle clarified** — live backlog items now exclude delivered work, archive directories now hold retired backlog history and reserved retired design space, landed cycle docs use explicit landed status, and the design/backlog indexes now reflect current truth instead of stale activity. - **Architecture map repaired** — [ARCHITECTURE.md](./ARCHITECTURE.md) now describes the shipped system instead of an older flat-manifest-only model, including Merkle manifests, the extracted `VaultService` and `KeyResolver`, current ports/adapters, and the real storage layout for trees and the vault. - **Architecture navigation clarified** — [ARCHITECTURE.md](./ARCHITECTURE.md) now distinguishes the public package boundary from internal domain helpers and links directly to [docs/THREAT_MODEL.md](./docs/THREAT_MODEL.md) as adjacent truth. +- **Security doc discoverability improved** — [README.md](./README.md), [CONTRIBUTING.md](./CONTRIBUTING.md), [WORKFLOW.md](./WORKFLOW.md), [ARCHITECTURE.md](./ARCHITECTURE.md), [docs/API.md](./docs/API.md), and [docs/DOCS_CHECKLIST.md](./docs/DOCS_CHECKLIST.md) now link more directly to [SECURITY.md](./SECURITY.md) and [docs/THREAT_MODEL.md](./docs/THREAT_MODEL.md) so maintainers and agents can find the canonical security guidance from the docs they read first. - **GitHub Actions runtime maintenance** — CI and release workflows now run on `actions/checkout@v6` and `actions/setup-node@v6`, clearing the Node 20 deprecation warnings from GitHub-hosted runners. - **Ubuntu-based Docker test stages** — the local/CI Node, Bun, and Deno test images now build on `ubuntu:24.04`, copying runtime binaries from the official upstream images instead of inheriting Debian-based runtime images directly, and the final test commands now run as an unprivileged `gitstunts` user. - **Test conventions expanded** — `test/CONVENTIONS.md` now documents Git tree filename ordering, Docker-only integration policy, pinned integration `fileParallelism: false`, and direct-argv subprocess helpers. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e947bf6..a96cad7 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -321,6 +321,8 @@ Before making non-trivial changes, read: - [docs/BACKLOG/README.md](./docs/BACKLOG/README.md) - [docs/design/README.md](./docs/design/README.md) - [docs/design/0001-m18-relay-agent-cli.md](./docs/design/0001-m18-relay-agent-cli.md) +- [SECURITY.md](./SECURITY.md) +- [docs/THREAT_MODEL.md](./docs/THREAT_MODEL.md) - [docs/API.md](./docs/API.md) - [docs/RELEASE.md](./docs/RELEASE.md) - [COMPLETED_TASKS.md](./COMPLETED_TASKS.md) diff --git a/README.md b/README.md index e496dfc..5eeccd4 100644 --- a/README.md +++ b/README.md @@ -10,9 +10,9 @@ ### Git, freebased: pure CAS that’ll knock your SHAs off. LFS hates this repo. -Git isn’t source control. +Git isn’t source control. Git is a content-addressed object database. -We use the object database. +We use the object database. `git-cas` chunks files into Git blobs (dedupe for free), optionally encrypts them, and emits a manifest + a real Git tree so you can commit/tag/ref it like any other artifact. @@ -102,12 +102,16 @@ See [CHANGELOG.md](./CHANGELOG.md) for the full list of changes. ```js // Rotate a single recipient's key const rotated = await cas.rotateKey({ - manifest, oldKey: aliceOldKey, newKey: aliceNewKey, label: 'alice', + manifest, + oldKey: aliceOldKey, + newKey: aliceNewKey, + label: 'alice', }); // Rotate the vault passphrase (all entries, atomic commit) const { commitOid, rotatedSlugs, skippedSlugs } = await cas.rotateVaultPassphrase({ - oldPassphrase: 'old-secret', newPassphrase: 'new-secret', + oldPassphrase: 'old-secret', + newPassphrase: 'new-secret', }); ``` @@ -138,7 +142,10 @@ const manifest = await cas.storeFile({ // Add a recipient later (no re-encryption) const updated = await cas.addRecipient({ - manifest, existingKey: aliceKey, newRecipientKey: carolKey, label: 'carol', + manifest, + existingKey: aliceKey, + newRecipientKey: carolKey, + label: 'carol', }); // List / remove recipients @@ -167,7 +174,12 @@ See [CHANGELOG.md](./CHANGELOG.md) for the full list of changes. ```js const cas = new ContentAddressableStore({ plumbing, - chunking: { strategy: 'cdc', targetChunkSize: 262144, minChunkSize: 65536, maxChunkSize: 1048576 }, + chunking: { + strategy: 'cdc', + targetChunkSize: 262144, + minChunkSize: 65536, + maxChunkSize: 1048576, + }, }); ``` @@ -367,7 +379,8 @@ CLI flags always take precedence over `.casrc` values. - [Guide](./GUIDE.md) — progressive walkthrough - [API Reference](./docs/API.md) — full method documentation - [Architecture](./ARCHITECTURE.md) — hexagonal design overview -- [Security](./SECURITY.md) — crypto design and threat model +- [Security](./SECURITY.md) — cryptographic design, limits, and operational guidance +- [Threat Model](./docs/THREAT_MODEL.md) — trust boundaries, exposed metadata, and explicit non-goals ## When to use git-cas (and when not to) @@ -379,14 +392,14 @@ Use an **orphan branch**. Seriously. It's 5 git commands, zero dependencies, and That's git-cas. The orphan branch gives you none of: -| | Orphan branch | git-cas | -|---|---|---| -| **Encryption** | None — plaintext forever in history | AES-256-GCM + passphrase KDF + multi-recipient + key rotation | -| **Large files** | Bloats `git clone` for everyone | Chunked, restored on demand | -| **Dedup** | None | Chunk-level content addressing | -| **Integrity** | Git SHA-1 | SHA-256 per chunk + GCM auth tag | -| **Lifecycle** | `git rm` (still in reflog) | Vault with audit trail + `git gc` reclaims | -| **Compression** | None | gzip before encryption | +| | Orphan branch | git-cas | +| --------------- | ----------------------------------- | ------------------------------------------------------------- | +| **Encryption** | None — plaintext forever in history | AES-256-GCM + passphrase KDF + multi-recipient + key rotation | +| **Large files** | Bloats `git clone` for everyone | Chunked, restored on demand | +| **Dedup** | None | Chunk-level content addressing | +| **Integrity** | Git SHA-1 | SHA-256 per chunk + GCM auth tag | +| **Lifecycle** | `git rm` (still in reflog) | Vault with audit trail + `git gc` reclaims | +| **Compression** | None | gzip before encryption | ### "Why not Git LFS?" @@ -404,7 +417,7 @@ If your team uses GitHub and needs file locking + web UI previews, use LFS. If y ## License -Apache-2.0 +Apache-2.0 Copyright © 2026 [James Ross](https://github.com/flyingrobots) --- diff --git a/WORKFLOW.md b/WORKFLOW.md index e49f86a..728a73e 100644 --- a/WORKFLOW.md +++ b/WORKFLOW.md @@ -200,6 +200,9 @@ The minimum review must confirm: - `main` is the playback truth when docs and branches drift. - Doc-heavy branches should run [docs/DOCS_CHECKLIST.md](./docs/DOCS_CHECKLIST.md) before review. +- When a doc makes security or threat claims, link [SECURITY.md](./SECURITY.md) + and [docs/THREAT_MODEL.md](./docs/THREAT_MODEL.md) instead of creating a + second canonical narrative. - Human CLI/TUI and agent CLI are separate surfaces over one shared core. - The human `--json` surface and the agent JSONL surface are not the same contract. diff --git a/docs/API.md b/docs/API.md index 630c6e8..8232c8a 100644 --- a/docs/API.md +++ b/docs/API.md @@ -2,6 +2,11 @@ This document provides the complete API reference for git-cas. +For cryptographic design, nonce and KDF guidance, and security-relevant +implementation details, see [SECURITY.md](../SECURITY.md). For attacker models, +trust boundaries, exposed metadata, and explicit non-goals, see +[docs/THREAT_MODEL.md](./THREAT_MODEL.md). + ## Table of Contents 1. [ContentAddressableStore](#contentaddressablestore) @@ -789,6 +794,7 @@ The vault stores the KDF parameters (algorithm, salt, iterations) in This does not make `refs/cas/vault` itself confidential. The vault remains a readable slug-to-tree index for repository readers. See +[SECURITY.md](../SECURITY.md) for the cryptographic design details and [docs/THREAT_MODEL.md](./THREAT_MODEL.md) for the explicit boundary. This is not an implicit library-level `store()` or `restore()` behavior. diff --git a/docs/BACKLOG/README.md b/docs/BACKLOG/README.md index 961d264..24ad175 100644 --- a/docs/BACKLOG/README.md +++ b/docs/BACKLOG/README.md @@ -30,7 +30,6 @@ If the planning history is still useful, move it to Current backlog items: - [TR-005 — CasService Decomposition Plan](./TR-005-casservice-decomposition-plan.md) -- [TR-007 — Security Doc Discoverability Audit](./TR-007-security-doc-discoverability-audit.md) - [TR-008 — Empty-State Phrasing Consistency](./TR-008-empty-state-phrasing-consistency.md) - [TR-009 — Pre-PR Doc Cross-Link Audit](./TR-009-pre-pr-doc-cross-link-audit.md) - [TR-011 — Streaming Encrypted Restore](./TR-011-streaming-encrypted-restore.md) diff --git a/docs/DOCS_CHECKLIST.md b/docs/DOCS_CHECKLIST.md index 08cb437..7e2aa4e 100644 --- a/docs/DOCS_CHECKLIST.md +++ b/docs/DOCS_CHECKLIST.md @@ -57,6 +57,7 @@ This checklist is most useful when a change touches files like: - [CONTRIBUTING.md](../CONTRIBUTING.md) - [WORKFLOW.md](../WORKFLOW.md) - [ARCHITECTURE.md](../ARCHITECTURE.md) +- [SECURITY.md](../SECURITY.md) - [docs/API.md](./API.md) - [docs/THREAT_MODEL.md](./THREAT_MODEL.md) - [docs/BENCHMARKS.md](./BENCHMARKS.md) diff --git a/docs/archive/BACKLOG/README.md b/docs/archive/BACKLOG/README.md index 6aa035e..fc8b0e1 100644 --- a/docs/archive/BACKLOG/README.md +++ b/docs/archive/BACKLOG/README.md @@ -27,5 +27,7 @@ Landed archived backlog items: - landed as [TR-004 — Truth: Design Doc Lifecycle](../../design/TR-004-design-doc-lifecycle.md) - [TR-006 — Docs Maintainer Checklist](./TR-006-docs-maintainer-checklist.md) - landed as [TR-006 — Truth: Docs Maintainer Checklist](../../design/TR-006-docs-maintainer-checklist.md) +- [TR-007 — Security Doc Discoverability Audit](./TR-007-security-doc-discoverability-audit.md) + - landed as [TR-007 — Truth: Security Doc Discoverability Audit](../../design/TR-007-security-doc-discoverability-audit.md) - [TR-010 — Planning Index Consistency Review](./TR-010-planning-index-consistency-review.md) - landed as [TR-010 — Truth: Planning Index Consistency Review](../../design/TR-010-planning-index-consistency-review.md) diff --git a/docs/BACKLOG/TR-007-security-doc-discoverability-audit.md b/docs/archive/BACKLOG/TR-007-security-doc-discoverability-audit.md similarity index 78% rename from docs/BACKLOG/TR-007-security-doc-discoverability-audit.md rename to docs/archive/BACKLOG/TR-007-security-doc-discoverability-audit.md index 036af52..6fde1ac 100644 --- a/docs/BACKLOG/TR-007-security-doc-discoverability-audit.md +++ b/docs/archive/BACKLOG/TR-007-security-doc-discoverability-audit.md @@ -2,7 +2,7 @@ ## Legend -- [TR — Truth](../legends/TR-truth.md) +- [TR — Truth](../../legends/TR-truth.md) ## Why This Exists @@ -15,7 +15,7 @@ contains it. ## Target Outcome Audit the top-level doc surface and add or repair discoverability links to -[SECURITY.md](../../SECURITY.md) and [docs/THREAT_MODEL.md](../THREAT_MODEL.md) +[SECURITY.md](../../../SECURITY.md) and [docs/THREAT_MODEL.md](../../THREAT_MODEL.md) where they are materially relevant. ## Human Value @@ -30,7 +30,7 @@ of citing secondary or partial summaries. ## Linked Invariants -- [I-001 — Determinism, Trust, And Explicit Surfaces](../invariants/I-001-determinism-trust-and-explicit-surfaces.md) +- [I-001 — Determinism, Trust, And Explicit Surfaces](../../invariants/I-001-determinism-trust-and-explicit-surfaces.md) ## Notes diff --git a/docs/design/README.md b/docs/design/README.md index ba35ee0..13bd231 100644 --- a/docs/design/README.md +++ b/docs/design/README.md @@ -44,6 +44,7 @@ Landed cycle docs: - [TR-003 — Truth: Benchmark Baselines](./TR-003-benchmark-baselines.md) - [TR-004 — Truth: Design Doc Lifecycle](./TR-004-design-doc-lifecycle.md) - [TR-006 — Truth: Docs Maintainer Checklist](./TR-006-docs-maintainer-checklist.md) +- [TR-007 — Truth: Security Doc Discoverability Audit](./TR-007-security-doc-discoverability-audit.md) - [TR-010 — Truth: Planning Index Consistency Review](./TR-010-planning-index-consistency-review.md) Archived or retired cycle docs: diff --git a/docs/design/TR-007-security-doc-discoverability-audit.md b/docs/design/TR-007-security-doc-discoverability-audit.md new file mode 100644 index 0000000..64a924b --- /dev/null +++ b/docs/design/TR-007-security-doc-discoverability-audit.md @@ -0,0 +1,141 @@ +# TR-007 — Truth: Security Doc Discoverability Audit + +## Status + +Landed + +## Linked Legend + +- [TR — Truth](../legends/TR-truth.md) + +## Linked Invariants + +- [I-001 — Determinism, Trust, And Explicit Surfaces](../invariants/I-001-determinism-trust-and-explicit-surfaces.md) + +## Context + +`git-cas` already has canonical security and threat-model docs: + +- [SECURITY.md](../../SECURITY.md) +- [docs/THREAT_MODEL.md](../THREAT_MODEL.md) + +The remaining problem is discoverability. + +High-traffic docs still did not point readers to those canonical sources +consistently, which made the right guidance easy to miss even when it already +existed. + +## Human Users, Jobs, And Hills + +### Users + +- maintainers +- contributors reading the repo front door and contributor guidance +- operators evaluating security and metadata tradeoffs + +### Jobs + +- find the right security and threat guidance from the docs they already read +- distinguish cryptographic design from threat boundary and non-goals +- navigate to canonical guidance without hunting through the repo + +### Hill + +A maintainer or operator can reach the canonical security and threat-model docs +from the main architecture, API, README, and contributor surfaces without +guesswork. + +## Agent Users, Jobs, And Hills + +### Users + +- coding agents +- review agents +- documentation agents + +### Jobs + +- navigate directly to canonical security truth +- avoid citing partial or secondary summaries when reviewing or planning +- tell where crypto design guidance ends and threat-model guidance begins + +### Hill + +An agent can find the canonical security and threat-model docs from the repo's +high-traffic docs instead of inferring around missing links. + +## Human Playback + +- Can a reader reach the security and threat-model docs from the front door? +- Do the affected docs distinguish cryptographic design from threat boundary? +- Did this cycle improve navigation without creating duplicate narratives? + +## Agent Playback + +- Can an agent find [SECURITY.md](../../SECURITY.md) and + [docs/THREAT_MODEL.md](../THREAT_MODEL.md) from the docs most likely to be + read first? +- Can it tell which document to cite for crypto design versus trust boundaries? +- Do the new links reduce ambiguity rather than multiply summaries? + +## Explicit Non-Goals + +- no rewrite of the security or threat-model documents themselves +- no attempt to link every markdown file in the repo to security docs +- no duplicate security narrative where a targeted link is enough + +## Decisions + +### Link Canonical Security Docs From High-Traffic Surfaces + +This cycle should focus on the docs people and agents are most likely to read +first: + +- [README.md](../../README.md) +- [CONTRIBUTING.md](../../CONTRIBUTING.md) +- [WORKFLOW.md](../../WORKFLOW.md), only where security guidance is materially + relevant +- [ARCHITECTURE.md](../../ARCHITECTURE.md) +- [docs/API.md](../API.md) + +### Distinguish Security Design From Threat Boundary + +The links should not collapse `SECURITY.md` and `docs/THREAT_MODEL.md` into one +concept. They answer different questions and should stay easy to distinguish. + +### Keep The Change Surgical + +This is a discoverability cycle, not a new security-writing cycle. Add links and +short routing language where it materially improves navigation, and stop there. + +## Implementation Outline + +1. Add this cycle doc. +2. Audit the high-traffic docs for missing or misleading security/threat-model + links. +3. Add or repair the canonical links where they materially improve navigation. +4. Archive the consumed backlog card, update the Truth indexes, and record the + change in [CHANGELOG.md](../../CHANGELOG.md). + +## Tests To Write First + +No new executable tests. + +This is a documentation-truth cycle. Verification is: + +- direct cross-check of the affected top-level docs and their new links +- formatting validation for touched Markdown files +- whitespace and diff validation + +## Risks And Unknowns + +- too many links can turn into clutter if this pattern is overused later +- some relevant docs may still remain outside this bounded first-pass audit +- contributor reading habits may still favor README-only navigation + +## Retrospective + +This was the right next Truth cycle after the workflow and checklist work. + +The repo already had the security truth. The missing piece was getting readers +to it reliably from the docs they actually open first. diff --git a/docs/legends/TR-truth.md b/docs/legends/TR-truth.md index e47aa7e..20eee9f 100644 --- a/docs/legends/TR-truth.md +++ b/docs/legends/TR-truth.md @@ -76,12 +76,12 @@ Current Truth design docs: - [TR-003 — Truth: Benchmark Baselines](../design/TR-003-benchmark-baselines.md) - [TR-004 — Truth: Design Doc Lifecycle](../design/TR-004-design-doc-lifecycle.md) - [TR-006 — Truth: Docs Maintainer Checklist](../design/TR-006-docs-maintainer-checklist.md) +- [TR-007 — Truth: Security Doc Discoverability Audit](../design/TR-007-security-doc-discoverability-audit.md) - [TR-010 — Truth: Planning Index Consistency Review](../design/TR-010-planning-index-consistency-review.md) Current Truth backlog items: - [TR-005 — CasService Decomposition Plan](../BACKLOG/TR-005-casservice-decomposition-plan.md) -- [TR-007 — Security Doc Discoverability Audit](../BACKLOG/TR-007-security-doc-discoverability-audit.md) - [TR-008 — Empty-State Phrasing Consistency](../BACKLOG/TR-008-empty-state-phrasing-consistency.md) - [TR-009 — Pre-PR Doc Cross-Link Audit](../BACKLOG/TR-009-pre-pr-doc-cross-link-audit.md) - [TR-011 — Streaming Encrypted Restore](../BACKLOG/TR-011-streaming-encrypted-restore.md) @@ -94,7 +94,7 @@ Truth work under this legend is currently focused on: - publishing benchmark guidance that matches shipped behavior - evaluating service decomposition where the current boundary is under strain - improving documentation review hygiene through a shared maintainer checklist -- improving cross-link discoverability +- improving security doc discoverability from high-traffic repo surfaces - running planning-index consistency reviews and keeping empty-state language consistent over time - investigating lower-memory restore paths for encrypted and compressed assets