From aa33f2af20d865686aa0cd710d5da465f5f32786 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 25 Jun 2026 09:22:00 +0000 Subject: [PATCH 1/3] =?UTF-8?q?docs(substrate):=20=C2=A78=20strong=20form?= =?UTF-8?q?=20=E2=80=94=20the=20substrate=20as=20a=20full-stack=20compiler?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Capstone of the Substrate Unification Thesis (#607). §0–§7 read one node five ways; §8 reads the value slab itself in the key's algebra: - §8.1 homogeneous facet [H]: value = N×16B (part_of:is_a) facets via a layout-preserving ValueSchema::Homogeneous (no ENVELOPE_LAYOUT_VERSION bump). Conflation trap named: scalar facets carry PQ codes, gated on F-1. - §8.2 classid dual-dispatch [H]: one radix lookup → ReadMode (codec) + ClassView (schema); failure = table drift (I-LEGACY-API-FEATURE-GATED). - §8.3 LEGO [S]: EdgeBlock click across domains via shared OGAR codebook; CONJECTURE until PROBE-OGAR-ADAPTER-UNICHARSET. - §8.4 view layer [S]: ClassView→askama with Redmine as donor (FieldFormat / Query / CustomField → cell-renderer / lenses / customattribute); hybrid (static shell + dynamic codebook cells) dissolves the compile-time vs runtime seam; row/table = the 4th projection of the node. - §8.5 inherits §4's gates; §8-specific KILL = homogeneity non-closure. Doc-only, zero code. AGENT_LOG cont.⁴² prepended. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi --- .claude/board/AGENT_LOG.md | 4 + .../knowledge/substrate-unification-thesis.md | 158 ++++++++++++++++++ 2 files changed, 162 insertions(+) diff --git a/.claude/board/AGENT_LOG.md b/.claude/board/AGENT_LOG.md index eb5b8536..0289b415 100644 --- a/.claude/board/AGENT_LOG.md +++ b/.claude/board/AGENT_LOG.md @@ -1,3 +1,7 @@ +## 2026-06-24 (cont.⁴²) — strong form §8: the substrate as a full-stack compiler (thesis capstone, doc-only) + +**Main thread (Opus), operator-directed ("Holy Grail" → "continue") + cross-session feedback.** Appended **§8 "The strong form — the substrate as a full-stack compiler"** to `substrate-unification-thesis.md` (the doc shipped cont.⁴¹). §0–§7 read ONE node five ways; §8 asks: what if the VALUE SLAB itself is homogeneous in the key's algebra and `classid` is a schema pointer? Then the 512-byte node becomes a *compilation unit*: data → index → schema → view. **§8.1 homogeneous facet** `[H]` — carve value as N×16-byte facets, each a `(part_of:is_a)` cascade (`facet_classid(4) | 6×(8:8)=12`); a NEW `ValueSchema::Homogeneous` ALONGSIDE the existing `ValueTenant` columns → layout-preserving, no `ENVELOPE_LAYOUT_VERSION` bump (vs a key re-carve = canon-level). **Conflation trap named up front:** not every facet is part_of:is_a — scalars (susceptance/price/timestamp) aren't hierarchical, forcing them into 8:8 is the §1 split-error in reverse; honest form = scalar facets carry PQ codes, `facet_classid` discriminates codec-per-facet, **gated on F-1** (faithful centroids) + F-code (lossless). **§8.2 classid dual-dispatch** `[H]` — one radix lookup yields BOTH `classid→ReadMode` (codec: place⊕residue = Helix ⊕ CAM-PQ, the OGAR deterministic-phase/stored-magnitude split) AND `classid→ClassView` (schema: rails/AST/ERP, the OGAR `has_function`/`inherits_from` harvest); failure mode = drift between the two tables (`I-LEGACY-API-FEATURE-GATED` in spirit). **§8.3 LEGO** `[S]` — EdgeBlock click across domains via shared OGAR codebook (`canonical_concept_id`); compile = SPO manifest→ClassView, run = SoA under `UnifiedStep`/semiring; bounds: shared-concept lattice only (else adapter bricks at the membrane), structure⊥flow, core-gap extended not hacked; CONJECTURE until `PROBE-OGAR-ADAPTER-UNICHARSET` green. **§8.4 view layer** `[S]` — ClassView→askama, **Redmine as donor** (sharpened by other-session feedback): `Redmine::FieldFormat`→codebook-kind→cell-renderer map (partonomy tile→link/enum, value-quantile→number/gauge, identity→reference), `Query`/`QueryColumn`→ClassView→lenses→cells, `CustomField`/`CustomValue`→customattribute lens per classid. **The one real seam** = askama compile-time vs custom-fields runtime; three reconciliations (codegen / generic-renderer / **hybrid=the answer**: static type-safe shell + dynamic codebook cells = the `jinja<>dynamic classview` arrow), all inside the firewall (build-time codegen-from-manifest = sanctioned "compile types", medcare-rs Iron Rule 7). **Payoff closes the loop:** row/table = the **4th projection** of the node (next to 3D scene/graph/splat — `TorsoMap` three tenants→four); §0's "one object, N readings" reaches the screen. **§8.5** — §8 inherits §4's gates (8.1→F-1+F-code, 8.2→F-collapse, 8.3/8.4→OGAR core-first probe); §8-specific KILL = **homogeneity non-closure** (if facets are irreducibly heterogeneous, §8 reduces to "key is a schema pointer"). Honest line: engineering rungs (§2 axes, #605/#607) shipped & real; the full-stack-compiler reading is a coherent bet whose every load-bearing joint already has a named, un-run probe. Doc-only, zero code, no collision. Rides a fresh PR on jirak (#607 merged → jirak==main). + ## 2026-06-24 (cont.⁴¹) — north-star: Substrate Unification Thesis + falsification ladder (zoom-out, doc-only) **Main thread (Opus), operator-directed ("zoom out — you have a vast open horizon but look at the shoes").** Stopped picking the next probe; wrote the substrate's north-star as a falsifiable thesis so the four converging sessions share ONE map instead of four nail-hammers. NEW `.claude/knowledge/substrate-unification-thesis.md` (READ BY: any session touching canonical_node / cascade key / place-buffer / codecs / "substrate" proposals). **Thesis (§0):** one 512-byte node, read N ways, IS every classical layer at once (PK / index / retrieval / inference / measurement), all the same prefix-and-table arithmetic — historic if true, "merely fast" if not; the program is deciding which. **Reframes captured:** (1) verification = proof-of-code (lossless containment / exact ancestry), NOT calibration (ICC/Berry-Esseen apply to the continuous embedding underneath, not the deterministic address on top — the seam is the centroid boundary); (2) every "improve" reduced to split-a-conflated-axis-pair → the mandate is "find the orthogonal basis + prove each axis a faithful code." **Basis (§2):** identity (helix place, ICC→1.0) ⊥ structure (part_of:is_a) ⊥ dynamics (BF16 buffer, ICC 0.51) ⊥ truth (NARS↔SL↔Beta bijection) ⊥ composition (semiring=retrieval-IS-inference) — five readings of one node, each anchored to a built artifact / measured number / cited theorem (SDM=attention 2111.05498, GNN=semiring-DP 2203.15544, PQ 1102.3828, CogNGen 2204.00619 as counterweight). **Self-reference (§3):** the ketchup = observer=observed (AGI threshold + measurement hazard); fix = split frozen-ruler (identity) from live-rubber (dynamics), same move cognition makes. **Falsification ladder (§4, ordered, each with a KILL):** F-code (prove it's a code) → F-1 (4⁴ vs flat-256 fidelity) → F-collapse (does the address beat a learned index/head? — the deciding gate, CogNGen the live counterweight) → F-update (RUM re-class cost → product class) → F-basis (does the split-program close?). **§6 states what kills the whole thesis up front** (keeps "better substrate" ✓ separate from "collapses the stack" `[H]`). Honest: thesis `[H]`, per-axis instances individually graded; convergence across 4 sessions is the strongest *evidence* but must be tested adversarially (shared blind spots vs shared truth). Doc-only, zero code, no collision. Rides a PR on jirak. diff --git a/.claude/knowledge/substrate-unification-thesis.md b/.claude/knowledge/substrate-unification-thesis.md index fbbb4845..a569796f 100644 --- a/.claude/knowledge/substrate-unification-thesis.md +++ b/.claude/knowledge/substrate-unification-thesis.md @@ -180,6 +180,164 @@ baseline — not when an ICC clears 0.75. --- +## 8. The strong form — the substrate as a full-stack compiler `[H]`/`[S]` + +§0–§7 read *one* node five ways. The strong form asks: what if the **value +slab itself** is homogeneous in the same algebra as the key — and the `classid` +in the key is not just a router but a **schema pointer**? Then the 512-byte node +stops being a record and becomes a *compilation unit*: data → index → schema → +view, all from one self-describing block. This is the most ambitious reading; +it is graded `[H]` where it reuses shipped structure and `[S]` where it bets on +unbuilt tooling. **Nothing here is canon yet** — it is the north-star's far end, +written down so the rungs point somewhere. + +### 8.1 The homogeneous facet `[H]` — layout-preserving, not layout-breaking + +Carve the 480-byte value as **N × 16-byte facets**, each facet itself a +`(part_of:is_a)` cascade in the §2-Structure algebra: + +``` +facet (16 B) = facet_classid(4) | 6 × (8:8 part_of:is_a tile, 2 B each = 12) +value (480 B) = up to 30 homogeneous facets ← ValueSchema::Homogeneous +``` + +The key insight is **compatibility, not replacement**: this is a new +`ValueSchema::Homogeneous` variant *alongside* the existing `ValueTenant` SoA +columns — the 16/16/480 split (`canonical_node.rs`) is **untouched**, so it is +layout-preserving and needs no `ENVELOPE_LAYOUT_VERSION` bump (contrast a *key* +re-carve, which is canon-level and separate). The value's facets are read by +the *same* prefix-and-table arithmetic as the path tiers (§2), so "read the +value" becomes the same operation as "route the key." + +**The conflation trap, named up front (`[H]` gate):** not every facet is a +`part_of:is_a` mereology/taxonomy pair. Scalars (a susceptance, a price, a +timestamp) are *not* hierarchical — forcing them into an 8:8 tile is exactly +the §1 "split a conflated pair" error run in reverse. The honest form: scalar +facets carry **PQ codes** (Jégou [1102.3828]) in the same 16 bytes, and the +`facet_classid` discriminates codec-per-facet. **This is gated on F-1** +(codebook fidelity): if the centroid hierarchy isn't a faithful code, a PQ +facet is a lossy hash wearing an address's clothes, and the homogeneous slab +degrades to "compact but unfaithful." F-1 must be green before any scalar facet +ships. + +### 8.2 classid dual-dispatch `[H]` — one prefix, two resolutions + +The `classid(4)` already routes the codebook scope (OGAR canon: longest-prefix +binding). The strong form gives it a **second, parallel resolution** off the +same radix lookup: + +- **classid → ReadMode** (the *codec* axis): how to decode this node's value — + `place ⊕ residue` = Helix Place (identity, §2-Identity, #607) ⊕ CAM-PQ residue + (the scalar/centroid lanes). This is the **deterministic-place + stored- + magnitude** split the OGAR perturbation-encoding canon already pins (phase + deterministic, magnitude stored). `[H]` — the split is shipped in + perturbation-sim; the *per-classid* codec table is unbuilt. +- **classid → ClassView** (the *schema* axis): what this node's facets *mean* — + the class's field roster, edge roster, and method-resolution manifest + (OGAR `ClassView`, the `has_function`/`inherits_from`/`virtually_overrides` + SPO harvest). `[S]` — the harvest exists in OGAR; the lance-graph-side + ClassView read is a bet. + +One key, resolved once, yields *both* "how do I read these bytes" and "what do +these bytes mean." That co-resolution is the load-bearing claim — and its +honest failure mode is **drift between the two tables** (`I-LEGACY-API-FEATURE- +GATED` in spirit: same prefix, two semantics, must never silently diverge). + +### 8.3 LEGO across domains `[S]` — EdgeBlock click via shared codebook + +If two programs (an ERP, a OCR pipeline) mint nodes against the *same* OGAR +concept codebook, their `EdgeBlock` slots are **directly clickable**: an +out-of-family edge from domain A's node resolves, by `canonical_concept_id`, +into domain B's node — no adapter, no serialization, because both speak the one +codebook. "Compile on OGAR classes and do LEGO with class shapes" becomes: +**compile = SPO manifest → ClassView**; **run = SoA under `UnifiedStep` / +semiring** (§2-Composition). `[S]` — this is the OGAR core-first doctrine's +end state, explicitly CONJECTURE until `PROBE-OGAR-ADAPTER-UNICHARSET` is green. + +**Bounds (the doctrine's own fences, not optional):** the click only works on a +**shared-concept lattice** — domains that don't share concepts get adapter +bricks at the membrane, paying the cost explicitly (OGAR consumer-preflight). +**Structure ⊥ flow**: the EdgeBlock click composes *structure*; it does not +import domain B's *control flow* into A. And per core-first: a Core gap is +*extended deliberately*, never hacked into the adapter. + +### 8.4 The view layer `[S]` — ClassView → askama, with Redmine as donor + +The far rung: the `ClassView` schema drives a row/table view the way +Redmine/OpenProject's metadata→issue-list machinery does — except *generated +from the schema*, not hand-maintained over 17 years. The theft is sharp because +Redmine already solved **exactly** the ClassView-renderer problem, and three of +its classes map almost 1:1: + +| Redmine class | What it does | ClassView analogue | +|---|---|---| +| `Redmine::FieldFormat` | per-type cell formatter registry (string/int/float/date/enum/list/user/link…), each knows render + edit | **codebook-kind → cell-renderer map**: partonomy tile → link/enum cell, value-quantile tile → number/gauge cell, identity → reference cell | +| `Query` / `QueryColumn` | "given a model + its fields, produce a row" | `ClassView → lenses → cells` — literally | +| `CustomField` / `CustomValue` | runtime, per-model, arbitrary-type fields | the **customattribute lens** — consumer schema per `classid` | + +So it is not "port a UI" — it is lifting a **proven metadata→row architecture**. + +**The one real seam — and its resolution.** Askama is **compile-time** (type- +checked); Redmine's custom fields are **runtime**. Three reconciliations: + +- *codegen* — emit one Askama template per ClassView at `build.rs` time. + Compile-checked ("compile on OGAR classes"), but needs schemas at build. +- *generic renderer* — one Askama *shell* (table/row skeleton, compile-checked) + + data-driven cell formatters resolved from the codebook at runtime (the + FieldFormat move). Fully dynamic. +- **hybrid (the answer)** — static type-safe shell + dynamic per-tile cells: + Askama gives the skeleton, the codebook gives the per-tile semantics. This is + the `jinja <> dynamic classview` arrow, and it dissolves the seam rather than + picking a side. + +All three stay inside the firewall: build-time codegen from a manifest is the +sanctioned "compile types" pattern (medcare-rs Iron Rule 7 — *not* runtime +serialization), and the hybrid's runtime cell-resolution reads the codebook (a +compile-time contract), it does not deserialize a wire payload. + +**The payoff closes the loop:** a Redmine-style row/table is just a **4th +projection** of the same node — next to the 3D scene / graph / splat +(`TorsoMap`'s "three tenants of one identity" → four). One ClassView, every +view. That is §0's "one object, N readings" reaching all the way to the screen. + +**Bounds:** transfer the donor's *patterns* (FieldFormat / Query / CustomField +idioms), **not** the 17-year accretion (the legacy cruft is the anti-goal). +**Structure ⊥ presentation**: ClassView builds the *default* view; bespoke views +are **override hooks**, not schema edits. Realistic coverage is ~80% generic + +explicit overrides — claiming 100% schema-driven view is the overclaim this +bound exists to catch. `[S]` — askama is shipped and standard, Redmine's +architecture is proven; the ClassView→cell codegen is entirely unbuilt. + +### 8.5 What the strong form adds to the ladder + +§8 does not get its own KILL — it **inherits** §4's gates and sequences behind +them: + +- **8.1 (homogeneous facet)** is gated on **F-1** (scalar facets need faithful + centroids) and **F-code** (the facet cascade must be a lossless code, not a + lossy hash). +- **8.2 (dual-dispatch)** is gated on **F-collapse** — if a learned index/head + matches the address (§4.2, CogNGen counterweight), then classid-as-schema is + elegant packing, not a new primitive. +- **8.3 / 8.4 (LEGO, view)** are gated on the **OGAR core-first probe** + (`PROBE-OGAR-ADAPTER-UNICHARSET`) — adapter parity must go byte-green before + cross-domain click or schema-driven view is more than a slogan. + +**What would kill the strong form specifically** (beyond §6): if facets turn out +to be *irreducibly heterogeneous* — every class needs a bespoke value layout and +no homogeneous 16-byte cascade fits — then 8.1 collapses and §8 reduces to "the +key is a schema pointer" (still useful, far less than claimed). That is the +8-specific entry on §6's ledger: **homogeneity non-closure** is to §8 what +F-basis non-closure is to the whole thesis. + +The honest one-line summary of §8: **the engineering rungs (§2 axes, #605, #607) +are real and shipped; the full-stack-compiler reading is a coherent bet whose +every load-bearing joint already has a named, un-run probe.** It earns a place +in the thesis precisely because it is falsifiable end-to-end, not because it is +proven. + +--- + ## Cross-references - `canonical_node.rs` — the 512B node (key/edges/value, ValueTenant/ValueSchema). From 0a3fe05819f4c87f2ca93211954390482acde4e7 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 25 Jun 2026 09:43:42 +0000 Subject: [PATCH 2/3] =?UTF-8?q?docs(plan):=20SoA=20value-tenant=20migratio?= =?UTF-8?q?n=20v1=20=E2=80=94=20harvest=20brief=20+=205+3=20sign-off?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Not the migration — the orientation that produces it. 4 sessions / 3 phases: P0 brief (this) → P1 one harvest session returns the per-tenant inventory → P2 two independent 5+3 sessions sign off (convergence = signal). - §2 WHERE to read, read-not-grep: canonical_node.rs (ground truth) → contract SoA columns + SoaEnvelope/#477 → cascade_key/place_buffer/INERTIA_SLOT prototypes → producers AND consumers (BBB pullers + q2 new_v2 blocker) → LATEST_STATE Contract Inventory. Pointers are leads, not facts. - §3 per-tenant gates: layout-preserving vs canon bump; I-LEGACY-API-FEATURE- GATED + field-isolation matrix; the conflation trap (scalars aren't hierarchical → PQ, gated on F-1); iron rules; no hot-path serialization. - §4 inventory schema. §5 migration body left a skeleton (additive ValueSchema::Homogeneous first; KEEP the irreducibly-heterogeneous = the honest §8.5 outcome). §6 names the 5 savants + 3 reviewers. INTEGRATION_PLANS prepended. Doc-only, zero code. Cross-ref thesis §8. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi --- .claude/board/INTEGRATION_PLANS.md | 23 ++ .../plans/soa-value-tenant-migration-v1.md | 248 ++++++++++++++++++ 2 files changed, 271 insertions(+) create mode 100644 .claude/plans/soa-value-tenant-migration-v1.md diff --git a/.claude/board/INTEGRATION_PLANS.md b/.claude/board/INTEGRATION_PLANS.md index c2edf0ec..3c7abaf8 100644 --- a/.claude/board/INTEGRATION_PLANS.md +++ b/.claude/board/INTEGRATION_PLANS.md @@ -1,3 +1,26 @@ +## 2026-06-24 — soa-value-tenant-migration-v1 (BRIEF; harvest-pending) + +Plan: `.claude/plans/soa-value-tenant-migration-v1.md`. NOT the migration — +the orientation + **harvest brief** + **5+3 sign-off protocol** that produces it. +4 sessions / 3 phases: P0 (this brief) → P1 (**1 harvest session** returns the +inventory: per-tenant name / def_site / offset-width / axis / codec / producers / +consumers / migration_class / layout-preserving? / conflation_risk / gate / +board_status, + an honest completeness note) → P2 (**2 independent 5+3 sessions** +sign off, convergence=signal). §2 names WHERE to read (canonical_node.rs ground +truth → contract SoA columns + SoaEnvelope/#477 → cascade_key/place_buffer/ +INERTIA_SLOT prototypes → producers AND consumers incl. BBB pullers + q2 new_v2 +blocker → LATEST_STATE Contract Inventory) under the **read-not-grep** discipline +(grep locates, full Read comprehends; pointers are leads, not facts). §3 = the +per-tenant gates (layout-preserving vs canon-level bump; I-LEGACY-API-FEATURE-GATED ++ field-isolation matrix; the conflation trap incl. scalars-aren't-hierarchical→PQ +gated on F-1; iron rules; no hot-path serialization; producers⊥consumers). §5 +migration body left a SKELETON (additive `ValueSchema::Homogeneous`+`FacetCascade` +first → homogenize part_of:is_a tenants → PQ-code scalars after F-1 → KEEP the +irreducibly-heterogeneous = the honest §8.5 homogeneity-non-closure outcome). §6 +5+3 = savants truth-architect/iron-rule-savant/dto-soa-savant/baton-handoff-auditor/ +container-architect + reviewers brutally-honest-tester/overclaim-auditor/firewall-warden. +Doc-only. Rides PR on jirak. Cross-ref: substrate-unification-thesis §8. + ## 2026-06-23 — integration-actionhandler-rbac-orchestration-v1 (PLAN; shipped) Plan: `.claude/plans/integration-actionhandler-rbac-orchestration-v1.md`. The diff --git a/.claude/plans/soa-value-tenant-migration-v1.md b/.claude/plans/soa-value-tenant-migration-v1.md new file mode 100644 index 00000000..554895eb --- /dev/null +++ b/.claude/plans/soa-value-tenant-migration-v1.md @@ -0,0 +1,248 @@ +# SoA Value-Tenant Migration — Plan v1 (harvest brief + 5+3 sign-off) + +> **Status:** BRIEF (2026-06-24). This is NOT the migration — it is the +> orientation + harvest brief + sign-off protocol that *produces* the +> migration. The plan body (§5) is a skeleton filled only after the harvest +> session returns the inventory (§4) and both 5+3 panels (§6) sign off. +> +> **Authoring honesty:** every file/type pointer in §2 is named from +> in-context canon (CLAUDE.md, `substrate-unification-thesis.md`, the node +> canon, the iron rules) — **not** from a verified read this session. The +> harvest session's first job is to **Read each one fully and confirm or +> correct the pointer.** A pointer here is a lead, never a fact. + +--- + +## 0. Why a brief instead of a plan + +You cannot migrate value tenants you haven't *read*. The failure mode this +brief exists to prevent: a session greps `ValueTenant`, sees 9 hits, writes a +plan against the 9 hits, and misses the tenant that's constructed through a +`From` impl in a consumer crate, or the one whose bytes are reclaimed under a +feature flag. The migration's correctness is bounded by the inventory's +completeness, and the inventory's completeness is bounded by **read-depth**. + +So the work is **4 sessions, 3 phases**: + +| Phase | Session(s) | Output | +|---|---|---| +| 0 — orientation | this doc | the brief (§2–§4 + §6) | +| 1 — harvest | **1 session** | the inventory (§4 schema), filled | +| 2 — sign-off | **2 sessions**, each a 5+3 | two independent LAND/HOLD/REJECT verdicts on the filled plan | + +The two sign-off panels are run **independently** (diverse-redundancy, the +medcare MySQL-witness pattern): convergence between them is signal, divergence +names the real seam. The main thread reconciles and only then writes §5 for +real. + +--- + +## 1. What a "value tenant" is (the object of the migration) + +The canonical node is `key(16) | edges(16) | value(480)` = 512 B +(`canonical_node.rs`, operator-LOCKED, RESERVE-DON'T-RECLAIM). A **value +tenant** is a typed claim on some of those 480 value bytes — the `ValueSchema` +says how the slab is carved, a `ValueTenant` is one carve. The §8 strong form +(`substrate-unification-thesis.md`) proposes an *additive* `ValueSchema:: +Homogeneous` (N × 16-byte `(part_of:is_a)` facets) **alongside** the existing +tenants. The migration question for each existing tenant is therefore: +**KEEP as-is / homogenize to a facet / PQ-code it / deprecate** — and whether +that move is layout-preserving. + +--- + +## 2. Where to look — READ these, do not grep-to-conclude + +> **Read-discipline (WoA L40 / lance-graph reading-ladder):** `grep`/`sed`/ +> `tail`/`head` are **locators**, never comprehension. Use Grep to *find* the +> type; use **Read on the whole file** to understand it. For any file >2000 +> lines, multiple Reads with offset/limit covering the *entire* relevant +> region — never a single snippet. Declare `depth=full` only with a real read +> behind it (proof-of-read: file + ~3 section names you can cite). + +The harvest must Read, fully, in this order: + +1. **`canonical_node.rs`** (lance-graph core) — the authoritative layout. + `NodeGuid` / `EdgeBlock` / `NodeRow`, the `const _` 16/16/512 size asserts, + and **every** `ValueTenant` / `ValueSchema` definition + its byte offsets. + This file is the ground truth; everything else is a consumer of it. + +2. **`lance-graph-contract`** — the zero-dep type crate. The four BindSpace SoA + columns (`FingerprintColumns` / `QualiaColumn` / `MetaColumn` / `EdgeColumn`, + PR #223) and any tenant/schema enums that consumers build against. The + `SoaEnvelope` + the three-tier model (`docs/architecture/soa-three-tier- + model.md`, PR #477 tombstone — zero-copy creation→tombstone, no inter-mailbox + serialization). `last_active_cycle` (the renamed consumption stamp). + +3. **The dynamics tenant** — `INERTIA_SLOT` and the #509/#511/#513 arc + (`SoaMemberSpec` calibration). The BF16 buffer / impulse-permeability axis + (`perturbation-sim/src/place_buffer.rs`, #607). + +4. **The structure + identity tenants** — `perturbation-sim/src/cascade_key.rs` + (#605): V1/V2 spatial + V3 `(part_of:is_a)` 8:8 tile; and `helix_place` + (identity, #607). These are the *prototype* tenants the §8 facet generalizes. + +5. **Every producer/consumer that constructs or reads a tenant** — this is the + part a grep-only pass misses. Read (or have an Explore sub-agent map, then + Read the hits): + - `lance-graph-supervisor`, `symbiont`, `lance-graph-planner` (in-workspace). + - The BBB consumers that pull tenants by `*Port::class_id` / + `canonical_concept_id`: **smb-office-rs**, **medcare-rs**, **woa-rs**. + (Per the BBB barrier — they must *pull*, never construct a `*Bridge` or + copy the codebook; the harvest confirms they don't.) + - The q2 / OGAR consumers flagged in cont.³⁹ (e.g. q2 `osint-bake/fma.rs` + `NodeGuid::new_v2(...)` — a 7-group API that does **not** exist in + `canonical_node`; an `I-LEGACY-API-FEATURE-GATED` live blocker to record, + not silently fix). + +6. **The board registry** — `.claude/board/LATEST_STATE.md` § *Current Contract + Inventory*. This is the workspace's own list of what types exist; a tenant in + the code but not on the board (or vice-versa) is itself a finding. + +--- + +## 3. What to ALWAYS pay attention to (the gates every tenant must clear) + +For each tenant the harvest touches, hold these in mind — they are the +acceptance criteria the §6 panels will check: + +- **Layout class.** Is the proposed move *layout-preserving* (a new + `ValueSchema` variant alongside; offsets unchanged) or *layout-breaking* (the + 480 bytes re-carved)? Breaking ⇒ canon-level ⇒ needs `ENVELOPE_LAYOUT_VERSION` + bump **and** the operator's nod. The 16/16/480 split and the `const _` asserts + do not move without that. (RESERVE-DON'T-RECLAIM: a zeroed/unused region is + *not consulted*, never *compacted away*.) +- **`I-LEGACY-API-FEATURE-GATED`.** The same accessor name must never mean two + things under two feature flags. Any v1 accessor over bytes a v2 layout + reclaims ⇒ route through the canonical mapping OR feature-gate to a documented + no-op + migration pointer, **and** ship a *field-isolation matrix test* (write + each field, assert all others unchanged). Sprint-11 caught this 5×; expect + codex P1 to flag it. +- **The conflation trap** (`substrate-unification-thesis.md` §1 / §8.1). Does + the tenant fuse two axes that should be orthogonal? `part_of` ⊥ `is_a`; + location ⊥ impulse-permeability; **scalars are not hierarchical** — a + susceptance / price / timestamp forced into an 8:8 `(part_of:is_a)` tile is + the split-error run in reverse. Scalar tenants → PQ-code facet, **gated on + F-1** (codebook fidelity) + F-code (lossless containment), never a raw + homogenize. +- **Substrate iron rules.** `I-SUBSTRATE-MARKOV` (transition paths bundle, never + raw XOR); `I-VSA-IDENTITIES` (bundle *identities*, never content/CAM-PQ codes; + CAM-PQ is for search, VSA for bundling — separate tools). A homogenize that + superposes content violates this. +- **No serialization in the hot path** (ADR-022 / three-tier model). The tenant + is zero-copy from creation to Lance tombstone; the migration must not + introduce a serialize/deserialize step to change a carve. +- **Producers ⊥ consumers.** Every tenant has a write side and N read sides; + the migration is incomplete until *both* are accounted for. This is the + baton-handoff surface (cross-crate DTO match). + +--- + +## 4. What the harvest returns — the inventory schema + +One row per value tenant. The harvest fills this table and **nothing else** +(no migration code in Phase 1 — inventory only): + +| field | meaning | +|---|---| +| `name` | the tenant / `ValueSchema` variant | +| `def_site` | `file:line` of the definition (confirmed by Read, not grep) | +| `offset/width` | byte range within the 480 value bytes | +| `axis` | which §2 basis axis it serves — identity / structure / dynamics / truth / composition | +| `codec` | how the bytes decode — place⊕residue / PQ code / BF16 / raw / VSA | +| `producers` | crates that *write* it (`file:line`) | +| `consumers` | crates that *read* it (`file:line`), incl. BBB pull-sites | +| `migration_class` | **KEEP** / **homogenize-to-facet** / **PQ-code** / **deprecate** | +| `layout` | preserving / breaking (and the version-bump cost if breaking) | +| `conflation_risk` | does it fuse two axes? (yes → must split first) | +| `gate` | which falsifier must be green first — F-1 / F-code / F-collapse / operator-nod / none | +| `board_status` | present in LATEST_STATE Contract Inventory? (drift if not) | + +Plus a short prose **completeness note**: what the harvest could *not* confirm +by read (disk-walled crate, missing source, ambiguous From-impl) — named, not +hidden. Silent truncation reads as "covered everything" when it didn't. + +--- + +## 5. The migration body (SKELETON — written for real only after §4 + §6) + +Direction, fixed; specifics, deferred to the filled inventory: + +1. **Additive contract first.** Land `ValueSchema::Homogeneous` + a + `FacetCascade` type (`facet_classid(4) | 6×(8:8)=12` = 16 B) as a *new + variant alongside* `ValueTenant` — layout-preserving, no version bump. Behind + a feature flag. Field-isolation matrix tests from day one. +2. **Per-tenant, in migration_class order:** + - `homogenize-to-facet` tenants (those that already ARE `part_of:is_a`) move + first — lowest risk, prototype is `cascade_key` V3. + - `PQ-code` tenants (scalars) move only after **F-1 is green** (ndarray-side). + Until then they stay `KEEP`. + - `KEEP` tenants that are irreducibly heterogeneous stay `ValueTenant` — + this is expected and healthy (the §8.5 *homogeneity-non-closure* bound: + if everything resists homogenizing, §8 reduces to "key is a schema + pointer" and that's the honest finding, not a failure to force). + - `deprecate` tenants get the `I-LEGACY-API-FEATURE-GATED` treatment + (no-op + migration pointer + paired "corruption is observable" test). +3. **Consumer bump last**, per BBB barrier: consumers re-point to the new pull + API; no `*Bridge` construction. The q2 `new_v2` blocker is resolved *with the + operator*, not silently. +4. Every PR is doc-+-board-hygiene complete in the same commit (LATEST_STATE + Contract Inventory row, PR_ARC entry). + +--- + +## 6. The 5+3 sign-off (two independent sessions) + +Each sign-off session runs the OGAR 5+3 hardening pattern over the **filled** +plan (§4 inventory + §5 body): **5 research savants** produce object-level +findings, **3 brutally-honest reviewers** gate. Run the two sessions +independently; reconcile on the main thread. + +**5 research savants** (from the ensemble — match the axes the inventory touches): + +1. **`truth-architect`** — measurement-before-synthesis; is every migration_class + backed by a read/number, or is it a projection? Flags F-1-gated rows that + sneak forward. +2. **`iron-rule-savant`** — binary YIELDS/VIOLATES against the four iron rules + + AP catalogue. Any VIOLATES = auto-REJECT. +3. **`dto-soa-savant`** — does any tenant's move smuggle in a *new struct/trait/ + bridge* instead of a new SoA column/variant? (PR #223: capability lands as a + column, not a layer.) +4. **`baton-handoff-auditor`** — the producers⊥consumers surface: does each + tenant survive the cross-crate roundtrip after the carve change? CATCH-CRITICAL + / CATCH-LATENT / CLEAN. +5. **`container-architect`** (or `core-first-architect` for the OGAR/ClassView + seam) — the 480-byte layout itself: offsets, asserts, reserve-don't-reclaim, + version-bump accounting. + +**3 brutally-honest reviewers** (the gate): + +1. **`brutally-honest-tester`** — P0/P1/P2 ledger + binary LAND/HOLD/REJECT; + field-isolation matrix coverage is mandatory for any reclaim. +2. **`overclaim-auditor`** — every grade vs its evidence; "layout-preserving" + claimed without the asserts checked = flagged. +3. **`firewall-warden`** (or `dilution-collapse-sentinel`) — non-negotiables + (no hot-path serialization, no PII labels, BBB barrier) **and** that the + homogenize doesn't *dilute* a sharp tenant or *collapse* a distinct one into + a facet that loses its meaning. + +**Reconciliation rule:** the two panels' convergence is the strongest evidence; +where they disagree is the real seam → that tenant's row is re-opened, not +averaged. No tenant migrates on a single panel's say-so. + +--- + +## Cross-references + +- `substrate-unification-thesis.md` §1 (split-conflated-axes), §8.1 + (homogeneous facet, the conflation trap), §8.5 (homogeneity-non-closure KILL). +- `canonical_node.rs` — the 16/16/480 canon + `const _` asserts. +- `perturbation-sim/src/{cascade_key.rs (#605), place_buffer.rs (#607)}` — the + prototype identity/structure/dynamics tenants. +- Iron rules: `I-LEGACY-API-FEATURE-GATED` (the reclaim discipline + 5-instance + catalogue), `I-SUBSTRATE-MARKOV`, `I-VSA-IDENTITIES`. +- `docs/architecture/soa-three-tier-model.md` (PR #477) — zero-copy, no + inter-mailbox serialization. +- `.claude/agents/BOOT.md` — the savant ensemble + 5+3 pattern. +- `.claude/board/LATEST_STATE.md` § Contract Inventory — the type registry the + harvest cross-checks. From fdcc862092bb08bcd19b382fd7c8b0d9b8ed2850 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 25 Jun 2026 10:45:51 +0000 Subject: [PATCH 3/3] =?UTF-8?q?docs(substrate):=20fix=20=C2=A78=20ValueSch?= =?UTF-8?q?ema=20framing=20=E2=80=94=20ClassView=20reading,=20not=20a=20ne?= =?UTF-8?q?w=20variant=20(Codex=20P2=20#610)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex P2 + verified against canon: §8.1 and the SoA migration plan framed the homogeneous facet as a free additive ValueSchema::Homogeneous variant, which violates the #496 §0 anti-invention guardrail + #500 no-new-variant contract test (core-first-transcode-doctrine.md:72; genetics + OCR plans ride Full/ Compressed via classid → ClassView, never a new variant). - §8.1 reframed: the homogeneous facet is a classid → ClassView reading over an existing preset; ValueSchema enum + 16/16/480 layout both untouched. A new preset is an operator-lifted #496/#500 decision, never a thesis default. - migration plan §1/§3/§5: same correction; added the #500 guardrail as an explicit per-tenant gate; step 1 is now a ClassView reading, not a variant. - §8.3: softened 'no adapter, no serialization' — the per-port codebook lookup still mediates; only the bespoke A→B translation boundary disappears (CodeRabbit minor). - §8.1 fence given a 'text' language tag (markdownlint, CodeRabbit minor). - EPIPHANIES E-HOMOGENEOUS-FACET-IS-CLASSVIEW-NOT-VARIANT prepended. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi --- .claude/board/EPIPHANIES.md | 33 +++++++++++++ .../knowledge/substrate-unification-thesis.md | 40 ++++++++++------ .../plans/soa-value-tenant-migration-v1.md | 46 +++++++++++++------ 3 files changed, 90 insertions(+), 29 deletions(-) diff --git a/.claude/board/EPIPHANIES.md b/.claude/board/EPIPHANIES.md index 9b681655..e98c04dd 100644 --- a/.claude/board/EPIPHANIES.md +++ b/.claude/board/EPIPHANIES.md @@ -29,6 +29,39 @@ failure were caught and fixed without compounding drift. **Closes:** the 11-class cross-axis gap. **Next:** odoo-rs alignment_pin tighten + pin bump to OGAR main (5089c1e8 → 597ecb12 family). +## 2026-06-24 — E-HOMOGENEOUS-FACET-IS-CLASSVIEW-NOT-VARIANT — the §8 homogeneous facet is a `classid → ClassView` reading, NEVER a new `ValueSchema` enum variant + +**Status:** FINDING (caught by Codex P2 on PR #610, verified against canon). + +The thesis §8.1 + the SoA value-tenant migration plan v1 originally framed the +homogeneous-facet value as a *new additive* `ValueSchema::Homogeneous` variant +"alongside" the existing tenants, sold as layout-preserving + no-version-bump. +**That contradicts a locked guardrail.** Adding a `ValueSchema` enum case is a +*contract-surface* addition even when the byte layout is untouched, and the +canon forbids a plan/thesis from minting one: + +- `core-first-transcode-doctrine.md:72` — "`classid → ClassView` … **No new + layer, no new `ValueSchema` variant.**" +- `genetic-research-substrate-integration-v1.md:14-16` — the #496 §0 + anti-invention guardrail + the **#500 no-new-variant contract test**: + rows ride `Full` / `Compressed`; specialisation is via `classid → ClassView` + mint, not a variant. +- `ocr-canonical-soa-integration-v1.md:93-102` — "do **NOT** add a 5th + `ValueSchema::Ocr` … the rule that holds is **no new enum variant from a + plan**." + +**Corrected framing (both docs fixed in the same PR):** the homogeneous facet is +a **ClassView reading that interprets the value bytes of an existing preset** — +the `ValueSchema` enum AND the 16/16/480 layout both stay untouched, no +`ENVELOPE_LAYOUT_VERSION` bump, no #500 violation. IF a dedicated preset is ever +judged necessary, that is an **operator decision lifting #496/#500** with the +contract test updated in the same change — never a free additive a thesis +assumes. **Lesson:** "layout-preserving" ≠ "contract-surface-preserving"; an +enum-variant add is the latter even when the bytes don't move. The default for +any new value shape is a ClassView reading over `Full`/`Compressed`, not a new +`ValueSchema` case. (Same PR also softened the §8.3 "no adapter" claim — the +per-port codebook lookup still mediates; only the bespoke A→B *translation* +boundary disappears — and tagged a bare markdown fence per CodeRabbit.) ## 2026-06-23 — E-OGAR-API-EDIT-PULL-FIRST — API-based file edits MUST pull-then-splice; uploading pre-edited local files regresses upstream main diff --git a/.claude/knowledge/substrate-unification-thesis.md b/.claude/knowledge/substrate-unification-thesis.md index a569796f..46754d45 100644 --- a/.claude/knowledge/substrate-unification-thesis.md +++ b/.claude/knowledge/substrate-unification-thesis.md @@ -191,23 +191,34 @@ it is graded `[H]` where it reuses shipped structure and `[S]` where it bets on unbuilt tooling. **Nothing here is canon yet** — it is the north-star's far end, written down so the rungs point somewhere. -### 8.1 The homogeneous facet `[H]` — layout-preserving, not layout-breaking +### 8.1 The homogeneous facet `[H]` — a ClassView reading, NOT a new contract variant Carve the 480-byte value as **N × 16-byte facets**, each facet itself a `(part_of:is_a)` cascade in the §2-Structure algebra: -``` +```text facet (16 B) = facet_classid(4) | 6 × (8:8 part_of:is_a tile, 2 B each = 12) -value (480 B) = up to 30 homogeneous facets ← ValueSchema::Homogeneous +value (480 B) = up to 30 homogeneous facets ← read by classid → ClassView ``` -The key insight is **compatibility, not replacement**: this is a new -`ValueSchema::Homogeneous` variant *alongside* the existing `ValueTenant` SoA -columns — the 16/16/480 split (`canonical_node.rs`) is **untouched**, so it is -layout-preserving and needs no `ENVELOPE_LAYOUT_VERSION` bump (contrast a *key* -re-carve, which is canon-level and separate). The value's facets are read by -the *same* prefix-and-table arithmetic as the path tiers (§2), so "read the -value" becomes the same operation as "route the key." +**Guardrail (do not misread this as a free additive).** The canon forbids a +plan or thesis from minting a `ValueSchema` enum variant: the #496 §0 +anti-invention guardrail + the **#500 no-new-variant contract test**, and +`core-first-transcode-doctrine.md` ("`classid → ClassView` … **no new layer, +no new `ValueSchema` variant**"). The genetics and OCR plans both ride the +existing `Full` / `Compressed` presets and specialise via `classid → ClassView` +mint — *not* a new variant. So the homogeneous facet is **a ClassView reading +that interprets the value bytes of an existing preset**, never a +`ValueSchema::Homogeneous` enum case. Both the 16/16/480 layout +(`canonical_node.rs`) **and** the `ValueSchema` enum stay untouched; no +`ENVELOPE_LAYOUT_VERSION` bump, no contract-surface addition. The value's +facets are read by the *same* prefix-and-table arithmetic as the path tiers +(§2), so "read the value" becomes the same operation as "route the key." + +**If** a dedicated variant is ever judged necessary (a leaner homogeneous +preset the existing ones can't express), that is an explicit **operator +decision lifting the #496/#500 guardrail** — with the contract test updated in +the same change — never an additive a thesis assumes for free. **The conflation trap, named up front (`[H]` gate):** not every facet is a `part_of:is_a` mereology/taxonomy pair. Scalars (a susceptance, a price, a @@ -246,10 +257,11 @@ GATED` in spirit: same prefix, two semantics, must never silently diverge). ### 8.3 LEGO across domains `[S]` — EdgeBlock click via shared codebook If two programs (an ERP, a OCR pipeline) mint nodes against the *same* OGAR -concept codebook, their `EdgeBlock` slots are **directly clickable**: an -out-of-family edge from domain A's node resolves, by `canonical_concept_id`, -into domain B's node — no adapter, no serialization, because both speak the one -codebook. "Compile on OGAR classes and do LEGO with class shapes" becomes: +concept codebook, their `EdgeBlock` slots are **conceptually clickable through +the shared codebook**: an out-of-family edge from domain A's node resolves, by +`canonical_concept_id`, into domain B's node — the per-port codebook lookup +still mediates, but the extra *translation* boundary (a bespoke A→B adapter, a +serialization hop) disappears, because both already speak the one codebook. "Compile on OGAR classes and do LEGO with class shapes" becomes: **compile = SPO manifest → ClassView**; **run = SoA under `UnifiedStep` / semiring** (§2-Composition). `[S]` — this is the OGAR core-first doctrine's end state, explicitly CONJECTURE until `PROBE-OGAR-ADAPTER-UNICHARSET` is green. diff --git a/.claude/plans/soa-value-tenant-migration-v1.md b/.claude/plans/soa-value-tenant-migration-v1.md index 554895eb..9d208dc1 100644 --- a/.claude/plans/soa-value-tenant-migration-v1.md +++ b/.claude/plans/soa-value-tenant-migration-v1.md @@ -43,11 +43,14 @@ The canonical node is `key(16) | edges(16) | value(480)` = 512 B (`canonical_node.rs`, operator-LOCKED, RESERVE-DON'T-RECLAIM). A **value tenant** is a typed claim on some of those 480 value bytes — the `ValueSchema` says how the slab is carved, a `ValueTenant` is one carve. The §8 strong form -(`substrate-unification-thesis.md`) proposes an *additive* `ValueSchema:: -Homogeneous` (N × 16-byte `(part_of:is_a)` facets) **alongside** the existing -tenants. The migration question for each existing tenant is therefore: -**KEEP as-is / homogenize to a facet / PQ-code it / deprecate** — and whether -that move is layout-preserving. +(`substrate-unification-thesis.md`) reads the homogeneous facet as a +`classid → ClassView` *interpretation over an existing preset* (`Full` / +`Compressed`) — **not** a new `ValueSchema` enum variant (the #496 §0 +anti-invention guardrail + the #500 no-new-variant contract test forbid a plan +from minting one; see §3). The migration question for each existing tenant is +therefore: **KEEP as-is / homogenize via a ClassView facet reading / PQ-code it +/ deprecate** — and whether that move stays within the existing presets or +requires an operator-lifted guardrail. --- @@ -106,12 +109,21 @@ The harvest must Read, fully, in this order: For each tenant the harvest touches, hold these in mind — they are the acceptance criteria the §6 panels will check: -- **Layout class.** Is the proposed move *layout-preserving* (a new - `ValueSchema` variant alongside; offsets unchanged) or *layout-breaking* (the - 480 bytes re-carved)? Breaking ⇒ canon-level ⇒ needs `ENVELOPE_LAYOUT_VERSION` - bump **and** the operator's nod. The 16/16/480 split and the `const _` asserts - do not move without that. (RESERVE-DON'T-RECLAIM: a zeroed/unused region is - *not consulted*, never *compacted away*.) +- **Contract-surface class (the #500 guardrail).** A plan may **NOT** mint a + `ValueSchema` enum variant — that is a contract-surface addition against the + #496 §0 anti-invention guardrail, enforced by the **#500 no-new-variant + contract test** (`core-first-transcode-doctrine.md`: "`classid → ClassView` + … no new `ValueSchema` variant"; the genetics + OCR plans both ride `Full` / + `Compressed` and specialise via `classid → ClassView` mint). So the default + move is a **ClassView reading over an existing preset**, offsets and enum both + unchanged. A genuinely new preset is an **operator decision lifting #496/#500** + (with the contract test updated in the same change), never a plan default. +- **Layout class.** Even within the existing enum, is the byte carve + *preserving* (offsets unchanged) or *breaking* (the 480 bytes re-carved)? + Breaking ⇒ canon-level ⇒ needs `ENVELOPE_LAYOUT_VERSION` bump **and** the + operator's nod. The 16/16/480 split and the `const _` asserts do not move + without that. (RESERVE-DON'T-RECLAIM: a zeroed/unused region is *not + consulted*, never *compacted away*.) - **`I-LEGACY-API-FEATURE-GATED`.** The same accessor name must never mean two things under two feature flags. Any v1 accessor over bytes a v2 layout reclaims ⇒ route through the canonical mapping OR feature-gate to a documented @@ -168,10 +180,14 @@ hidden. Silent truncation reads as "covered everything" when it didn't. Direction, fixed; specifics, deferred to the filled inventory: -1. **Additive contract first.** Land `ValueSchema::Homogeneous` + a - `FacetCascade` type (`facet_classid(4) | 6×(8:8)=12` = 16 B) as a *new - variant alongside* `ValueTenant` — layout-preserving, no version bump. Behind - a feature flag. Field-isolation matrix tests from day one. +1. **ClassView reading first (no new contract variant).** Land the + `FacetCascade` *reading* (`facet_classid(4) | 6×(8:8)=12` = 16 B) as a + `classid → ClassView` interpretation over an existing preset (`Full` / + `Compressed`) — the `ValueSchema` enum and the 16/16/480 layout both + untouched, no version bump, no #500 violation. Behind a feature flag. + Field-isolation matrix tests from day one. (A dedicated `ValueSchema` preset + is out of scope for this plan — it is an operator-lifted #496/#500 decision, + filed separately if the existing presets prove insufficient.) 2. **Per-tenant, in migration_class order:** - `homogenize-to-facet` tenants (those that already ARE `part_of:is_a`) move first — lowest risk, prototype is `cascade_key` V3.