|
| 1 | +# KNOWLEDGE: Codec / SoA Facet Map — speed and fidelity are separable knobs |
| 2 | + |
| 3 | +## READ BY: truth-architect, family-codec-smith, palette-engineer, savant-architect, |
| 4 | +## cascade-architect, integration-lead, resonance-cartographer |
| 5 | + |
| 6 | +## STATUS: probe-backed map (ndarray PR #218, 10 reproducible probes). The holy |
| 7 | +## grail = ONE SoA where every facet composes for accuracy AND speed. |
| 8 | +## Mechanism established; white patches listed at the bottom. |
| 9 | + |
| 10 | +--- |
| 11 | + |
| 12 | +## The one-line thesis (measured this session) |
| 13 | + |
| 14 | +**No single vector subsumes the others (Correction-6 / I-VSA-IDENTITIES category |
| 15 | +boundary). The unified representation is a STRUCT of orthogonal facets — one SoA |
| 16 | +column per category — and accuracy vs speed are TWO SEPARABLE KNOBS that compose: |
| 17 | +cascade-prune the coarse code (speed, lossless), then residue-refine only the |
| 18 | +survivors (fidelity, +bytes).** |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +## The facets — one SoA column per native category |
| 23 | + |
| 24 | +| Facet (SoA column) | Codec | Category | Measured this session | Knob | |
| 25 | +|---|---|---|---|---| |
| 26 | +| place / semantic basin | HHTL (HEEL·HIP·TWIG) | hierarchical key | cascade prune (CLAM dfs-sieve 2.3×; CAM-PQ coarse→fine 16–128× lossless) | speed | |
| 27 | +| episodic basin | rolling floor (Belichtungsmesser / EWMA) | self-calibrating μ+3σ | ρ=1.0 tracking under SD drift; shipped global-Welford **inert** (bug) | speed/adaptivity | |
| 28 | +| position (high-D) | CAM-PQ | NN-recall position | recall ~0.66 vs truth; cascade-prunable losslessly (recall 1.0 vs flat) | — | |
| 29 | +| orientation (phase+mag) | helix-48 | 3-DOF direction | 24-bit lossless vs ≤f16; needs +1 sign bit; ⊥ HHTL (ρ≈0); +13.6× recon | — | |
| 30 | +| spatial perturbation | helix → Morton pyramid | parametric field | 32,768× amortized, on-demand exact at every level, fine-scale coherent | speed/memory | |
| 31 | +| relation + truth | CausalEdge64 (3×8 SPO + 2³ + f/c) | relational triple | SPO = 3× CAM-PQ palette + Pearl mask; entropy ρ=−0.78 reliability proxy | — | |
| 32 | +| reliability / entropy | entropy_class → CausalEdge64 spare [63:61] | Staunen↔Wisdom scalar | nars_entropy validated as reliability proxy | — | |
| 33 | +| value refinement | edge_codec CoarseResidue / turbovec | per-item residue | ICC 0.97–0.99, 14× error cut (vs coarse-only) | fidelity | |
| 34 | +| time / recurrence | EpisodicWitness64 | temporal | **NOT PROBED — white patch** | — | |
| 35 | + |
| 36 | +Bit budgets are the same order (≈6 bytes each) but the **domains differ** — the |
| 37 | +6-byte coincidence is why "one vector" is tempting and wrong. |
| 38 | + |
| 39 | +## The two knobs (the holy-grail mechanism, measured) |
| 40 | + |
| 41 | +- **SPEED = the cascade.** Coarse→fine prune (partial-ADC / HHTL lower bound is |
| 42 | + admissible) + 2×2/4×4 register-blocked LUT (FastScan/AMX `pshufb`) + Morton-order |
| 43 | + contiguity + rolling-floor adaptive cut. **16–128× fewer full evals at recall |
| 44 | + 1.000 vs flat** (`campq_cascade_probe`). Lossless — adds no error. |
| 45 | +- **FIDELITY = the residue plane.** Coarse centroid + signed-4-bit / SVD residue. |
| 46 | + **ICC 0.97–0.99, 14×** error cut (`edge_codec_compare`). Adds bytes, not error. |
| 47 | +- **They compose, orthogonally:** prune the coarse code to a small survivor set, |
| 48 | + then residue-refine only those. Fast AND accurate, each from its own mechanism. |
| 49 | + This composition is the holy grail's load-bearing claim (each half measured; |
| 50 | + the end-to-end compose is a white patch — see below). |
| 51 | + |
| 52 | +## The category boundary (the iron rule that kills the WRONG holy grail) |
| 53 | + |
| 54 | +Per Correction-6 (`bf16-hhtl-terrain.md`) + I-VSA-IDENTITIES: |
| 55 | +- Do NOT float-reconstruct a byte register (bgz-hhtl-d on Qwen: cos~0.1, dead). |
| 56 | +- Do NOT squeeze a relation OR a high-D point into a 3-DOF helix (`codec_overlap_probe`: |
| 57 | + helix recall 0.245 vs CAM-PQ 0.657 on high-D; SPO is a different category entirely). |
| 58 | +- Do NOT measure a router by reconstruction fidelity (it routes; only calibration matters). |
| 59 | +- ⇒ The SoA stays a struct of facets; new capability = a new column, not a fold. |
| 60 | + |
| 61 | +## The reproducible probe family (ndarray PR #218) |
| 62 | + |
| 63 | +`reliability` (Pearson/Spearman/Cronbach/ICC) · `edge_codec` (coarse/residue/PQ) · |
| 64 | +`entropy_ladder` (Staunen↔Wisdom). Probes: `edge_codec_compare`, |
| 65 | +`instrument_mtmm_probe`, `cakes_grail_probe`, `entropy_ladder_probe`, |
| 66 | +`helix_orthogonality_probe`, `helix_bitdepth_probe`, `morton_perturbation_probe`, |
| 67 | +`rolling_floor_probe`, `codec_overlap_probe`, `campq_cascade_probe`. Each settles a |
| 68 | +claim with a number; two found shipped bugs (Cascade Welford-inert; the bgz17 OOB |
| 69 | +gather, fixed). |
| 70 | + |
| 71 | +## White patches on the map (unbuilt / unmeasured — be honest) |
| 72 | + |
| 73 | +1. **EpisodicWitness64 / temporal facet** — referenced, never probed. Biggest gap. |
| 74 | +2. **End-to-end compose** — cascade-prune × residue-refine measured *separately*, |
| 75 | + never together as one `coarse→prune→refine` pipeline. |
| 76 | +3. **`cam_pq_cascade_search`** — probe-proven lossless, NOT wired into real `cam_pq.rs`. |
| 77 | +4. **AMX-accelerated CAM-PQ assignment** — proven pattern (`edge_residue_probe` 100% |
| 78 | + assign), not wired into `cam_pq.rs`. |
| 79 | +5. **`TD-CASCADE-WELFORD-INERT`** — shipped `Cascade::observe` never fires `ShiftAlert` |
| 80 | + per-sample (cumulative Δμ ≪ 2σ); needs windowed/EWMA. Found, not fixed. |
| 81 | +6. **Real COCA codebook** — every probe is synthetic-COCA-like (labeled); none run on |
| 82 | + the actual baked CAM index codebook. |
| 83 | +7. **Full SoA assembly** — facets validated individually; the unified SoA (all columns, |
| 84 | + one cascade sweep) is not assembled or measured end-to-end. |
| 85 | +8. **entropy_class → CausalEdge64 spare bits** (R2) — computed, not stored. |
| 86 | +9. **bf16-hhtl probe queue M1/M3/M4** — the routing-not-reconstruction versions, NOT RUN. |
| 87 | + |
| 88 | +## Cross-refs |
| 89 | + |
| 90 | +lance-graph: `.claude/knowledge/encoding-ecosystem.md` (encoding map), |
| 91 | +`.claude/knowledge/bf16-hhtl-terrain.md` (Correction chain incl. #6), |
| 92 | +`.claude/plans/entropy-ladder-spo-rung-v1.md` (R1–R6), `lance-graph-contract` |
| 93 | +(`CausalEdge64`, `EpisodicWitness64`, `EdgeCodecFlavor`, the BindSpace SoA columns). |
0 commit comments