Skip to content

Commit d285936

Browse files
committed
docs(knowledge): codec/SoA facet map — speed & fidelity are separable, composable knobs
Capstone map of the PR #218 probe family (10 probes). The holy grail = ONE SoA where every facet composes for accuracy AND speed, established mechanistically: - No single vector subsumes the facets (Correction-6 / I-VSA-IDENTITIES boundary) — the representation is a STRUCT of orthogonal facets, one column per category (HHTL place, helix orientation, CAM-PQ position, CausalEdge64 relation+truth, rolling-floor episodic basin, residue value, EpisodicWitness64 time). - SPEED knob = the cascade (coarse→fine admissible prune + tiling + rolling floor + Morton order): 16–128× fewer full evals at recall 1.000 vs flat — lossless. - FIDELITY knob = the residue plane (coarse + 4-bit/SVD): ICC 0.97–0.99, 14× — +bytes. - They compose: cascade-prune the coarse code, residue-refine the survivors. Includes the per-facet table with measured numbers, the category-boundary iron rules, the reproducible probe inventory, and an explicit WHITE-PATCHES list (EpisodicWitness64 unprobed, end-to-end compose unbuilt, cam_pq_cascade_search/AMX-assign not wired, Cascade Welford-inert bug, real-COCA not run, full SoA assembly pending). https://claude.ai/code/session_01D2WSmezQBNC3bUdHuGfGmo
1 parent 8b034c3 commit d285936

1 file changed

Lines changed: 93 additions & 0 deletions

File tree

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# KNOWLEDGE: Codec / SoA Facet Map — speed and fidelity are separable knobs
2+
3+
## READ BY: truth-architect, family-codec-smith, palette-engineer, savant-architect,
4+
## cascade-architect, integration-lead, resonance-cartographer
5+
6+
## STATUS: probe-backed map (ndarray PR #218, 10 reproducible probes). The holy
7+
## grail = ONE SoA where every facet composes for accuracy AND speed.
8+
## Mechanism established; white patches listed at the bottom.
9+
10+
---
11+
12+
## The one-line thesis (measured this session)
13+
14+
**No single vector subsumes the others (Correction-6 / I-VSA-IDENTITIES category
15+
boundary). The unified representation is a STRUCT of orthogonal facets — one SoA
16+
column per category — and accuracy vs speed are TWO SEPARABLE KNOBS that compose:
17+
cascade-prune the coarse code (speed, lossless), then residue-refine only the
18+
survivors (fidelity, +bytes).**
19+
20+
---
21+
22+
## The facets — one SoA column per native category
23+
24+
| Facet (SoA column) | Codec | Category | Measured this session | Knob |
25+
|---|---|---|---|---|
26+
| place / semantic basin | HHTL (HEEL·HIP·TWIG) | hierarchical key | cascade prune (CLAM dfs-sieve 2.3×; CAM-PQ coarse→fine 16–128× lossless) | speed |
27+
| episodic basin | rolling floor (Belichtungsmesser / EWMA) | self-calibrating μ+3σ | ρ=1.0 tracking under SD drift; shipped global-Welford **inert** (bug) | speed/adaptivity |
28+
| position (high-D) | CAM-PQ | NN-recall position | recall ~0.66 vs truth; cascade-prunable losslessly (recall 1.0 vs flat) ||
29+
| orientation (phase+mag) | helix-48 | 3-DOF direction | 24-bit lossless vs ≤f16; needs +1 sign bit; ⊥ HHTL (ρ≈0); +13.6× recon ||
30+
| spatial perturbation | helix → Morton pyramid | parametric field | 32,768× amortized, on-demand exact at every level, fine-scale coherent | speed/memory |
31+
| relation + truth | CausalEdge64 (3×8 SPO + 2³ + f/c) | relational triple | SPO = 3× CAM-PQ palette + Pearl mask; entropy ρ=−0.78 reliability proxy ||
32+
| reliability / entropy | entropy_class → CausalEdge64 spare [63:61] | Staunen↔Wisdom scalar | nars_entropy validated as reliability proxy ||
33+
| value refinement | edge_codec CoarseResidue / turbovec | per-item residue | ICC 0.97–0.99, 14× error cut (vs coarse-only) | fidelity |
34+
| time / recurrence | EpisodicWitness64 | temporal | **NOT PROBED — white patch** ||
35+
36+
Bit budgets are the same order (≈6 bytes each) but the **domains differ** — the
37+
6-byte coincidence is why "one vector" is tempting and wrong.
38+
39+
## The two knobs (the holy-grail mechanism, measured)
40+
41+
- **SPEED = the cascade.** Coarse→fine prune (partial-ADC / HHTL lower bound is
42+
admissible) + 2×2/4×4 register-blocked LUT (FastScan/AMX `pshufb`) + Morton-order
43+
contiguity + rolling-floor adaptive cut. **16–128× fewer full evals at recall
44+
1.000 vs flat** (`campq_cascade_probe`). Lossless — adds no error.
45+
- **FIDELITY = the residue plane.** Coarse centroid + signed-4-bit / SVD residue.
46+
**ICC 0.97–0.99, 14×** error cut (`edge_codec_compare`). Adds bytes, not error.
47+
- **They compose, orthogonally:** prune the coarse code to a small survivor set,
48+
then residue-refine only those. Fast AND accurate, each from its own mechanism.
49+
This composition is the holy grail's load-bearing claim (each half measured;
50+
the end-to-end compose is a white patch — see below).
51+
52+
## The category boundary (the iron rule that kills the WRONG holy grail)
53+
54+
Per Correction-6 (`bf16-hhtl-terrain.md`) + I-VSA-IDENTITIES:
55+
- Do NOT float-reconstruct a byte register (bgz-hhtl-d on Qwen: cos~0.1, dead).
56+
- Do NOT squeeze a relation OR a high-D point into a 3-DOF helix (`codec_overlap_probe`:
57+
helix recall 0.245 vs CAM-PQ 0.657 on high-D; SPO is a different category entirely).
58+
- Do NOT measure a router by reconstruction fidelity (it routes; only calibration matters).
59+
- ⇒ The SoA stays a struct of facets; new capability = a new column, not a fold.
60+
61+
## The reproducible probe family (ndarray PR #218)
62+
63+
`reliability` (Pearson/Spearman/Cronbach/ICC) · `edge_codec` (coarse/residue/PQ) ·
64+
`entropy_ladder` (Staunen↔Wisdom). Probes: `edge_codec_compare`,
65+
`instrument_mtmm_probe`, `cakes_grail_probe`, `entropy_ladder_probe`,
66+
`helix_orthogonality_probe`, `helix_bitdepth_probe`, `morton_perturbation_probe`,
67+
`rolling_floor_probe`, `codec_overlap_probe`, `campq_cascade_probe`. Each settles a
68+
claim with a number; two found shipped bugs (Cascade Welford-inert; the bgz17 OOB
69+
gather, fixed).
70+
71+
## White patches on the map (unbuilt / unmeasured — be honest)
72+
73+
1. **EpisodicWitness64 / temporal facet** — referenced, never probed. Biggest gap.
74+
2. **End-to-end compose** — cascade-prune × residue-refine measured *separately*,
75+
never together as one `coarse→prune→refine` pipeline.
76+
3. **`cam_pq_cascade_search`** — probe-proven lossless, NOT wired into real `cam_pq.rs`.
77+
4. **AMX-accelerated CAM-PQ assignment** — proven pattern (`edge_residue_probe` 100%
78+
assign), not wired into `cam_pq.rs`.
79+
5. **`TD-CASCADE-WELFORD-INERT`** — shipped `Cascade::observe` never fires `ShiftAlert`
80+
per-sample (cumulative Δμ ≪ 2σ); needs windowed/EWMA. Found, not fixed.
81+
6. **Real COCA codebook** — every probe is synthetic-COCA-like (labeled); none run on
82+
the actual baked CAM index codebook.
83+
7. **Full SoA assembly** — facets validated individually; the unified SoA (all columns,
84+
one cascade sweep) is not assembled or measured end-to-end.
85+
8. **entropy_class → CausalEdge64 spare bits** (R2) — computed, not stored.
86+
9. **bf16-hhtl probe queue M1/M3/M4** — the routing-not-reconstruction versions, NOT RUN.
87+
88+
## Cross-refs
89+
90+
lance-graph: `.claude/knowledge/encoding-ecosystem.md` (encoding map),
91+
`.claude/knowledge/bf16-hhtl-terrain.md` (Correction chain incl. #6),
92+
`.claude/plans/entropy-ladder-spo-rung-v1.md` (R1–R6), `lance-graph-contract`
93+
(`CausalEdge64`, `EpisodicWitness64`, `EdgeCodecFlavor`, the BindSpace SoA columns).

0 commit comments

Comments
 (0)