BANANASJIM · BANANASJIM · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
@@ -0,0 +1,149 @@
+# Proposal: packed HDR format decode for remote texture export
+
+## Motivation
+
+Issue #236 named R11G11B10_FLOAT and R9G9B9E5_SHAREDEXP as formats to support.
+PR #237 (621528f) scoped them out: `_decode_texture_png` rejects all non-Regular
+`ResourceFormatType`s with `-32002 "format not supported for remote decode"`.
+
+Both formats are common HDR render-target and light-probe formats:
+- R11G11B10_FLOAT is the standard G-buffer emission/radiance target in UE5, Unity HDRP,
+  and most modern engines. Its 32 bits-per-pixel with no sign makes it a first-class RT.
+- R9G9B9E5_SHAREDEXP appears as HDR skybox / IBL texture storage.
+
+Both are closed-form bit-unpackable in numpy with no GPU round-trip, so they can follow
+the same local-decode path already used for Regular Float formats.
+
+## Design
+
+### Entry point
+
+`_decode_texture_png` currently has a hard gate at the top:
+
+```python
+if fmt.type != rd.ResourceFormatType.Regular:
+    return None
+```
+
+The fix adds two explicit branches **before** this gate, keyed on
+`fmt.type == rd.ResourceFormatType.R11G11B10` and
+`fmt.type == rd.ResourceFormatType.R9G9B9E5`. Each branch:
+1. Length-checks `len(raw) != width * height * depth_lvl * 4` (4 bytes/pixel, fixed).
+2. Reinterprets `raw` as `uint32` LE and extracts float32 RGB via numpy bitops.
+3. Feeds the result into the existing Float display path: `nan_to_num`, `clip(0,1)`,
+   `_srgb_encode`, alpha=255 opaque, output RGBA PNG.
+
+The Regular gate is unchanged; every other non-Regular format still returns `None`.
+
+### Unpack functions
+
+Two private helpers (in `_helpers.py` alongside the existing helpers):
+
+**`_unpack_r11g11b10(words: np.ndarray) -> np.ndarray`**
+
+Input: uint32 array shape `(N,)`. Output: float32 array shape `(N, 3)` — R, G, B.
+
+Bit extraction (all shifts on the uint32 word):
+- R 11-bit: `words & 0x7FF`         (bits [0:11))
+- G 11-bit: `(words >> 11) & 0x7FF` (bits [11:22))
+- B 10-bit: `(words >> 22) & 0x3FF` (bits [22:32))
+
+For 11-bit component `x` (exp=5 bits, mant=6 bits, no sign):
+- `exp = x >> 6`, `mant = x & 0x3F`
+- exp == 0  → subnormal: `value = (mant / 64.0) * 2**-14`
+- exp == 31 → Inf/NaN (handled by nan_to_num downstream)
+- else      → normal: `value = (1.0 + mant / 64.0) * 2**(exp - 15)`
+
+For 10-bit component `x` (exp=5 bits, mant=5 bits, no sign):
+- `exp = x >> 5`, `mant = x & 0x1F`
+- exp == 0  → subnormal: `value = (mant / 32.0) * 2**-14`
+- exp == 31 → Inf/NaN
+- else      → normal: `value = (1.0 + mant / 32.0) * 2**(exp - 15)`
+
+Vectorised implementation: build `exp` and `mant` arrays, apply numpy `where` for the
+three cases (subnormal / inf-nan / normal). The inf/nan case can emit `np.inf` or any
+non-finite value — `nan_to_num` in the display path sanitises them.
+
+**`_unpack_r9g9b9e5(words: np.ndarray) -> np.ndarray`**
+
+Input: uint32 array shape `(N,)`. Output: float32 array shape `(N, 3)`.
+
+Bit extraction:
+- R mantissa 9-bit: `words & 0x1FF`           (bits [0:9))
+- G mantissa 9-bit: `(words >> 9) & 0x1FF`    (bits [9:18))
+- B mantissa 9-bit: `(words >> 18) & 0x1FF`   (bits [18:27))
+- Shared exponent 5-bit: `(words >> 27) & 0x1F` (bits [27:32))
+
+Decode: `value_c = mant_c * 2.0**(exp - 24)` (equivalent to `mant_c / 512.0 * 2^(exp-15)`).
+No Inf/NaN possible (the exponent has no reserved value in this format); shared exponent
+E=31 is valid and just produces large values which clip to 1 in the display path.
+
+### Integration into `_decode_texture_png`
+
+IMPORTANT (ordering): in the current code the `ResourceFormatType.Regular` gate
+(`if fmt.type != rd.ResourceFormatType.Regular: return None`) comes FIRST, and the MSAA
+guard (`if getattr(tex, "msSamp", 1) > 1: return None`) comes AFTER it. Packed formats are
+non-Regular, so they are rejected by the Regular gate before ever reaching the MSAA guard.
+The packed branch MUST therefore be inserted **before** the Regular gate, and it MUST:
+(a) perform its own MSAA check (the existing guard is below the Regular gate and is
+unreachable for non-Regular formats), and (b) compute `width`/`height`/`depth_lvl`
+locally, because those locals are not yet defined this early in the function.
+
+Insert immediately after `fmt = tex.format` (and after the `if not raw: return None`
+check), before the Regular gate:
+
+```python
+# Packed HDR formats: 4 bytes/pixel, closed-form numpy decode.
+if fmt.type in (rd.ResourceFormatType.R11G11B10, rd.ResourceFormatType.R9G9B9E5):
+    if getattr(tex, "msSamp", 1) > 1:
+        return None
+    width = max(1, tex.width >> mip)
+    height = max(1, tex.height >> mip)
+    depth_lvl = max(1, getattr(tex, "depth", 1) >> mip)
+    if len(raw) != width * height * depth_lvl * 4:
+        return None
+    words = np.frombuffer(raw, dtype=np.dtype("<u4")).reshape((depth_lvl * height, width))
+    flat = words.ravel()
+    if fmt.type == rd.ResourceFormatType.R11G11B10:
+        rgb = _unpack_r11g11b10(flat)
+    else:
+        rgb = _unpack_r9g9b9e5(flat)
+    rgb_img = rgb.reshape((depth_lvl * height, width, 3))
+    # Reuse Float display path.
+    sanitized = np.nan_to_num(rgb_img, nan=0.0, posinf=1.0, neginf=0.0)
+    f = np.clip(sanitized, 0.0, 1.0)
+    alpha = np.full((depth_lvl * height, width, 1), 255, np.uint8)
+    rgb8 = (_srgb_encode(f) * 255.0).round().astype(np.uint8)
+    out = np.concatenate([rgb8, alpha], axis=2)
+    buf = io.BytesIO()
+    Image.fromarray(out, mode="RGBA").save(buf, format="PNG")
+    return buf.getvalue()
+```
+
+Notes:
+- `depth_lvl * height` matches the 3D tiling logic already in the Regular path.
+- `BGRAOrder()` does not apply (R11G11B10 and R9G9B9E5 have no BGRA variant).
+- `is_depth` never applies (these are color formats; callers do not set it for HDR RTs).
+- MSAA is rejected by the explicit `msSamp` check inside this branch (the function's other
+  MSAA guard sits below the Regular gate and never sees non-Regular formats).
+- The length check uses `* 4` not `* cc * cbw` because `compCount`/`compByteWidth` are
+  not meaningful for packed formats — only `ElementSize()` (which equals 4) matters.
+
+### What is NOT changed
+
+- Local mode `SaveTexture` path: unchanged.
+- The `_decode_dtype` table: unchanged (packed formats never reach it).
+- All other non-Regular formats: still rejected via the existing gate.
+- `rt_overlay` guard: still blocked.
+- `_export_remote` and call sites in `texture.py`: no change needed; they already pass
+  `raw` to `_decode_texture_png` and propagate `None` → `-32002`.
+
+## Risks
+
+| Risk | Mitigation |
+|------|------------|
+| Bit-extraction off-by-one | Hand-computed known-value unit tests with exact uint32 words. |
+| numpy `where` for subnormal/normal wrong | Explicit test case for subnormal (exp=0, mant=1) — value must be ~9.5e-7, not zero. |
+| Inf/NaN leaked to Image.fromarray | `nan_to_num` is applied before clip; verified by existing NaN test pattern on Float path. |
+| Length check wrong (using cc*cbw) | Spec explicitly mandates `* 4`; length test covers wrong-size rejection. |
+| Regression on existing Regular formats | New branches are fully guarded by `fmt.type`; Regular path code is untouched. |
@@ -0,0 +1,46 @@
+# Tasks: packed-hdr-decode
+
+## Phase A: unpack helpers
+
+- [x] Add `_unpack_r11g11b10(words: np.ndarray) -> np.ndarray` to
+  `src/rdc/handlers/_helpers.py` (place adjacent to `_decode_dtype`).
+  Vectorised numpy: extract R/G 11-bit and B 10-bit fields; apply subnormal /
+  normal / inf-nan cases via `np.where`; return float32 shape `(N, 3)`.
+
+- [x] Add `_unpack_r9g9b9e5(words: np.ndarray) -> np.ndarray` to the same file.
+  Extract R/G/B 9-bit mantissas and 5-bit shared exponent; decode as
+  `mant * 2.0**(exp - 24)`; return float32 shape `(N, 3)`.
+
+## Phase B: hook into `_decode_texture_png`
+
+- [x] In `_decode_texture_png`, insert the packed-HDR branch **before** the
+  `ResourceFormatType.Regular` gate (the existing MSAA guard sits below that gate and is
+  unreachable for non-Regular formats, so it cannot be relied on):
+  - Guard on `fmt.type in (rd.ResourceFormatType.R11G11B10, rd.ResourceFormatType.R9G9B9E5)`
+  - Own MSAA check: `if getattr(tex, "msSamp", 1) > 1: return None`
+  - Compute `width`/`height`/`depth_lvl` locally (those locals are defined only after the
+    Regular gate in the current code)
+  - Length check: `len(raw) != width * height * depth_lvl * 4` → return None
+  - Reinterpret as `uint32` LE, reshape to `(depth_lvl * height, width)`, ravel, call the
+    appropriate unpack helper, reshape back to `(depth_lvl * height, width, 3)`
+  - Apply Float display path: `nan_to_num`, `clip`, `_srgb_encode`, alpha=255, RGBA PNG
+
+## Phase C: unit tests
+
+- [x] Add TC-1 through TC-14 (from test-plan.md) to
+  `tests/unit/test_tex_stats_handler.py`, following the `_remote_state` / `_handle_request`
+  pattern used by the existing remote decode tests.
+  - Use `struct.pack("<I", <word>)` to construct raw bytes for each test vector.
+  - Pixel assertions use `img.getpixel((0, 0))` on the decoded PNG.
+- [x] TC-15 (MANDATORY): repurpose the existing
+  `test_tex_export_remote_packed_format_rejected` — it currently asserts R11G11B10
+  (`type=13`) is rejected with `-32002`, which this change breaks. Swap its fixture to a
+  still-unsupported non-Regular packed type (e.g. `R5G6B5` type=14 or `R10G10B10A2`
+  type=12), keep the `-32002 "not supported"` assertion, and rename the test.
+
+## Phase D: verification
+
+- [x] Run `pixi run lint` — no new lint errors.
+- [x] Run `pixi run test` — all existing tests pass; new TC-1 through TC-14 pass.
+- [ ] Real-GPU verify step per test-plan.md section "Manual / real-GPU verification"
+  (or mark DEFERRED with a tracking comment if no suitable capture is available).
@@ -0,0 +1,185 @@
+# Test plan: packed HDR format decode
+
+All tests follow the pattern in `tests/unit/test_tex_stats_handler.py`:
+`_remote_state(tex, raw, tmp_path)` + `_handle_request(rpc_request("tex_export", {...}), state)`.
+Format fields use `rd.ResourceFormat(type=..., compByteWidth=4, compCount=3, compType=1)`.
+- `rd.ResourceFormatType.R11G11B10 = 13`
+- `rd.ResourceFormatType.R9G9B9E5 = 16`
+
+---
+
+## Bit-vector construction reference
+
+### R11G11B10_FLOAT
+
+Per-pixel layout in a little-endian uint32:
+- R 11-bit: bits [0:11) — 5-bit exponent (bits 6-10), 6-bit mantissa (bits 0-5), no sign.
+- G 11-bit: bits [11:22).
+- B 10-bit: bits [22:32) — 5-bit exponent (bits 27-31 of the full word), 5-bit mantissa.
+
+Decode of an 11-bit component `x`:
+- exp = x >> 6, mant = x & 0x3F
+- exp == 0: value = (mant / 64) * 2^-14  (subnormal)
+- exp == 31: Inf (mant==0) or NaN (mant!=0)
+- else: value = (1 + mant/64) * 2^(exp-15)
+
+Decode of the 10-bit B component `x`:
+- exp = x >> 5, mant = x & 0x1F
+- exp == 0: value = (mant / 32) * 2^-14
+- exp == 31: Inf/NaN
+- else: value = (1 + mant/32) * 2^(exp-15)
+
+**Known-value uint32 words (LE):**
+
+| Color (R, G, B) | uint32 word | LE bytes | Notes |
+|-----------------|-------------|----------|-------|
+| (1.0, 0.5, 0.25) | `0x681C03C0` | `[0xC0,0x03,0x1C,0x68]` | R: exp=15 mant=0; G: exp=14 mant=0; B: exp=13 mant=0 |
+| max finite (all ch) | `0xF7FDFFBF` | `[0xBF,0xFF,0xFD,0xF7]` | R,G: exp=30 mant=63; B: exp=30 mant=31 |
+| Inf (all ch) | `0xF83E07C0` | `[0xC0,0x07,0x3E,0xF8]` | R,G: exp=31 mant=0; B: exp=31 mant=0 |
+| NaN (all ch) | `0xF87E0FC1` | `[0xC1,0x0F,0x7E,0xF8]` | R,G: exp=31 mant=1; B: exp=31 mant=1 |
+| subnormal (mant=1 all) | `0x00400801` | `[0x01,0x08,0x40,0x00]` | R,G: exp=0 mant=1; B: exp=0 mant=1 |
+
+### R9G9B9E5_SHAREDEXP
+
+Per-pixel layout in a little-endian uint32:
+- R mantissa 9-bit: bits [0:9)
+- G mantissa 9-bit: bits [9:18)
+- B mantissa 9-bit: bits [18:27)
+- Shared exponent 5-bit: bits [27:32)
+
+Decode: `value_c = mant_c * 2.0^(exp - 24)` (= `mant_c / 512 * 2^(exp-15)`).
+No reserved exponent values; no Inf/NaN possible.
+
+**Known-value uint32 words (LE):**
+
+| Color (R, G, B) | uint32 word | LE bytes | Build (E, rm, gm, bm) |
+|-----------------|-------------|----------|-----------------------|
+| (1.0, 1.0, 1.0) | `0xC0040201` | `[0x01,0x02,0x04,0xC0]` | E=24, m=1 each: `1 * 2^0 = 1.0` |
+| (1.0, 0.5, 0.25) | `0xB0040404` | `[0x04,0x04,0x04,0xB0]` | E=22, rm=4, gm=2, bm=1: `4*2^-2=1, 2*2^-2=0.5, 1*2^-2=0.25` |
+
+**Expected sRGB output bytes** (after clip + `_srgb_encode`):
+- 1.0 → 255, 0.5 → 188, 0.25 → 137, 0.0 → 0
+
+---
+
+## R11G11B10_FLOAT unit tests
+
+**TC-1: happy path (1.0, 0.5, 0.25)**
+- `fmt`: type=13, compByteWidth=4, compCount=3, compType=1, name="R11G11B10_FLOAT"
+- `tex`: 1×1, msSamp=1
+- `raw`: `struct.pack("<I", 0x681C03C0)` (4 bytes)
+- `rpc`: `tex_export`, id=<tex_id>
+- Assert: `resp["result"]` present; PNG RGBA; pixel[0,0][0]==255, pixel[0,0][1]` ≈ 188 (±2), pixel[0,0][2]` ≈ 137 (±2), alpha==255
+
+**TC-2: Inf clips to white**
+- `raw`: `struct.pack("<I", 0xF83E07C0)` (all-Inf)
+- Assert: pixel[0,0] == (255, 255, 255, 255)
+
+**TC-3: NaN renders black**
+- `raw`: `struct.pack("<I", 0xF87E0FC1)` (all-NaN)
+- Assert: pixel[0,0][0] == 0, pixel[0,0][1] == 0, pixel[0,0][2] == 0, alpha == 255
+
+**TC-4: subnormal is non-negative and very small**
+- `raw`: `struct.pack("<I", 0x00400801)` (exp=0 mant=1 for R,G,B)
+- Assert: `resp["result"]` present; pixel[0,0] == (0, 0, 0, 255) (sRGB(~1.5e-19) rounds to 0); no error
+
+**TC-5: wrong length rejected**
+- `tex`: 2×2
+- `raw`: `b"\x00" * 4` (should be 16 bytes)
+- Assert: `resp["error"]["code"] == -32002`
+
+**TC-6: MSAA rejected**
+- `tex`: 1×1, msSamp=4
+- `raw`: `struct.pack("<I", 0x681C03C0)`
+- Assert: `resp["error"]["code"] == -32002`
+
+**TC-7: 3D tiled (depth=2)**
+- `tex`: 1×1, depth=2
+- `raw`: `struct.pack("<2I", 0x681C03C0, 0x00000000)` (8 bytes = 2 slices)
+- Assert: `resp["result"]` present; PNG size == (1, 2); pixel[0,0] ≈ (255, 188, 137, 255); pixel[0,1] == (0, 0, 0, 255)
+
+---
+
+## R9G9B9E5_SHAREDEXP unit tests
+
+**TC-8: happy path (1.0, 1.0, 1.0)**
+- `fmt`: type=16, compByteWidth=4, compCount=3, compType=1, name="R9G9B9E5_SHAREDEXP"
+- `tex`: 1×1
+- `raw`: `struct.pack("<I", 0xC0040201)` (4 bytes)
+- Assert: `resp["result"]` present; pixel[0,0] == (255, 255, 255, 255)
+
+**TC-9: happy path (1.0, 0.5, 0.25)**
+- `raw`: `struct.pack("<I", 0xB0040404)`
+- Assert: pixel[0,0][0] == 255, pixel[0,0][1] ≈ 188 (±2), pixel[0,0][2] ≈ 137 (±2), alpha == 255
+
+**TC-10: zero value**
+- `raw`: `struct.pack("<I", 0x00000000)` (E=0, all m=0)
+- Assert: pixel[0,0] == (0, 0, 0, 255) — `0 * 2^(0-24) = 0`
+
+**TC-11: wrong length rejected**
+- `tex`: 2×2
+- `raw`: `b"\x00" * 4`
+- Assert: `resp["error"]["code"] == -32002`
+
+**TC-12: 3D tiled (depth=2)**
+- `tex`: 1×1, depth=2
+- `raw`: `struct.pack("<2I", 0xC0040201, 0x00000000)` (8 bytes)
+- Assert: PNG size == (1, 2); pixel[0,0] == (255, 255, 255, 255); pixel[0,1] == (0, 0, 0, 255)
+
+**TC-12b: max shared exponent (E=31, mantissa=511) clips to white**
+- `raw`: `struct.pack("<I", 0xFFFFFFFF)` (E=31, all mantissas=511 → each ch = 65408.0)
+- Assert: pixel[0,0] == (255, 255, 255, 255). Confirms E=31 is a valid (non-reserved)
+  exponent that produces large finite values clipped to 1, not Inf/NaN.
+
+---
+
+## Regression guard
+
+**TC-13: existing Regular Float format still works**
+- `fmt`: type=0 (Regular), compByteWidth=4, compCount=4, compType=1 (R32G32B32A32_FLOAT)
+- Verify that an existing test (e.g. `test_tex_export_remote_rgba32f_hdr_clip`) still passes
+  unchanged — confirms the new branches do not interfere with the Regular path.
+
+**TC-14: BC1 (block-compressed) still rejected**
+- `fmt`: type=2 (BC1), compByteWidth=0, compCount=4
+- Assert: `-32002 "not supported"` — confirms the non-Regular gate is intact for BC.
+
+**TC-15: repurpose the existing rejection test (MANDATORY — currently broken by this change)**
+- The existing `test_tex_export_remote_packed_format_rejected` (in
+  `tests/unit/test_tex_stats_handler.py`) builds an R11G11B10 (`type=13`) texture and
+  asserts `-32002 "not supported"`. After this change R11G11B10 **decodes**, so that test
+  WILL FAIL as written.
+- Required action: repurpose it. Replace the `type=13` fixture with a still-unsupported
+  non-Regular format that retains the rejection-test role — e.g. `R5G6B5` (`type=14`) or
+  `R10G10B10A2` (`type=12`), both present in the mock enum and not decoded by this change.
+  Keep the `-32002 "not supported"` assertion. Rename the test accordingly (e.g.
+  `test_tex_export_remote_unsupported_packed_format_rejected`).
+- Note: TC-14 (BC1) already guards the block-compressed path; TC-15 specifically preserves
+  coverage of a still-unsupported *packed* non-Regular type after R11G11B10/R9G9B9E5 became
+  decodable.
+
+---
+
+## Manual / real-GPU verification
+
+1. Find or create a RenderDoc capture that contains R11G11B10_FLOAT or R9G9B9E5_SHAREDEXP
+   render targets. Any modern engine HDR G-buffer pass or light probe capture works.
+   If no capture is available locally, generate one from a Vulkan sample (e.g. Sascha Willems
+   `hdr` sample) with `rdc capture`.
+
+2. Open the capture in remote-replay mode: `rdc open capture.rdc --proxy host:port`.
+
+3. Run `rdc rt <eid> -o /tmp/hdr_rt.png` for a draw event whose primary RT has one
+   of the packed formats. Verify:
+   - Command exits 0.
+   - The PNG file exists and opens in an image viewer showing a plausible HDR scene
+     (bright highlights clipped to white, not garbled noise).
+   - `file /tmp/hdr_rt.png` reports PNG, `identify /tmp/hdr_rt.png` (ImageMagick) reports
+     geometry matching the RT dimensions.
+
+4. Cross-check: use `SaveTexture` in local mode on the same event/resource. Compare the
+   two PNGs visually; they should be perceptually similar (same content, slight gamma
+   difference acceptable since local mode may use a different display mapping).
+
+5. If a capture is unavailable: fallback is unit vectors only (TC-1 through TC-12 above).
+   Mark the real-GPU step as DEFERRED and file a tracking comment in the PR.