Skip to content

hi3516av300: ship vendor SPL as :emmc board variant#103

Merged
widgetii merged 1 commit into
masterfrom
hi3516av300-emmc-variant
May 15, 2026
Merged

hi3516av300: ship vendor SPL as :emmc board variant#103
widgetii merged 1 commit into
masterfrom
hi3516av300-emmc-variant

Conversation

@widgetii
Copy link
Copy Markdown
Member

Why

We have two hi3516av300 cameras on the bench: one with SPI NOR flash (ether8), one with eMMC (ether1). They use different DDR chips, and the OpenIPC U-Boot for hi3516av300 ships an SPL targeting SPI NOR boards. On the eMMC board, defib's boot protocol completes every stage with ACKs from the bootrom, but the bootrom faithfully calls the agent at 0x81000000 — DDR isn't backed there, the CPU fetches garbage, and the link goes silent (0 bytes for 30s, no READY).

Two pieces here. The first builds on #102's variant infrastructure to carry an SPL blob; the second is the actual extracted-vendor variant.

What

SPL_BLOB schema addition

Optional profile field naming a binary file (resolved relative to the profile JSON's directory). The loader reads it into profile.spl_data. The agent-upload CLI prefers profile.spl_data over the downloaded U-Boot when set:

```python
if profile.spl_data is not None:
spl_data = profile.spl_data # variant SPL takes precedence
else:
spl_data = cached_fw.read_bytes() # fall back to OpenIPC U-Boot first 20K
```

Variant declaration looks like:

```json
{
"name": "hi3516av300",
"...": "...",
"variants": {
"emmc": { "SPL_BLOB": "hi3516av300-emmc-spl.bin" }
}
}
```

hi3516av300:emmc variant

20480 bytes extracted from a working eMMC av300 board's vendor U-Boot (eMMC offset 0, truncated at the gzip boundary at 0x5000). Lives at src/defib/profiles/data/hi3516av300-emmc-spl.bin.

End-to-end verified on real hardware:

Camera --chip Result
SPI NOR av300 hi3516av300 agent READY at t=0.3s
eMMC av300 hi3516av300 0 bytes for 30s (pre-existing failure mode)
eMMC av300 hi3516av300:emmc agent READY at t=0.3s

Failure-diagnostic content update

The diagnostic message from #102 now actually has a real variant to suggest:

```
Known board variants for hi3516av300: emmc
Try: defib agent upload -c hi3516av300:emmc ...
```

Extraction recipe

Captured in kaeru `hi3516av300-emmc-variant-shipped-2026-05-15` for the next board family that hits this:

  1. Catch vendor U-Boot prompt (^C bombardment)
  2. `mmc dev 0` then `mmc read 0 0x82000000 0 0x40` — note: this U-Boot 2016.11 wants `mmc read DEV addr blk# cnt`, not `mmc read addr blk# cnt`
  3. `loady 0x81000000` the defib agent, then `go 0x81000000`
  4. `agent.read_memory(0x82000000, 0x6000)` to pull the bytes back
  5. Truncate at the byte before the `\x1f\x8b\x08` gzip signature (0x5000 here) to drop the gzipped U-Boot tail
  6. Drop into `src/defib/profiles/data/--spl.bin` and add a variant block

Test plan

  • `uv run pytest tests/ -x --ignore=tests/fuzz` — 522 passed, 2 skipped (5 new tests in test_profiles.py covering blob resolution, missing-blob error path, blob-via-variant, real av300:emmc, real av300 base)
  • `uv run ruff check` + `mypy` on changed files — clean
  • Real-hardware: eMMC av300 reaches READY at t=0.3s with `--chip hi3516av300:emmc` (was 0 bytes for 30s before this PR)
  • Real-hardware: SPI NOR av300 still reaches READY at t=0.3s with the base `--chip hi3516av300` (no regression)
  • `.bin` gitignore got a negation rule for `src/defib/profiles/data/.bin` so SPL blobs don't get hidden

Aside

Found a separate routeros power-controller bug while iterating: `power_off → power_on` over a port that was already off restores it to "off" because `power_off` saves the current state (off) and `power_on` restores it. Worked around in test scripts via `_set_poe(port, 'forced-on')`. Worth fixing separately.

🤖 Generated with Claude Code

eMMC-equipped hi3516av300 cameras (the variant we have on the bench, ether1)
need different DDR-init bytes than the SPI NOR variant (ether8) because the
two boards use different DDR chips. The OpenIPC U-Boot for hi3516av300 ships
one SPL targeting SPI NOR boards — on the eMMC board, defib's boot protocol
completes every stage with ACKs but the bootrom faithfully calls the agent
at 0x81000000 in unbacked DDR, the CPU fetches garbage, and the link goes
silent (no UART output, no READY frame).

Two pieces:

* `SPL_BLOB`: optional profile field naming a binary file (resolved
  relative to the profile JSON) the loader reads into `profile.spl_data`.
  When the variant block sets `SPL_BLOB`, the upload CLI uses those bytes
  as the SPL stage instead of slicing them out of the downloaded U-Boot.

* `hi3516av300:emmc` ships with `SPL_BLOB = "hi3516av300-emmc-spl.bin"`,
  20480 bytes extracted from a working eMMC av300 board's vendor U-Boot
  via the vendor-U-Boot `loady` route. End-to-end verified on real
  hardware: `defib agent upload -c hi3516av300:emmc -p /dev/ttyUSB2`
  produces a READY frame at t=0.3s on the eMMC board where the base
  profile produces 0 bytes for 30s. SPI NOR av300 regression-tested with
  the base profile — still works, READY at t=0.3s.

Extraction recipe (for the next board that hits this — captured fully in
kaeru `hi3516av300-emmc-variant-shipped-2026-05-15`):
  1. catch vendor U-Boot prompt (^C bombardment)
  2. `mmc dev 0` then `mmc read 0 0x82000000 0 0x40` (this U-Boot wants
     `mmc read DEV addr blk# cnt`, not the usual `mmc read addr blk# cnt`)
  3. `loady 0x81000000` defib agent, then `go 0x81000000`
  4. agent.read_memory(0x82000000, 0x6000) to pull the bytes back
  5. truncate at byte before the `\x1f\x8b\x08` gzip signature (0x5000
     here) to drop the gzipped U-Boot tail

The `*.bin` global gitignore otherwise hid the new SPL blob — added a
negation rule for `src/defib/profiles/data/*.bin`.

5 new tests in test_profiles.py (TestSplBlob: blob bytes resolution,
missing-blob error path, blob-via-variant; TestBoardVariants:
real-av300:emmc variant loads, real-av300 base still loads with
`spl_data=None`). test_cli.py's diagnostic tests pivoted to use
`hi3516ev300` for the "no variants" path (av300 now has one) and gained
a `test_when_variants_exist_for_real_shipped_chip` to lock in the
"Known board variants for hi3516av300: emmc" message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@widgetii widgetii merged commit 4193805 into master May 15, 2026
13 checks passed
@widgetii widgetii deleted the hi3516av300-emmc-variant branch May 15, 2026 14:38
widgetii added a commit that referenced this pull request May 15, 2026
## Why

\`RouterOSController\` saves the port's previous \`poe-out\` mode on
\`power_off\` so \`power_on\` can put it back where it was — that
preserves "auto-on" vs "forced-on" distinctions correctly. But if the
port was already \"off\" when \`power_off\` ran (recovering a parked
camera; bench is dark on a fresh shell), the saved \"previous\" mode is
literally \"off\", and \`power_on\` then \"restored\" the port to off.
Cycling a powered-down port left it powered down forever, which silently
broke every recovery flow that started from off.

Surfaced while testing #103 — every fresh script that started with
\`power_off → power_on\` left the camera unpowered, until I switched to
\`_set_poe(port, 'forced-on')\` directly as a workaround.

## What

\`power_on\` must always result in a powered port. If the saved mode is
\"off\", promote to \"forced-on\" and log the promotion (visible at
\`-v\`):

\`\`\`python
async def power_on(self, port: str) -> None:
    restore_mode = self._saved_poe_out.pop(port, \"forced-on\")
    if restore_mode == \"off\":
logger.info(\"PoE ON: %s on %s (saved state was 'off' — promoting to
'forced-on')\", port, self._host)
        restore_mode = \"forced-on\"
    else:
logger.info(\"PoE ON: %s on %s (restoring %s)\", port, self._host,
restore_mode)
    await self._set_poe(port, restore_mode)
\`\`\`

## Test plan

- [x] \`uv run pytest tests/ -x --ignore=tests/fuzz\` — **527 passed, 2
skipped** (5 new tests)
- [x] \`uv run ruff check\` + \`mypy\` on changed files — clean
- [x] Verified on real hardware (MikroTik 10.216.128.2 / ether8,
currently powered down):
      \`\`\`
      before:    ether8 poe-out='off'
      power_off: saved='off'
      power_on:  saved=None, port now 'forced-on'
      \`\`\`

5 new tests in \`tests/test_power.py::TestRouterOSPowerOnOff\` covering
forced-on round-trip, auto-on round-trip (preserved — not blindly
clobbered to forced-on), **off→on promotion (the bug)**, no-prior-off
default, and double-power_off-doesn't-clobber-saved-state edge cases.
They use a \`_PoeStateRouterOS\` subclass that stubs the two network
primitives so we can exercise the save/restore state machine without a
real switch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Dmitry Ilyin <widgetii@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
widgetii added a commit that referenced this pull request May 15, 2026
…500-family) (#105)

## Why

The cv500-family eMMC av300 board (`hi3516av300:emmc`) gets defib's boot
protocol working as of #103, but the agent itself has no idea what eMMC
is — `flash_init` probes SPI NOR via FMC, finds nothing, and falls back
to no-flash. CMD_READ against the FLASH_MEM virtual window has nowhere
to route.

This PR teaches the agent to drive the Synopsys DesignWare MMC host
controller (EMMC_BASE=0x10100000), bring up an eMMC card, and stream
blocks back through the existing CMD_READ. End-to-end verified on the
real board: bytes returned through `defib agent read --chip
hi3516av300:emmc -p /dev/ttyUSB2 --address 0x14000000 --size 0x6000`
match the shipped `hi3516av300-emmc-spl.bin` exactly (≈10 KB/s).

## What

\`\`\`
agent/Makefile     |  16 +-
agent/emmc_himci.c | 485
+++++++++++++++++++++++++++++++++++++++++++++++++++++
agent/emmc_himci.h |  36 ++++
agent/main.c       |  78 ++++++++-
agent/spi_flash.h  |   1 +
\`\`\`

* **\`agent/emmc_himci.{c,h}\`** — minimal DWMMC driver (~470 LOC):
pinmux setup, CRG configuration, controller reset, CMD0/1/2/3/9/7
identification, CMD17 single-block read with FIFO drain. The bootrom's
PERISTAT-driven init returns -1 when \`PERISTAT[9:8]==0\` (the state the
chip lands in after UART fastboot + truncated vendor SPL), so we bypass
that path entirely and hardcode mode-1 pinmux values pulled from the
bootrom ROM at offset 0x7d98.

* **\`agent/main.c\`** — eMMC init runs after SPI flash probe fails. On
success it populates \`flash_info\` with \`FLASH_TYPE_EMMC=2\`,
CID-derived jedec_id, capacity, and 512-byte sector size.
\`handle_read\` adds an eMMC branch that loops block-by-block through
\`emmc_read_block\`. The \`FLASH_MEM + flash_info.size\` arithmetic
moved to \`uint64_t\` — eMMC capacity values near 4 GiB were overflowing
\`uint32_t\` and silently routing reads to the (non-existent) SPI flash
window.

* **\`addr_readable\`** — extended whitelist for \`EMMC_BASE\`,
\`IO_CTRL0_BASE\`, and the bootrom mask ROM (\`BOOTROM_BASE..+64 KiB\`),
each gated by per-SoC Makefile defines. The bootrom-ROM range is the
source of the pinmux table; eMMC + IO ranges are needed by the driver.

* **\`spi_flash.h\`** — adds \`FLASH_TYPE_EMMC = 2\` to the enum.

* **\`Makefile\`** — adds \`EMMC_BASE / IO_CTRL0_BASE / SYSCTRL_BASE /
BOOTROM_BASE / BOOTROM_SIZE\` under the \`hi3516cv500\` stanza and
threads the corresponding \`-D\` defines into CFLAGS. \`emmc_himci.c\`
joins \`SRCS_C\` only when \`EMMC_BASE\` is set, so non-eMMC SoCs are
unaffected.

## Known limitations (intentional MVP cuts)

* **1-bit bus mode** (skip CMD6 SWITCH). Setting CTYPE=1 without first
switching the card to 4-bit wedges the data path silently — a real 4-bit
path needs CMD6, follow-up.
* **HC capacity capped at uint32_t max - sector (0xFFFFFE00)**. True
capacity for ≥ 2 GiB eMMC lives in EXT_CSD SEC_COUNT, which needs CMD8
SEND_EXT_CSD (data-bearing command) to retrieve. Until that lands, the
host can address the first 4 GiB linearly — fine for partition dumps.
* **CMD17 single-block only** (no CMD18 multi-block / no CMD12 STOP).
* **No writes, no erase** — that's the whole "read-only MVP" framing.

## Test plan

- [x] **Real hardware**: agent identifies eMMC (CID MID=0x88, OEM=0x0103
— matches vendor U-Boot's \`Manufacturer ID: 0x88, OEM: 103, Name:
\"D9D16\"\` boot log), 24 KiB read returns bytes matching the shipped
SPL blob exactly.
- [x] **C-side tests** (\`make -C agent test HOST_CC=gcc\`) — 5412
assertions pass.
- [x] **Python suite** (\`uv run pytest tests/ -x --ignore=tests/fuzz\`)
— 527 passed, 2 skipped.
- [x] **Lint/mypy** clean on touched files.
- [x] **SPI NOR av300 not regressed** — the eMMC code is \`#ifdef
EMMC_BASE\` gated and only enters when \`flash_init()\` failed, so chips
with working SPI flash never see the eMMC path.

## Gotchas captured in the PR for future debug

1. **CTYPE bus-width** must match the card's actual mode (default 1-bit
until CMD6).
2. **uint32_t overflow** at \`FLASH_MEM + flash_info.size\` near 4 GiB —
needs uint64_t arithmetic.
3. **CMD2/9 responses** are R2 long (no CRC); pass \`EMMC_FLAG_NO_CRC\`
to skip controller CRC check.

Followup work captured in kaeru
\`agent-emmc-himci-read-mvp-2026-05-15\`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Dmitry Ilyin <widgetii@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant