agent: post-erase verify must use register-mode read past 1 MB#91
Merged
Conversation
flash_verify_erased read sample bytes directly from FLASH_MEM (the
memory-mapped window), which on hi3516ev300 wraps at 1 MB. For any
sector at offset ≥ 0x100000 the verify read returned bytes from
sector (offset % 0x100000) instead of the actual just-erased sector,
so the smoke test saw non-0xFF data and reported ACK_FLASH_ERROR
even though the erase succeeded.
Effect: write_flash to any offset past 1 MB on hi3516ev300 (and any
other SoC where the boot-mode memory window wraps at 1 MB) silently
failed. Visible on W25Q128 (16 MB NOR) — 12 sectors of a kernel
write completed, then sector 13 at flash offset 0x110000 failed with
ACK_FLASH_ERROR (0x02). Same chip programmed cleanly via U-Boot's
`sf write`, which the agent's CRC32-based higher-level path also
verified, so the bug was localised to the post-erase smoke test.
Fix: route the verify reads through flash_read() (register-mode SPI
READ via FMC normal-mode), the same path flash_read_full has used
since the 1 MB-window workaround landed. The 1 MB-window-wraps
hazard exists for the verify path with identical reasoning.
Confirmed on hardware against rack pod 10.216.128.69
(hi3516ev300 + W25Q128):
Before fix:
0x00050000: OK in 6.4s CRC match=True ← <1 MB
0x000C0000: OK in 6.8s CRC match=True ← <1 MB
0x00110000: FAIL in 6.1s ← =1 MB+0x10000
0x00350000: FAIL in 6.1s ← 3.3 MB
0x00F00000: FAIL in 6.1s ← 15 MB
After fix:
0x00050000: OK in 6.4s CRC match=True
0x000C0000: OK in 6.3s CRC match=True
0x00110000: OK in 6.2s CRC match=True ✓
0x00350000: OK in 6.3s CRC match=True ✓
0x00F00000: OK in 6.3s CRC match=True ✓
Full nor-neo install through the agent (kernel 2.0 MB + rootfs 4.2 MB)
now completes end-to-end in 92 s at 81 KB/s sustained, Linux boots
to `openipc-hi3516ev300 login:`.
Suite: 480 passed / 2 skipped; agent C tests: 5406/5406 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
flash_verify_erased(the post-erase smoke test) read sample bytes directly fromFLASH_MEM(the memory-mapped window). On hi3516ev300 — and apparently every SoC whereflash_read_fullalready takes the register-mode-read path with the comment "boot mode memory window wraps at 1 MB on some SoCs" — that direct read wraps at 1 MB. For any sector at flash offset ≥0x100000the verify read returned bytes from(offset % 0x100000)instead of the actual just-erased sector, the bytes weren't0xFF, and the smoke test reportedACK_FLASH_ERROR (0x02)— even though the erase had completed cleanly.Visible on W25Q128 (16 MB NOR): 12 sectors of a kernel write completed, then sector 13 at flash offset
0x110000failed. Same chip programmed fine via U-Boot'ssf write, and the agent's higher-level CRC32 verify (which usesflash_read()indirectly) also succeeded when bypassing the smoke test — the bug was localised to this one read path.Fix: route the verify reads through
flash_read(), the same register-mode SPI READ pathflash_read_fullhas used since the 1 MB window workaround originally landed.Verification on rack pod
10.216.128.69(hi3516ev300 + W25Q128)Full OpenIPC nor-neo install through the agent (kernel 2.0 MB + rootfs 4.2 MB) now completes end-to-end in 92 s at 81 KB/s sustained, Linux boots to
openipc-hi3516ev300 login:.Test plan
uv run pytest tests/ -x -v --ignore=tests/fuzz— 480 passed / 2 skippedmake -C agent test HOST_CC=gcc— 5406/5406 agent C tests pass🤖 Generated with Claude Code