restore: route through pod fastboot + pod TFTP when power=rack#94
Merged
Conversation
Brings restore to parity with install (#88 + #93) for rack-controlled cameras: * Phase 1 — when power=rack, drive the bring-up via `run_rack_fastboot()` (which handles its own power-cycle and locks UART to the pod for the upload). The previous host-side frame-blast race (power-off → open serial → start session → power-on) is RouterOS-only; rack pods don't expose independent power_off/on and don't need it — the pod's `/fastboot` does the whole sequence locally with microsecond ACK latency. * Phase 5 — add `--tftp-via=auto|pod|host` (default auto: pod when power=rack, host otherwise) and pick the TFTP backend with the same `AsyncExitStack` pattern install uses. Pod path stages every partition via `RackController.tftp_put`; sets `serverip=192.168.1.1` (the pod itself). `tftp_clear` is called BEFORE staging too, so a prior aborted run can't OOM the next one. * `_replace_in_tftp(name, data)` unifies the UBI rootfs swap — pod re-POSTs to /tftp/<name>; host reassigns the in-memory dict. * Wrap the partition-write loop + final reset in `try/finally` so the pod TFTP cleanup (or host UDP-socket close) always fires, even on a mid-loop failure. Without the wrap a Phase-5 raise would skip __aexit__ and leak ~7 MB of pod PSRAM until the next install. * Drop the hard-coded "restore needs RouterOSController only" reject in the power-controller setup — RackController is now an accepted alternative. Vectis stays rejected (no independent off/on, no /fastboot equivalent). ### Live verification on rack pod 10.216.128.69 Synthetic dump dir at /tmp/cam_dump/ (mtd0..3 sized to match the 16 MB NOR layout): $ DEFIB_POWER_TYPE=rack DEFIB_RACK_HOST=10.216.128.69 \ defib restore -c hi3516ev300 -i /tmp/cam_dump/ \ -p rack://10.216.128.69 --power-cycle --flash-type nor Power: rack pod HTTP API Phase 1: Loading U-Boot to RAM Pod-side fastboot in progress… Phase 4: Network setup — Network OK (attempt 1) Phase 5: Writing flash Staging 7664 KB in pod PSRAM via POST /tftp/<name>... Pod TFTP ready on 192.168.1.1:69 mtd1: 64KB → 0x40000 Written (7.5s) mtd2: 3072KB → 0x50000 Written (11.7s) mtd3: 4272KB → 0x350000 Written (15.7s) mtd0: 256KB → 0x0 Written (8.3s) Restore complete! Camera reaches `openipc-hi3516ev300 login:` cleanly. exit=0. Companion rack-firmware bump (local-only): UART_IDLE_TIMEOUT_S 60 → 600. The 60-second idle timer was killing the bridge socket mid-staging (~50 s of HTTP /tftp uploads with zero UART traffic counts as "idle" to the bridge); 600 s comfortably covers full installs and restores. Suite: 486 passed / 2 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings
defib restoreto parity withdefib install(#88 + #93) for rack-controlled cameras. Three pieces:Phase 1 — fastboot when
power=rackThe previous host-side frame-blast race (power-off → open serial → start session → power-on) is RouterOS-only. Rack pods don't expose independent
power_off/power_onand don't need to — the pod's/fastbootendpoint does the whole sequence locally with microsecond ACK latency. Drop the hard-coded "restore needs RouterOSController only" reject —RackControlleris now an accepted alternative. Vectis stays rejected.Phase 5 —
--tftp-via=auto|pod|host(default auto)Same flag as
install. Auto → pod whenpower=rack, host otherwise. Pod path stages every partition viaRackController.tftp_put, setsserverip=192.168.1.1(the pod), and unifies the UBI rootfs file-swap through_replace_in_tftp(name, data).Two robustness improvements:
tftp_clearBEFORE staging. A prior aborted run leaves PSRAM occupied; if the next run can't allocate, the 4 MB rootfs OOMs at 256 KB largest-free. Wipe first.try/finallyaround Phase 5 + 6. A mid-loop write failure skipped__aexit__and leaked ~7 MB of pod PSRAM until the next install. Thetry/finally(with the cleanup hooks pre-registered on theAsyncExitStack) makes cleanup unconditional.Live verification on rack pod
10.216.128.69(hi3516ev300)Synthetic dump dir at
/tmp/cam_dump/(mtd0..3 sized to match the 16 MB NOR layout):Camera reaches
openipc-hi3516ev300 login:cleanly.exit=0.Companion rack-firmware change (local-only)
UART_IDLE_TIMEOUT_S60 → 600. The 60-second idle timer was killing the bridge socket mid-staging — ~50 s of HTTP/tftpuploads counts as "idle" to the bridge (no host→pod UART traffic during that window). 600 s comfortably covers full installs and restores.Test plan
uv run pytest tests/ -x -v --ignore=tests/fuzz— 486 passed / 2 skipped (no new unit tests;_restore_asyncis integration-only)uv run ruff check src/defib/cli/app.py— cleanuv run mypy src/defib/cli/app.py --ignore-missing-imports— cleandefib restore --tftp-via host …still works on existing RouterOS+host-TFTP setups — host branch is byte-identical except for being inside the sharedAsyncExitStack.--tftp-via podwithoutDEFIB_POWER_TYPE=rack→ clean error message.🤖 Generated with Claude Code