agent+transport: 5.6× flash throughput over rack pod via baud switching#89
Merged
Conversation
Three coordinated changes that make the flash agent's high-speed-UART
mode (`DEFAULT_FAST_BAUD = 921600`) work over rack-pod WiFi-bridged
links — previously the host-side `_port.baudrate = baud` path was
serial-only, so `defib agent {read,write,scan}` over `tcp://<pod>:9000`
silently fell back to `FALLBACK_BAUD = 115200` and ran at ~10 KB/s.
### 1. Host: `Transport.set_baudrate(baud)` abstraction
New method on `defib.transport.base.Transport`. Default raises
`NotImplementedError`. Overrides:
- **`SerialTransport`** — sets `self._port.baudrate` (was inlined in
`FlashAgentClient.set_baud`).
- **`Rfc2217Transport`** — already had `set_baudrate` from PR #64
(Vectis), now exposed through the ABC.
- **New `RackTransport(SocketTransport)`** with the pod's HTTP base
URL captured at construction; `set_baudrate` POSTs to
`/uart/baud {"rate": baud}`. New `rack://host[:bridge_port][?api=http_port]`
URL scheme in `serial_platform.create_transport` (defaults 9000 / 8080).
`FlashAgentClient.set_baud` now `await transport.set_baudrate(baud)`
instead of poking `_port.baudrate` — works across all four transport
flavours; cleanly returns `False` if the transport doesn't support
baud changes (was: raw `AttributeError`).
### 2. Agent: stop auto-reverting on the post-switch verification
`handle_set_baud` used to switch UART, then `proto_recv(timeout=3000)`
for a verification packet from the host, reverting to 115200 if
nothing arrived. The "3000 ms" budget is a CPU-speed-dependent
busy-wait — `for (volatile int d=25; d>0; d--) {}` × `timeout_ms*100`
iterations — and on a fast Cortex-A7 the actual window collapses to
~300 ms.
Over a rack pod the host's `POST /uart/baud` itself takes ~1 s
(WiFi RTT + httpd dispatch), so the agent reverted to 115200 long
before any verification packet could land. Result: agent at 115200,
bridge at 921600, host reading 35 bytes of misclocked `0x80 0x00 …`
garbage forever.
Fix: drop the verification window. The agent stays at whatever baud
the last `CMD_SET_BAUD` selected. If the new rate doesn't work the
agent is unreachable until the next power-cycle / fastboot — both of
which the rack pod and RouterOS trivially provide.
### 3. Pod: defensive UART hygiene around `/uart/baud`
`uart_bridge_set_baud` (rack repo, local-only): drain the TX FIFO at
the old rate before calling `uart_set_baudrate`, and read back the
actual divisor via `uart_get_baudrate`. Belt + braces — even with the
agent fix, leaving in-flight bytes from the old rate gets clocked
out at the new rate and corrupts the agent's RX window. (See companion
rack-firmware commit on the `uart-bridge-flush-rx-on-accept` branch.)
### Live verification
Against the rack-pod prototype at 10.216.128.69 (hi3516ev300, 16 MB
W25Q128). 256 KB sustained flash read through the agent:
| Path | Rate | Speedup |
|---|---|---|
| 115200 (fallback) | 11.1 KB/s | 1.0× |
| 921600 (rack baud switch) | 61.9 KB/s | **5.57×** |
7 new transport tests (`tests/test_transport_rack.py`):
- `set_baudrate` POSTs correct URL + body
- HTTP / URL errors surface as `TransportError`
- `rack://` URL parsing with default + custom + ?api= query
Suite: **468 passed / 2 skipped**; ruff + mypy clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three coordinated changes that make the flash agent's high-speed-UART mode (
DEFAULT_FAST_BAUD = 921600) actually work over rack-pod WiFi-bridged links. Previously the host-side baud switch path (port.baudrate = baud— pyserial) silently failed onSocketTransport, sodefib agent {read,write,scan}overtcp://<pod>:9000fell back toFALLBACK_BAUD = 115200and ran at ~10 KB/s.Live result on the prototype
256 KB sustained flash read at 0x14000000 through the agent over
rack://10.216.128.69:~70 % of the theoretical 8× ceiling — the rest is COBS + windowed-ACK protocol overhead, which is the same as the on-serial path.
What changed
1.
Transport.set_baudrate(baud)abstractionNew method on
defib.transport.base.Transport. Default raisesNotImplementedError. Overrides:SerialTransport— setsself._port.baudrate(was inlined inFlashAgentClient.set_baud).Rfc2217Transport— already hadset_baudratefrom PR Add OpenIPC Vectis support (RFC 2217 transport) #64 (Vectis); just exposed through the ABC.RackTransport(SocketTransport)— captures the pod's HTTP base URL at construction;set_baudratePOSTs{"rate": baud}to/uart/baud. Newrack://host[:bridge_port][?api=http_port]URL scheme inserial_platform.create_transport(defaults 9000 / 8080).FlashAgentClient.set_baudnowawait transport.set_baudrate(baud)— works across all four transport flavours; cleanly returnsFalsewhen the transport refuses (was: rawAttributeError).2. Agent: stop auto-reverting on the post-switch verification window
handle_set_baudused to switch UART, thenproto_recv(timeout=3000)for a verification packet from the host, reverting to 115200 if nothing arrived. The "3000 ms" budget is a CPU-speed-dependent busy-wait —for (volatile int d=25; d>0; d--) {}×timeout_ms*100iterations — and on a fast Cortex-A7 the actual window collapses to ~300 ms.Over a rack pod the host's
POST /uart/bauditself takes ~1 s (WiFi RTT + httpd dispatch), so the agent reverted to 115200 long before any verification packet could land. Result: agent at 115200, bridge at 921600, host reading 35 bytes of misclocked0x80 0x00 …garbage forever.Fix: drop the verification window. The agent stays at whatever baud the last
CMD_SET_BAUDselected. If the new rate doesn't work the agent is unreachable until the next power-cycle / fastboot — both of which the rack pod and RouterOS trivially provide.(This also matches the local-UART experience: defib has been using the same
set_baudagainst MikroTik+pyserial-attached cameras successfully because pyserial'sport.baudrate=is microsecond-fast, easily landing within the agent's collapsed ~300 ms window. The bug only surfaces when the host-side switch is on the wrong side of a high-RTT control plane.)3. Pod firmware (rack repo, local-only —
uart-bridge-flush-rx-on-acceptbranch)Defensive UART hygiene around
/uart/baud: drain the TX FIFO at the old rate beforeuart_set_baudrate, and read back the actual divisor viauart_get_baudrate. Belt + braces — even with the agent fix, leaving in-flight bytes from the old rate gets clocked out at the new rate and corrupts the agent's RX window.Tests
7 new
tests/test_transport_rack.py:set_baudratePOSTs correct URL + bodyTransportErrorrack://URL parsing with default + custom +?api=querySuite: 468 passed / 2 skipped; ruff + mypy clean.
Test plan
uv run pytest tests/ -x -v --ignore=tests/fuzzuv run ruff check src/defib/ tests/uv run mypy src/defib/ --ignore-missing-importsRackTransport. Confirm reading 256 KB at 921600 still works on a USB-serial-attached camera.🤖 Generated with Claude Code