Releases: pulseengine/synth
v0.3.0 — AAPCS audit + optimizations + fuzz + safety policy
Major release closing out the AAPCS-clobber bug class and adding optimization, fuzz, and security-policy infrastructure.
Highlights
Bug-class closure: hardcoded R0..R3 in instruction emission
A recurring class of bugs across v0.1.x — instruction emission paths hardcoding ARM AAPCS param registers and clobbering live params. Systematically swept in this release:
- #106 — 24 i64 ops (Eq/Ne/Lt/Le/Gt/Ge ×Signed/Unsigned, Mul/Div/Rem, Rotl/Rotr, Clz/Ctz/Popcnt, Extend8/16/32 Signed, I32WrapI64) re-routed through
alloc_consecutive_pair - #107 — CSE missing arms for
MemLoad/MemStore/Extendconsumers - #108 — systematic audit: 47 hardcoded R0..R3 sites fixed, 6 new
Opcodevariants, 27 regression tests - #109 —
WasmOp::Callhad nowasm_to_irhandler (causedfibcompilation to silently miscompile) - #101 —
get_arm_regsilent R0 fallback replaced with diagnostic panic — future wasm_to_ir gaps now crash the compiler instead of producing miscompiled firmware - #111 —
I32WrapI64no longer preassigns R0; defers to function-return epilogue
Optimization improvements
- #96 —
(i32.const C)(i32.load offset=O)folds to a single 4-byteLDR rd, [base, #(C+O)]whenC+O ≤ 4095. Drops from ≥10 bytes. - #98 —
(i64.shr_u 32; i32.wrap_i64)and friends lower to a direct hi/lo register rename. 83% size reduction on the canonical u64-packed FFI extraction pattern.
Test + verification infrastructure
- #99 — 59 new semantic-correctness tests covering every i64 wasm op (closes the gap that allowed #93 to ship)
- #92 — 37 tests of coverage uplift for the v0.1.1 diff
- #100 — 4 cargo-fuzz harnesses + CI smoke gate. 2 gating, 2 exploration. The harnesses found the bugs that became #103/#104/#108/#109/#111/#112
- #105 — Spectre/csdb policy doc, aarch64 CVE audit (CVE-2026-34971 / CVE-2026-34944), arXiv 2604.17391 citation
- #110 — 842-line binary-safety design covering MPU/PMP/CFI/PAC/BTI/MSPLIM/etc with per-target applicability matrix
CI infrastructure
- #102 — Test + Clippy back to ubuntu-latest (unblocked the z3-sys-disk-full failures on smithy runners)
- Fuzz workflow split into gating + exploration matrices
Known follow-ups (v0.3.x patch track)
- #112 — i64-extend chain + Movw R0 clobber (fuzz-found post-#111)
- RISC-V cross-function calls + relocations (WIP branch)
- Promote exploration fuzz harnesses to gating after 2 weeks of clean smoke runs
Releases this cycle
- v0.1.1 — AAPCS regalloc fixes + Cortex-M7 hardening
- v0.2.0 — RISC-V RV32IMAC GA
- v0.2.1 — silicon-blocking memset codegen fix
- v0.3.0 — this release
🤖 Generated with Claude Code
v0.2.1 — silicon-blocking i64 codegen fix
Silicon-blocking patch
Fixes #93: synth-compiled memset (and other compiler_builtins functions using i64 extend/wrap ops in their loop bodies) no longer hangs on real Cortex-M silicon.
Root cause
optimizer_bridge::wasm_to_ir had no handler for I64ExtendI32U / I64ExtendI32S / I32WrapI64. Their result vregs were never mapped to ARM registers, and the downstream get_arm_reg's silent R0 fallback caused subsequent i64 shifts to read R0 as their rm_lo/rm_hi — destroying memset's destination pointer on every iteration.
Symptoms
- Real STM32G474RE silicon hung at
memset+0x4cduring Zephyr'sz_bss_zero. - Synth-emitted
memsetwas 454 bytes vs picolibc's ~80 bytes and looped forever. - PC bounced between two addresses indefinitely, observed via OpenOCD halt+resume sampling.
Fix
Adds the three missing op handlers in wasm_to_ir + ir_to_arm + analyze_i64_local_gets. 5 regression tests added (3 of which fail-before-fix and pass-after).
This is the first concrete silicon-blocking instance of the "i64 lowering has multiple holes" pattern. Follow-up issues #103 (AAPCS clobber in I64SetCond, found by cargo-fuzz harness) and the latent vreg-mapping gap surfaced by the defensive-panic work track the broader class.
🤖 Generated with Claude Code
v0.2.0 — RISC-V GA
First release with RV32IMAC as a first-class compilation target alongside ARM Cortex-M. Synth now emits real EM_RISCV ELF binaries and ships a complete bare-metal runtime (startup, linker script, PMP) for booting them.
RISC-V backend (new)
- New
synth-backend-riscvcrate with RV32IMAC encoder, ELF builder (EM_RISCV=0xF3), PMP allocator, instruction selector, bare-metal startup generator, and linker script generator. - CLI:
--backend riscv,--target riscv32imac/rv32imac/rv32i/rv32gc/rv64imac/rv64gc. - New
synth riscv-runtimecommand emitsstartup.c+linker.ldfor cross-compilation. - Selector covers the i32 surface (arithmetic, logic, shifts, comparisons, division with trap-on-zero), i32.load/store + sub-word load8/16 + store8/16, and control flow (block, loop, if/else, br, br_if). Locals for params 0..7.
- Encoder cross-validated against canonical RV32 hex encodings.
- Renode RV32IMAC platform (
tests/renode/synth_riscv.repl) wired into Bazel CI. - 98 backend-riscv tests, all passing.
- Offline integration smoke + end-to-end calculator demo.
What's still out of scope
- i64 lowering for RV32 (register-pair arithmetic — a future minor release).
- RV32F/D float instructions.
- br_table jump-table emission.
- Cross-function calls + relocations for multi-function module linking.
- RISC-V Rocq proofs (deferred until the validator-pattern work lands; see issue #76).
Compatibility
Purely additive. ARM Cortex-M codegen is unchanged. The RISC-V backend is gated behind --backend riscv. v0.1.x WAT compilation flows continue to work without modification.
🤖 Generated with Claude Code
v0.1.1 — AAPCS regalloc fixes + Cortex-M7 hardening
Bug fixes (real-hardware-found)
- fix(opt): regalloc clobbers parameter registers in i64 ops —
optimizer_bridge::ir_to_armno longer hardcodes i64 ops to R0:R1 / R2:R3. Newalloc_i64_pairpicks free callee-saved pairs (R4..R11) skipping live param registers. Fixes silent corruption of i32 params when an i64 op runs before all params are read. - fix(no-optimize): allocate stack frame + i64 local storage in select_with_stack — non-param locals now get a real stack frame, and i64 locals use 8-byte STR/LDR so the upper half doesn't get dropped. Fixes corruption of the callee-saved spill area.
Features
- feat(cli): --relocatable flag — forces ET_REL output even when the wasm has no imports, for linking into a host build system (e.g. Zephyr).
- feat(m7): Cortex-M7 hardware profiles
HardwareCapabilities::imxrt1062()— single-precision FPU, 16 MPU regions, 8 MB QSPI flash, 1 MB OCRAMHardwareCapabilities::stm32h743()— double-precision FPU, 16 MPU regions, 2 MB Flash, 1 MB RAM- CLI
--hardware {imxrt1062,stm32h743}and target-info wired up - Renode
synth_cortex_m7.replprofile +cortex_m7_test.robot - MPU allocator tests proving 16-region operation on M7-class parts
Toolchain hygiene
- 8 clippy errors fixed (Rust 1.95 lint refresh):
unnecessary_sort_by,collapsible_match,collapsible_if,manual_checked_division.
🤖 Generated with Claude Code