Skip to content

Releases: pulseengine/synth

v0.3.0 — AAPCS audit + optimizations + fuzz + safety policy

15 May 05:22
dfa1e17

Choose a tag to compare

Major release closing out the AAPCS-clobber bug class and adding optimization, fuzz, and security-policy infrastructure.

Highlights

Bug-class closure: hardcoded R0..R3 in instruction emission

A recurring class of bugs across v0.1.x — instruction emission paths hardcoding ARM AAPCS param registers and clobbering live params. Systematically swept in this release:

  • #106 — 24 i64 ops (Eq/Ne/Lt/Le/Gt/Ge ×Signed/Unsigned, Mul/Div/Rem, Rotl/Rotr, Clz/Ctz/Popcnt, Extend8/16/32 Signed, I32WrapI64) re-routed through alloc_consecutive_pair
  • #107 — CSE missing arms for MemLoad / MemStore / Extend consumers
  • #108 — systematic audit: 47 hardcoded R0..R3 sites fixed, 6 new Opcode variants, 27 regression tests
  • #109WasmOp::Call had no wasm_to_ir handler (caused fib compilation to silently miscompile)
  • #101get_arm_reg silent R0 fallback replaced with diagnostic panic — future wasm_to_ir gaps now crash the compiler instead of producing miscompiled firmware
  • #111I32WrapI64 no longer preassigns R0; defers to function-return epilogue

Optimization improvements

  • #96(i32.const C)(i32.load offset=O) folds to a single 4-byte LDR rd, [base, #(C+O)] when C+O ≤ 4095. Drops from ≥10 bytes.
  • #98(i64.shr_u 32; i32.wrap_i64) and friends lower to a direct hi/lo register rename. 83% size reduction on the canonical u64-packed FFI extraction pattern.

Test + verification infrastructure

  • #99 — 59 new semantic-correctness tests covering every i64 wasm op (closes the gap that allowed #93 to ship)
  • #92 — 37 tests of coverage uplift for the v0.1.1 diff
  • #100 — 4 cargo-fuzz harnesses + CI smoke gate. 2 gating, 2 exploration. The harnesses found the bugs that became #103/#104/#108/#109/#111/#112
  • #105 — Spectre/csdb policy doc, aarch64 CVE audit (CVE-2026-34971 / CVE-2026-34944), arXiv 2604.17391 citation
  • #110 — 842-line binary-safety design covering MPU/PMP/CFI/PAC/BTI/MSPLIM/etc with per-target applicability matrix

CI infrastructure

  • #102 — Test + Clippy back to ubuntu-latest (unblocked the z3-sys-disk-full failures on smithy runners)
  • Fuzz workflow split into gating + exploration matrices

Known follow-ups (v0.3.x patch track)

  • #112 — i64-extend chain + Movw R0 clobber (fuzz-found post-#111)
  • RISC-V cross-function calls + relocations (WIP branch)
  • Promote exploration fuzz harnesses to gating after 2 weeks of clean smoke runs

Releases this cycle

  • v0.1.1 — AAPCS regalloc fixes + Cortex-M7 hardening
  • v0.2.0 — RISC-V RV32IMAC GA
  • v0.2.1 — silicon-blocking memset codegen fix
  • v0.3.0 — this release

🤖 Generated with Claude Code

v0.2.1 — silicon-blocking i64 codegen fix

11 May 17:09
a43a4e1

Choose a tag to compare

Silicon-blocking patch

Fixes #93: synth-compiled memset (and other compiler_builtins functions using i64 extend/wrap ops in their loop bodies) no longer hangs on real Cortex-M silicon.

Root cause

optimizer_bridge::wasm_to_ir had no handler for I64ExtendI32U / I64ExtendI32S / I32WrapI64. Their result vregs were never mapped to ARM registers, and the downstream get_arm_reg's silent R0 fallback caused subsequent i64 shifts to read R0 as their rm_lo/rm_hi — destroying memset's destination pointer on every iteration.

Symptoms

  • Real STM32G474RE silicon hung at memset+0x4c during Zephyr's z_bss_zero.
  • Synth-emitted memset was 454 bytes vs picolibc's ~80 bytes and looped forever.
  • PC bounced between two addresses indefinitely, observed via OpenOCD halt+resume sampling.

Fix

Adds the three missing op handlers in wasm_to_ir + ir_to_arm + analyze_i64_local_gets. 5 regression tests added (3 of which fail-before-fix and pass-after).

This is the first concrete silicon-blocking instance of the "i64 lowering has multiple holes" pattern. Follow-up issues #103 (AAPCS clobber in I64SetCond, found by cargo-fuzz harness) and the latent vreg-mapping gap surfaced by the defensive-panic work track the broader class.


🤖 Generated with Claude Code

v0.2.0 — RISC-V GA

10 May 15:23
5abf568

Choose a tag to compare

First release with RV32IMAC as a first-class compilation target alongside ARM Cortex-M. Synth now emits real EM_RISCV ELF binaries and ships a complete bare-metal runtime (startup, linker script, PMP) for booting them.

RISC-V backend (new)

  • New synth-backend-riscv crate with RV32IMAC encoder, ELF builder (EM_RISCV=0xF3), PMP allocator, instruction selector, bare-metal startup generator, and linker script generator.
  • CLI: --backend riscv, --target riscv32imac/rv32imac/rv32i/rv32gc/rv64imac/rv64gc.
  • New synth riscv-runtime command emits startup.c + linker.ld for cross-compilation.
  • Selector covers the i32 surface (arithmetic, logic, shifts, comparisons, division with trap-on-zero), i32.load/store + sub-word load8/16 + store8/16, and control flow (block, loop, if/else, br, br_if). Locals for params 0..7.
  • Encoder cross-validated against canonical RV32 hex encodings.
  • Renode RV32IMAC platform (tests/renode/synth_riscv.repl) wired into Bazel CI.
  • 98 backend-riscv tests, all passing.
  • Offline integration smoke + end-to-end calculator demo.

What's still out of scope

  • i64 lowering for RV32 (register-pair arithmetic — a future minor release).
  • RV32F/D float instructions.
  • br_table jump-table emission.
  • Cross-function calls + relocations for multi-function module linking.
  • RISC-V Rocq proofs (deferred until the validator-pattern work lands; see issue #76).

Compatibility

Purely additive. ARM Cortex-M codegen is unchanged. The RISC-V backend is gated behind --backend riscv. v0.1.x WAT compilation flows continue to work without modification.


🤖 Generated with Claude Code

v0.1.1 — AAPCS regalloc fixes + Cortex-M7 hardening

10 May 04:56
5c7ef0c

Choose a tag to compare

Bug fixes (real-hardware-found)

  • fix(opt): regalloc clobbers parameter registers in i64 opsoptimizer_bridge::ir_to_arm no longer hardcodes i64 ops to R0:R1 / R2:R3. New alloc_i64_pair picks free callee-saved pairs (R4..R11) skipping live param registers. Fixes silent corruption of i32 params when an i64 op runs before all params are read.
  • fix(no-optimize): allocate stack frame + i64 local storage in select_with_stack — non-param locals now get a real stack frame, and i64 locals use 8-byte STR/LDR so the upper half doesn't get dropped. Fixes corruption of the callee-saved spill area.

Features

  • feat(cli): --relocatable flag — forces ET_REL output even when the wasm has no imports, for linking into a host build system (e.g. Zephyr).
  • feat(m7): Cortex-M7 hardware profiles
    • HardwareCapabilities::imxrt1062() — single-precision FPU, 16 MPU regions, 8 MB QSPI flash, 1 MB OCRAM
    • HardwareCapabilities::stm32h743() — double-precision FPU, 16 MPU regions, 2 MB Flash, 1 MB RAM
    • CLI --hardware {imxrt1062,stm32h743} and target-info wired up
    • Renode synth_cortex_m7.repl profile + cortex_m7_test.robot
    • MPU allocator tests proving 16-region operation on M7-class parts

Toolchain hygiene

  • 8 clippy errors fixed (Rust 1.95 lint refresh): unnecessary_sort_by, collapsible_match, collapsible_if, manual_checked_division.

🤖 Generated with Claude Code