Skip to content

refactor(cosim): CosimBackend seam + de-Metal run_cosim + module split + CpuBackend (#105 Phase 0 + Phase 1 steps 1-5)#118

Open
robtaylor wants to merge 31 commits into
mainfrom
cosim-backend-seam-phase0
Open

refactor(cosim): CosimBackend seam + de-Metal run_cosim + module split + CpuBackend (#105 Phase 0 + Phase 1 steps 1-5)#118
robtaylor wants to merge 31 commits into
mainfrom
cosim-backend-seam-phase0

Conversation

@robtaylor

@robtaylor robtaylor commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

What

Cosim backend portability (#105), Phase 0 + Phase 1 (in progress) — extract
the backend seam and de-Metal run_cosim so cosim can run on a CPU reference
backend (and later CUDA/HIP), not just Metal. Implements the target architecture
from ADR 0017 Amendment 2026-06-07 (Layer 1/2/3 + the peripheral contract)
and the staging in docs/plans/cosim-backend-portability.md /
docs/plans/cosim-phase1-cpu-backend.md.

Per the maintainer's call this is one PR, not a stack: it now carries Phase 0
and the completed Phase 1 steps below. Every commit is a Metal-only refactor
with zero behaviour change
, gated by a byte-identical fixture harness.

Phase 0 — the seam

  • edge_ops_mut accessor — factor the repeated unsafe from_raw_parts over
    a schedule edge's shared MTLBuffer behind ScheduleBuffers::edge_ops_mut/
    edge_ops.
  • CosimBackend trait + MetalBackendMetalBackend owns the
    MetalSimulator, the [2×state_size] design-state buffer, the per-edge
    schedule, and all GPU IO buffers. run_edges/profile_kernels/wait forward
    to the unchanged encode_and_commit_gpu_batch/profile_gpu_kernels/spin_wait.

Phase 1 — de-Metal run_cosim, module split, CpuBackend (steps 1–5 done)

  • Step 1 — de-Metal the run_edges seam (drop metal::Buffer); backend-owned
    VCD ring (enable_vcd_ring/vcd_snapshot); flash_set_in_reset.

  • Step 2 (2a/2b/2c) — relocate buffer setup+init into builders:
    build_flash_buffers, build_io_buffers (+ CPU bus_lanes), build_state_buffers
    (states/sram/xmask/blocks/event, incl. xprop seeding + SRAM preload).

  • Step 3 (3a/3b-i/3b-ii) — make the orchestration backend-neutral:

    • 3aCosimBackend::{state,state_mut,sram}; route the design-state/SRAM
      loop-body reads.
    • 3b-idecoded-records seam (ADR 0017 Layer 3): flash_d_i,
      flash_debug_snapshot, drain_uart_tx, drain_bus_beats,
      drain_wb_trace_debug, uart_decoder_debug. Flash const-params become
      agnostic locals; peripheral ring cursors move into MetalBackend. The
      run_cosim body now has zero backend.<field>.contents() reads.
    • 3b-iiMetalBackend::new fat constructor (-> (Self, Vec<BusTraceLane>));
      stimulus deposits via state_mut(); event_buffer freed by a Drop impl;
      CosimBackend::new/profile_kernels on the trait; run_cosim flips to
      run_cosim_generic<B: CosimBackend>
      with a thin private-MetalBackend shim.
  • Step 4 — physical module split cosim_metal.rscosim/{mod,metal}.rs: mod.rs (non-gated) holds the trait, run_cosim_generic<B>, scheduler + agnostic glue; metal.rs (#[cfg(feature="metal")]) holds MetalBackend, the GPU structs, build_*/encode_*, the run_cosim shim. cargo check --lib --no-default-features now compiles the agnostic half.
    The CosimBackend trait surface is now complete and CPU-ready — debug/profile
    methods have no-op defaults, so a CpuBackend need only implement the functional
    core.

  • Step 5 (5a/5b/5c)CpuBackend (in non-gated cosim/mod.rs): a CPU reference backend that runs jacquard cosim with no GPU feature. Design stepper via cpu_reference::simulate_block_v1[_xprop] (xprop X-mask state_prep ported from kernel_v1.metal); CPU ports of the UART-TX decoder and APB3 bus-trace beat-extraction FSMs (also from the shader); cmd_cosim's no-GPU path wired. All 7 cosim regression fixtures are byte-identical between CpuBackend (no-GPU build) and the Metal golden — cross-backend equivalence proven. This unlocks cosim regression on free Linux CI.

Scope / next (this PR or follow-up)

  • Step 7 — Linux cosim CI on ubuntu-latest (free runner): run the cosim
    fixtures via the no-GPU CpuBackend build. The only remaining Phase-1 step.

Verification (every commit)

  • All 7 Metal cosim fixtures byte-identical to a pre-refactor golden
    (dual_uart, apb_trace ±xprop, xprop_cosim, 2state, reg-init variants).
  • jtag_minimal 4M-edge replay PASS (data0_obs=0xCAFEBABE) — exercises the
    model-driven-clock per-edge path.
  • cargo test --lib --features metal298 pass.

Refs #105.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)

robtaylor added 21 commits June 7, 2026 15:20
…e_ops_mut

Phase 0 step 1 of the backend-portability seam (#105, ADR 0017 Amendment
2026-06-07). The three per-edge ops-patching closures (reset, model-driven
inputs, model clock edges) and the --check-with-cpu replay each open-coded
the same unsafe `from_raw_parts[_mut](buf.contents() as *mut/*const BitOp,
len)` slice over a schedule edge's shared MTLBuffer. Collapse those into
`ScheduleBuffers::edge_ops_mut`/`edge_ops` accessors.

The accessor name and signature deliberately match the `edge_ops_mut`
method on the forthcoming `CosimBackend` trait, so call sites need no
renaming when `ScheduleBuffers` is later subsumed into `MetalBackend`. On
Metal the slice is zero-copy over shared memory (the write is the upload);
a CUDA/HIP backend will back the same accessor with a host mirror +
dirty-flag + lazy upload.

Behaviour-preserving: all 7 Metal cosim fixtures byte-identical to the
pre-refactor golden; `cargo test --lib --features metal` 298 pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
Phase 0 of the cosim backend-portability seam (#105, ADR 0017 Amendment
2026-06-07). Factors `run_cosim`'s design execution + state ownership
behind a batch-granular `CosimBackend` trait, implemented by a new
`MetalBackend` that owns the `MetalSimulator`, the `[2 × state_size]`
design-state buffer, the per-edge schedule storage, and every GPU IO
buffer (flash / UART / bus-trace / SRAM / blocks / event).

Key points:
- The backend owns its schedule storage, materialised once via
  `init_schedule` from a backend-agnostic `Vec<Vec<BitOp>>` description;
  orchestration keeps only scalars (`edges_per_period`, `gcd_ps`). No
  parallel copy the backend re-materialises.
- `run_edges` / `profile_kernels` / `wait` forward to the UNCHANGED
  `MetalSimulator::encode_and_commit_gpu_batch` / `profile_gpu_kernels` /
  `spin_wait` using the owned fields, so the GPU-encoding logic is
  byte-for-byte unchanged from the pre-seam call site.
- The three per-edge ops-patching closures (reset / model-driven inputs /
  model clock edges) become free functions taking `&mut dyn CosimBackend`
  and patch through `edge_ops_mut` — the "closure-borrow" resolution noted
  in ADR 0017 (sequential calls, no long-lived backend borrow).
- `edge_ops_mut` takes `&mut self` so the seam enforces exclusive access
  at compile time over the interior-mutable Metal shared memory.

Trait scope is exactly what the Metal driver uses; `state()`/`state_mut()`
(plan's output_state/input_state_mut) and a backend-agnostic VCD ring land
in Phase 1 with `CpuBackend` (noted in-code). The module split to
cosim/{mod,metal}.rs remains separable/cosmetic.

Behaviour-preserving: all 7 Metal cosim fixtures byte-identical to the
pre-refactor golden; jtag_minimal 4M-edge replay PASS
(data0_obs=0xCAFEBABE); `cargo test --lib --features metal` 298 pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
Phase 0 (#105) landed on this branch: edge_ops_mut accessor + CosimBackend
trait/MetalBackend extraction, Metal bit-identical (harness + 4M JTAG
verified). Re-frame next-up as Phase 1 (CpuBackend + Linux CI) and record
the two Phase-0 deferrals (state accessors, de-Metal the VCD ring) that
feed into it.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
Concrete implementation plan for Phase 1 of #105, stacked on the Phase 0
seam (#118). Records the chosen interface approach (fat backend
constructor / ADR Layer-1/2 split), the trait additions to de-Metal
run_cosim's ~54 setup sites, the required cosim_metal.rs → cosim/{mod,metal}
module split, the CPU peripheral-decode strategy (reuse CppSpiFlash +
BusTraceDecoder; port UART TX decode to CPU), a 7-step bit-identical-gated
sequencing, and the cross-backend-equivalence testing strategy.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…ddressed)

Address the plan-reviewer's 4 critical issues + gaps (all evidenced against
the real code):

- Drop `flash_d_i` from the trait (Metal vs CppSpiFlash update d_i at
  different dispatch points); CpuBackend absorbs CppSpiFlash::step internally.
- Specify the `vcd_snapshot` ring contract ([input|output] slot layout,
  CpuBackend fills slot 0, drain loop refactored off raw ring.contents()).
- `--check-with-cpu` becomes a no-op-with-warning under CpuBackend (never
  compare the reference backend to itself).
- Decompose step 2 (the ~54-site relocation) into 2a/2b/2c, each Metal
  bit-identical; add the module-split compile gate (zero metal:: in mod.rs,
  `cargo check --lib` no-feature) with front-loaded exit assertions in
  steps 1/3.
- Note the de-Metaling of the ~100 lines of loop-body diagnostics
  (dff-dump/trace-signals/deep-diag).
- New CpuBackend::new asserts: reject SRAM+xprop (simulate_block_v1 has no
  sram_xmask) and timing-arrivals (GPU-ring readback); size state Vec to
  effective_state_size*2.
- Acknowledge WB-trace stub, input-only models / empty step_edge output,
  multi-stage (blocks × num_major_stages); fix the cmd_cosim hard-error
  ref (~1684, not the sim-path 507).

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…nification)

Fill in the Tier-2 `GpuPeripheral` seam beyond the `encode_step` one-liner:
specify the single peripheral contract that the CPU model (Tier 1), GPU
kernel (Tier 2), and single source (Tier 3) all express, so they aren't two
parallel interfaces.

Records: (1) the observe→FSM→drive+emit shape is common to every peripheral
and the CPU `PeripheralModel` trait is already this contract and already
bidirectional (optional input/output halves cover GPIO/UART/bus/flash);
(2) the GPU half is three bespoke kernels with one common skeleton + a
hand-synced `#[repr(C)]` layout — the shared layout is the consistency
anchor, the hand-sync is the tax Tier 3 removes; (3) the key decision —
express ALL input drives as (position,value) ops applied through state_prep,
normalising flash's bespoke direct-write so input application is uniform
across peripherals and both substrates; (4) what deliberately stays
substrate-specific (ring drain, FSM body). Adds the Phase-1 implication that
the new CPU UART-TX decoder mirror `UartDecoderState` so Phase 2/3 fold into
one definition.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
… (P1.1)

Phase 1 step 1 of #105. Removes the `metal::Buffer` leak from the
`CosimBackend` trait and moves the per-edge VCD snapshot ring into the
backend, so the seam stops exposing Metal types:

- `run_edges(batch, schedule_offset)` drops its `Option<&metal::Buffer>`
  parameter; the ring is now a `MetalBackend` field sourced internally.
- `enable_vcd_ring()` / `vcd_snapshot(edge) -> &[u32]` trait methods replace
  the run_cosim-local `vcd_ring_buffer` + raw `ring.contents()` pointer-math
  drain (now a `vcd_snapshot(i)` loop over agnostic [input|output] slots).
- `flash_set_in_reset(bool)` replaces the direct
  `flash_state_buffer.contents() as *mut FlashState` in_reset write.

State/SRAM read accessors (`state`/`sram`) and the mutable `state_mut`, plus
the routing of the ~15 diagnostic `.contents()` reads (entangled with
flash-FlashState reads), are deferred to step 3 (make run_cosim generic) and
step 5 (CpuBackend), where they are handled coherently — kept off the trait
here to match exactly what the Metal driver calls today.

Behaviour-preserving: 7 cosim fixtures byte-identical to the Phase 0 golden
(incl. the VCD outputs that exercise the ring path); cargo test --lib
--features metal 298 pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
… new)

P1.1 landed (bc18f79): de-Metal run_edges seam + backend-owned VCD ring.
Record the precise step-2 site map (buffer alloc/init line numbers) and the
2a/2b/2c decomposition, and note #118 is being grown to Phase 0 + Phase 1.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…ash_buffers (P1.2a)

Phase 1 step 2a. Relocate the ~120-line GPU SPI-flash buffer allocation +
init (FlashState / FlashDinParams / FlashModelParams / flash_data firmware
load) out of run_cosim's inline body into an associated
`MetalBackend::build_flash_buffers` in the Metal impl. Pure relocation; a
permanent piece of the eventual MetalBackend::new (which composes
build_flash + build_uart + build_bus + build_state).

Bit-identical: 7 cosim fixtures byte-identical to the Phase 0 golden.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
… (P1.2b)

Phase 1 step 2b. Relocate the ~110-line UART + Wishbone-trace + bus-trace
buffer allocation + init (the gpu_io_step peripheral buffers) out of
run_cosim into an associated `MetalBackend::build_io_buffers`, returning the
seven buffers plus the CPU-side `bus_lanes` decoders. Pure relocation;
companion to build_flash_buffers in the eventual MetalBackend::new.

Bit-identical: 7 cosim fixtures byte-identical to the Phase 0 golden (incl.
dual_uart + apb_trace, which exercise these buffers).

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
Flash + gpu_io_step buffer setup extracted to MetalBackend::build_flash_buffers
/ build_io_buffers (75c9ec0, a93812a), bit-identical. Record 2c (the trickier
state/sram/event/blocks extraction + xprop seed) and the MetalBackend::new
assembly as the next moves.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…next

Sync the handoff header to the current branch tip + CI status (fully green
across Phase 0 + P1.1 + P1.2a + P1.2b) and note the PR description should be
rescoped to Phase 0 + Phase 1.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…d_state_buffers (P1.2c)

Relocate the design-state, SRAM, blocks-program, and event buffer
allocation + buffer-intrinsic init out of run_cosim into a new static
MetalBackend::build_state_buffers, mirroring 2a/2b. Moved: states_buffer
alloc + fill, the xprop X-mask seed of both slots, sram_data/sram_xmask
alloc + fill, the SRAM ELF preload, the blocks_start/blocks_data no-copy
wrappers, and the leaked-Box event buffer.

The agnostic stimulus deposits (reg_init / reset / constant_ports /
set_flash_din) and sram_dumper stay inline in run_cosim, operating on a
states slice re-derived over the returned states_buffer.
timing_constraints_buffer also stays inline (not in 2c scope).

build_state_buffers returns event_buffer_ptr (the *mut EventBuffer from
Box::into_raw) so run_cosim keeps ownership of the leaked box; the two
existing drop(Box::from_raw(event_buffer_ptr)) sites are unchanged, and
no Drop impl is added. Metal output bit-identical (7 fixtures match the
golden; the only shasum delta is run_params.json::master_seed, which is
rand::random() and flips between runs of the same binary). 298 lib tests
pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
… 3 next

Step 2c extracted state/sram/event/blocks buffer setup into
MetalBackend::build_state_buffers; Metal bit-identical, 298 tests pass.
Next up: step 3 (generic run_cosim<B>). Also notes the golden.sums
run_params.json drop (per-run random master_seed, never bit-identical).

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…reads (P1.3a)

Add CosimBackend::{state,state_mut,sram}; MetalBackend gains sram_len.
Route the ~14 read-only states_buffer/sram_data_buffer loop-body reads
through state()/sram(). run_cosim stays concrete-typed; the generic flip
+ flash/drain seam are 3b. Metal bit-identical, 298 tests pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…ric flip)

Step 3's one-liner under-specified the loop-body work. 3a (state/sram
accessors) landed @ d5a029f. Document the decoded-records seam (ADR 0017
Layer 3) for 3b-i (flash diagnostics + uart/wb/bus drains off concrete
fields) and 3b-ii (MetalBackend::new fat constructor + generic flip).

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…rete fields (P1.3b-i)

Decoded-records seam (ADR 0017 Layer 3): add CosimBackend::{flash_d_i,
flash_debug_snapshot, drain_uart_tx, drain_bus_beats, drain_wb_trace_debug,
uart_decoder_debug, debug_flash_raw_tick0}. Flash const-params become
agnostic locals; uart/wb/bus ring read-cursors move into MetalBackend.
run_cosim body now has zero backend.<field>.contents() reads; stays
concrete-typed (generic flip is 3b-ii). Metal bit-identical, 298 tests pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
3a (state/sram accessors @ d5a029f) + 3b-i (decoded-records seam @ 32d31b8)
landed; run_cosim body now has zero backend.<field>.contents() reads.
Records the 3b-ii design: MetalBackend::new fat constructor, bus_lanes →
agnostic, event_buffer Drop impl, state_mut() deposit routing, generic flip.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…e_mut, Drop for event buffer (P1.3b-ii-a)

Fat constructor MetalBackend::new (MetalSimulator + build_state/flash/io +
timing buffer + struct literal) returning (Self, bus_lanes). Stimulus
deposits (reg_init/reset/constant_ports/set_flash_din) now go through
state_mut() after construction. event_buffer_ptr becomes a field freed by
a Drop impl, removing the two manual drop sites. run_cosim stays
concrete-typed (generic flip is 3b-ii-b). Metal bit-identical, 298 tests pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…Backend> (P1.3b-ii-b)

Add CosimBackend::new (fat constructor) + profile_kernels (default no-op);
move the write_params GPU-setup loop into MetalBackend::new. run_cosim_generic<B>
drives everything through trait methods — zero concrete MetalBackend/metal::
tokens in its body. Public run_cosim is a thin run_cosim_generic::<MetalBackend>
shim; jacquard.rs call site unchanged. Completes step 3. Metal bit-identical,
298 tests pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…; step 4 next

3b-ii-a (MetalBackend::new + deposits via state_mut + Drop @ 0bb150c) and
3b-ii-b (trait new/profile_kernels + run_cosim_generic<B> flip @ 14f6a27)
landed. run_cosim body is fully backend-neutral. Next: step 4 module split
(cosim/{mod,metal}.rs) gated by `cargo check --lib` no-feature compile.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
@robtaylor robtaylor changed the title refactor(cosim): extract CosimBackend seam + MetalBackend (#105 Phase 0) refactor(cosim): CosimBackend seam + de-Metal run_cosim (#105 Phase 0 + Phase 1 steps 1-3) Jun 9, 2026
robtaylor added 2 commits June 9, 2026 15:27
mod.rs (non-gated): CosimBackend trait, BitOp/FlashDebug, CosimOpts/Result,
run_cosim_generic<B>, patchers, scheduler + agnostic glue. metal.rs
(#[cfg(feature="metal")]): MetalBackend/MetalSimulator/ScheduleBuffers, GPU
structs, build_*/encode_*, the run_cosim shim. src/sim/mod.rs now `pub mod
cosim;` (non-gated); jacquard.rs paths updated. cargo check --lib
--no-default-features now compiles the agnostic half. Metal bit-identical,
298 tests pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…327080); step 5 next

Module split landed: cosim/mod.rs (non-gated agnostic) + cosim/metal.rs
(metal-gated GPU). cargo check --lib --no-default-features now compiles the
agnostic half. Next: step 5 CpuBackend (the trait is fully backend-neutral;
debug/profile methods have no-op defaults). Notes the bus_lanes-helper
extraction + dropping the temporary allow(dead_code) when CpuBackend lands.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
@robtaylor robtaylor changed the title refactor(cosim): CosimBackend seam + de-Metal run_cosim (#105 Phase 0 + Phase 1 steps 1-3) refactor(cosim): CosimBackend seam + de-Metal run_cosim + module split (#105 Phase 0 + Phase 1 steps 1-4) Jun 9, 2026
robtaylor added 5 commits June 9, 2026 15:34
CpuBackend run_edges models the --check-with-cpu CPU stepper; UART-TX FSM
must be ported from csrc/kernel_v1.metal (gpu_io_step), not Rust. CpuBackend
can't run end-to-end until step 6 wiring — note the 5+6-together vs
compile+unit-test verification options.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
CpuBackend (cosim/mod.rs, non-gated): Vec<u32> state/sram, run_edges via
cpu_reference::simulate_block_v1[_xprop] mirroring the --check-with-cpu
stepper, xprop X-mask state_prep ported from kernel_v1.metal. cmd_cosim's
no-GPU path now runs cosim on CpuBackend via run_cosim_cpu. UART/bus decode
+ flash stubbed (drain_* empty) for 5b/5c. Logic fixtures (xprop/2state/
noreginit/reginit VCDs) byte-identical to Metal golden; Metal path unchanged
(7 fixtures bit-identical, 298 tests pass).

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…; 5b next

CpuBackend runs cosim with no GPU feature; 4 logic VCDs (incl. xprop)
byte-identical to Metal golden. Next: 5b (port UART-TX FSM from
kernel_v1.metal → drain_uart_tx, verify dual_uart), then 5c (bus-trace beat
extraction → drain_bus_beats + agnostic bus-lanes helper, verify apb_trace).
Notes the UnsafeCell interior-mutability review point.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…1.5b)

Port the gpu_io_step UART FSM (kernel_v1.metal:1189-1249) to CPU: per-channel
4-state decoder advancing current_cycle per edge, run after each edge's
simulate in run_edges, accumulating completed bytes drained by drain_uart_tx.
Per-channel tx_out_pos/cycles_per_bit derived agnostically (mirrors
build_io_buffers' UartParams). dual_uart_events.json byte-identical to Metal
golden on a no-GPU build; Metal path unchanged (7/7), 298 tests pass.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
… next

dual_uart byte-identical to golden on CPU. Next: 5c (port bus-trace beat
extraction → drain_bus_beats + extract agnostic bus-lanes builder so
CpuBackend::new returns real bus_lanes; verify apb_trace ±xprop). Then
step 7 Linux cosim CI.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
robtaylor added 2 commits June 9, 2026 19:51
…-lanes builder (P1.5c)

Port the gpu_io_step APB3 bus-trace extraction (kernel_v1.metal:1305-1352) to
CPU: per-bus gate rising-edge detection emitting RawBeats with a per-edge
current_tick, run after each edge's simulate. Extract the agnostic
build_bus_trace (positions + BusTraceLane decoders) into cosim/mod.rs, shared
by both backends; metal's build_bus_trace_params packs the positions into the
GPU struct. CpuBackend::new now returns real bus_lanes. All 7 cosim fixtures
byte-identical to the Metal golden on a no-GPU build; Metal path unchanged
(7/7), 298 tests pass. Completes CpuBackend Phase-1 functional parity.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
…e (@ a525a25); step 7 next

All 7 cosim fixtures byte-identical to the Metal golden on a no-GPU build.
Only step 7 (Linux cosim CI) remains for Phase 1. Records the CI-job options
(commit expected outputs vs extend compare_backend_vcds.py).

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
@robtaylor robtaylor changed the title refactor(cosim): CosimBackend seam + de-Metal run_cosim + module split (#105 Phase 0 + Phase 1 steps 1-4) refactor(cosim): CosimBackend seam + de-Metal run_cosim + module split + CpuBackend (#105 Phase 0 + Phase 1 steps 1-5) Jun 9, 2026
Commit the 7 cosim fixture expected outputs (tests/*/expected/) and a
scripts/ci/cosim_cpu_check.sh that runs them via the no-GPU CpuBackend build
and diffs against expected. New ubuntu-latest `cosim-cpu` CI job builds
`cargo build -r --bin jacquard` (no features) and runs the check — locking in
cross-backend equivalence on a free Linux runner. Completes Phase 1 of #105.

Co-developed-by: Claude Code v2.1.168 (claude-opus-4-8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant