Summary
meld fuse produces a core wasm module whose code section is fully rewritten compared to the inputs, but DWARF custom sections (.debug_info, .debug_line, .debug_str, …) carried by the input components are passed through byte-for-byte. Every DWARF address inside those passed-through sections is a byte offset into the original per-component code section and is therefore wrong against the merged code section.
The downstream consumer is pulseengine/witness (sibling repo). Witness uses gimli to build a (code-section byte offset) -> (file, line) map so it can attribute MC/DC br_if decisions to source lines. After meld fusion every offset is wrong, so witness produces incorrect coverage attribution for fused modules.
This issue tracks a phased fix.
Discovery (Phase 1, this issue)
Audit of meld-core/src/ on main (commit c52eb1b):
- Default custom-section policy is
Merge, not Drop — see meld-core/src/lib.rs:110. The earlier assumption in CLI/agent docs was that DWARF was being dropped; in practice it is being passed through.
merger.rs:2010-2012 naively concatenates every input core module's custom_sections vec into the merged module's vec without dedup, ordering, or address rewriting.
lib.rs::encode_output (line 1345-1356) emits them all at the end of the output module unless CustomSectionHandling::Drop is set.
- No code in
meld-core/ parses, rewrites, or reconciles .debug_* sections. parser.rs ignores component-level custom sections entirely (line 1082-1084) and only stores core-module custom sections.
Concrete data from the new discovery test on tests/wit_bindgen/fixtures/lists.wasm (P2 component embedding 2 core modules):
DWARF sections at top level of fused module: {
\".debug_abbrev\": 2,
\".debug_info\": 2,
\".debug_line\": 2,
\".debug_loc\": 2,
\".debug_ranges\": 2,
\".debug_str\": 2,
}
input code section (sum across embedded modules): 231531 bytes
fused code section: 213242 bytes
So today's fused module:
- carries duplicate DWARF sections (one set per input core module),
- has a code section of a different length (so addresses can't be coincidentally valid),
- gives
gimli ambiguous + wrong input.
Cross-repo dependency: witness
pulseengine/witness v0.11.x reads DWARF via:
crates/witness-core/src/decisions.rs::extract_dwarf_sections — looks for .debug_abbrev, .debug_info, .debug_line, .debug_str, .debug_line_str, .debug_str_offsets, .debug_addr, .debug_rnglists, .debug_loclists.
build_line_map — uses gimli to compute (code-section byte offset) -> (file, line).
reconstruct_decisions — attributes each br_if byte offset to a source line.
Witness is intentionally NOT a meld-core dependency (it pulls wasmtime, walrus, gimli and lives in a separate workspace). End-to-end verification ("run witness on a meld-fused module, assert > X% of branches got source attribution") therefore has to live cross-repo — likely in pulseengine/wasm-component-examples release evidence, or as a scripted smoke check in CI.
Phased plan
Phase 1 — discovery (THIS ISSUE)
Phase 1.5 — explicit policy
Document and surface the choice: today, debug-info-bearing components produce a fused module with wrong DWARF, which is worse than no DWARF. Options:
- Keep
Merge default: silent corruption, witness gives wrong source lines.
- Switch default to
Drop for .debug_* sections specifically: witness gives no attribution but no wrong attribution.
- Add a CLI flag
--debug-info {drop,passthrough,remap} so users opt in.
Recommendation: add a .debug_*-aware policy distinct from generic custom-section handling, default to drop for .debug_* until Phase 2 ships, keep Merge for non-DWARF custom sections. Out of scope for the discovery PR; tracked here.
Phase 2 — DWARF remap
For each .debug_line line program, rewrite addresses from per-input code-section offsets to merged-code-section offsets using the function-body relocation map the merger already builds. Single-pass through the line program is enough — the addresses are sequential and the merger preserves function-body byte offsets within each rewritten body modulo index reencoding.
.debug_info DIEs that reference code addresses (DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges) need the same remap. .debug_ranges / .debug_rnglists need entry-by-entry rewriting. .debug_str and .debug_abbrev are address-free and can pass through.
Multi-input dedup: .debug_info becomes the concatenation of input compile units, with offset adjustments. .debug_str needs string-pool dedup. .debug_abbrev either merges (if abbreviation tables are byte-equal) or stays per-CU.
This is real DWARF surgery. Feasible with gimli (read) + a write path. Witness already has the gimli dep; meld-core would need to add it (or fork the writer-side logic into a thin standalone helper).
Phase 3 — adapters and inlined code
The merger generates new function bodies for cross-component adapters (memory.copy + cabi_realloc trampolines). These have NO source. Options:
- Synthesize DIEs that point at a placeholder file
\"<meld-adapter>\" line N where N = the adapter's role (memory copy, realloc, lift, lower). Witness's truth-table view becomes correct: adapter br_ifs show up as "adapter" branches that don't need source-level MC/DC coverage.
- Accept the gap: leave adapter ranges out of the line map. Witness already has a strict-per-
br_if fallback when DWARF is absent or sparse, so this degrades gracefully.
Variable-level debug info (.debug_loc, .debug_loclists) is explicitly out of scope for this epic — instruction → source mapping only.
Done criteria for THIS issue (discovery)
References
meld-core/src/lib.rs:110 — default CustomSectionHandling::Merge
meld-core/src/lib.rs:1345-1356 — encoder writes custom sections verbatim
meld-core/src/merger.rs:2010-2012 — naive per-module custom-section accumulation
meld-core/src/parser.rs:1279-1283 — core-module custom sections collected
meld-core/src/parser.rs:1082-1084 — component-level custom sections explicitly dropped
pulseengine/witness:crates/witness-core/src/decisions.rs:83-140 — DWARF section extraction contract
pulseengine/witness:crates/witness-core/src/decisions.rs:142-160 — DWARF address semantics ("byte offsets into the Code section")
Summary
meld fuseproduces a core wasm module whose code section is fully rewritten compared to the inputs, but DWARF custom sections (.debug_info,.debug_line,.debug_str, …) carried by the input components are passed through byte-for-byte. Every DWARF address inside those passed-through sections is a byte offset into the original per-component code section and is therefore wrong against the merged code section.The downstream consumer is
pulseengine/witness(sibling repo). Witness usesgimlito build a(code-section byte offset) -> (file, line)map so it can attribute MC/DCbr_ifdecisions to source lines. After meld fusion every offset is wrong, so witness produces incorrect coverage attribution for fused modules.This issue tracks a phased fix.
Discovery (Phase 1, this issue)
Audit of
meld-core/src/onmain(commitc52eb1b):Merge, notDrop— seemeld-core/src/lib.rs:110. The earlier assumption in CLI/agent docs was that DWARF was being dropped; in practice it is being passed through.merger.rs:2010-2012naively concatenates every input core module'scustom_sectionsvec into the merged module's vec without dedup, ordering, or address rewriting.lib.rs::encode_output(line 1345-1356) emits them all at the end of the output module unlessCustomSectionHandling::Dropis set.meld-core/parses, rewrites, or reconciles.debug_*sections.parser.rsignores component-level custom sections entirely (line 1082-1084) and only stores core-module custom sections.Concrete data from the new discovery test on
tests/wit_bindgen/fixtures/lists.wasm(P2 component embedding 2 core modules):So today's fused module:
gimliambiguous + wrong input.Cross-repo dependency: witness
pulseengine/witnessv0.11.x reads DWARF via:crates/witness-core/src/decisions.rs::extract_dwarf_sections— looks for.debug_abbrev,.debug_info,.debug_line,.debug_str,.debug_line_str,.debug_str_offsets,.debug_addr,.debug_rnglists,.debug_loclists.build_line_map— uses gimli to compute(code-section byte offset) -> (file, line).reconstruct_decisions— attributes eachbr_ifbyte offset to a source line.Witness is intentionally NOT a meld-core dependency (it pulls
wasmtime,walrus,gimliand lives in a separate workspace). End-to-end verification ("run witness on a meld-fused module, assert > X% of branches got source attribution") therefore has to live cross-repo — likely inpulseengine/wasm-component-examplesrelease evidence, or as a scripted smoke check in CI.Phased plan
Phase 1 — discovery (THIS ISSUE)
meld-core/tests/dwarf_passthrough.rspinning the lossy behavior (5 tests, all green today, flip when phases land)Phase 1.5 — explicit policy
Document and surface the choice: today, debug-info-bearing components produce a fused module with wrong DWARF, which is worse than no DWARF. Options:
Mergedefault: silent corruption, witness gives wrong source lines.Dropfor.debug_*sections specifically: witness gives no attribution but no wrong attribution.--debug-info {drop,passthrough,remap}so users opt in.Recommendation: add a
.debug_*-aware policy distinct from generic custom-section handling, default todropfor.debug_*until Phase 2 ships, keepMergefor non-DWARF custom sections. Out of scope for the discovery PR; tracked here.Phase 2 — DWARF remap
For each
.debug_lineline program, rewrite addresses from per-input code-section offsets to merged-code-section offsets using the function-body relocation map the merger already builds. Single-pass through the line program is enough — the addresses are sequential and the merger preserves function-body byte offsets within each rewritten body modulo index reencoding..debug_infoDIEs that reference code addresses (DW_AT_low_pc,DW_AT_high_pc,DW_AT_ranges) need the same remap..debug_ranges/.debug_rnglistsneed entry-by-entry rewriting..debug_strand.debug_abbrevare address-free and can pass through.Multi-input dedup:
.debug_infobecomes the concatenation of input compile units, with offset adjustments..debug_strneeds string-pool dedup..debug_abbreveither merges (if abbreviation tables are byte-equal) or stays per-CU.This is real DWARF surgery. Feasible with
gimli(read) + a write path. Witness already has thegimlidep; meld-core would need to add it (or fork the writer-side logic into a thin standalone helper).Phase 3 — adapters and inlined code
The merger generates new function bodies for cross-component adapters (memory.copy + cabi_realloc trampolines). These have NO source. Options:
\"<meld-adapter>\"line N where N = the adapter's role (memory copy, realloc, lift, lower). Witness's truth-table view becomes correct: adapterbr_ifs show up as "adapter" branches that don't need source-level MC/DC coverage.br_iffallback when DWARF is absent or sparse, so this degrades gracefully.Variable-level debug info (
.debug_loc,.debug_loclists) is explicitly out of scope for this epic — instruction → source mapping only.Done criteria for THIS issue (discovery)
meld-core/tests/dwarf_passthrough.rsReferences
meld-core/src/lib.rs:110— defaultCustomSectionHandling::Mergemeld-core/src/lib.rs:1345-1356— encoder writes custom sections verbatimmeld-core/src/merger.rs:2010-2012— naive per-module custom-section accumulationmeld-core/src/parser.rs:1279-1283— core-module custom sections collectedmeld-core/src/parser.rs:1082-1084— component-level custom sections explicitly droppedpulseengine/witness:crates/witness-core/src/decisions.rs:83-140— DWARF section extraction contractpulseengine/witness:crates/witness-core/src/decisions.rs:142-160— DWARF address semantics ("byte offsets into the Code section")