Dem from stim text can use error decompositions#615
Conversation
Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>
Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>
Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>
b8f5989 to
acbd502
Compare
|
/ok to test acbd502 |
| if (!dets.empty() || !obs.empty()) { | ||
| detector_hits.push_back(dets); | ||
| observable_hits.push_back(obs); | ||
| rates.push_back(prob); |
There was a problem hiding this comment.
This changes the probability semantics for decomposed Stim errors. In Stim, error(p) A ^ B is still one correlated error mechanism with a suggested decomposition, but this creates multiple CUDA-Q columns each with probability p. Existing CUDA-Q consumers treat error_rates / error_rate_vec as per-column independent probabilities, so this would make the split components independently sampleable/weighted.
Can we avoid exposing decomposed columns as independent error mechanisms, or preserve the correlation explicitly, e.g. with shared error_ids plus clear consumer support? Otherwise decompose_errors=True should probably not populate ordinary error_rates in a way that looks probability-preserving.
There was a problem hiding this comment.
I agree with Vedika's concern that this changes the semantics of the error rate vec. I do think that we need to have both meanings. Since error_rates has an established meaning in this codebase (physical error rates), I suggest that we add another way rather than overloading the existing meaning of error_rates
| // Each segment delimited by '^' in the DEM text becomes its own column. | ||
| auto flush = [&]() { | ||
| if (!dets.empty() || !obs.empty()) { | ||
| detector_hits.push_back(dets); |
There was a problem hiding this comment.
This stores the raw component before duplicate detector targets are XOR-cancelled. For example, D0 D0 ^ D1 later becomes an all-zero detector column with a nonzero rate. That is not graphlike and will break consumers such as the PyMatching plugin, which requires each column to have one or two detector endpoints.
Can we canonicalize/XOR-cancel each component before appending, then reject decomposed components with 0 or more than 2 detector endpoints instead of emitting them?
There was a problem hiding this comment.
I agree that it probably makes the most sense to remove all-zero columns, but it looks like there is an existing test case that expects it as an output:
cudaqx/libs/qec/unittests/test_decoders.cpp
Lines 806 to 819 in 4bb93c6
Does it make sense for these all-zero columns to be removed only when decompose_errors=true, or to change this test?
| /// graphlike decomposition (components separated by '^' in the DEM text) are | ||
| /// expanded into one column per component; otherwise the '^' separators are | ||
| /// ignored and each error instruction produces a single column. | ||
| detector_error_model dem_from_stim_text(const std::string &dem_text, |
There was a problem hiding this comment.
Changing the only declaration from dem_from_stim_text(const std::string&) to dem_from_stim_text(const std::string&, bool) preserves source compatibility via the default argument, but removes the old exported C++ symbol. Any already-built downstream code/plugin linked against the one-argument symbol would fail to load.
Can we keep the one-argument overload and add a separate two-argument overload? The one-argument overload can simply delegate to dem_from_stim_text(dem_text, false).
There was a problem hiding this comment.
I agree that this is somewhat of a breaking change. However since we don't have a hard downstream dependency (we typically expect downstream to rebuild against the latest main/release), I would prefer that we don't add this overload just to preserve an old mangled symbol.
There was a problem hiding this comment.
That makes sense. If we expect downstream consumers to rebuild against the latest main/release, I’m fine not preserving the old one-arg mangled symbol here. Thanks for clarifying.
|
Thanks @eliotheinrich. I left a few more comments below. One testing gap that I think should be fixed in this PR: this adds new behavior to the public C++ At minimum, I’d like a small test in
Potential follow-up: docs/example clarity. The examples now show Could we either restructure the example to actually use the decomposed matrix where relevant, or track a follow-up to explicitly separate this as parser inspection only and state that direct |
melody-ren
left a comment
There was a problem hiding this comment.
Thanks Eliot! Could you please update the description to clarify that this PR doesn't decompose errors (already done by stim)?
| /// expanded into one column per component; otherwise the '^' separators are | ||
| /// ignored and each error instruction produces a single column. | ||
| detector_error_model dem_from_stim_text(const std::string &dem_text, | ||
| bool decompose_errors = false); |
There was a problem hiding this comment.
I want a refund on my high hopes due this name decompose_errors ;). I thought it meant that we're (cudaqx) doing the decomposition. Can we use something like use_stim_err_decomp or some other concise name to indicate that this flags toggles whether we'll ignore or use the stim decomposition? Feel free to suggest a better name. I don't quite like what I suggested but can't think of a better one
| if (!dets.empty() || !obs.empty()) { | ||
| detector_hits.push_back(dets); | ||
| observable_hits.push_back(obs); | ||
| rates.push_back(prob); |
There was a problem hiding this comment.
I agree with Vedika's concern that this changes the semantics of the error rate vec. I do think that we need to have both meanings. Since error_rates has an established meaning in this codebase (physical error rates), I suggest that we add another way rather than overloading the existing meaning of error_rates
Description
In the current implementation of
dem_from_stim_text, the suggested error decomposition^characters are ignored, but it is sometimes useful to output graphlike edge check matrices. For example,This PR adds a flag to
dem_from_stim_textwhich allows users to toggle if the^characters should be ignored (default) or should be used to generate an error-decomposed DEM. This is useful if the DEM will eventually be used for, i.e., pymatching.Runtime / performance impact
N/A
Self-review checklist
Please confirm each item before requesting review. Check
[x]or strikethrough and explain.
Before requesting review
Scope and size
(if so, an issue has been raised).
Tests
just when it is missing.
EXPECT_*/assertchecks areinsufficient for algorithmic correctness.
Documentation
tracked.
Code style
snake_casevscamelCase) forthe area being modified.
Dependencies
OSRB tickets filed.