Dem from stim text can use error decompositions by eliotheinrich · Pull Request #615 · NVIDIA/cudaqx

eliotheinrich · 2026-06-15T18:46:14Z

Description

In the current implementation of dem_from_stim_text, the suggested error decomposition ^ characters are ignored, but it is sometimes useful to output graphlike edge check matrices. For example,

import numpy as np
import stim
import cudaq_qec as qec
import beliefmatching

dem_text = "error(0.1) D0 ^ D1\n"

check_matrices = beliefmatching.detector_error_model_to_check_matrices(stim.DetectorErrorModel(dem_text))

# Current behavior: D0 and D1 merged into one column
dem_old = qec.dem_from_stim_text(dem_text)
print("dem_from_stim_text:")
print(np.array(dem_old.detector_error_matrix))          # [[1], [1]]

# Sometimes desirable: accept the error decomposition indicated by ^ and split into two columns
print("beliefmatching's edge_check_matrix:")
print(check_matrices.edge_check_matrix.toarray())                   # [[1, 0], [0, 1]

This PR adds a flag to dem_from_stim_text which allows users to toggle if the ^ characters should be ignored (default) or should be used to generate an error-decomposed DEM. This is useful if the DEM will eventually be used for, i.e., pymatching.

Runtime / performance impact

N/A

Self-review checklist

Please confirm each item before requesting review. Check [x] or strike
through and explain.

Before requesting review

I reviewed my own full diff in GitHub or my editor.
PR is in Draft if it is not yet ready for review.
Temporary / debugging changes have been removed.
Local test logs reviewed; no unexplained warnings or errors.
CI logs reviewed; no unexplained warnings or errors.
Full CI has been run.

Scope and size

PR is under ~1000 lines, or an exception is justified in the description.
Refactoring-only changes are isolated in their own PR(s).
No existing tests were disabled or modified just to make this PR pass
(if so, an issue has been raised).

Tests

New functionality has new tests.
Tests fail if the new functionality is broken (including crashes), not
just when it is missing.
Negative tests added where exceptions are expected.
Truth data added where simple EXPECT_* / assert checks are
insufficient for algorithmic correctness.
CI runtime impact considered; team notified if significant.

Documentation

Public-facing APIs have Doxygen docs.
User-visible behavior changes have public docs, or a follow-up is
tracked.

Code style

Naming follows the existing convention (snake_case vs camelCase) for
the area being modified.

Dependencies

No new third-party dependencies, or the team has been notified and
OSRB tickets filed.

Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>

copy-pr-bot · 2026-06-15T19:28:38Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

bmhowe23 · 2026-06-16T02:13:09Z

/ok to test acbd502

vedika-saravanan · 2026-06-16T15:13:07Z

+        if (!dets.empty() || !obs.empty()) {
+          detector_hits.push_back(dets);
+          observable_hits.push_back(obs);
+          rates.push_back(prob);


This changes the probability semantics for decomposed Stim errors. In Stim, error(p) A ^ B is still one correlated error mechanism with a suggested decomposition, but this creates multiple CUDA-Q columns each with probability p. Existing CUDA-Q consumers treat error_rates / error_rate_vec as per-column independent probabilities, so this would make the split components independently sampleable/weighted.

Can we avoid exposing decomposed columns as independent error mechanisms, or preserve the correlation explicitly, e.g. with shared error_ids plus clear consumer support? Otherwise decompose_errors=True should probably not populate ordinary error_rates in a way that looks probability-preserving.

I agree with Vedika's concern that this changes the semantics of the error rate vec. I do think that we need to have both meanings. Since error_rates has an established meaning in this codebase (physical error rates), I suggest that we add another way rather than overloading the existing meaning of error_rates

vedika-saravanan · 2026-06-16T15:13:59Z

+      // Each segment delimited by '^' in the DEM text becomes its own column.
+      auto flush = [&]() {
+        if (!dets.empty() || !obs.empty()) {
+          detector_hits.push_back(dets);


This stores the raw component before duplicate detector targets are XOR-cancelled. For example, D0 D0 ^ D1 later becomes an all-zero detector column with a nonzero rate. That is not graphlike and will break consumers such as the PyMatching plugin, which requires each column to have one or two detector endpoints.

Can we canonicalize/XOR-cancel each component before appending, then reject decomposed components with 0 or more than 2 detector endpoints instead of emitting them?

I agree that it probably makes the most sense to remove all-zero columns, but it looks like there is an existing test case that expects it as an output:

cudaqx/libs/qec/unittests/test_decoders.cpp

Lines 806 to 819 in 4bb93c6

TEST(StimDemGetDecoder, RepeatedDetectorOrObservableTargetsXorFold) {

const std::string dem_text = R"(error(0.1) D0 D0

error(0.1) L0 L0

)";

auto dem = cudaq::qec::dem_from_stim_text(dem_text);

ASSERT_EQ(dem.num_detectors(), 1u);

ASSERT_EQ(dem.num_observables(), 1u);

ASSERT_EQ(dem.num_error_mechanisms(), 2u);

EXPECT_EQ(dem.detector_error_matrix.at({0u, 0u}), 0u)

<< "duplicate D0 in error 0 should XOR-cancel to 0";

EXPECT_EQ(dem.observables_flips_matrix.at({0u, 1u}), 0u)

<< "duplicate L0 in error 1 should XOR-cancel to 0";

}

Does it make sense for these all-zero columns to be removed only when decompose_errors=true, or to change this test?

vedika-saravanan · 2026-06-16T15:14:58Z

+/// graphlike decomposition (components separated by '^' in the DEM text) are
+/// expanded into one column per component; otherwise the '^' separators are
+/// ignored and each error instruction produces a single column.
+detector_error_model dem_from_stim_text(const std::string &dem_text,


Changing the only declaration from dem_from_stim_text(const std::string&) to dem_from_stim_text(const std::string&, bool) preserves source compatibility via the default argument, but removes the old exported C++ symbol. Any already-built downstream code/plugin linked against the one-argument symbol would fail to load.

Can we keep the one-argument overload and add a separate two-argument overload? The one-argument overload can simply delegate to dem_from_stim_text(dem_text, false).

I agree that this is somewhat of a breaking change. However since we don't have a hard downstream dependency (we typically expect downstream to rebuild against the latest main/release), I would prefer that we don't add this overload just to preserve an old mangled symbol.

That makes sense. If we expect downstream consumers to rebuild against the latest main/release, I’m fine not preserving the old one-arg mangled symbol here. Thanks for clarifying.

vedika-saravanan · 2026-06-16T15:19:25Z

Thanks @eliotheinrich. I left a few more comments below.

One testing gap that I think should be fixed in this PR: this adds new behavior to the public C++ dem_from_stim_text API, but the new decompose_errors=true path is only covered through Python tests. Could we add C++ gtest coverage as well?

At minimum, I’d like a small test in libs/qec/unittests/test_decoders.cpp covering:

default / false behavior still ignores ^ and produces one column per Stim error instruction
true behavior splits ^ components into separate columns
edge behavior for duplicate detector targets after XOR cancellation, especially if we validate/reject empty decomposed components

Potential follow-up: docs/example clarity. The examples now show dem_from_stim_text(..., decompose_errors=True), but the decoder is still constructed from dem_text, which uses the default non-decomposed path. That can make readers think get_decoder(..., dem_text) is using the decomposed DEM when it is not.

Could we either restructure the example to actually use the decomposed matrix where relevant, or track a follow-up to explicitly separate this as parser inspection only and state that direct get_decoder(..., dem_text) uses the default non-decomposed parsing?

melody-ren

Thanks Eliot! Could you please update the description to clarify that this PR doesn't decompose errors (already done by stim)?

melody-ren · 2026-06-16T19:37:17Z

+/// expanded into one column per component; otherwise the '^' separators are
+/// ignored and each error instruction produces a single column.
+detector_error_model dem_from_stim_text(const std::string &dem_text,
+                                        bool decompose_errors = false);


I want a refund on my high hopes due this name decompose_errors ;). I thought it meant that we're (cudaqx) doing the decomposition. Can we use something like use_stim_err_decomp or some other concise name to indicate that this flags toggles whether we'll ignore or use the stim decomposition? Feel free to suggest a better name. I don't quite like what I suggested but can't think of a better one

melody-ren · 2026-06-16T19:39:43Z

+        if (!dets.empty() || !obs.empty()) {
+          detector_hits.push_back(dets);
+          observable_hits.push_back(obs);
+          rates.push_back(prob);


I agree with Vedika's concern that this changes the semantics of the error rate vec. I do think that we need to have both meanings. Since error_rates has an established meaning in this codebase (physical error rates), I suggest that we add another way rather than overloading the existing meaning of error_rates

eliotheinrich added 3 commits June 15, 2026 12:28

Added decompose_errors param to dem_from_stim_text

45c065a

Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>

Updated docs for new decompose_errors flag

2f724f8

Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>

Applied formatting patches

acbd502

Signed-off-by: Eliot Heinrich <eheinrich@nvidia.com>

eliotheinrich force-pushed the pr-dem-stim-decompose branch from b8f5989 to acbd502 Compare June 15, 2026 19:28

eliotheinrich marked this pull request as ready for review June 16, 2026 14:44

vedika-saravanan reviewed Jun 16, 2026

View reviewed changes

melody-ren reviewed Jun 16, 2026

View reviewed changes

	TEST(StimDemGetDecoder, RepeatedDetectorOrObservableTargetsXorFold) {
	const std::string dem_text = R"(error(0.1) D0 D0
	error(0.1) L0 L0
	)";

	auto dem = cudaq::qec::dem_from_stim_text(dem_text);
	ASSERT_EQ(dem.num_detectors(), 1u);
	ASSERT_EQ(dem.num_observables(), 1u);
	ASSERT_EQ(dem.num_error_mechanisms(), 2u);
	EXPECT_EQ(dem.detector_error_matrix.at({0u, 0u}), 0u)
	<< "duplicate D0 in error 0 should XOR-cancel to 0";
	EXPECT_EQ(dem.observables_flips_matrix.at({0u, 1u}), 0u)
	<< "duplicate L0 in error 1 should XOR-cancel to 0";
	}

Conversation

eliotheinrich commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Runtime / performance impact

Self-review checklist

Before requesting review

Scope and size

Tests

Documentation

Code style

Dependencies

Uh oh!

copy-pr-bot Bot commented Jun 15, 2026

Uh oh!

bmhowe23 commented Jun 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vedika-saravanan commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

melody-ren left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eliotheinrich commented Jun 15, 2026 •

edited

Loading

vedika-saravanan commented Jun 16, 2026 •

edited

Loading