-
-
Notifications
You must be signed in to change notification settings - Fork 21
Add GNN-Based Predictor with DAG Preprocessing #430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…into gnn-branch
…into gnn-branch
…into gnn-branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (5)
tests/hellinger_distance/test_estimated_hellinger_distance.py (5)
186-192:num_nodesshould be an integer, not a tensor.PyTorch Geometric's
Dataclass expectsnum_nodesto be a plain Pythonint. Wrapping it intorch.tensor([n_nodes])creates a 1D tensor, which can cause unexpected behavior during batching or graph operations.gnn_training_sample = Data( x=x, y=torch.tensor(labels_list[i], dtype=torch.float32), edge_index=edge_idx, - num_nodes=torch.tensor([n_nodes]), + num_nodes=n_nodes, )
219-221: Docstring is outdated after parameterization.The docstring still references "the Hellinger distance model that was trained in the previous test," but this test now trains its own model and is parameterized for both RF and GNN backends. Update to reflect the current behavior.
def test_train_and_qcompile_with_hellinger_model( source_path: Path, target_path: Path, device: Target, gnn: bool ) -> None: - """Test the entire predictor toolchain with the Hellinger distance model that was trained in the previous test.""" + """Test the entire predictor toolchain for estimated_hellinger_distance with both RF and GNN backends."""
307-317: Consolidate duplicatedata_pathcomputation.
data_pathis computed twice with the same value. Merge the cleanup loops to avoid redundant computation.data_path = get_path_training_data() / "training_data_aggregated" if data_path.exists(): for file in data_path.iterdir(): - if file.suffix == ".npy": + if file.suffix in (".npy", ".pt"): file.unlink() - - data_path = get_path_training_data() / "training_data_aggregated" - if data_path.exists(): - for file in data_path.iterdir(): - if file.suffix == ".pt": - file.unlink()
319-323: Use tuple membership instead of chainedorcomparisons.For readability and consistency, use
inwith a tuple.- if file.suffix == ".joblib" or file.suffix == ".pth" or file.suffix == ".json": + if file.suffix in (".joblib", ".pth", ".json"):
293-324: Consider converting cleanup test to a fixture.
test_remove_filesacts as teardown for artifacts created by other tests. This couples correctness to test ordering and can break under parallel execution or when running a subset of tests with-k. Consider moving this logic to a module-scoped autouse fixture withyieldto guarantee cleanup regardless of test selection or execution order.Example fixture approach:
@pytest.fixture(scope="module", autouse=True) def cleanup_test_artifacts(source_path: Path, target_path: Path): """Clean up test artifacts after all tests in the module complete.""" yield # Cleanup logic here...
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
tests/hellinger_distance/test_estimated_hellinger_distance.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/hellinger_distance/test_estimated_hellinger_distance.py (3)
src/mqt/predictor/hellinger/utils.py (2)
calc_device_specific_features(37-132)hellinger_distance(29-34)src/mqt/predictor/ml/helper.py (3)
TrainingData(575-585)create_dag(192-301)get_path_training_data(36-38)src/mqt/predictor/rl/helper.py (1)
get_path_training_data(97-99)
🪛 Ruff (0.14.6)
tests/hellinger_distance/test_estimated_hellinger_distance.py
158-158: Boolean-typed positional argument in function definition
(FBT001)
218-218: Boolean-typed positional argument in function definition
(FBT001)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 🐍 Test (windows-2022) / 🐍 windows-2022
- GitHub Check: 🐍 Test (ubuntu-24.04) / 🐍 ubuntu-24.04
- GitHub Check: 🐍 Test (macos-14) / 🐍 macos-14
🔇 Additional comments (2)
tests/hellinger_distance/test_estimated_hellinger_distance.py (2)
21-31: LGTM!The new imports for
torch,torch_geometric.data(Batch, Data), andcreate_dagare appropriate for the GNN testing functionality.
254-265: Clarify the Windows + GNN timeout behavior.The condition
sys.platform == "win32" and not gnnmeans GNN tests on Windows skip the timeout warning path. If this is intentional (e.g., GNN compilation doesn't use the same timeout mechanism), a brief comment would help future maintainers understand this distinction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
tests/device_selection/test_predictor_ml.py(4 hunks)tests/hellinger_distance/test_estimated_hellinger_distance.py(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/device_selection/test_predictor_ml.py (2)
src/mqt/predictor/ml/helper.py (1)
get_path_training_data(36-38)src/mqt/predictor/ml/predictor.py (2)
predict_device_for_figure_of_merit(800-880)Predictor(166-797)
🪛 Ruff (0.14.6)
tests/device_selection/test_predictor_ml.py
42-42: Boolean-typed positional argument in function definition
(FBT001)
42-42: Boolean-typed positional argument in function definition
(FBT001)
115-115: Boolean-typed positional argument in function definition
(FBT001)
tests/hellinger_distance/test_estimated_hellinger_distance.py
158-158: Boolean-typed positional argument in function definition
(FBT001)
219-219: Boolean-typed positional argument in function definition
(FBT001)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 🐍 Test (macos-14) / 🐍 macos-14
- GitHub Check: 🐍 Test (windows-2022) / 🐍 windows-2022
- GitHub Check: 🐍 Test (ubuntu-24.04) / 🐍 ubuntu-24.04
🔇 Additional comments (6)
tests/device_selection/test_predictor_ml.py (3)
38-43: Parameterization covers key scenarios appropriately.The test matrix exercises RF-only, GNN without verbose, and GNN with verbose modes. This provides good coverage for the new GNN path integration.
68-75: Artifact assertions correctly differentiate GNN and RF paths.The conditional logic now properly validates:
- GNN path:
graph_dataset_expected_fidelity.ptplus the shared.npyfiles- RF path:
training_data_expected_fidelity.npyplus the shared.npyfilesThis addresses the earlier feedback about asserting
names_listandscores_listexistence for the GNN path.
114-117: Parameterized test correctly exercises both predictor backends.The GNN flag is properly passed to
Predictor, ensuring both RF and GNN backends are tested for the_get_prepared_training_dataerror path.tests/hellinger_distance/test_estimated_hellinger_distance.py (3)
166-193: GNN training data construction is correct.The branching logic properly:
- Uses
calc_device_specific_featuresfor RF andcreate_dagfor GNN- Constructs appropriate
TrainingDataobjects for each path- Correctly passes
n_nodesas an integer toData(addressing earlier feedback)
196-214: Model training and evaluation paths are properly separated.The conditional training logic correctly dispatches to
train_gnn_modelvstrain_random_forest_model, and the evaluation uses appropriate methods for each model type. The tolerance comment at line 213 explains the rationale foratol=2e-1.
255-266: Clarify whether Windows timeout warning should apply to GNN path.The condition
sys.platform == "win32" and not gnnmeans the timeout warning test is skipped whengnn=Trueon Windows. Is this intentional because the GNN path handles timeouts differently, or should both paths warn on Windows?If GNN compilation also uses timeouts that aren't supported on Windows, consider:
- if sys.platform == "win32" and not gnn: + if sys.platform == "win32":Otherwise, add a comment explaining why GNN is exempt from this warning.
| @pytest.mark.parametrize("gnn", [False, True], ids=["rf", "gnn"]) | ||
| def test_train_model_and_predict(device: Target, gnn: bool) -> None: | ||
| """Test the training of the random forest regressor. The trained model is saved and used in the following tests.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Docstring no longer accurately reflects parameterized test behavior.
The docstring still says "Test the training of the random forest regressor" but the test is now parameterized to cover both RF and GNN backends.
@pytest.mark.parametrize("gnn", [False, True], ids=["rf", "gnn"])
def test_train_model_and_predict(device: Target, gnn: bool) -> None:
- """Test the training of the random forest regressor. The trained model is saved and used in the following tests."""
+ """Test model training and prediction for both RF and GNN backends."""🧰 Tools
🪛 Ruff (0.14.6)
158-158: Boolean-typed positional argument in function definition
(FBT001)
🤖 Prompt for AI Agents
In tests/hellinger_distance/test_estimated_hellinger_distance.py around lines
157 to 159, the test docstring incorrectly states it only tests training of the
random forest regressor while the test is parameterized to run for both RF and
GNN; update the docstring to accurately describe that the test trains and saves
a model for both backends (random forest and GNN) and that the saved model is
used in subsequent tests, and briefly document the gnn parameter meaning (False
=> RF, True => GNN).
| def test_remove_files(source_path: Path, target_path: Path) -> None: | ||
| """Remove files created during testing.""" | ||
| if source_path.exists(): | ||
| for file in source_path.iterdir(): | ||
| if file.suffix == ".qasm": | ||
| file.unlink() | ||
| source_path.rmdir() | ||
|
|
||
| if target_path.exists(): | ||
| for file in target_path.iterdir(): | ||
| if file.suffix == ".qasm": | ||
| file.unlink() | ||
| target_path.rmdir() | ||
|
|
||
| data_path = get_path_training_data() / "training_data_aggregated" | ||
| if data_path.exists(): | ||
| for file in data_path.iterdir(): | ||
| if file.suffix in (".npy", ".pt"): | ||
| file.unlink() | ||
|
|
||
| model_path = get_path_training_data() / "trained_model" | ||
| if model_path.exists(): | ||
| for file in model_path.iterdir(): | ||
| if file.suffix in (".joblib", ".pth", ".json"): | ||
| file.unlink() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider converting cleanup to a fixture for test isolation.
test_remove_files acts as teardown but depends on test execution order. If tests are run in isolation or in parallel, cleanup may not occur. This was flagged in a previous review.
Convert to a session-scoped autouse fixture:
@pytest.fixture(scope="module", autouse=True)
def cleanup_test_artifacts(source_path: Path, target_path: Path):
"""Clean up test artifacts after all tests in the module complete."""
yield # Let tests run
# Cleanup logic here...This ensures cleanup runs regardless of test selection or ordering.
🤖 Prompt for AI Agents
In tests/hellinger_distance/test_estimated_hellinger_distance.py around lines
294 to 318, replace the standalone test_remove_files teardown function with a
pytest fixture that runs automatically for the module (or session) and performs
cleanup after tests by yielding control and then executing the same removal
logic; make the fixture use scope="module" (or "session") and autouse=True,
accept source_path and target_path as parameters, ensure the post-yield cleanup
is idempotent (check existence before unlink/rmdir or use safe removal) and
covers data_path and model_path cleanup so cleanup runs regardless of test
ordering or isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (10)
src/mqt/predictor/ml/predictor.py (10)
108-109: Consider makinggnnparameter keyword-only.The
gnnboolean parameter could be made keyword-only to prevent accidental positional usage and address Ruff's FBT001/FBT002 warnings consistently across the codebase.Apply this diff:
path_training_data: Path | None = None, timeout: int = 600, + *, gnn: bool = False, **gnn_kwargs: Unpack[TrainGNNKwargs],
328-338: Device score ordering may diverge fromself.devicescanonical order.Line 330 uses
sorted(scores.keys())to order device scores, butself.devicesis sorted bydev.descriptionat initialization (lines 190-192). If the sorted order ofscores.keys()differs from the sorted order of device descriptions, theytensor indices won't match the model's expected output ordering during training and inference.Use the canonical
self.devicesordering:if self.gnn: x, _y, edge_idx, n_nodes, target_label = training_sample - name_device = sorted(scores.keys()) - value_device = [scores[i] for i in name_device] + # Use canonical device ordering from self.devices + value_device = [scores.get(dev.description, -1.0) for dev in self.devices] gnn_training_sample = Data( x=x, y=torch.tensor(value_device, dtype=torch.float32),
422-422: Remove or clarify the commented assignment.Line 422 assigns
scores_list = scoreswith a commented-out alternative# list(scores.values()). Sincescoresis already a dict and is used directly, remove the comment to avoid confusion.Apply this diff:
- scores_list = scores # list(scores.values()) + scores_list = scores
427-430: Clarify or remove the intermediateytensor that gets discarded.The
ytensor created at line 429 (device index) is immediately discarded ingenerate_training_data(line 329:x, _y, edge_idx, ...) and replaced with the full device scores tensor. This intermediate value adds confusion without serving a purpose.Consider either:
- Not computing
yhere and returning a 4-tuple(x, edge_index, number_of_gates, target_label), or- Adding a comment explaining why this value exists but is unused.
if self.gnn: x, edge_index, number_of_gates = create_dag(qc) - y = torch.tensor([[dev.description for dev in self.devices].index(target_label)], dtype=torch.float) - training_sample = (x, y, edge_index, number_of_gates, target_label) + # Note: y placeholder - actual targets are computed in generate_training_data + training_sample = (x, None, edge_index, number_of_gates, target_label)
504-507: Optuna objective has unused parameter and potential KFold failure.Two issues in the
objectivemethod:
Line 504:
num_outputsis immediately overwritten bymax(1, len(self.devices)), making the parameter unused. Either remove the parameter or honor the passed value.Line 507:
KFold(n_splits=k_folds)will fail ifk_folds < 2. For small datasets, guard this:k_folds = max(2, k_folds)Apply these fixes:
- num_outputs = max(1, len(self.devices)) + # num_outputs already passed as parameter, use it directly # Split into k-folds + k_folds = max(2, k_folds) # KFold requires at least 2 splits kf = KFold(n_splits=k_folds, shuffle=True)If you want to keep the overwriting behavior, then remove
num_outputsfrom the function signature and docstring.
859-859: Security risk:torch.loadwithoutweights_only=Truecan execute arbitrary code.
torch.load(path)withoutweights_only=Truecan execute arbitrary code if the model file is malicious. Since you're loading a state dict:Apply this diff:
- gnn_model.load_state_dict(torch.load(path)) + gnn_model.load_state_dict(torch.load(path, weights_only=True))
627-633: Add parsedmlp_unitsto JSON metadata for inference.Line 631 parses
mlp_unitsfrom the string, but this parsed list is not stored injson_dict. While the string"mlp"is saved, prediction code may expect the parsed list. Add it to the JSON before saving:Apply this diff:
mlp_str = dict_best_hyper["mlp"] mlp_units = [] if mlp_str == "none" else [int(x) for x in mlp_str.split(",")] json_dict["num_outputs"] = len(self.devices) if self.figure_of_merit != "hellinger_distance" else 1 + json_dict["mlp_units"] = mlp_units # Store parsed list for inference
607-607: Remove dead code: list comprehension result not assigned.Line 607 computes
[dev.description for dev in self.devices]but doesn't assign the result. This line should be removed as the class labels are correctly stored at line 666.Apply this diff:
else: # task = "binary" task = "classification" - [dev.description for dev in self.devices] sampler_obj = TPESampler(n_startup_trials=10)
861-863: Use standard PyTorch Geometric attributenum_nodesinstead of customnum_gates.Line 861 creates a Data object with
num_gates=number_of_gates, but PyTorch Geometric'sDataclass expectsnum_nodesas the standard attribute name. While custom attributes are permitted, using the standard name ensures compatibility with GNN operations that rely on this attribute.Apply this diff:
- feature_vector = Data(x=x, edge_index=edge_index, num_gates=number_of_gates).to( + feature_vector = Data(x=x, edge_index=edge_index, num_nodes=number_of_gates).to( "cuda" if torch.cuda.is_available() else "cpu" )
696-701: Task handling in verbose evaluation always uses classification, ignoring regression tasks.Line 698 unconditionally calls
evaluate_classification_modeleven whentask == "regression"(for Hellinger distance). This produces incorrect metrics for regression tasks.Branch on the
taskvariable:if verbose: test_loader = DataLoader(training_data.X_test, batch_size=16, shuffle=False) - avg_loss_test, dict_results, _ = evaluate_classification_model( - model, test_loader, loss_fn=loss_fn, device=device, verbose=verbose - ) + if task == "regression": + avg_loss_test, dict_results, _ = evaluate_regression_model( + model, test_loader, loss_fn=loss_fn, device=device, verbose=verbose + ) + else: + avg_loss_test, dict_results, _ = evaluate_classification_model( + model, test_loader, loss_fn=loss_fn, device=device, verbose=verbose + ) print(f"Test loss: {avg_loss_test:.4f}, {dict_results}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/mqt/predictor/ml/predictor.py(12 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/mqt/predictor/ml/predictor.py (2)
src/mqt/predictor/hellinger/utils.py (1)
get_hellinger_model_path(135-144)src/mqt/predictor/ml/helper.py (9)
TrainingData(575-585)create_dag(192-301)create_feature_vector(164-189)evaluate_classification_model(323-402)evaluate_regression_model(405-469)get_openqasm3_gates(119-151)get_path_trained_model(48-50)get_path_trained_model_gnn(53-57)get_path_training_data(36-38)
🪛 Ruff (0.14.7)
src/mqt/predictor/ml/predictor.py
108-108: Boolean-typed positional argument in function definition
(FBT001)
108-108: Boolean default positional argument in function definition
(FBT002)
147-147: Logging statement uses f-string
(G004)
150-150: Logging statement uses f-string
(G004)
174-174: Boolean-typed positional argument in function definition
(FBT001)
174-174: Boolean default positional argument in function definition
(FBT002)
449-449: Boolean-typed positional argument in function definition
(FBT001)
449-449: Boolean default positional argument in function definition
(FBT002)
571-571: Boolean-typed positional argument in function definition
(FBT001)
571-571: Boolean default positional argument in function definition
(FBT002)
802-802: Boolean-typed positional argument in function definition
(FBT001)
802-802: Boolean default positional argument in function definition
(FBT002)
…into gnn-branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/mqt/predictor/ml/predictor.py (1)
171-190: Makegnnparameter keyword-only inPredictor.__init__.The
gnnparameter is currently positional, which triggers Ruff FBT001/FBT002. Making it keyword-only prevents accidental positional usage and maintains consistency withsetup_device_predictor.Apply this diff:
def __init__( self, devices: list[Target], figure_of_merit: figure_of_merit = "expected_fidelity", - gnn: bool = False, + *, + gnn: bool = False, logger_level: int = logging.INFO, ) -> None:
♻️ Duplicate comments (8)
src/mqt/predictor/_version.py (2)
9-11: Autogenerated_version.pyshould not be tracked in git (per header and prior review).The header explicitly states “file generated by setuptools-scm” and “don’t change, don’t track in version control”, but this file is committed and contains a concrete dev version and
Nonecommit id. This will constantly drift from the real SCM state and cause noisy diffs and scanner findings.Strongly recommend:
- Remove
src/mqt/predictor/_version.pyfrom version control.- Add this path to
.gitignore(and/or configuresetuptools-scm’swrite_toas intended) so the file is only generated into sdists/wheels, not tracked in the repo.This aligns with the earlier reviewer note that the file should not be tracked.
Also applies to: 36-39
21-27: Fix misuse ofTYPE_CHECKINGand unreachable type-alias branch.
TYPE_CHECKING = Falsemakes theif TYPE_CHECKING:block unreachable at runtime, and it is not the standard sentinel (fromtyping) that type checkers special‑case. This is what the “unreachable code” scanner warning is pointing at, and it adds indirection without benefit for such a small module.A simpler and clearer approach is to drop the conditional entirely and use regular runtime type aliases:
-from typing import TYPE_CHECKING - -if TYPE_CHECKING: - VERSION_TUPLE = tuple[int | str, ...] - COMMIT_ID = str | None -else: - VERSION_TUPLE = object - COMMIT_ID = object +from typing import Optional, Tuple, Union + +VERSION_TUPLE = Tuple[Union[int, str], ...] +COMMIT_ID = Optional[str]This removes unreachable code and keeps the annotations straightforward. Also ensure your
python_requiresmatches the syntax level you rely on (PEP 604 unions etc.), or adjust the annotations accordingly.Also applies to: 29-34
src/mqt/predictor/ml/helper.py (3)
306-320:get_results_classesstill doesn't handle integer class label targets.When targets are 1D or
[N, 1]shaped integer class indices,argmax(targets, dim=1)will always return 0, corrupting accuracy metrics. This was flagged in a previous review.Apply this diff to handle both one-hot and integer encodings:
def get_results_classes(preds: torch.Tensor, targets: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: """Return predicted and target class indices. Arguments: preds: model predictions - targets: ground truth targets + targets: ground truth targets (one-hot encoded or integer class labels) Returns: pred_idx: predicted class indices targets_idx: target class indices """ pred_idx = torch.argmax(preds, dim=1) - targets_idx = torch.argmax(targets, dim=1) + if targets.dim() == 1: + targets_idx = targets.long() + elif targets.size(1) == 1: + targets_idx = targets.view(-1).long() + else: + targets_idx = torch.argmax(targets, dim=1) return pred_idx, targets_idx
325-405: Return type annotation mismatch.Line 333 declares
dict[str, float]but line 391 assigns a string tometrics["classification_report"]. This was flagged in a previous review.Apply this diff:
-) -> tuple[float, dict[str, float], tuple[np.ndarray, np.ndarray] | None]: +) -> tuple[float, dict[str, float | str], tuple[np.ndarray, np.ndarray] | None]:
536-544: Validation evaluation calls hardcodeverbose=True, causing noisy output.Lines 539 and 543 pass
verbose=Trueregardless of the outer function'sverboseparameter. This was flagged in a previous review.Apply this diff:
if task == "classification": val_loss, val_metrics, _ = evaluate_classification_model( - model, val_loader, loss_fn, device=str(device), verbose=True + model, val_loader, loss_fn, device=str(device), verbose=False ) elif task == "regression": val_loss, val_metrics, _ = evaluate_regression_model( - model, val_loader, loss_fn, device=str(device), verbose=True + model, val_loader, loss_fn, device=str(device), verbose=False )src/mqt/predictor/ml/predictor.py (3)
668-674: Redundant train/val split after data preparation already performed 70/30 split.
_get_prepared_training_data()at line 596 already performs a 70/30 train/test split. Lines 668-670 splitX_trainagain (80/20), resulting in effective proportions of 56%/14%/30%. Consider usingtraining_data.X_testas the validation set instead. This was flagged in a previous review.Apply this diff:
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) - x_train, x_val, _y_train, _y_val = train_test_split( - training_data.X_train, training_data.y_train, test_size=0.2, random_state=5 - ) # Dataloader - train_loader = DataLoader(x_train, batch_size=16, shuffle=True) - - val_loader = DataLoader(x_val, batch_size=16, shuffle=False) + train_loader = DataLoader(training_data.X_train, batch_size=16, shuffle=True) + val_loader = DataLoader(training_data.X_test, batch_size=16, shuffle=False)
680-688: Type mismatch:deviceistorch.devicebuttrain_modelexpectsstr | None.Line 682 passes
device(atorch.deviceobject) totrain_model, which expectsdevice: str | None. This works via duck typing but is a type annotation mismatch. This was flagged in a previous review.Apply this diff:
train_model( model, train_loader, optimizer, loss_fn, task=task, num_epochs=num_epochs, - device=device, + device=str(device), verbose=verbose, val_loader=val_loader, patience=30, min_delta=0.0, restore_best=True, )
626-655: Reduce code duplication in GNN instantiation.The two GNN instantiation blocks are nearly identical, differing only in
output_dim. This can be simplified.Apply this diff:
- json_dict["num_outputs"] = len(self.devices) if self.figure_of_merit != "hellinger_distance" else 1 - if self.figure_of_merit != "hellinger_distance": - model = GNN( - in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), - num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"], - hidden_dim=dict_best_hyper["hidden_dim"], - num_resnet_layers=dict_best_hyper["num_resnet_layers"], - mlp_units=mlp_units, - output_dim=len(self.devices), - dropout_p=dict_best_hyper["dropout"], - bidirectional=dict_best_hyper["bidirectional"], - use_sag_pool=dict_best_hyper["sag_pool"], - sag_ratio=0.7, - conv_activation=torch.nn.functional.leaky_relu, - mlp_activation=torch.nn.functional.leaky_relu, - ).to("cuda" if torch.cuda.is_available() else "cpu") - else: - model = GNN( - in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), - num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"], - hidden_dim=dict_best_hyper["hidden_dim"], - num_resnet_layers=dict_best_hyper["num_resnet_layers"], - mlp_units=mlp_units, - output_dim=1, - dropout_p=dict_best_hyper["dropout"], - bidirectional=dict_best_hyper["bidirectional"], - use_sag_pool=dict_best_hyper["sag_pool"], - sag_ratio=0.7, - conv_activation=torch.nn.functional.leaky_relu, - mlp_activation=torch.nn.functional.leaky_relu, - ).to("cuda" if torch.cuda.is_available() else "cpu") + num_outputs = 1 if self.figure_of_merit == "hellinger_distance" else len(self.devices) + json_dict["num_outputs"] = num_outputs + device_str = "cuda" if torch.cuda.is_available() else "cpu" + model = GNN( + in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), + num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"], + hidden_dim=dict_best_hyper["hidden_dim"], + num_resnet_layers=dict_best_hyper["num_resnet_layers"], + mlp_units=mlp_units, + output_dim=num_outputs, + dropout_p=dict_best_hyper["dropout"], + bidirectional=dict_best_hyper["bidirectional"], + use_sag_pool=dict_best_hyper["sag_pool"], + sag_ratio=0.7, + conv_activation=torch.nn.functional.leaky_relu, + mlp_activation=torch.nn.functional.leaky_relu, + ).to(device_str)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
src/mqt/predictor/_version.py(1 hunks)src/mqt/predictor/ml/helper.py(4 hunks)src/mqt/predictor/ml/predictor.py(12 hunks)src/mqt/predictor/rl/predictorenv.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-04T06:59:40.314Z
Learnt from: MatthiasReumann
Repo: munich-quantum-toolkit/core PR: 1301
File: mlir/lib/Dialect/MQTOpt/Transforms/Transpilation/LayeredUnit.cpp:84-85
Timestamp: 2025-12-04T06:59:40.314Z
Learning: In the MQTOpt MLIR routing passes (NaiveRoutingPassSC, AStarRoutingPassSC), the input IR is guaranteed to contain only 1-qubit and 2-qubit gates. All 3+-qubit gates must be decomposed before routing; otherwise the input IR is invalid. This invariant allows skipTwoQubitBlock in LayeredUnit.cpp to safely assert wires.size() == 2.
Applied to files:
src/mqt/predictor/ml/helper.py
🧬 Code graph analysis (2)
src/mqt/predictor/ml/helper.py (1)
src/mqt/predictor/utils.py (1)
calc_supermarq_features(94-146)
src/mqt/predictor/ml/predictor.py (1)
src/mqt/predictor/ml/helper.py (12)
TrainingData(577-587)create_dag(192-303)create_feature_vector(164-189)evaluate_classification_model(325-404)evaluate_regression_model(407-471)get_openqasm3_gates(119-151)get_path_trained_model(48-50)get_path_trained_model_gnn(53-57)get_path_training_circuits(60-62)get_path_training_circuits_compiled(65-67)get_path_training_data(36-38)train_model(474-573)
🪛 Ruff (0.14.8)
src/mqt/predictor/ml/predictor.py
148-148: Logging statement uses f-string
(G004)
151-151: Logging statement uses f-string
(G004)
175-175: Boolean-typed positional argument in function definition
(FBT001)
175-175: Boolean default positional argument in function definition
(FBT002)
448-448: Boolean-typed positional argument in function definition
(FBT001)
448-448: Boolean default positional argument in function definition
(FBT002)
569-569: Boolean-typed positional argument in function definition
(FBT001)
569-569: Boolean default positional argument in function definition
(FBT002)
800-800: Boolean-typed positional argument in function definition
(FBT001)
800-800: Boolean default positional argument in function definition
(FBT002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 🐍 Test (ubuntu-24.04) / 🐍 ubuntu-24.04
- GitHub Check: 🐍 Test (macos-14) / 🐍 macos-14
- GitHub Check: 🐍 Test (windows-2022) / 🐍 windows-2022
🔇 Additional comments (13)
src/mqt/predictor/ml/helper.py (6)
18-26: LGTM on new imports.The added imports for numpy, torch, qiskit utilities, and sklearn metrics are appropriate for the new GNN and evaluation functionality.
53-58: LGTM on GNN path helper.Clean implementation using f-string and proper Path handling for the GNN checkpoint file.
119-152: LGTM on OpenQASM 3.0 gate list.The function includes a helpful reference to the specification URL and a maintenance note.
192-304: DAG featurization handles edge cases well.The implementation correctly guards against:
- Empty edges (lines 267-270)
- Empty
topo_nodes(lines 274-278)One observation:
nodes(line 234) andtopo_nodes(line 273) come from different DAG methods, but critical path lookup at line 297 assumes all nodes innodesexist indist_in/dist_out(keyed bytopo_nodes). This should be safe since both return operation nodes from the same DAG, but verify this assumption holds if Qiskit's DAG API changes.
407-472: LGTM on regression evaluator.Proper handling of edge cases including empty arrays and constant targets for R2 calculation.
576-587: LGTM on TrainingData type widening.The updated types properly support both classical numpy arrays and GNN graph data.
src/mqt/predictor/ml/predictor.py (7)
18-64: LGTM on new imports and type definitions.The imports and type aliases (
GNNSample,FeatureSample,TrainingSample) are well-organized and support the new GNN functionality.
92-99: LGTM on TrainGNNKwargs TypedDict.Good use of
TypedDictwithtotal=Falsefor optional GNN training configuration.
329-342: LGTM on GNN training data generation.The use of canonical device ordering via
[scores.get(dev.description, -1.0) for dev in self.devices]ensures consistent label ordering between training and inference.
502-506: LGTM on k_folds clamping.The
k_folds = max(2, k_folds)guard preventsKFold(n_splits=1)which would raise an error.
657-660: LGTM on class labels persistence.Storing
json_dict["class_labels"]ensures device ordering is preserved for inference.
689-699: LGTM on task-aware verbose evaluation.The branching on
taskfor evaluation (regression vs classification) is correct.
835-873: LGTM on GNN prediction path.Key improvements addressed from previous reviews:
mlp_unitsparsed fromjson_dict["mlp"](line 841)class_labelsloaded fromjson_dict["class_labels"](line 863)torch.load(path, weights_only=True)for security (line 857)- Batch dimension squeezed before length check (line 866)
| def objective( | ||
| self, | ||
| trial: optuna.Trial, | ||
| dataset: NDArray[np.float64] | list[torch_geometric.data.Data], | ||
| task: str, | ||
| in_feats: int, | ||
| num_outputs: int, | ||
| loss_fn: nn.Module, | ||
| k_folds: int, | ||
| batch_size: int = 32, | ||
| num_epochs: int = 10, | ||
| patience: int = 10, | ||
| verbose: bool = False, | ||
| device: str | None = None, | ||
| ) -> float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Make verbose parameter keyword-only.
The verbose: bool = False parameter at line 448 triggers Ruff FBT001/FBT002. Consider making it keyword-only for consistency.
Apply this diff:
def objective(
self,
trial: optuna.Trial,
dataset: NDArray[np.float64] | list[torch_geometric.data.Data],
task: str,
in_feats: int,
num_outputs: int,
loss_fn: nn.Module,
k_folds: int,
batch_size: int = 32,
num_epochs: int = 10,
patience: int = 10,
- verbose: bool = False,
device: str | None = None,
+ *,
+ verbose: bool = False,
) -> float:🧰 Tools
🪛 Ruff (0.14.8)
448-448: Boolean-typed positional argument in function definition
(FBT001)
448-448: Boolean default positional argument in function definition
(FBT002)
🤖 Prompt for AI Agents
In src/mqt/predictor/ml/predictor.py around lines 436 to 450, make the verbose
parameter keyword-only by inserting a lone asterisk (*) immediately before the
verbose parameter in the objective function signature; this ensures verbose
cannot be passed positionally while leaving the other parameters and defaults
unchanged.
| def predict_device_for_figure_of_merit( | ||
| qc: Path | QuantumCircuit, figure_of_merit: figure_of_merit = "expected_fidelity" | ||
| qc: Path | QuantumCircuit, figure_of_merit: figure_of_merit = "expected_fidelity", gnn: bool = False | ||
| ) -> Target: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider making gnn parameter keyword-only.
The gnn: bool = False parameter triggers Ruff FBT001/FBT002. Making it keyword-only maintains consistency with setup_device_predictor.
Apply this diff:
def predict_device_for_figure_of_merit(
- qc: Path | QuantumCircuit, figure_of_merit: figure_of_merit = "expected_fidelity", gnn: bool = False
+ qc: Path | QuantumCircuit, figure_of_merit: figure_of_merit = "expected_fidelity", *, gnn: bool = False
) -> Target:🧰 Tools
🪛 Ruff (0.14.8)
800-800: Boolean-typed positional argument in function definition
(FBT001)
800-800: Boolean default positional argument in function definition
(FBT002)
🤖 Prompt for AI Agents
In src/mqt/predictor/ml/predictor.py around lines 799-801, the gnn parameter
should be keyword-only to satisfy Ruff FBT001/FBT002; update the function
signature to place a bare * before gnn (e.g., def
predict_device_for_figure_of_merit(qc: Path | QuantumCircuit, figure_of_merit:
figure_of_merit = "expected_fidelity", *, gnn: bool = False) -> Target) and then
update any call sites that pass gnn positionally to use the gnn=... keyword.
| # added replace_implicit_swaps=True for a warning to make sure that implicit swaps are also replaced | ||
| altered_qc = tk_to_qiskit(tket_qc, replace_implicit_swaps=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🌐 Web query:
pytket-qiskit tk_to_qiskit replace_implicit_swaps parameter
💡 Result:
The tk_to_qiskit(..., replace_implicit_swaps=False) parameter (default False) tells the converter whether to replace pytket "implicit" qubit permutations with explicit SWAP gates before converting to Qiskit. If True, tk_to_qiskit calls Circuit.replace_implicit_wire_swaps() (so implicit wire-swaps are implemented as SWAP gates); if False a warning may be emitted when implicit permutations are present (controlled by the perm_warning flag). [1][2]
References:
- pytket-qiskit tk_to_qiskit docstring. [1]
- Implementation / Circuit.replace_implicit_wire_swaps() in pytket docs. [2]
🏁 Script executed:
# Examine the actual code context around lines 382-383
cat -n src/mqt/predictor/rl/predictorenv.py | sed -n '375,395p'Repository: munich-quantum-toolkit/predictor
Length of output: 311
🏁 Script executed:
# Check if there's any documentation or comments nearby that provide context
cat -n src/mqt/predictor/rl/predictorenv.py | sed -n '370,400p'Repository: munich-quantum-toolkit/predictor
Length of output: 311
🏁 Script executed:
# Look for imports to understand the pytket-qiskit version and setup
head -50 src/mqt/predictor/rl/predictorenv.py | grep -E "import|from.*pytket|from.*qiskit"Repository: munich-quantum-toolkit/predictor
Length of output: 311
Improve comment clarity for better code documentation.
The parameter replace_implicit_swaps=True is valid and well-documented in pytket-qiskit. When set to True, it converts implicit qubit permutations to explicit SWAP gates before converting to Qiskit. The current comment could be clearer about this intent. Consider revising to:
- # added replace_implicit_swaps=True for a warning to make sure that implicit swaps are also replaced
+ # Replace implicit wire swaps with explicit SWAP gates during conversion to Qiskit
altered_qc = tk_to_qiskit(tket_qc, replace_implicit_swaps=True)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # added replace_implicit_swaps=True for a warning to make sure that implicit swaps are also replaced | |
| altered_qc = tk_to_qiskit(tket_qc, replace_implicit_swaps=True) | |
| # Replace implicit wire swaps with explicit SWAP gates during conversion to Qiskit | |
| altered_qc = tk_to_qiskit(tket_qc, replace_implicit_swaps=True) |
🤖 Prompt for AI Agents
In src/mqt/predictor/rl/predictorenv.py around lines 382-383, the inline comment
for replace_implicit_swaps=True is vague; update it to explicitly state that
setting replace_implicit_swaps=True tells pytket-qiskit to convert implicit
qubit permutations into explicit SWAP gates before conversion to Qiskit so those
qubit mappings are preserved in the resulting Qiskit circuit, and mention that
this parameter is provided by pytket-qiskit for that purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (5)
src/mqt/predictor/_version.py (1)
1-39: This auto-generated file should not be tracked in version control.The file header explicitly states "don't change, don't track in version control" (line 10). This file is generated by
setuptools-scmat build time and should be added to.gitignorerather than committed.Additionally, lines 21-24 contain unreachable code:
TYPE_CHECKING = Falseis hardcoded, making theif TYPE_CHECKING:block dead code.src/mqt/predictor/ml/helper.py (2)
306-319:get_results_classeswill corrupt target labels when targets are 1D or [N, 1] shaped.The function applies
argmax(targets, dim=1)unconditionally, which works for one-hot encoded targets but silently corrupts integer class labels. When targets are[N, 1]shaped (as done at line 362-363 inevaluate_classification_model),argmaxon dimension 1 returns all zeros regardless of actual values.Make the function robust to both encodings:
def get_results_classes(preds: torch.Tensor, targets: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: """Return predicted and target class indices. Arguments: preds: model predictions - targets: ground truth targets + targets: ground truth targets (one-hot encoded or integer class labels) Returns: pred_idx: predicted class indices targets_idx: target class indices """ pred_idx = torch.argmax(preds, dim=1) - targets_idx = torch.argmax(targets, dim=1) + if targets.dim() == 1: + targets_idx = targets.long() + elif targets.size(1) == 1: + targets_idx = targets.view(-1).long() + else: + targets_idx = torch.argmax(targets, dim=1) return pred_idx, targets_idx
536-544: Validation evaluation hardcodesverbose=True, causing noisy output.Lines 539 and 543 pass
verbose=Trueto the evaluation functions regardless of the outerverboseparameter. This produces excessive output during training when the caller intended quiet operation.if task == "classification": val_loss, val_metrics, _ = evaluate_classification_model( - model, val_loader, loss_fn, device=str(device), verbose=True + model, val_loader, loss_fn, device=str(device), verbose=False ) elif task == "regression": val_loss, val_metrics, _ = evaluate_regression_model( - model, val_loader, loss_fn, device=str(device), verbose=True + model, val_loader, loss_fn, device=str(device), verbose=False )src/mqt/predictor/ml/predictor.py (2)
626-655: GNN instantiation is duplicated; consolidate usingoutput_dimvariable.The two GNN instantiation blocks (lines 627-640 and 642-655) are nearly identical, differing only in
output_dim. This was flagged in previous reviews but remains unaddressed.json_dict["num_outputs"] = len(self.devices) if self.figure_of_merit != "hellinger_distance" else 1 - if self.figure_of_merit != "hellinger_distance": - model = GNN( - in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), - ... - output_dim=len(self.devices), - ... - ).to("cuda" if torch.cuda.is_available() else "cpu") - else: - model = GNN( - ... - output_dim=1, - ... - ).to("cuda" if torch.cuda.is_available() else "cpu") + output_dim = 1 if self.figure_of_merit == "hellinger_distance" else len(self.devices) + model = GNN( + in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), + num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"], + hidden_dim=dict_best_hyper["hidden_dim"], + num_resnet_layers=dict_best_hyper["num_resnet_layers"], + mlp_units=mlp_units, + output_dim=output_dim, + dropout_p=dict_best_hyper["dropout"], + bidirectional=dict_best_hyper["bidirectional"], + use_sag_pool=dict_best_hyper["sag_pool"], + sag_ratio=0.7, + conv_activation=torch.nn.functional.leaky_relu, + mlp_activation=torch.nn.functional.leaky_relu, + ).to("cuda" if torch.cuda.is_available() else "cpu")
668-688: Redundant train/val split and device type mismatch.
Double split:
_get_prepared_training_data()already performs a 70/30 train/test split (line 783-785). Lines 668-670 then splitX_trainagain (80/20), resulting in effective 56%/14%/30% train/val/test proportions. Consider usingtraining_data.X_testas the validation set instead.Type mismatch: Line 682 passes
device(atorch.deviceobject) totrain_model, buttrain_modelexpectsdevice: str | None. This works due to duck typing but is technically incorrect.optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) - x_train, x_val, _y_train, _y_val = train_test_split( - training_data.X_train, training_data.y_train, test_size=0.2, random_state=5 - ) - train_loader = DataLoader(x_train, batch_size=16, shuffle=True) - val_loader = DataLoader(x_val, batch_size=16, shuffle=False) + train_loader = DataLoader(training_data.X_train, batch_size=16, shuffle=True) + val_loader = DataLoader(training_data.X_test, batch_size=16, shuffle=False) train_model( model, train_loader, optimizer, loss_fn, task=task, num_epochs=num_epochs, - device=device, + device=str(device), verbose=verbose, ... )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
src/mqt/predictor/_version.py(1 hunks)src/mqt/predictor/ml/helper.py(4 hunks)src/mqt/predictor/ml/predictor.py(12 hunks)src/mqt/predictor/rl/predictorenv.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-04T06:59:40.314Z
Learnt from: MatthiasReumann
Repo: munich-quantum-toolkit/core PR: 1301
File: mlir/lib/Dialect/MQTOpt/Transforms/Transpilation/LayeredUnit.cpp:84-85
Timestamp: 2025-12-04T06:59:40.314Z
Learning: In the MQTOpt MLIR routing passes (NaiveRoutingPassSC, AStarRoutingPassSC), the input IR is guaranteed to contain only 1-qubit and 2-qubit gates. All 3+-qubit gates must be decomposed before routing; otherwise the input IR is invalid. This invariant allows skipTwoQubitBlock in LayeredUnit.cpp to safely assert wires.size() == 2.
Applied to files:
src/mqt/predictor/ml/helper.py
🧬 Code graph analysis (1)
src/mqt/predictor/ml/helper.py (1)
src/mqt/predictor/utils.py (1)
calc_supermarq_features(94-146)
🪛 Ruff (0.14.8)
src/mqt/predictor/ml/predictor.py
148-148: Logging statement uses f-string
(G004)
151-151: Logging statement uses f-string
(G004)
175-175: Boolean-typed positional argument in function definition
(FBT001)
175-175: Boolean default positional argument in function definition
(FBT002)
448-448: Boolean-typed positional argument in function definition
(FBT001)
448-448: Boolean default positional argument in function definition
(FBT002)
569-569: Boolean-typed positional argument in function definition
(FBT001)
569-569: Boolean default positional argument in function definition
(FBT002)
800-800: Boolean-typed positional argument in function definition
(FBT001)
800-800: Boolean default positional argument in function definition
(FBT002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 🐍 Test (ubuntu-24.04) / 🐍 ubuntu-24.04
- GitHub Check: 🐍 Test (macos-14) / 🐍 macos-14
- GitHub Check: 🐍 Test (windows-2022) / 🐍 windows-2022
🔇 Additional comments (6)
src/mqt/predictor/rl/predictorenv.py (1)
382-383: LGTM! Explicit implicit swap handling is appropriate.The
replace_implicit_swaps=Trueparameter ensures that implicit swaps from Tket circuits are explicitly materialized in the Qiskit circuit, preventing potential layout/routing inconsistencies downstream.Minor note: The comment mentions "for a warning" but no warning is actually emitted. Consider either removing that phrase or adding a
warnings.warn()call if you want to notify users when implicit swaps are replaced.src/mqt/predictor/ml/helper.py (2)
192-303: LGTM! DAG featurization handles edge cases correctly.The
create_dagfunction properly handles:
- Empty edges (lines 267-270): Creates
torch.empty((2, 0), dtype=torch.long)for edge-less graphs- Empty
topo_nodes(lines 274-278): Returns node features with zero critical flags- Unknown gates: Raises a clear
ValueErrorwith the gate nameThe feature layout (one-hot gate, sin/cos params, arity, controls, num_params, critical_flag, fan_in, fan_out) is well-documented.
576-587: LGTM! TrainingData type widening supports both classical and GNN paths.The widened types for
X_train/X_testto support both numpy arrays andtorch_geometric.data.Datalists, along withy_train/y_testsupporting both numpy arrays and torch Tensors, correctly accommodates the dual RF/GNN training pipeline.src/mqt/predictor/ml/predictor.py (3)
83-86: LGTM! Type aliases improve code readability.The
GNNSample,FeatureSample, andTrainingSampleunion types clearly document the expected data structures for both training paths.
835-873: LGTM! GNN prediction path handles model loading and inference correctly.The implementation properly:
- Loads hyperparameters from JSON metadata (lines 837-838)
- Parses
mlp_unitsfrom the stored string format (lines 840-841)- Uses
weights_only=Truefor secure model loading (line 857)- Removes the batch dimension with
squeeze(0)(line 866)- Retrieves class labels from JSON metadata (line 863)
- Validates output/label length consistency (lines 868-870)
329-342: LGTM! GNN training sample uses canonical device ordering.The code correctly uses
[scores.get(dev.description, -1.0) for dev in self.devices](line 331) to ensure device score ordering matchesself.devices, which is sorted at initialization. This addresses the previous concern about ordering consistency between training and inference.
| self, | ||
| devices: list[Target], | ||
| figure_of_merit: figure_of_merit = "expected_fidelity", | ||
| gnn: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Consider making gnn parameter keyword-only to avoid accidental positional usage.
The gnn boolean parameter at line 175 (and similarly at lines 448, 569, 800) is a positional argument, which Ruff flags with FBT001/FBT002. While setup_device_predictor correctly uses keyword-only (*, at line 107), the Predictor.__init__ and other methods do not.
def __init__(
self,
devices: list[Target],
figure_of_merit: figure_of_merit = "expected_fidelity",
+ *,
gnn: bool = False,
logger_level: int = logging.INFO,
) -> None:Similarly for predict_device_for_figure_of_merit:
def predict_device_for_figure_of_merit(
- qc: Path | QuantumCircuit, figure_of_merit: figure_of_merit = "expected_fidelity", gnn: bool = False
+ qc: Path | QuantumCircuit, figure_of_merit: figure_of_merit = "expected_fidelity", *, gnn: bool = False
) -> Target:🧰 Tools
🪛 Ruff (0.14.8)
175-175: Boolean-typed positional argument in function definition
(FBT001)
175-175: Boolean default positional argument in function definition
(FBT002)
🤖 Prompt for AI Agents
In src/mqt/predictor/ml/predictor.py around line 175 (and also update signatures
at lines 448, 569, 800 and for predict_device_for_figure_of_merit), the boolean
parameter `gnn` is currently positional; change the function/method signatures
so `gnn` becomes a keyword-only parameter (e.g., introduce a `*` marker before
the keyword-only params or move `gnn` after an existing `*`), and update any
callers that relied on positional ordering to pass `gnn=` explicitly; ensure
signature consistency across Predictor.__init__ and the other mentioned methods
to satisfy Ruff FBT001/FBT002.
| # Prepare data | ||
| if training_data is None: | ||
| training_data = self._get_prepared_training_data() | ||
| number_in_features = int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Magic number for in_feats is error-prone and duplicated.
The formula int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1) appears multiple times (lines 597, 628, 643, 844) with slight variations and no documentation of what each component represents.
Consider extracting this to a helper function or constant:
def get_gnn_input_features() -> int:
"""Calculate GNN input feature dimension.
Components:
- len(get_openqasm3_gates()) + 1: one-hot gate encoding (including 'measure')
- 6: sin/cos for up to 3 parameters
- 1: arity
- 1: controls
- 1: num_params
- 1: critical_flag
- 2: fan_in, fan_out
"""
return len(get_openqasm3_gates()) + 1 + 6 + 1 + 1 + 1 + 1 + 2Note: The current formula sums to 40, but create_dag concatenates: [onehots(28), params(6), arity(1), controls(1), num_params(1), critical_flag(1), fan_in(1), fan_out(1)] = 40 features. Verify this matches.
🤖 Prompt for AI Agents
In src/mqt/predictor/ml/predictor.py around line 597 (and also update
occurrences at ~628, ~643, ~844), the input-feature dimension is expressed as a
hard-to-read duplicated magic-number expression int(len(get_openqasm3_gates()) +
1 + 6 + 3 + 1 + 1 + 1); replace these with a single well-documented helper
function or constant (e.g. get_gnn_input_features()) that computes the sum with
named components (one-hot gate size, sin/cos params, arity, controls,
num_params, critical_flag, fan_in, fan_out), add a docstring listing each
component and its value, use that helper everywhere the magic expression
appears, and update any off-by-one discrepancies by verifying the helper result
matches the actual concatenation order in create_dag (onehots, params, arity,
controls, num_params, critical_flag, fan_in, fan_out).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (7)
tests/hellinger_distance/test_estimated_hellinger_distance.py (3)
294-319: Cleanup test still exists instead of using fixture teardown.Past reviews recommended converting
test_remove_filesto a module- or session-scoped autouse fixture with yield-based teardown to ensure cleanup runs regardless of test ordering or parallelization. This suggestion was previously raised but the cleanup is still implemented as a test function.
217-220: Same FBT001 issue as previous test.Apply the same parametrization fix here for consistency:
-@pytest.mark.parametrize("gnn", [False, True], ids=["rf", "gnn"]) -def test_train_and_qcompile_with_hellinger_model( - source_path: Path, target_path: Path, device: Target, gnn: bool -) -> None: +@pytest.mark.parametrize("model_type", ["rf", "gnn"]) +def test_train_and_qcompile_with_hellinger_model( + source_path: Path, target_path: Path, device: Target, model_type: str +) -> None: + gnn = model_type == "gnn"
157-159: Boolean parameter remains despite past review addressing FBT001.Past reviews suggested changing the
gnn: boolparameter to a string-based parametrization (e.g.,model_type: strwith values"rf"and"gnn"), and this was marked as addressed in commit 1acc0cb. However, the current code still usesgnn: booltriggering Ruff FBT001.If the team decided to keep the boolean parameter, consider silencing the warning in the linter configuration. Otherwise, apply the previously suggested fix:
-@pytest.mark.parametrize("gnn", [False, True], ids=["rf", "gnn"]) -def test_train_model_and_predict(device: Target, gnn: bool) -> None: +@pytest.mark.parametrize("model_type", ["rf", "gnn"]) +def test_train_model_and_predict(device: Target, model_type: str) -> None: + gnn = model_type == "gnn"src/mqt/predictor/ml/predictor.py (4)
436-450: Makeverboseparameter keyword-only for consistency.The
verbose: bool = Falseparameter triggers Ruff FBT001/FBT002. A past review suggested making it keyword-only, which would align with the rest of the codebase and prevent accidental positional usage.def objective( self, trial: optuna.Trial, dataset: NDArray[np.float64] | list[torch_geometric.data.Data], task: str, in_feats: int, num_outputs: int, loss_fn: nn.Module, k_folds: int, batch_size: int = 32, num_epochs: int = 10, patience: int = 10, - verbose: bool = False, device: str | None = None, + *, + verbose: bool = False, ) -> float:
598-598: Magic number for GNN input features still duplicated.Past reviews identified that the expression
int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1)appears at lines 598, 629, 644, and 845 without documentation and suggested extracting it to a helper function likeget_gnn_input_features()with a docstring explaining each component (one-hot gates, sin/cos params, arity, controls, num_params, critical_flag, fan_in, fan_out).
175-175: Consider keyword-only boolean parameters for consistency.The module has inconsistent handling of the
gnnboolean parameter:
setup_device_predictor(line 107) correctly uses keyword-only (*, gnn: bool = False)Predictor.__init__(line 175),objective(line 448),train_gnn_model(line 570), andpredict_device_for_figure_of_merit(line 801) use positional boolean parametersFor consistency and to address Ruff FBT001/FBT002, consider making all boolean parameters keyword-only:
def __init__( self, devices: list[Target], figure_of_merit: figure_of_merit = "expected_fidelity", + *, gnn: bool = False, logger_level: int = logging.INFO, ) -> None:Apply similarly to other methods.
669-675: Redundant train/validation split persists after previous review.Past reviews (marked as addressed in commits 0992bd0 to 77f4889) identified that
_get_prepared_training_dataalready performs a 70/30 train/test split, and the additional 80/20 split at lines 669-671 results in unintended effective proportions (56%/14%/30%). The recommendation was to usetraining_data.X_testas the validation set.optimizer = torch.optim.Adam(model.parameters(), lr=dict_best_hyper["lr"]) - # Train-validation split (needed for early stopping and hyperparameter optimization) - x_train, x_val, _y_train, _y_val = train_test_split( - training_data.X_train, training_data.y_train, test_size=0.2, random_state=5 - ) # Dataloader - train_loader = DataLoader(x_train, batch_size=16, shuffle=True) - - val_loader = DataLoader(x_val, batch_size=16, shuffle=False) + train_loader = DataLoader(training_data.X_train, batch_size=16, shuffle=True) + val_loader = DataLoader(training_data.X_test, batch_size=16, shuffle=False)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/mqt/predictor/ml/predictor.py(12 hunks)tests/hellinger_distance/test_estimated_hellinger_distance.py(5 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
tests/hellinger_distance/test_estimated_hellinger_distance.py
158-158: Boolean-typed positional argument in function definition
(FBT001)
219-219: Boolean-typed positional argument in function definition
(FBT001)
src/mqt/predictor/ml/predictor.py
148-148: Logging statement uses f-string
(G004)
151-151: Logging statement uses f-string
(G004)
175-175: Boolean-typed positional argument in function definition
(FBT001)
175-175: Boolean default positional argument in function definition
(FBT002)
448-448: Boolean-typed positional argument in function definition
(FBT001)
448-448: Boolean default positional argument in function definition
(FBT002)
570-570: Boolean-typed positional argument in function definition
(FBT001)
570-570: Boolean default positional argument in function definition
(FBT002)
801-801: Boolean-typed positional argument in function definition
(FBT001)
801-801: Boolean default positional argument in function definition
(FBT002)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 🐍 Test (macos-14) / 🐍 macos-14
- GitHub Check: 🐍 Test (windows-2022) / 🐍 windows-2022
- GitHub Check: 🐍 Test (ubuntu-24.04) / 🐍 ubuntu-24.04
🔇 Additional comments (1)
tests/hellinger_distance/test_estimated_hellinger_distance.py (1)
255-266: Windows timeout warning logic appears inconsistent.The test expects a Windows timeout warning only when
gnn=False(line 255:if sys.platform == "win32" and not gnn). However, both branches pass the sametimeout=600parameter tocompile_training_circuits. Why would the warning behavior differ based on the GNN flag?If this is intentional (e.g., GNN compilation doesn't use timeouts), please add a comment explaining the rationale. Otherwise, consider whether the condition should be
if sys.platform == "win32"without theand not gnnclause.
| if self.figure_of_merit != "hellinger_distance": | ||
| model = GNN( | ||
| in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), | ||
| num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"], | ||
| hidden_dim=dict_best_hyper["hidden_dim"], | ||
| num_resnet_layers=dict_best_hyper["num_resnet_layers"], | ||
| mlp_units=mlp_units, | ||
| output_dim=len(self.devices), | ||
| dropout_p=dict_best_hyper["dropout"], | ||
| bidirectional=dict_best_hyper["bidirectional"], | ||
| use_sag_pool=dict_best_hyper["sag_pool"], | ||
| sag_ratio=0.7, | ||
| conv_activation=torch.nn.functional.leaky_relu, | ||
| mlp_activation=torch.nn.functional.leaky_relu, | ||
| ).to("cuda" if torch.cuda.is_available() else "cpu") | ||
| else: | ||
| model = GNN( | ||
| in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1), | ||
| num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"], | ||
| hidden_dim=dict_best_hyper["hidden_dim"], | ||
| num_resnet_layers=dict_best_hyper["num_resnet_layers"], | ||
| mlp_units=mlp_units, | ||
| output_dim=1, | ||
| dropout_p=dict_best_hyper["dropout"], | ||
| bidirectional=dict_best_hyper["bidirectional"], | ||
| use_sag_pool=dict_best_hyper["sag_pool"], | ||
| sag_ratio=0.7, | ||
| conv_activation=torch.nn.functional.leaky_relu, | ||
| mlp_activation=torch.nn.functional.leaky_relu, | ||
| ).to("cuda" if torch.cuda.is_available() else "cpu") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GNN instantiation duplication persists despite past review.
Past reviews (marked as addressed in commits 2a9457d to 77cca31) suggested consolidating the two GNN instantiation blocks that differ only in output_dim. However, the duplication remains in the current code.
- if self.figure_of_merit != "hellinger_distance":
- model = GNN(
- in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1),
- num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"],
- ...
- output_dim=len(self.devices),
- ...
- ).to("cuda" if torch.cuda.is_available() else "cpu")
- else:
- model = GNN(
- ...
- output_dim=1,
- ...
- ).to("cuda" if torch.cuda.is_available() else "cpu")
+ output_dim = 1 if self.figure_of_merit == "hellinger_distance" else len(self.devices)
+ device_str = "cuda" if torch.cuda.is_available() else "cpu"
+ model = GNN(
+ in_feats=int(len(get_openqasm3_gates()) + 1 + 6 + 3 + 1 + 1 + 1),
+ num_conv_wo_resnet=dict_best_hyper["num_conv_wo_resnet"],
+ hidden_dim=dict_best_hyper["hidden_dim"],
+ num_resnet_layers=dict_best_hyper["num_resnet_layers"],
+ mlp_units=mlp_units,
+ output_dim=output_dim,
+ dropout_p=dict_best_hyper["dropout"],
+ bidirectional=dict_best_hyper["bidirectional"],
+ use_sag_pool=dict_best_hyper["sag_pool"],
+ sag_ratio=0.7,
+ conv_activation=torch.nn.functional.leaky_relu,
+ mlp_activation=torch.nn.functional.leaky_relu,
+ ).to(device_str)
Description
This PR introduces a Graph Neural Network (GNN) as an alternative to the Random Forest model for predicting the best device to run a quantum circuit.
To support this, the preprocessing pipeline was redesigned: instead of manually extracting features from the circuit, the model now directly takes as input the Directed Acyclic Graph (DAG) representation of the quantum circuit.
🚀 Major Changes
Graph Neural Network Integration
🎯 Motivation
🔧 Fixes and Enhancements
📦 Dependency Updates
optuna>=4.5.0torch-geometric>=2.6.1Checklist: