Skip to content

Conversation

@ericspod
Copy link
Member

@ericspod ericspod commented Nov 23, 2025

Description

This is an attempt to create a slim Docker image which is smaller than the current one to avoid running out of space during testing. Various fixes have been included to account for test fails within the image. These appear to be all real issues that need to be addressed (eg. ONNX export) or fixes that should be integrated either way.

This excludes PyTorch 2.9 from the requirements for now to avoid legacy issues with ONNX, Torchscript, and other things. MONAI needs to be updated for PyTorch 2.9 support, specifically dropping the use of Torchscript in places as it's becoming obsolete in place of torch.export.

Some tests fail without enough shared memory, the command I'm using to run with is docker run -ti --rm --gpus '"device=0,1"' --shm-size=10gb -v $(pwd)/tests:/opt/monai/tests monai_slim /bin/bash to tests with GPUs 0 and 1.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

ericspod and others added 13 commits July 16, 2025 18:03
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <eric.kerfoot@gmail.com>
…n the slim Docker image, all of which appear to be real errors.

Signed-off-by: Eric Kerfoot <eric.kerfoot@gmail.com>
…ply.github.com>

I, Eric Kerfoot <17726042+ericspod@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 566c2bc

Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
I, Eric Kerfoot <eric.kerfoot@kcl.ac.uk>, hereby add my Signed-off-by to this commit: 510987d

Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
@Project-MONAI Project-MONAI deleted a comment from coderabbitai bot Nov 23, 2025
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 23, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

ericspod and others added 4 commits December 4, 2025 23:25
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
Signed-off-by: Eric Kerfoot <17726042+ericspod@users.noreply.github.com>
@ericspod
Copy link
Member Author

ericspod commented Dec 6, 2025

Nine tests in the image currently fail. The first 4 are related to auto3dseg and mention a value "image_stats" being missing from a config file, these tests pass when run in isolation however. The others relate to the GMM module and not being able to compile it since nvcc is missing from image, which is true since the CUDA toolkit is omitted for size reasons.

Output of the errors

======================================================================
ERROR: test_ensemble (tests.integration.test_auto3dseg_ensemble.TestEnsembleBuilder.test_ensemble)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/monai/monai/bundle/config_parser.py", line 158, in __getitem__
    look_up_option(k, config, print_all_options=False) if isinstance(config, dict) else config[int(k)]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/utils/module.py", line 141, in look_up_option
    raise ValueError(f"Unsupported option '{opt_str}', " + supported_msg)
ValueError: Unsupported option 'image_stats', 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/monai/tests/integration/test_auto3dseg_ensemble.py", line 155, in test_ensemble
    bundle_generator.generate(self.work_dir, num_fold=1)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 660, in generate
    gen_algo.export_to_disk(output_folder, name, fold=f_id)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 193, in export_to_disk
    self.fill_records = self.fill_template_config(self.data_stats_files, self.output_path, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/tmp324gd5iq/workdir/algorithm_templates/dints/scripts/algo.py", line 79, in fill_template_config
  File "/opt/monai/monai/bundle/config_parser.py", line 161, in __getitem__
    raise KeyError(f"query key: {k}") from e
KeyError: 'query key: image_stats'

======================================================================
ERROR: test_get_history (tests.integration.test_auto3dseg_hpo.TestHPO.test_get_history)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/monai/monai/bundle/config_parser.py", line 158, in __getitem__
    look_up_option(k, config, print_all_options=False) if isinstance(config, dict) else config[int(k)]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/utils/module.py", line 141, in look_up_option
    raise ValueError(f"Unsupported option '{opt_str}', " + supported_msg)
ValueError: Unsupported option 'image_stats', 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/monai/tests/integration/test_auto3dseg_hpo.py", line 129, in setUp
    bundle_generator.generate(work_dir, num_fold=1)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 660, in generate
    gen_algo.export_to_disk(output_folder, name, fold=f_id)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 193, in export_to_disk
    self.fill_records = self.fill_template_config(self.data_stats_files, self.output_path, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/tmp324gd5iq/workdir/algorithm_templates/dints/scripts/algo.py", line 79, in fill_template_config
  File "/opt/monai/monai/bundle/config_parser.py", line 161, in __getitem__
    raise KeyError(f"query key: {k}") from e
KeyError: 'query key: image_stats'

======================================================================
ERROR: test_run_algo (tests.integration.test_auto3dseg_hpo.TestHPO.test_run_algo)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/monai/monai/bundle/config_parser.py", line 158, in __getitem__
    look_up_option(k, config, print_all_options=False) if isinstance(config, dict) else config[int(k)]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/utils/module.py", line 141, in look_up_option
    raise ValueError(f"Unsupported option '{opt_str}', " + supported_msg)
ValueError: Unsupported option 'image_stats', 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/monai/tests/integration/test_auto3dseg_hpo.py", line 129, in setUp
    bundle_generator.generate(work_dir, num_fold=1)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 660, in generate
    gen_algo.export_to_disk(output_folder, name, fold=f_id)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 193, in export_to_disk
    self.fill_records = self.fill_template_config(self.data_stats_files, self.output_path, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/tmp324gd5iq/workdir/algorithm_templates/dints/scripts/algo.py", line 79, in fill_template_config
  File "/opt/monai/monai/bundle/config_parser.py", line 161, in __getitem__
    raise KeyError(f"query key: {k}") from e
KeyError: 'query key: image_stats'

======================================================================
ERROR: test_run_optuna (tests.integration.test_auto3dseg_hpo.TestHPO.test_run_optuna)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/monai/monai/bundle/config_parser.py", line 158, in __getitem__
    look_up_option(k, config, print_all_options=False) if isinstance(config, dict) else config[int(k)]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/utils/module.py", line 141, in look_up_option
    raise ValueError(f"Unsupported option '{opt_str}', " + supported_msg)
ValueError: Unsupported option 'image_stats', 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/monai/tests/integration/test_auto3dseg_hpo.py", line 129, in setUp
    bundle_generator.generate(work_dir, num_fold=1)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 660, in generate
    gen_algo.export_to_disk(output_folder, name, fold=f_id)
  File "/opt/monai/monai/apps/auto3dseg/bundle_gen.py", line 193, in export_to_disk
    self.fill_records = self.fill_template_config(self.data_stats_files, self.output_path, **kwargs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/tmp324gd5iq/workdir/algorithm_templates/dints/scripts/algo.py", line 79, in fill_template_config
  File "/opt/monai/monai/bundle/config_parser.py", line 161, in __getitem__
    raise KeyError(f"query key: {k}") from e
KeyError: 'query key: image_stats'

======================================================================
ERROR: test_cuda_0_2_batches_1_dimensions_1_channels_2_classes_2_mixtures (tests.networks.layers.test_gmm.GMMTestCase.test_cuda_0_2_batches_1_dimensions_1_channels_2_classes_2_mixtures)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2595, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 127.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/tests/networks/layers/test_gmm.py", line 287, in test_cuda
    gmm = GaussianMixtureModel(features_tensor.size(1), mixture_count, class_count, verbose_build=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/networks/layers/gmm.py", line 44, in __init__
    self.compiled_extension = load_module(
                              ^^^^^^^^^^^^
  File "/opt/monai/monai/_extensions/loader.py", line 89, in load_module
    module = load(
             ^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 1681, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2138, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2290, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2612, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'gmm_1_2_1_Linux_3_11_2_28_12_8'

======================================================================
ERROR: test_cuda_1_1_batches_1_dimensions_5_channels_2_classes_1_mixtures (tests.networks.layers.test_gmm.GMMTestCase.test_cuda_1_1_batches_1_dimensions_5_channels_2_classes_1_mixtures)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2595, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 127.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/tests/networks/layers/test_gmm.py", line 287, in test_cuda
    gmm = GaussianMixtureModel(features_tensor.size(1), mixture_count, class_count, verbose_build=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/networks/layers/gmm.py", line 44, in __init__
    self.compiled_extension = load_module(
                              ^^^^^^^^^^^^
  File "/opt/monai/monai/_extensions/loader.py", line 89, in load_module
    module = load(
             ^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 1681, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2138, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2290, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2612, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'gmm_5_2_1_Linux_3_11_2_28_12_8'

======================================================================
ERROR: test_cuda_2_1_batches_2_dimensions_2_channels_4_classes_4_mixtures (tests.networks.layers.test_gmm.GMMTestCase.test_cuda_2_1_batches_2_dimensions_2_channels_4_classes_4_mixtures)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2595, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 127.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/tests/networks/layers/test_gmm.py", line 287, in test_cuda
    gmm = GaussianMixtureModel(features_tensor.size(1), mixture_count, class_count, verbose_build=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/networks/layers/gmm.py", line 44, in __init__
    self.compiled_extension = load_module(
                              ^^^^^^^^^^^^
  File "/opt/monai/monai/_extensions/loader.py", line 89, in load_module
    module = load(
             ^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 1681, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2138, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2290, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2612, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'gmm_2_4_1_Linux_3_11_2_28_12_8'

======================================================================
ERROR: test_cuda_3_1_batches_3_dimensions_1_channels_2_classes_1_mixtures (tests.networks.layers.test_gmm.GMMTestCase.test_cuda_3_1_batches_3_dimensions_1_channels_2_classes_1_mixtures)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2595, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 127.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/parameterized/parameterized.py", line 620, in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/tests/networks/layers/test_gmm.py", line 287, in test_cuda
    gmm = GaussianMixtureModel(features_tensor.size(1), mixture_count, class_count, verbose_build=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/monai/monai/networks/layers/gmm.py", line 44, in __init__
    self.compiled_extension = load_module(
                              ^^^^^^^^^^^^
  File "/opt/monai/monai/_extensions/loader.py", line 89, in load_module
    module = load(
             ^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 1681, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2138, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2290, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2612, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'gmm_1_2_1_Linux_3_11_2_28_12_8_v1'

======================================================================
ERROR: test_load (tests.networks.layers.test_gmm.GMMTestCase.test_load)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2595, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 127.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/monai/tests/networks/layers/test_gmm.py", line 310, in test_load
    load_module("gmm", {"CHANNEL_COUNT": 2, "MIXTURE_COUNT": 2, "MIXTURE_SIZE": 3}, verbose_build=True)
  File "/opt/monai/monai/_extensions/loader.py", line 89, in load_module
    module = load(
             ^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 1681, in load
    return _jit_compile(
           ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2138, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2290, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/cpp_extension.py", line 2612, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'gmm_2_2_3_Linux_3_11_2_28_12_8'

It's a simple matter of forcing the GMM module to build when building the image, this also fails if used as a RUN command: python -c 'from monai._extensions import load_module;load_module("gmm", {"CHANNEL_COUNT": 2, "MIXTURE_COUNT": 2, "MIXTURE_SIZE": 3}, verbose_build=True)'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant