Fix: CUDA OOM issue after training before inference by Hongbin10 · Pull Request #96 · TIO-IKIM/CellViT

Hongbin10 · 2026-04-02T08:31:40Z

Problem：

When running run_cpp_net.py, training and inference are executed sequentially in the same process.
On hardware with limited GPU memory (tested on NVIDIA L40S, 48GB VRAM), GPU memory allocated reached ~97% by the end of training. The experiment object — including the model, optimizer states, and gradients — was not explicitly released before inference was initiated, leaving insufficient VRAM for the inference model to load.

This caused a torch.OutOfMemoryError at the start of inference:

CUDA out of memory. Tried to allocate 6.00 GiB.
GPU 0 has a total capacity of 44.42 GiB of which 607.38 MiB is free.

Root Cause

In both the checkpoint and casual run branches of run_cpp_net.py, inference = InferenceCellViTCPP(...) was called immediately after experiment.run_experiment() without releasing the training objects from GPU memory first.

Fix

Added explicit GPU memory cleanup between training and inference in both branches:

del experiment
gc.collect()
torch.cuda.empty_cache()

Result

Training and inference now run successfully end-to-end in a single job without CUDA OOM errors.

Fix: Gpu OOM issue before inference

2be4934

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: CUDA OOM issue after training before inference#96

Fix: CUDA OOM issue after training before inference#96
Hongbin10 wants to merge 1 commit intoTIO-IKIM:mainfrom
Hongbin10:fix/CPPNet-OOM-release-gpu-memory-before-inference

Hongbin10 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Hongbin10 commented Apr 2, 2026

Problem：

Root Cause

Fix

Result

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant