Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit fe6826b

Browse files
KSGulinbfineran
andauthored
[Cherry-Pick] Fixes #858 and #863 (#865)
* Fix transformers batchsize (#858) * Update: remove deepsparse requirement to run yolov5 * Fix: set num_devices to 1 if no gpus * Update: nit * unwrap checkpoint_path on checkpoint recipe load in IC training (#863) * unwrap checkpoint_path on checkpoint recipe load in IC training * import zoo Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
1 parent e957f81 commit fe6826b

File tree

2 files changed

+7
-1
lines changed

2 files changed

+7
-1
lines changed

src/sparseml/pytorch/image_classification/utils/trainer.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
default_device,
3434
is_parallel_model,
3535
)
36+
from sparsezoo import Zoo
3637

3738

3839
_LOGGER = logging.getLogger(__file__)
@@ -327,6 +328,10 @@ def _run_train_epoch(
327328
)
328329

329330
def _setup_checkpoint_manager(self):
331+
if self.checkpoint_path and self.checkpoint_path.startswith("zoo"):
332+
self.checkpoint_path = Zoo.load_model_from_stub(
333+
self.checkpoint_path
334+
).download_framework_files(extensions=[".pth"])[0]
330335
checkpoint_state = torch.load(self.checkpoint_path)
331336
checkpoint_manager = None
332337
checkpoint_recipe = checkpoint_state.get("recipe")

src/sparseml/transformers/sparsification/trainer.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,9 +254,10 @@ def create_optimizer(self):
254254
if torch.distributed.is_initialized()
255255
else self.args._n_gpu
256256
)
257+
n_device = n_gpu if n_gpu > 0 else 1
257258
total_batch_size = (
258259
self.args.per_device_train_batch_size
259-
* n_gpu
260+
* n_device
260261
* self.args.gradient_accumulation_steps
261262
)
262263
self.manager_steps_per_epoch = math.ceil(

0 commit comments

Comments
 (0)