add multi-gpu support by mike-ferguson · Pull Request #394 · brain-score/language

mike-ferguson · 2026-03-16T17:42:25Z

Add multi-GPU and memory-efficient model loading to HuggingfaceSubject

Changes

Multi-GPU support: When multiple CUDA GPUs are detected, HuggingfaceSubject now loads models with device_map='auto', automatically distributing layers across all available GPUs. Single-GPU and MPS (Apple Silicon) behavior is unchanged.
Memory-efficient loading: Added low_cpu_mem_usage=True to from_pretrained, reducing peak CPU RAM during checkpoint loading from ~3x to ~1x model size.

Why

Models over ~6B parameters in float32 exceed a single 24 GB GPU. Previously, all weights were loaded onto GPU 0 regardless of how many GPUs were available, causing OOM kills on multi-GPU instances (e.g. g5.12xlarge). This fix allows 7-13B models to run in fp32 on multi-GPU instances without any changes to individual model plugins.

Impact

No changes required to existing model plugins
No behavior change on single-GPU or CPU/MPS setups
Fixes OOM for medium-tier models (Mistral-7B, OPT-6.7B, Falcon-7B, Pythia-12B, etc.) on multi-GPU Batch instances

add multi-gpu support

9fc3695

mike-ferguson merged commit 1ee35a3 into main Mar 16, 2026
10 of 11 checks passed

KartikP added the OOM label Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add multi-gpu support#394

add multi-gpu support#394
mike-ferguson merged 1 commit intomainfrom
add_multi_gpu_support

mike-ferguson commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mike-ferguson commented Mar 16, 2026

Add multi-GPU and memory-efficient model loading to HuggingfaceSubject

Changes

Why

Impact

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants