[Bugfix] [Offloading] Save disk-offloaded buffers, Save converted weights by kylesayrs · Pull Request #46902 · huggingface/transformers

kylesayrs · 2026-06-26T02:23:51Z

Purpose

Parlay two bugfixes in one for the purpose of enabling the saving of models with disk offloaded weights
- Fix saving models which are disk offloaded with offload_buffers=True
- Fix saving models which are disk offloaded with weight conversions

Changes

Replace get_parameter call with get_parameter_or_buffer call
Expand load_offloaded_parameter to load a state dict of all weights associated with the checkpoint weight
- checkpoint weight (all/any) -> model weight (one) -> reverted weights (all)
- By using this loaded state dict to update the original state dict, we avoid redundant loading of offloaded weights
Loosen requirements for is_offloaded flag
- This is backwards compatible safe, as full disk offloaded was never supported in previous releases anyways

Testing

Able to save disk-offloaded models with conversion mappings now

from transformers import AutoModelForCausalLM

# Load model with full disk offload
model = AutoModelForCausalLM.from_pretrained(
    "inference-optimization/DSV4-tiny-empty",
    device_map="auto",
    max_memory={},
    offload_folder="offload_folder",
)

# Save the model
model.save_pretrained("tmp_save")

Used these changes to quantize RedHatAI/DeepSeek-V4-Pro-NVFP4-FP8 and RedHatAI/GLM-5.2-NVFP4-FP8

Suggested Reviewers

@SunMarc @zucchini-nlp @ArthurZucker

SunMarc

Thanks, left a couple of comments but I think @Cyrilvallez might have better ideas on how to deal with those as he's the one who coded this !

SunMarc · 2026-06-30T14:22:46Z

            filename = os.path.join(save_directory, shard_file)
            shard_state_dict = {}
-            for tensor_name in tensor_names:
+            for tensor_name in sorted(tensor_names):


any specific reason to sort ?

Note that load_offloaded_parameter may load multiple weights for a single tensor.
While it is possible to overload CPU memory by loading parameters in a bad order,
in practice split_torch_state_dict_into_shards preserves weight locality

sorting helps reduce the chances that bad load ordering occurs.

An example of bad load ordering would be

"layers.0.experts.0.up_proj" -> loads "layers.0.experts.gate_up_proj" "layers.1.experts.0.up_proj" -> loads "layers.1.experts.gate_up_proj" "layers.2.experts.0.up_proj" -> loads "layers.2.experts.gate_up_proj"

In this scenario, 3 separate gate_up_proj weights have been loaded onto cpu, but only 3 shard weights have been consumed by state_dict.pop.

Sorting reduces the chances that split_torch_state_dict_into_shards gives an adversarially bad ordering. It doesn't fix adversarially bad ordering between shards, but there's not much we can do about that.

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

github-actions · 2026-07-02T04:58:41Z

CI recap

Dashboard: View test results in Grafana
Latest run: 28566046728:2
Result: failure | Jobs: 14 | Tests: 72,772 | Failures: 1 | Duration: 16h 25m

Cyrilvallez · 2026-07-02T09:52:37Z

Hey @kylesayrs! I took the liberty to open #47018 to fix the issue, I believe it is simpler and more robust in general!
Let me know if something is still unclear

kylesayrs changed the title ~~[Bugfix]~~ [Bugfix] Save disk-offloaded buffers Jun 26, 2026

kylesayrs mentioned this pull request Jun 26, 2026

[DSV4] DeepSeekV4 Pro vllm-project/llm-compressor#2858

Open

18 tasks

kylesayrs changed the title ~~[Bugfix] Save disk-offloaded buffers~~ [Bugfix] [Offloading] Save disk-offloaded buffers, Save converted weights Jun 26, 2026

kylesayrs force-pushed the kylesayrs/fix-disk-offloaded-ptr-buffer branch 2 times, most recently from fbe41de to 54dfbf8 Compare June 29, 2026 17:58

kylesayrs marked this pull request as ready for review June 29, 2026 17:59

SunMarc reviewed Jun 30, 2026

View reviewed changes

SunMarc requested a review from Cyrilvallez June 30, 2026 14:29

kylesayrs force-pushed the kylesayrs/fix-disk-offloaded-ptr-buffer branch from f8b104d to 41e9caf Compare July 1, 2026 15:49

kylesayrs added 10 commits July 2, 2026 00:45

fix tensor get

9e408df

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

proper disk saving

d38d3fa

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

cleanup

cd3dcec

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

cache meta_data_dict

c015d80

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix style

324a8b7

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

better comments

e1232e6

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

format

52d3e4f

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

break out load_offloaded_checkpoint_parameters

d9281e3

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix typo

c3caa61

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix style

dbc8c39

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs force-pushed the kylesayrs/fix-disk-offloaded-ptr-buffer branch from 7d663f8 to dbc8c39 Compare July 2, 2026 04:45

Cyrilvallez closed this Jul 2, 2026

Cyrilvallez mentioned this pull request Jul 2, 2026

Fix save_pretrained with offloading and weight conversions #47018

Open

kylesayrs mentioned this pull request Jul 2, 2026

[Bug]: Encountered a type error related to offload.device.type when running kimi_k2_example.py vllm-project/llm-compressor#2563

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] [Offloading] Save disk-offloaded buffers, Save converted weights#46902

[Bugfix] [Offloading] Save disk-offloaded buffers, Save converted weights#46902
kylesayrs wants to merge 10 commits into
huggingface:mainfrom
kylesayrs:kylesayrs/fix-disk-offloaded-ptr-buffer

kylesayrs commented Jun 26, 2026 •

edited by github-actions Bot

Loading

Uh oh!

SunMarc left a comment

Uh oh!

SunMarc Jun 30, 2026

Uh oh!

kylesayrs Jun 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jul 2, 2026

Uh oh!

Cyrilvallez commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

kylesayrs commented Jun 26, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Changes

Testing

Suggested Reviewers

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

SunMarc Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

kylesayrs Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jul 2, 2026

CI recap

Uh oh!

Cyrilvallez commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kylesayrs commented Jun 26, 2026 •

edited by github-actions Bot

Loading

kylesayrs Jun 30, 2026 •

edited

Loading