Fix UnboundLocalError for tp_plan_alt when tp_plan is empty by YangKai0616 · Pull Request #44540 · huggingface/transformers

YangKai0616 · 2026-03-09T09:37:36Z

Per the title, an error occurs when tp_plan is empty due to here:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/workspace/test_moe_tp_ep.py", line 6, in <module>
[rank0]:     model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b" , dtype=torch.bfloat16, tp_plan="auto", use_kernels=True)
[rank0]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/workspace/transformers/src/transformers/models/auto/auto_factory.py", line 381, in from_pretrained
[rank0]:     return model_class.from_pretrained(
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/workspace/transformers/src/transformers/modeling_utils.py", line 4137, in from_pretrained
[rank0]:     loading_info, disk_offload_index = cls._load_pretrained_model(model, state_dict, checkpoint_files, load_config)
[rank0]:                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/workspace/transformers/src/transformers/modeling_utils.py", line 4256, in _load_pretrained_model
[rank0]:     loading_info, disk_offload_index = convert_and_load_state_dict_in_model(
[rank0]:                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/workspace/transformers/src/transformers/core_model_loading.py", line 1178, in convert_and_load_state_dict_in_model
[rank0]:     if matched_tp_pattern := tp_plan_alt.search(renamed_key):
[rank0]:                              ^^^^^^^^^^^
[rank0]: UnboundLocalError: cannot access local variable 'tp_plan_alt' where it is not associated with a value

Reproduction script and command:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import time


model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b" , dtype=torch.bfloat16, tp_plan="auto", use_kernels=True)
print(model._tp_plan)

tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt", return_dict=True)

for i in range(5):
# distributed run
    s1 = time.time()
    outputs = model.generate(**inputs.to(model.device), max_new_tokens=100, do_sample=False)
    s2 = time.time()
    outputs = tokenizer.batch_decode(outputs[:, inputs["input_ids"].shape[-1]:])
    print(outputs[0])
    print(s2-s1)

torchrun --nproc-per-node=4 test_moe_tp_ep.py

This PR fixes the issue by adding a check for non-empty tp_plan in the weight sharding condition logic.

Hi @ArthurZucker , please help review. Thanks!

YangKai0616 added 2 commits March 9, 2026 09:20

Fix UnboundLocalError for tp_plan_alt when tp_plan is empty

45d3072

Merge branch 'main' into fix-tp_plan

5517041

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix UnboundLocalError for tp_plan_alt when tp_plan is empty#44540

Fix UnboundLocalError for tp_plan_alt when tp_plan is empty#44540
YangKai0616 wants to merge 2 commits intohuggingface:mainfrom
YangKai0616:fix-tp_plan

YangKai0616 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

YangKai0616 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant