Skip to content

Fix crash in Qwen2_5_VLProcessor when using batched input with padding=False#44535

Open
Anakintano wants to merge 2 commits intohuggingface:mainfrom
Anakintano:fix-44521
Open

Fix crash in Qwen2_5_VLProcessor when using batched input with padding=False#44535
Anakintano wants to merge 2 commits intohuggingface:mainfrom
Anakintano:fix-44521

Conversation

@Anakintano
Copy link

@Anakintano Anakintano commented Mar 9, 2026

Problem

Qwen2_5_VLProcessor.apply_chat_template raises ValueError: setting an array element with a sequence when called with a batch of ≥2 conversations that include images under the default padding=False setting.

Root cause: mm_token_type_ids was built by calling np.array(text_inputs["input_ids"]) on a ragged list (variable-length sequences when padding=False). NumPy ≥ 1.24 rejects inhomogeneous shapes for this operation.

Fix

Iterate per-sequence instead of constructing a 2D array from a ragged list. Each ids_arr = np.array(ids) call receives a 1-D list, so the shape is always homogeneous.

Changed in both:

  • src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py
  • src/transformers/models/qwen2_5_vl/processing_qwen2_5_vl.py (auto-generated copy, manually synced since make is unavailable on Windows)

Test

Added test_batched_apply_chat_template_no_padding in tests/models/qwen2_5_vl/test_processing_qwen2_5_vl.py to guard against regression.

Closes #44545

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_5_vl

…se (huggingface#44521)

`mm_token_type_ids` was constructed by calling `np.array()` and
`np.zeros_like()` on the full `text_inputs["input_ids"]` list, which
is ragged (variable-length sequences) when `padding=False`. NumPy ≥ 1.24
raises `ValueError: setting an array element with a sequence` for
inhomogeneous shapes.

Fix: iterate per-sequence so each `np.array()` call receives a 1-D list,
avoiding the ragged-array construction entirely.
@umbilnm
Copy link

umbilnm commented Mar 9, 2026

Hi @Anakintano! It looks like this PR fixes a different issue — the ValueError from constructing a ragged numpy array during batched processing without padding. That's a valid bug, but issue #44521 is specifically about assistant_masks being all zeros when multimodal inputs are present. These seem like two separate problems - would it make sense to unlink this PR from #44521 and open a separate issue for the batched padding crash instead?

@Anakintano
Copy link
Author

Hi @umbilnm Thanks for pointing that out — you're right that this PR addresses a different issue than #44521.

I've opened a separate issue for the batched padding crash here: #44545 and updated the PR to reference it instead.

Appreciate the clarification! 🙌😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen2_5_VLProcessor.apply_chat_template crashes on batched input when padding=False

2 participants