[Feature] Support GLM-5 MTP for vLLM Pluggin. by whx-sjtu · Pull Request #544 · ROCm/ATOM

whx-sjtu · 2026-04-11T13:25:43Z

Summary

wire GLM-5 MTP draft registration and load path in vLLM plugin mode, including draft/target namespace isolation and spec-decode weight remapping
align sparse MLA/indexer metadata flow for speculative decode so draft model receives decode/prefill metadata in plugin mode
fix draft logits path to consume shared lm_head in MTP mode, removing the zero-acceptance collapse in atom+vllm (run_vllm_offline.sh now reports non-zero acceptance)

Verification

bash /app/scripts/run_vllm_offline.sh
observed Avg Draft acceptance rate: 4.4% in latest run log

Merge dependency

this PR is stacked on top of [Feat][Plugin] Enable Sparse MLA and GLM-5 for vLLM-ATOM #399 (plugin_sparse_mla), merge this after [Feat][Plugin] Enable Sparse MLA and GLM-5 for vLLM-ATOM #399

Signed-off-by: whx-sjtu <xiaowang990929@gmail.com>

Ensure draft/target namespaces and metadata wiring stay consistent for GLM-5 sparse MTP so speculative decoding no longer collapses to zero acceptance in atom+vllm. Made-with: Cursor

whx-sjtu added 2 commits April 9, 2026 08:29

fix(model_ops): handle dense and moe shuffle paths explicitly

4f0e044

Signed-off-by: whx-sjtu <xiaowang990929@gmail.com>

fix(plugin): align GLM-5 MTP draft path with vLLM speculative decode

a0b5fae

Ensure draft/target namespaces and metadata wiring stay consistent for GLM-5 sparse MTP so speculative decoding no longer collapses to zero acceptance in atom+vllm. Made-with: Cursor

whx-sjtu changed the title ~~fix(plugin): align GLM-5 MTP draft path with vLLM speculative decode~~ [Feature] Support GLM-5 MTP for vLLM Pluggin. Apr 11, 2026

whx-sjtu marked this pull request as draft April 11, 2026 13:32

wuhuikx requested a review from ganyi1996ppo April 13, 2026 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support GLM-5 MTP for vLLM Pluggin.#544

[Feature] Support GLM-5 MTP for vLLM Pluggin.#544
whx-sjtu wants to merge 2 commits intoplugin_sparse_mlafrom
whx/glm5-mtp-vllm-followup

whx-sjtu commented Apr 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

whx-sjtu commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Merge dependency

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

whx-sjtu commented Apr 11, 2026 •

edited

Loading