Skip to content

[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)#827

Open
functionstackx wants to merge 8 commits intomainfrom
claude/issue-826-20260301-0409
Open

[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)#827
functionstackx wants to merge 8 commits intomainfrom
claude/issue-826-20260301-0409

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx commented Mar 1, 2026

Add MiniMax M2.5 MXFP4 benchmark config for MI355x with vLLM v0.17.1, now that AMD's MXFP4 checkpoint is out: https://huggingface.co/amd/MiniMax-M2.5-MXFP4

  • Model: amd/MiniMax-M2.5-MXFP4
  • Image: vllm/vllm-openai-rocm:v0.17.1
  • TP=2 and TP=4 (matching MiniMax M2.5 FP8 pattern)
  • VLLM_ROCM_USE_AITER=1, with AITER MoE fallback for TP>=4
  • Seq lengths: 1k1k, 1k8k, 8k1k (conc 4-64)

Closes #826

Generated with Claude Code

@functionstackx functionstackx requested a review from a team March 1, 2026 04:14
@functionstackx functionstackx changed the title [AMD] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) [WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 1, 2026
@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude can u comment out TP=4 and only do TP=2 for now since TP=4 is bugged vllm-project/vllm#35637

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 1, 2026

Claude finished @functionstackx's task in 53s —— View job


Commented out TP=4 for MiniMax M2.1 MXFP4

Due to vllm-project/vllm#35637, TP=4 has been disabled. Only TP=2 will run.

  • Read current benchmark files
  • Comment out TP=4 in .github/configs/amd-master.yaml (all 3 seq-len configs)
  • Update perf-changelog.yaml with PR link and description
  • Commit and push changes

Changes made:

@functionstackx
Copy link
Copy Markdown
Contributor Author

@chunfangamd i am seeing good mxfp4 performance when comparing to FP8. just waiting for AMD to release mxfp4 version of MiniMax M2.5 now. currently AMD only have mxfp4 version of MiniMax M2.1.

image

@functionstackx functionstackx marked this pull request as draft March 1, 2026 23:23
@chunfangamd chunfangamd marked this pull request as ready for review March 4, 2026 09:09
Copy link
Copy Markdown
Collaborator

@chunfangamd chunfangamd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@chunfangamd chunfangamd changed the title [WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026
@chunfangamd chunfangamd enabled auto-merge (squash) March 4, 2026 09:11
@functionstackx functionstackx changed the title Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) [Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026
@functionstackx functionstackx changed the title [Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) Mar 20, 2026
@functionstackx functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from bd10495 to e849d65 Compare March 20, 2026 01:50
@functionstackx functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from 86cc700 to b82116b Compare March 20, 2026 01:57
@functionstackx functionstackx force-pushed the claude/issue-826-20260301-0409 branch from b82116b to 7dd6063 Compare March 20, 2026 01:59
@functionstackx
Copy link
Copy Markdown
Contributor Author

@adibarra do u wanna take over this PR too? upgraading this PR to v0.18.1, this might work

@adibarra
Copy link
Copy Markdown
Collaborator

adibarra commented Apr 2, 2026

Sure, I'll give it a shot

@adibarra adibarra force-pushed the claude/issue-826-20260301-0409 branch from ddd4f96 to 1f44d49 Compare April 6, 2026 13:54
@benenzhu
Copy link
Copy Markdown
Collaborator

benenzhu commented Apr 6, 2026

@adibarra Hi, I think we need wait for vllm v0.19.1 for this. The fix PR can be found in vllm 0.19.1rc0. 0.19.0 still don't have the fix.
image

@adibarra
Copy link
Copy Markdown
Collaborator

adibarra commented Apr 6, 2026

Sounds good, we'll wait till then!

@cquil11 cquil11 changed the title [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) [AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) Apr 8, 2026
@benenzhu
Copy link
Copy Markdown
Collaborator

@adibarra Run passed. Wait for VLLM v0.19.1's release.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

mi355 fp4 minimax vllm single node

6 participants