[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) by functionstackx · Pull Request #827 · SemiAnalysisAI/InferenceX

functionstackx · 2026-03-01T04:14:09Z

Add MiniMax M2.5 MXFP4 benchmark config for MI355x with vLLM v0.17.1, now that AMD's MXFP4 checkpoint is out: https://huggingface.co/amd/MiniMax-M2.5-MXFP4

Model: amd/MiniMax-M2.5-MXFP4
Image: vllm/vllm-openai-rocm:v0.17.1
TP=2 and TP=4 (matching MiniMax M2.5 FP8 pattern)
VLLM_ROCM_USE_AITER=1, with AITER MoE fallback for TP>=4
Seq lengths: 1k1k, 1k8k, 8k1k (conc 4-64)

Closes #826

Generated with Claude Code

functionstackx · 2026-03-01T04:43:02Z

@claude can u comment out TP=4 and only do TP=2 for now since TP=4 is bugged vllm-project/vllm#35637

Klaud-Cold · 2026-03-01T04:43:24Z

Claude finished @functionstackx's task in 53s —— View job

Commented out TP=4 for MiniMax M2.1 MXFP4

Due to vllm-project/vllm#35637, TP=4 has been disabled. Only TP=2 will run.

Read current benchmark files
Comment out TP=4 in .github/configs/amd-master.yaml (all 3 seq-len configs)
Update perf-changelog.yaml with PR link and description
Commit and push changes

Changes made:

.github/configs/amd-master.yaml: Commented out all 3 tp: 4 search-space entries under minimaxm2.1-fp4-mi355x-vllm with a reference to the vLLM issue
perf-changelog.yaml: Updated entry to reflect TP=2 only and added PR [AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) #827 link
|

functionstackx · 2026-03-01T07:11:04Z

@chunfangamd i am seeing good mxfp4 performance when comparing to FP8. just waiting for AMD to release mxfp4 version of MiniMax M2.5 now. currently AMD only have mxfp4 version of MiniMax M2.1.

chunfangamd

Looks good to me

functionstackx · 2026-04-02T00:33:55Z

@adibarra do u wanna take over this PR too? upgraading this PR to v0.18.1, this might work

adibarra · 2026-04-02T02:13:58Z

Sure, I'll give it a shot

benenzhu · 2026-04-06T14:04:56Z

@adibarra Hi, I think we need wait for vllm v0.19.1 for this. The fix PR can be found in vllm 0.19.1rc0. 0.19.0 still don't have the fix.

adibarra · 2026-04-06T14:10:49Z

Sounds good, we'll wait till then!

benenzhu · 2026-04-10T09:35:09Z

@adibarra Run passed. Wait for VLLM v0.19.1's release.

github-actions · 2026-04-14T20:01:59Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-04-14T20:01:59Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

functionstackx requested a review from a team March 1, 2026 04:14

functionstackx requested review from billishyahao and chunfangamd as code owners March 1, 2026 04:14

github-project-automation bot added this to InferenceMAX Board Mar 1, 2026

functionstackx added AMD sweep-enabled labels Mar 1, 2026

functionstackx changed the title ~~[AMD] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ [WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 1, 2026

functionstackx removed the sweep-enabled label Mar 1, 2026

functionstackx added the sweep-enabled label Mar 1, 2026

functionstackx removed the sweep-enabled label Mar 1, 2026

functionstackx marked this pull request as draft March 1, 2026 23:23

chunfangamd marked this pull request as ready for review March 4, 2026 09:09

chunfangamd approved these changes Mar 4, 2026

View reviewed changes

chunfangamd changed the title ~~[WIP] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026

chunfangamd enabled auto-merge (squash) March 4, 2026 09:11

functionstackx changed the title ~~Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ [Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4) Mar 4, 2026

functionstackx changed the title ~~[Do Not Merge] [WIP till AMD releases MXFP4 of MiniMax M2.5] Add MiniMax M2.1 MXFP4 benchmark for MI355x vLLM (TP=2,4)~~ Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) Mar 20, 2026

functionstackx added sweep-enabled and removed sweep-enabled labels Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from bd10495 to e849d65 Compare March 20, 2026 01:50

functionstackx added the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch 2 times, most recently from 86cc700 to b82116b Compare March 20, 2026 01:57

functionstackx removed the sweep-enabled label Mar 20, 2026

functionstackx force-pushed the claude/issue-826-20260301-0409 branch from b82116b to 7dd6063 Compare March 20, 2026 01:59

functionstackx added the sweep-enabled label Mar 20, 2026

functionstackx added vllm/sglang release broken -need to wait and removed sweep-enabled labels Mar 29, 2026

adibarra added sweep-enabled and removed sweep-enabled labels Apr 4, 2026

Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.18.1 (TP=2,4)

1f44d49

adibarra force-pushed the claude/issue-826-20260301-0409 branch from ddd4f96 to 1f44d49 Compare April 6, 2026 13:54

fix changelog description

2b5662b

Switch to vLLM ROCm nightly to test trust_remote_code fix

3c301d0

adibarra removed the sweep-enabled label Apr 6, 2026

fix eval to see eval results

43fdb59

benenzhu added sweep-enabled and removed sweep-enabled labels Apr 6, 2026

benenzhu added 3 commits April 10, 2026 08:27

fix fp4 align with fp8 and fix eval issues

4b29aa3

fix fp4 align with fp8 and fix eval issues

def349c

fix fp4 align with fp8 and fix eval issues

041a41b

benenzhu added sweep-enabled and removed sweep-enabled labels Apr 10, 2026

Merge branch 'main' into claude/issue-826-20260301-0409

5959ab7

cquil11 requested review from 1am9trash, seungrokj and yctseng0211 as code owners April 14, 2026 20:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)#827

[AMD] [DNM, still merge in 0.18 as trust_remote_code=True is not passed to quark] Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4)#827
functionstackx wants to merge 8 commits intomainfrom
claude/issue-826-20260301-0409

functionstackx commented Mar 1, 2026 •

edited

Loading

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

Klaud-Cold commented Mar 1, 2026 •

edited

Loading

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

chunfangamd left a comment

Uh oh!

functionstackx commented Apr 2, 2026

Uh oh!

adibarra commented Apr 2, 2026

Uh oh!

benenzhu commented Apr 6, 2026

Uh oh!

adibarra commented Apr 6, 2026

Uh oh!

benenzhu commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

functionstackx commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

Klaud-Cold commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commented out TP=4 for MiniMax M2.1 MXFP4

Uh oh!

functionstackx commented Mar 1, 2026

Uh oh!

chunfangamd left a comment

Choose a reason for hiding this comment

Uh oh!

functionstackx commented Apr 2, 2026

Uh oh!

adibarra commented Apr 2, 2026

Uh oh!

benenzhu commented Apr 6, 2026

Uh oh!

adibarra commented Apr 6, 2026

Uh oh!

benenzhu commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

functionstackx commented Mar 1, 2026 •

edited

Loading

Klaud-Cold commented Mar 1, 2026 •

edited

Loading