Skip to content

[ATOM/ROCm] atom deepseek r1 fp4 mtp3 on mi355x#1028

Open
seungrokj wants to merge 2 commits intomainfrom
srok/atom_dsr1_fp4_mtp3
Open

[ATOM/ROCm] atom deepseek r1 fp4 mtp3 on mi355x#1028
seungrokj wants to merge 2 commits intomainfrom
srok/atom_dsr1_fp4_mtp3

Conversation

@seungrokj
Copy link
Copy Markdown
Collaborator

hi,

This is deepseek r1 fp4 on atom framework which supports mtp 3 tokens.

Recipe link: https://github.com/ROCm/ATOM/blob/main/recipes/DeepSeek-R1.md#mxfp4-with-mtp

Regards,
Seungrok

Signed-off-by: seungrokj <seungrok.jung@amd.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

Comment thread perf-changelog.yaml Outdated
Signed-off-by: seungrokj <seungrok.jung@amd.com>
@seungrokj
Copy link
Copy Markdown
Collaborator Author

Copy link
Copy Markdown
Collaborator

@cquil11 cquil11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4
wait what is this?

@functionstackx
Copy link
Copy Markdown
Contributor

do yall support this model in sglang...

@seungrokj seungrokj added the AMD label Apr 15, 2026
@seungrokj
Copy link
Copy Markdown
Collaborator Author

amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 wait what is this?

hi @cquil11

new model 1) below, quantized shared & routed experts of layers 61.
old model 2) didn't quantized the entire layers 61.

==

  1. https://huggingface.co/amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4
    ->
    export exclude_layers="mlp.gate. *lm_head model.layers.61.eh_proj model.layers.61.shared_head.head model.layers.61.embed_tokens"
    vs
  2. https://huggingface.co/amd/DeepSeek-R1-0528-MXFP4
    ->
    exclude_layers="self_attn mlp.gate. lm_head model.layers.61."

@seungrokj
Copy link
Copy Markdown
Collaborator Author

@functionstackx haven't checked mtp3 + 'native' sglang yet. (atom-oot-sglang, rocm/atom-dev:sglang-latest supports though; but same perf as atom). will check this with other folks and get back to you soon :D

@seungrokj seungrokj requested a review from cquil11 April 15, 2026 01:26
@functionstackx
Copy link
Copy Markdown
Contributor

functionstackx commented Apr 15, 2026

@functionstackx haven't checked mtp3 + 'native' sglang yet. (atom-oot-sglang, rocm/atom-dev:sglang-latest supports though; but same perf as atom). will check this with other folks and get back to you soon :D

Yes, plz let me know when this model ckpt is supported in upstream sglang and we can accept this PR for this atom model kpt

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Apr 15, 2026

to reiterate what @functionstackx said, we think it is a good general rule of thumb to only allow this model once it is able to run on upstream SGLang image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants