[ATOM/ROCm] atom deepseek r1 fp4 mtp3 on mi355x#1028
Conversation
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
hi @functionstackx @cquil11 e2e perf: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24402497975 |
cquil11
left a comment
There was a problem hiding this comment.
amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4
wait what is this?
|
do yall support this model in sglang... |
hi @cquil11 new model 1) below, quantized shared & routed experts of layers 61. ==
|
|
@functionstackx haven't checked mtp3 + 'native' sglang yet. (atom-oot-sglang, |
Yes, plz let me know when this model ckpt is supported in upstream sglang and we can accept this PR for this atom model kpt |
|
to reiterate what @functionstackx said, we think it is a good general rule of thumb to only allow this model once it is able to run on upstream SGLang image |
hi,
This is deepseek r1 fp4 on atom framework which supports mtp 3 tokens.
Recipe link: https://github.com/ROCm/ATOM/blob/main/recipes/DeepSeek-R1.md#mxfp4-with-mtp
Regards,
Seungrok