[BugFix] enable deepseek r1 fp4 by ZLkanyo009 · Pull Request #527 · ROCm/ATOM

ZLkanyo009 · 2026-04-09T03:20:50Z

Motivation

For FP4 DeepSeek, the attention part is fully BF16 while the MoE part is FP4. Therefore, the scale for the attention part is None. For regular DeepSeek FP8, the attention part is also FP8 and requires quantization. However, the case where scale is None was previously ignored. This PR fixes that bug.

command

export AITER_QUICK_REDUCE_QUANTIZATION=INT4
export SGLANG_AITER_FP8_PREFILL_ATTN=0
export SGLANG_USE_AITER=1
export ATOM_ENABLE_DS_QKNORM_QUANT_FUSION=1
 
model_path=/workspace/model/DeepSeek-R1-0528-MXFP4/
export PYTHONPATH=/workspace/dpsk-r1-fp4/sglang/python:/workspace/dpsk-r1-fp4/ATOM_oot/ATOM
 
 
export SGLANG_PROFILE_RECORD_SHAPES=1
export SGLANG_PROFILE_WITH_STACK=1
export SGLANG_TORCH_PROFILER_DIR=/workspace/dpsk-r1-fp4/sglang/profile_log
 
# export SGLANG_EXTERNAL_MODEL_PACKAGE=atom.plugin.sglang.model_wrapper
export SGLANG_EXTERNAL_MODEL_PACKAGE=atom.plugin.sglang.models

export ATOM_PROFILE_MLA_ABSORBED_BMM=1

TORCHINDUCTOR_COMPILE_THREADS=128 python3 -m sglang.launch_server \
    --model-path $model_path \
    --host localhost \
    --port 8000 \
    --trust-remote-code \
    --tensor-parallel-size 8 \
    --kv-cache-dtype fp8_e4m3 \
    --mem-fraction-static 0.9 \
    --page-size 1 \
    --disable-radix-cache \
    --skip-server-warmup \
    --disable-cuda-graph > log.serve.atom.oot.fp4.log 2>&1

zhuyuhua-v

LGTM

ZLkanyo009 force-pushed the lingzha/enable-dpsk-fp4 branch from d649906 to b3dd131 Compare April 13, 2026 03:01

[BugFix] enable deepseek r1 fp4

06eedc6

ZLkanyo009 force-pushed the lingzha/enable-dpsk-fp4 branch from b3dd131 to 06eedc6 Compare April 13, 2026 03:06

ZLkanyo009 requested a review from zhuyuhua-v April 13, 2026 03:06

zhuyuhua-v approved these changes Apr 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] enable deepseek r1 fp4#527

[BugFix] enable deepseek r1 fp4#527
ZLkanyo009 wants to merge 1 commit intomainfrom
lingzha/enable-dpsk-fp4

ZLkanyo009 commented Apr 9, 2026 •

edited

Loading

Uh oh!

zhuyuhua-v left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZLkanyo009 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

command

Uh oh!

zhuyuhua-v left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZLkanyo009 commented Apr 9, 2026 •

edited

Loading