Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/configs/amd-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2137,7 +2137,7 @@ dsr1-fp4-mi355x-sglang-disagg-8k1k-mtp:


dsv4-fp4-mi355x-sglang:
image: lmsysorg/sglang-rocm:v0.5.12.post1-rocm720-mi35x-20260610
image: lmsysorg/sglang-rocm:v0.5.13-rocm720-mi35x-20260612
model: deepseek-ai/DeepSeek-V4-Pro
model-prefix: dsv4
runner: mi355x
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,9 @@ start_gpu_monitor
PARALLEL_ARGS=(
--tensor-parallel-size "$TP"
)
CHUNKED_PREFILL_SIZE=8192
if [ "${DP_ATTENTION}" = "true" ]; then
CHUNKED_PREFILL_SIZE=$((8192 * TP))
PARALLEL_ARGS+=(
--dp "$TP"
--enable-dp-attention
Expand All @@ -85,7 +87,7 @@ sglang serve \
--swa-full-tokens-ratio 0.15 \
--page-size 256 \
--context-length $MAX_MODEL_LEN \
--chunked-prefill-size 8192 \
--chunked-prefill-size $CHUNKED_PREFILL_SIZE \
--disable-shared-experts-fusion \
--tool-call-parser deepseekv4 \
--reasoning-parser deepseek-v4 \
Expand Down
8 changes: 8 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3600,6 +3600,14 @@
- "MI355x DSR1-FP4: Include TP4 configurations for 8k1k"
- "Expand the TP sweep (included TP=4) for 8k/1k configuration for conc=4 to 64"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1692

- config-keys:
- dsv4-fp4-mi355x-sglang
description:
- "Bump image to lmsysorg/sglang-rocm:v0.5.13-rocm720-mi35x-20260612."
- "Fix the intermediate_pad setting in the MoE computation in sglang PR#27858. This avoids the unnecessary overhead of computing useless padding."
- "Correct the chunk prefill setting size under tp8/dp8 config."
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1715

- config-keys:
- dsv4-fp4-gb200-dynamo-sglang
Expand Down