[AMD] Performance Improvements for MI300X with GEMM and FP8 Enhancements#811
[AMD] Performance Improvements for MI300X with GEMM and FP8 Enhancements#811chunfangamd wants to merge 15 commits intomainfrom
Conversation
- Pin aiter and sgl-kernel to specific commits required by the
v0.5.8-rocm700-mi30x image.
- This Patch should only work with Image
lmsysorg/sglang:v0.5.8-rocm700-mi30x
- A work with Zhentao Chen
The previous aiter ref (9046b6f) changed get_mla_metadata_v1 to expect a Tensor for kv_last_page_lens, but the image's sglang still passed an int, crashing during cuda graph capture. Fix by fresh-cloning aiter at d2ca5a89, pinning sgl-kernel to 8bd6447 (now at sglang/sgl-kernel), and uninstalling stale packages before rebuilding to avoid leftover C extension conflicts.
functionstackx
left a comment
There was a problem hiding this comment.
Can y'all update sglang with the latest aiter and we wait until next sgl release image before updating inferenceMax? Ideally we wanna track acutal images and not patchwork
|
@chunfangamd So the idea is that this will land in SGLang 0.10.0? |
as it contains previous mentioned three improvements
e1c3d94 to
9c490e8
Compare
yes generally for less new models, plz use the release builds when possible +viz @cquil11 @Oseltamivir |
|
@chunfangamd did this enhancement make it into SGLang 0.5.9? |
|
@claude Can you check to see if the PRs / commits mentioned in this PR description are in SGLang 0.5.10 (https://github.com/sgl-project/sglang/releases/tag/v0.5.10)? This may involve searching the commit history on the SGLang repo. |
|
Claude finished @cquil11's task in 2m 14s —— View job Checking if PR-referenced changes are in SGLang 0.5.10
ResultsAll three PRs referenced in this PR description were merged after SGLang v0.5.9 (released Feb 24, 2026 at 01:14 UTC) and before SGLang v0.5.10 (released April 6, 2026).
Summary: Both SGLang PRs (#18624 and #18242) are confirmed to be in v0.5.10. The aiter GEMM config PR (#2024) was merged into the aiter repo before either SGLang PR, so it would be included if SGLang 0.5.10 uses an aiter version built after Feb 12, 2026. |
|
@claude referring to your previous comment and investigation:
You need to confirm if SGLang 0.5.10 uses an aiter version built after Feb 12, 2026 |
|
Claude finished @cquil11's task in 1m 28s —— View job Confirming aiter version in SGLang 0.5.10
ResultsYes, SGLang v0.5.10 uses an aiter version built well after Feb 12, 2026. SGLang v0.5.10's The aiter
Updated summary: All three PRs referenced in this PR description are confirmed to be in SGLang v0.5.10:
|
|
@claude, show me one sweep run link. Are we facing a hang issue? |
|
Claude finished @chunfangamd's task in 1m 42s —— View job Sweep runs for this PR branch
There are 2 sweep runs on the
No hang issues detected. In the most recent run (#22951504643), all 16 benchmark jobs, plus collect-evals, collect-results, and calc-success-rate completed successfully. The longest individual job was There are also no in-progress runs currently on this branch or repo-wide for the e2e-tests workflow. |
|
@claude update perf changelog to be up to date |
|
Claude finished @cquil11's task in 59s —— View job Updating perf-changelog.yaml
Updated the perf-changelog entry for this PR to reflect the actual image transition: the old entry referenced a stale daily build ( |
….5.10 Co-authored-by: Cameron Quilici <cquil11@users.noreply.github.com>
…20260414 - updated the perf-changelog correspondingly
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
Upgrade the DSR1 FP8 MI300X/MI325X images to lmsysorg/sglang-rocm:v0.5.10rc0-rocm700-mi30x-20260414:
Compared to the previous image lmsysorg/sglang:v0.5.9-rocm700-mi30x, we included the following features.
e2e Tests:
Co-authored with @zhentaocc