Skip to content

Upgrade GLM-5 image to v0.5.10#1023

Open
chunfangamd wants to merge 8 commits intomainfrom
chun/glm5
Open

Upgrade GLM-5 image to v0.5.10#1023
chunfangamd wants to merge 8 commits intomainfrom
chun/glm5

Conversation

@chunfangamd
Copy link
Copy Markdown
Collaborator

@chunfangamd chunfangamd commented Apr 10, 2026

Upgrade GLM5-FP8-MI355X-SGLang Image to v0.5.10rc0-rocm720-mi35x-20260413 after fixing the Slurm shared cache issue. With this upstream daily image, we use a new aiter 0.1.12.post1 which can improve the performance.

    if [[ "$FRAMEWORK" == "atom" ]] || [[ "$FRAMEWORK" == "sglang" ]]; then
        SLRUM_HOME_MOUNT=""
    else
        SLRUM_HOME_MOUNT=" --container-mount-home "
    fi

Co-authored with: @1am9trash @zhentaocc
Thanks to @cquil11 @billishyahao

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

@chunfangamd
Copy link
Copy Markdown
Collaborator Author

@claude please active the e2e test with test-config --config-files .github/configs/amd-master.yaml --config-keys glm5-fp8-mi355x-sglang

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Apr 10, 2026

Claude finished @chunfangamd's task in 3s —— View job


E2E Test Triggered for GLM-5

  • Understand the request
  • Trigger e2e test workflow with test-config --config-files .github/configs/amd-master.yaml --config-keys glm5-fp8-mi355x-sglang --conc 4
  • Monitor workflow run — jobs queued, waiting for MI355X runners
  • Analyze benchmark results (pending runner availability)

Workflow triggered: Run #24243787005glm5-fp8-mi355x-sglang on branch chun/glm5

Jobs created:

Job Status
get-jobs ✅ Success
glm5_1k1k fp8 mi355x sglang | tp=8 ep=1 conc-4 ⌛ Queued (waiting for mi355x runner)
glm5_8k1k fp8 mi355x sglang | tp=8 ep=1 conc-4 ⌛ Queued (waiting for mi355x runner)
glm5_8k1k fp8 mi355x sglang | tp=8 ep=1 conc-4 | eval-only ⌛ Queued (waiting for mi355x runner)
multi-node ⏩ Skipped (not applicable)

The workflow has been successfully dispatched. All 3 MI355X benchmark/eval jobs are queued and waiting for self-hosted MI355X runners to become available (~40min in queue so far). Once a runner picks up the jobs, the benchmarks will run automatically. Please monitor the run directly via the link above. You can tag @claude again once the run completes if you'd like me to analyze the results.

--- | View workflow run | Branch: chun/glm5

@chunfangamd chunfangamd self-assigned this Apr 10, 2026
@chunfangamd
Copy link
Copy Markdown
Collaborator Author

@claude, I canceled the job https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24243787005.

Comment thread .github/configs/amd-master.yaml
Comment thread .github/configs/amd-master.yaml Outdated
@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Apr 10, 2026

Claude finished @chunfangamd's task in 15s —— View job


Acknowledged Job Cancellation

The e2e test run for glm5-fp8-mi355x-sglang has been canceled as requested.

If you'd like me to re-trigger the tests or help with anything else on this PR, just let me know!

--- | Branch: chun/glm5

@billishyahao
Copy link
Copy Markdown
Collaborator

Hi @chunfangamd This is new sweep based on sgl-dev nightly build:
https://inferencex.semianalysis.com/inference?unofficialRun=24379639570

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Apr 14, 2026

@billishyahao @chunfangamd nightly is fine for now, since this is a new model and the pace of development is rapid. feel free to change this PR to the nightly image

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Apr 14, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants