[plugin][dashboard] use nightly date tagged docker by zejunchen-zejun · Pull Request #503 · ROCm/ATOM

zejunchen-zejun · 2026-04-07T08:13:11Z

when running vllm-atom benchmark, only use docker image with the date instead of latest docker
CC: @wuhuikx

Copilot

Pull request overview

This PR improves reproducibility and presentation of ATOM vLLM OOT benchmark runs by resolving the floating vllm-latest Docker tag to a stable, date-tagged (or digest-pinned) image reference and by surfacing the resolved image digest in the benchmark dashboard while suppressing commit metadata in OOT detail views.

Changes:

Update the OOT benchmark workflow to resolve rocm/atom-dev:vllm-latest to a same-digest nightly tag (or digest-pin) and record the pulled image digest into the result payload.
Add a new .github/scripts/resolve_oot_image.py helper to perform the Docker Hub registry resolution from a floating tag to a stable reference.
Extend the OOT dashboard pipeline/UI to ingest and display the Docker image digest, and hide commit/message/author for ATOM-vLLM detail views.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`.github/workflows/atom-vllm-oot-benchmark.yaml`	Resolves prebuilt OOT image to a stable nightly/digest reference and records the pulled digest into the benchmark payload.
`.github/scripts/resolve_oot_image.py`	New resolver script that maps a floating tag to a same-digest nightly tag (or digest-pinned fallback).
`.github/scripts/oot_benchmark_to_dashboard.py`	Adds `oot_image_digest` into the `extra` metadata string for dashboard ingestion.
`.github/dashboard/index.html`	Parses digest from `extra` and updates detail/popover rendering to show digest and hide commit metadata for ATOM-vLLM.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/scripts/resolve_vllm_atom_image.py

.github/scripts/resolve_oot_image.py

.github/dashboard/index.html

.github/workflows/atom-vllm-benchmark.yaml

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/workflows/atom-vllm-benchmark.yaml

.github/scripts/resolve_oot_image.py

wuhuikx · 2026-04-09T02:40:37Z

Low — Reverse lookup in .github/scripts/resolve_oot_image.py performs a linear, per-tag digest fetch for nightly tags. As the number of tags grows, request volume increases and may hit Docker Hub rate limits, which can force a fallback to floating vllm-latest and weaken reproducibility stability.

resolve_oot_image.py
Lines 162–167

candidates = nightly_candidates(list_tags(repository, token), preferred_version)
matched_tag = None
for candidate in candidates:
if get_manifest_digest(repository, candidate.tag, token) == reference_digest:
matched_tag = candidate.tag
Suggestion: Limit the scan window (for example, only the most recent N nightly tags), or use a tag-list API/flow that includes digests when possible to reduce N manifest requests.

zejunchen-zejun · 2026-04-09T06:41:49Z

Low — Reverse lookup in .github/scripts/resolve_oot_image.py performs a linear, per-tag digest fetch for nightly tags. As the number of tags grows, request volume increases and may hit Docker Hub rate limits, which can force a fallback to floating vllm-latest and weaken reproducibility stability.

resolve_oot_image.py Lines 162–167

candidates = nightly_candidates(list_tags(repository, token), preferred_version) matched_tag = None for candidate in candidates: if get_manifest_digest(repository, candidate.tag, token) == reference_digest: matched_tag = candidate.tag Suggestion: Limit the scan window (for example, only the most recent N nightly tags), or use a tag-list API/flow that includes digests when possible to reduce N manifest requests.

The docker image is searched from the latest to the oldest according to the digest, so the search overhead is very small.
If the docker cannot be found, we will fallback to the docker with the digest tag instead of vLLM:latest

and neglect commit/message/author info Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/scripts/resolve_vllm_atom_image.py

.github/workflows/atom-vllm-benchmark.yaml

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/scripts/resolve_vllm_atom_image.py

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/scripts/resolve_vllm_atom_image.py

wuhuikx · 2026-04-11T05:33:26Z

Can we remove the OOT or oot naming?

zejunchen-zejun · 2026-04-12T12:12:15Z

Can we remove the OOT or oot naming?

Sure. I will remove the OOT terminology in this PR's code change. For other remove work, let's make in PR #541

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

.github/workflows/atom-vllm-benchmark.yaml:1

The workflow name was changed to "ATOM vLLM Benchmark", but later steps still query the old workflow name ("ATOM vLLM OOT Benchmark") when downloading baseline runs via gh run list --workflow=.... That lookup will return no runs, so regression/baseline comparison will silently stop working. Update the gh run list filter to match the new workflow name (or switch to referencing the workflow file path / workflow ID to avoid breakage on future renames).
.github/workflows/atom-vllm-benchmark.yaml:186
This fallback uses the floating tag rocm/atom-dev:vllm-latest when resolution fails, which can undermine the PR goal of avoiding non-date-tagged (and non-stable) images for main-branch benchmarks. Consider failing the run when digest resolution cannot be performed, or add a more robust fallback that still produces a digest-pinned reference (e.g., retry/backoff, optional Docker Hub auth to avoid rate limits, or a second resolver using docker manifest inspect).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot AI review requested due to automatic review settings April 7, 2026 08:13

Copilot started reviewing on behalf of zejunchen-zejun April 7, 2026 08:14 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings April 8, 2026 03:32

zejunchen-zejun changed the title ~~[plugin][OOT dashboard] use nightly date tagged docker and neglect commit/message/author info~~ [plugin][OOT dashboard] use nightly date tagged docker Apr 8, 2026

Copilot started reviewing on behalf of zejunchen-zejun April 8, 2026 03:33 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

.github/workflows/atom-vllm-benchmark.yaml Show resolved Hide resolved

.github/scripts/resolve_oot_image.py Outdated Show resolved Hide resolved

.github/scripts/resolve_oot_image.py Outdated Show resolved Hide resolved

Copilot AI review requested due to automatic review settings April 9, 2026 07:04

Copilot started reviewing on behalf of zejunchen-zejun April 9, 2026 07:06 View session

zejunchen-zejun added 5 commits April 9, 2026 15:07

[plugin][OOT dashboard] use nightly date tagged docker

ee1acac

and neglect commit/message/author info Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

make lint happy

58f5bd3

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

retrieve message shown in detailed chart

63cbfa2

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

not fallback to vllm:latest, use docker with digest tag

dd9a31a

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

make lint happy

3d69252

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

zejunchen-zejun force-pushed the zejun/refine_oot_benchmark_0405 branch from 512283e to 3d69252 Compare April 9, 2026 07:08

Copilot AI reviewed Apr 9, 2026

View reviewed changes

.github/scripts/resolve_vllm_atom_image.py Show resolved Hide resolved

.github/scripts/resolve_vllm_atom_image.py Show resolved Hide resolved

.github/workflows/atom-vllm-benchmark.yaml Show resolved Hide resolved

zejunchen-zejun added 2 commits April 10, 2026 21:49

Merge branch 'main' into zejun/refine_oot_benchmark_0405

7a41dff

only push data.js to dashboard

178e174

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot AI review requested due to automatic review settings April 10, 2026 14:18

Copilot started reviewing on behalf of zejunchen-zejun April 10, 2026 14:19 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

.github/scripts/resolve_vllm_atom_image.py Show resolved Hide resolved

.github/scripts/resolve_vllm_atom_image.py Show resolved Hide resolved

zejunchen-zejun added 2 commits April 10, 2026 22:58

split the workflow

96fb53f

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Merge branch 'main' into zejun/refine_oot_benchmark_0405

fdb0cb6

Copilot AI review requested due to automatic review settings April 11, 2026 01:56

Copilot started reviewing on behalf of zejunchen-zejun April 11, 2026 01:58 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

.github/scripts/resolve_vllm_atom_image.py Show resolved Hide resolved

zejunchen-zejun changed the title ~~[plugin][OOT dashboard] use nightly date tagged docker~~ [plugin][dashboard] use nightly date tagged docker Apr 12, 2026

replace the old name

57ac737

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

wuhuikx previously approved these changes Apr 13, 2026

View reviewed changes

divide the sglang nightly from sglang ci

98ac2d3

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

Copilot AI review requested due to automatic review settings April 13, 2026 05:43

zejunchen-zejun dismissed wuhuikx’s stale review via 98ac2d3 April 13, 2026 05:43

Copilot started reviewing on behalf of zejunchen-zejun April 13, 2026 05:45 View session

Copilot AI reviewed Apr 13, 2026

View reviewed changes

wuhuikx previously approved these changes Apr 13, 2026

View reviewed changes

valarLip previously approved these changes Apr 13, 2026

View reviewed changes

align the job name

ef7dc87

Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>

zejunchen-zejun dismissed stale reviews from valarLip and wuhuikx via ef7dc87 April 13, 2026 10:12

valarLip approved these changes Apr 13, 2026

View reviewed changes

valarLip merged commit a6fe785 into main Apr 13, 2026
25 of 29 checks passed

valarLip deleted the zejun/refine_oot_benchmark_0405 branch April 13, 2026 11:45

Conversation

zejunchen-zejun commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wuhuikx commented Apr 9, 2026

Uh oh!

zejunchen-zejun commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

wuhuikx commented Apr 11, 2026

Uh oh!

zejunchen-zejun commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zejunchen-zejun commented Apr 7, 2026 •

edited

Loading

zejunchen-zejun commented Apr 12, 2026 •

edited

Loading