Skip to content

test(ci): add dcgm-exporter compatibility unit test to validate-components workflow#8368

Merged
ganeshkumarashok merged 5 commits into
mainfrom
suraj/test-dcgm-dependency
Apr 28, 2026
Merged

test(ci): add dcgm-exporter compatibility unit test to validate-components workflow#8368
ganeshkumarashok merged 5 commits into
mainfrom
suraj/test-dcgm-dependency

Conversation

@surajssd
Copy link
Copy Markdown
Member

What this PR does / why we need it:

Adds a TestDCGMExporterCompatibility unit test that validates dcgm-exporter package dependencies (datacenter-gpu-manager-4-core and datacenter-gpu-manager-4-proprietary) match the versions pinned in components.json — without requiring GPU VMs or any infrastructure.

Problem

The existing Test_DCGM_Exporter_Compatibility e2e test in the GPU E2E pipeline only triggers on changes to e2e/** or **/*.go files. Renovate PRs that bump DCGM package versions in parts/common/components.json don't match those path filters, so the compatibility check is silently skipped (as seen in #8354).

Solution

  • Add TestDCGMExporterCompatibility unit test in e2e/components/components_test.go that downloads .deb/.rpm packages from PMC and parses their dependency metadata natively in Go using blakesmith/ar + klauspost/compress/zstd (for .deb) and cavaliergopher/rpm (for .rpm) — no dpkg-deb or rpm CLI tools needed
  • Cover all three OS variants: Ubuntu 22.04, Ubuntu 24.04, and Azure Linux 3.0
  • Add a dcgm-compatibility job to .github/workflows/validate-components.yml so the test runs automatically on every PR, including Renovate dependency bumps
  • The existing GPU e2e test is kept for full integration coverage on scheduled GPU runs

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a lightweight, infra-free compatibility check to ensure dcgm-exporter package dependencies match the DCGM versions pinned in parts/common/components.json, and wires it into CI so Renovate bumps don’t bypass the validation.

Changes:

  • Added TestDCGMExporterCompatibility in e2e/components/components_test.go to download .deb/.rpm artifacts and parse dependency metadata in Go.
  • Added new Go module dependencies in e2e/go.mod/e2e/go.sum to support .deb (ar + tar + zstd) and .rpm parsing.
  • Added a dcgm-compatibility job to .github/workflows/validate-components.yml to run the new test on every PR.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.

File Description
e2e/components/components_test.go Introduces the new DCGM exporter compatibility test and package-metadata parsers.
e2e/go.mod Adds required libraries for parsing .deb and .rpm metadata in Go tests.
e2e/go.sum Records checksums for the newly added Go dependencies.
.github/workflows/validate-components.yml Adds CI job to run the new compatibility unit test on PRs.

Comment thread e2e/components/components_test.go
Comment thread e2e/components/components_test.go Outdated
Comment thread e2e/components/components_test.go
…mponents` workflow

- Add `TestDCGMExporterCompatibility` unit test in `e2e/components/` that
downloads `dcgm-exporter` `.deb`/`.rpm` packages from PMC and verifies declared
dependencies on `datacenter-gpu-manager-4-core` and
`datacenter-gpu-manager-4-proprietary` match versions in `components.json`

- Parse package metadata natively in Go using `blakesmith/ar` +
`klauspost/compress/zstd` for `.deb` and `cavaliergopher/rpm` for `.rpm`,
eliminating need for `dpkg-deb`/`rpm` CLI tools or VM infrastructure

- Add `dcgm-compatibility` job to `.github/workflows/validate-components.yml` so
  the test runs automatically on every PR (including Renovate dependency bumps)

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Cover `dcgm-exporter` compatibility check for all OS variants in
`components.json` — Ubuntu 22.04 was previously missing, so a
Renovate bump on that variant would bypass the version skew detection.

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
…n assertions

Future-proof against `previousLatestVersion` being added to
`components.json` — `GetExpectedPackageVersions` returns both
`latestVersion` and `previousLatestVersion` when present, so
asserting `len == 1` would break with a confusing error message.

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Log epoch, version, and release separately when parsing RPM
dependencies to verify `cavaliergopher/rpm` library correctly
decomposes the EVR fields used by `formatRPMVersion`.

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
- Add `downloadWithRetry` helper with 60s `http.Client` timeout and
3 retries with exponential backoff to prevent CI hangs on transient
network failures

- Extract `parseDebControlField` to correctly parse RFC822-style
continuation lines in `.deb` control files, where long `Depends:`
values may wrap across multiple lines

Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Copilot AI review requested due to automatic review settings April 27, 2026 17:14
@surajssd surajssd force-pushed the suraj/test-dcgm-dependency branch from 52149e1 to 31bbb68 Compare April 27, 2026 17:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

Comment thread e2e/components/components_test.go
Comment thread e2e/components/components_test.go
@surajssd
Copy link
Copy Markdown
Member Author

Tests ran locally:

➜  go test -v -run ^TestDCGMExporterCompatibility$ ./components/
Error loading .env file: open .env: no such file or directory
=== RUN   TestDCGMExporterCompatibility
=== RUN   TestDCGMExporterCompatibility/Ubuntu2204
    components_test.go:178: Expected versions from components.json:
    components_test.go:179:   dcgm-exporter: 4.8.1-ubuntu22.04u5
    components_test.go:180:   datacenter-gpu-manager-4-core: 1:4.5.2-1
    components_test.go:181:   datacenter-gpu-manager-4-proprietary: 1:4.5.2-1
    components_test.go:185: Downloading dcgm-exporter package from https://packages.microsoft.com/repos/microsoft-ubuntu-jammy-prod/pool/main/d/dcgm-exporter/dcgm-exporter_4.8.1-ubuntu22.04u5_amd64.deb
    components_test.go:202: Actual versions from dcgm-exporter package:
    components_test.go:203:   datacenter-gpu-manager-4-core: 1:4.5.2-1
    components_test.go:204:   datacenter-gpu-manager-4-proprietary: 1:4.5.2-1
    components_test.go:215: ✅ Version compatibility verified: dcgm-exporter 4.8.1-ubuntu22.04u5 is compatible with DCGM packages 1:4.5.2-1
=== RUN   TestDCGMExporterCompatibility/Ubuntu2404
    components_test.go:178: Expected versions from components.json:
    components_test.go:179:   dcgm-exporter: 4.8.1-ubuntu24.04u5
    components_test.go:180:   datacenter-gpu-manager-4-core: 1:4.5.2-1
    components_test.go:181:   datacenter-gpu-manager-4-proprietary: 1:4.5.2-1
    components_test.go:185: Downloading dcgm-exporter package from https://packages.microsoft.com/repos/microsoft-ubuntu-noble-prod/pool/main/d/dcgm-exporter/dcgm-exporter_4.8.1-ubuntu24.04u5_amd64.deb
    components_test.go:202: Actual versions from dcgm-exporter package:
    components_test.go:203:   datacenter-gpu-manager-4-core: 1:4.5.2-1
    components_test.go:204:   datacenter-gpu-manager-4-proprietary: 1:4.5.2-1
    components_test.go:215: ✅ Version compatibility verified: dcgm-exporter 4.8.1-ubuntu24.04u5 is compatible with DCGM packages 1:4.5.2-1
=== RUN   TestDCGMExporterCompatibility/AzureLinux3
    components_test.go:178: Expected versions from components.json:
    components_test.go:179:   dcgm-exporter: 4.8.1-5.azl3
    components_test.go:180:   datacenter-gpu-manager-4-core: 1:4.5.2-1
    components_test.go:181:   datacenter-gpu-manager-4-proprietary: 1:4.5.2-1
    components_test.go:185: Downloading dcgm-exporter package from https://packages.microsoft.com/azurelinux/3.0/prod/cloud-native/x86_64/Packages/d/dcgm-exporter-4.8.1-5.azl3.x86_64.rpm
    components_test.go:200: RPM dependency datacenter-gpu-manager-4-core: epoch=1 version=4.5.2 release=1
    components_test.go:200: RPM dependency datacenter-gpu-manager-4-proprietary: epoch=1 version=4.5.2 release=1
    components_test.go:202: Actual versions from dcgm-exporter package:
    components_test.go:203:   datacenter-gpu-manager-4-core: 1:4.5.2-1
    components_test.go:204:   datacenter-gpu-manager-4-proprietary: 1:4.5.2-1
    components_test.go:215: ✅ Version compatibility verified: dcgm-exporter 4.8.1-5.azl3 is compatible with DCGM packages 1:4.5.2-1
--- PASS: TestDCGMExporterCompatibility (4.37s)
    --- PASS: TestDCGMExporterCompatibility/Ubuntu2204 (1.44s)
    --- PASS: TestDCGMExporterCompatibility/Ubuntu2404 (1.49s)
    --- PASS: TestDCGMExporterCompatibility/AzureLinux3 (1.44s)
PASS
ok  	github.com/Azure/agentbaker/e2e/components	5.501s

@ganeshkumarashok ganeshkumarashok merged commit 54341d2 into main Apr 28, 2026
33 of 37 checks passed
@ganeshkumarashok ganeshkumarashok deleted the suraj/test-dcgm-dependency branch April 28, 2026 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants