Skip to content

feat(cpu): add CPUHierarchy_v2 binding with per-CPU serial#129

Open
rvatkar wants to merge 4 commits into
mainfrom
rvatkat/feat/cpu-hierarchy-v2-serial
Open

feat(cpu): add CPUHierarchy_v2 binding with per-CPU serial#129
rvatkar wants to merge 4 commits into
mainfrom
rvatkat/feat/cpu-hierarchy-v2-serial

Conversation

@rvatkar

@rvatkar rvatkar commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Adds Go bindings for dcgmGetCpuHierarchy_v2() so callers can access the Grace CPU serial returned by DCGM.

Changes

  • Add CPUHierarchy_v2 / CPUHierarchyCPU_v2 support.
  • Convert the v2 C hierarchy into Go structs, including Serial.
  • Keep the existing v1 CPU hierarchy API unchanged.
  • Add converter tests that exercise real C-struct conversion, including:
    • populated serial
    • empty hierarchy / NumCPUs == 0
    • CPU/core conversion coverage

Validation

  • go test ./pkg/dcgm

@rvatkar rvatkar requested review from glowkey and nccurry and removed request for glowkey June 11, 2026 22:31
Comment thread pkg/dcgm/cpu.go Outdated
return toCpuHierarchy(c_hierarchy), nil
}

func cpuHierarchyError(operation string, result C.dcgmReturn_t) error {

@nccurry nccurry Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both of these functions wrapping functions wrapping functions?
Maybe we do, maybe we dont.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll simplify this. cpuHierarchyError is still useful because these APIs need to preserve the DCGM status as a wrapped *dcgm.Error, but wrapCPUHierarchyError is just an extra test seam. I’ll remove
that layer and test cpuHierarchyError directly.

Comment thread pkg/dcgm/cpu.go
Comment thread pkg/dcgm/cpu_test.go Outdated
assert.ErrorContains(t, err, tt.status)

var dcgmErr *Error
require.True(t, errors.As(err, &dcgmErr))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo enables testifylint’s error-is-as rule, so this assertion should use require.ErrorAs(t, err, &dcgmErr) instead of require.True(t, errors.As(...)); that also lets the errors import go away.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants