Skip to content

Misleading field label "GPU Device IDs Detected" in dcgmi diag output #282

@nizar-sw

Description

@nizar-sw

DCGM Version: 4.4.1 / 4.5.2

Description:

The dcgmi diag output displays a field labeled "GPU Device IDs Detected" which shows identical values for all GPUs:

| GPU Device IDs Detected   | 3182, 3182, 3182, 3182, 3182, 3182, 3182       |

This is misleading because:

  • "GPU Device IDs" implies unique per-GPU identifiers
  • Users expect N different values for N GPUs
  • The actual values shown are PCI Device IDs (hardware SKU), which are identical for all GPUs of the same model

Expected behavior:

Either:

  1. Rename the field to accurately reflect what it shows: "PCI Device IDs Detected" or "GPU Model IDs"
  2. Or show actual unique GPU identifiers (UUIDs, indices, or serial numbers)

Reproduction:
bash
dcgmi diag --run 1

On any system with multiple identical GPUs.

Repro stdout examples:

Successfully ran diagnostic for group.
+---------------------------+------------------------------------------------+
| Diagnostic                | Result                                         |
+===========================+================================================+
|-----  Metadata  ----------+------------------------------------------------|
| DCGM Version              | 4.4.1                                          |
| Driver Version Detected   | 580.95.05                                      |
| GPU Device IDs Detected   | 3182, 3182, 3182, 3182, 3182, 3182, 3182, 3182 |
|-----  Deployment  --------+------------------------------------------------|
| software                  | Pass                                           |
|                           | GPU0: Pass                                     |
|                           | GPU1: Pass                                     |
|                           | GPU2: Pass                                     |
|                           | GPU3: Pass                                     |
|                           | GPU4: Pass                                     |
|                           | GPU5: Pass                                     |
|                           | GPU6: Pass                                     |
|                           | GPU7: Pass                                     |
+---------------------------+------------------------------------------------+
Successfully ran diagnostic for group.
+---------------------------+------------------------------------------------+
| Diagnostic                | Result                                         |
+===========================+================================================+
|-----  Metadata  ----------+------------------------------------------------|
| DCGM Version              | 4.5.2                                          |
| Driver Version Detected   | 580.126.09                                     |
| GPU Device IDs Detected   | 3182, 3182, 3182, 3182, 3182, 3182, 3182       |
|-----  Deployment  --------+------------------------------------------------|
| software                  | Pass                                           |
|                           | GPU0: Pass                                     |
|                           | GPU1: Pass                                     |
|                           | GPU2: Pass                                     |
|                           | GPU3: Pass                                     |
|                           | GPU4: Pass                                     |
|                           | GPU5: Pass                                     |
|                           | GPU6: Pass                                     |
+---------------------------+------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions