Skip to content

Conversation

@iximeow
Copy link
Member

@iximeow iximeow commented Dec 19, 2025

NVMe commands are unique in a device across queue ID (qid in various parts of Propolis) and command ID (cid in various parts of Propolis), but are ambiguous when looking at an entire VM: two different NVMe devices will have a "queue 1", and their distinct "queue 1" may also have "command 1234" submitted at the same time.

Pick a device ID for NVMe controllers, and plumb that through to distinguish queues similar to how we distinguish queues for block devices: a devq_id construction that makes a device and queue unique across the VM.

In the NVMe case, a devq_id is not uniquely identifying across the life of a VM, as the queues can be destroyed, recreated, the controller reset, etc. But absent such an administrative operation, this helps untangle what's happening across a VM with many disks.

Note that block attachment IDs and NVMe IDs are similar in name, and probably similar in practice, but are tracked distinctly. This is similar to the difference between block qeuue IDs and NVMe queue IDs: the architcture is similar, but the mapping between them is more arbitrary and may change as an implementation detail of Propolis.


a few of us (@ahl among others) were looking at the nvme probes as a "top-of-propolis" view of submission->completion time and found, to our surprise, that sometimes we'd get really bogus data. it took a bit of thinking to realize that we were probably seeing duplicate (qid, cid) tuples since we were loading a bunch of disks in the VM. I want to see that not happen any more for good measure, but from first principles the probes don't seem sufficient for analysis with multiple disks at the moment.

nvme-trace.d only looked at cid, not the queue ID, so it'd be even more ambiguous. it probably could use a sprinkling of devq_id/arg0 too.

NVMe commands are unique in a device across queue ID (`qid` in various
parts of Propolis) and command ID (`cid` in various parts of Propolis),
but are ambiguous when looking at an entire VM: two different NVMe
devices will have a "queue 1", and their distinct "queue 1" may also
have "command 1234" submitted at the same time.

Pick a device ID for NVMe controllers, and plumb that through to
distinguish queues similar to how we distinguish queues for block
devices: a `devq_id` construction that makes a device and queue unique
across the VM.

In the NVMe case, a `devq_id` is not uniquely identifying across the
life of a VM, as the queues can be destroyed, recreated, the controller
reset, etc. But absent such an administrative operation, this helps
untangle what's happening across a VM with many disks.

Note that block attachment IDs and NVMe IDs are similar in name, and
probably similar in practice, but are tracked distinctly. This is
similar to the difference between block qeuue IDs and NVMe queue IDs:
the architcture is similar, but the mapping between them is more
arbitrary and may change as an implementation detail of Propolis.
sqid,
u16::from(block_qid),
devsq_id,
block_devqid,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bringing block_qid -> block_devqid along here because otherwise we'd know which NVMe device notified a block queue ... somewhere ... with no way to find out which block attachment we'd just poked.

@iximeow iximeow force-pushed the ixi/nvme-probe-device-qid branch from b6f258d to f39a608 Compare December 19, 2025 22:48
@AlejandroME AlejandroME added the local storage Relating to the local storage project label Dec 19, 2025
@AlejandroME AlejandroME added this to the 18 milestone Dec 19, 2025
@iximeow
Copy link
Member Author

iximeow commented Dec 19, 2025

ok, that's what I get for building it and seeing the VM run, but not cargo test .. :)

Copy link
Contributor

@ahl ahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good stuff

// must have changed the type of QueueId such this is no longer absurd!
#[allow(clippy::absurd_extreme_comparisons)]
{
static_assertions::const_assert!(QueueId::MAX <= u16::MAX);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Clippy's defense u16::MAX <= u16::MAX is pretty absurd...

}
}

/// Combine an NVMe device and queue ID into a single u64 for probes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... to make a globally unique ID?

Copy link
Member Author

@iximeow iximeow Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't want to go quite that far: the NVMe controller can be reset, which will destroy queues, abandon any in-flight work, and probably have queues with the same ID re-created when the guest sets up the controller again. but I'll adjust this to say a bit more.

Comment on lines 731 to 732
/// The ID of the device this is a completion queue for. Not semantically
/// interesting for completion queues, but useful context in probes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The ID of the device this is a completion queue for. Not semantically
/// interesting for completion queues, but useful context in probes.
/// The ID of the device associated with this completion queue. Not semantically
/// interesting for completion queues, but useful context in probes.

Not trying to be grammatically picayune--took me a few reads before I understood the meaning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much appreciated. as much as I try not to, I invent novel sentence structures sometimes.. played boggle with the words here and I hope it's much more legible now.

Comment on lines +27 to +28
fn nvme_read_enqueue(devsq_id: u64, idx: u16, cid: u16, off: u64, sz: u64) {
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies to literally everyone for the grating formatting.

@iximeow iximeow force-pushed the ixi/nvme-probe-device-qid branch from 6c5a37f to c47deb6 Compare December 20, 2025 00:27
@iximeow
Copy link
Member Author

iximeow commented Dec 20, 2025

"[TEST - EVENT] Error obtaining artifact from source"

ah, well, if office internet is flaky i suppose it will be hard for that computer to get artifacts

@iximeow iximeow merged commit fc2def4 into master Dec 20, 2025
11 checks passed
@iximeow iximeow deleted the ixi/nvme-probe-device-qid branch December 20, 2025 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

local storage Relating to the local storage project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants