forked from NVIDIA/open-gpu-kernel-modules
-
Notifications
You must be signed in to change notification settings - Fork 132
Open
Description
NVIDIA Open GPU Kernel Modules Version
570.133, 570.144, 570.148, 570.153
Operating System and Version
Fedora 41
Hardware
RTX 5090x2+RTX4090x2+RTX3090x2+A6000
AMD Ryzen 7 7800X3D
192GB RAM
MSI Carbon X670E
Kernel Release
6.14.9
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
- I am running on a stable kernel release.
Build Command
Terminal output/Build Log
More Info
Hi there. I'm trying to use P2P on two RTX 5090, but I get the next issues:
pancho@fedora:~/cuda-samples/build/Samples/5_Domain_Specific/p2pBandwidthLatencyTest$ export CUDA_VISIBLE_DEVICES=2,3
pancho@fedora:~/cuda-samples/build/Samples/5_Domain_Specific/p2pBandwidthLatencyTest$ ./p2pBandwidthLatencyTest
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, NVIDIA GeForce RTX 5090, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, NVIDIA GeForce RTX 5090, pciBusID: 3, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=1 CAN Access Peer Device=0
***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.
P2P Connectivity Matrix
D\D 0 1
0 1 1
1 1 1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1
0 1728.49 24.63
1 24.70 1761.56
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
Cuda failure /home/pancho/cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest/p2pBandwidthLatencyTest.cu:192: 'mapping of buffer object failed'
pancho@fedora:~/cuda-samples/build/Samples/0_Introduction/simpleP2P$ ./simpleP2P
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2
Checking GPU(s) for support of peer to peer memory access...
> Peer access from NVIDIA GeForce RTX 5090 (GPU0) -> NVIDIA GeForce RTX 5090 (GPU1) : Yes
> Peer access from NVIDIA GeForce RTX 5090 (GPU1) -> NVIDIA GeForce RTX 5090 (GPU0) : Yes
Enabling peer access between GPU0 and GPU1...
CUDA error at /home/pancho/cuda-samples/Samples/0_Introduction/simpleP2P/simpleP2P.cu:130 code=205(cudaErrorMapBufferObjectFailed) "cudaDeviceEnablePeerAccess(gpuid[1], 0)"
I have other GPUs on my system (A6000, 3090s, 4090s) and P2P works fine on these cards.
I did try with the patch mentioned here #29 (comment), but as I mentioned on #29 (comment), it doesn't seem to work. I edited the files mentioned but there was no difference.
I have also tried https://github.com/tinygrad/open-gpu-kernel-modules/tree/570.148.08-p2p branch, but issue is still there.
I installed the driver -> then the patch with ./install.sh.
IOMMU and PCIe ACS are disabled.
Any help is appreciated.
Metadata
Metadata
Assignees
Labels
No labels