Skip to content

Error on P2P on 2 RTX 5090: mapping of buffer object failed or code=205(cudaErrorMapBufferObjectFailed) "cudaDeviceEnablePeerAccess(gpuid[1], 0) #44

@Panchovix

Description

@Panchovix

NVIDIA Open GPU Kernel Modules Version

570.133, 570.144, 570.148, 570.153

Operating System and Version

Fedora 41

Hardware

RTX 5090x2+RTX4090x2+RTX3090x2+A6000

AMD Ryzen 7 7800X3D

192GB RAM

MSI Carbon X670E

Kernel Release

6.14.9

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Build Command

Terminal output/Build Log

More Info

Hi there. I'm trying to use P2P on two RTX 5090, but I get the next issues:

pancho@fedora:~/cuda-samples/build/Samples/5_Domain_Specific/p2pBandwidthLatencyTest$ export CUDA_VISIBLE_DEVICES=2,3
pancho@fedora:~/cuda-samples/build/Samples/5_Domain_Specific/p2pBandwidthLatencyTest$ ./p2pBandwidthLatencyTest 
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, NVIDIA GeForce RTX 5090, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
Device: 1, NVIDIA GeForce RTX 5090, pciBusID: 3, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=1 CAN Access Peer Device=0

***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0     1
     0       1     1
     1       1     1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 1728.49  24.63 
     1  24.70 1761.56 
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
Cuda failure /home/pancho/cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest/p2pBandwidthLatencyTest.cu:192: 'mapping of buffer object failed'
pancho@fedora:~/cuda-samples/build/Samples/0_Introduction/simpleP2P$ ./simpleP2P 
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2

Checking GPU(s) for support of peer to peer memory access...
> Peer access from NVIDIA GeForce RTX 5090 (GPU0) -> NVIDIA GeForce RTX 5090 (GPU1) : Yes
> Peer access from NVIDIA GeForce RTX 5090 (GPU1) -> NVIDIA GeForce RTX 5090 (GPU0) : Yes
Enabling peer access between GPU0 and GPU1...
CUDA error at /home/pancho/cuda-samples/Samples/0_Introduction/simpleP2P/simpleP2P.cu:130 code=205(cudaErrorMapBufferObjectFailed) "cudaDeviceEnablePeerAccess(gpuid[1], 0)" 

I have other GPUs on my system (A6000, 3090s, 4090s) and P2P works fine on these cards.

I did try with the patch mentioned here #29 (comment), but as I mentioned on #29 (comment), it doesn't seem to work. I edited the files mentioned but there was no difference.

I have also tried https://github.com/tinygrad/open-gpu-kernel-modules/tree/570.148.08-p2p branch, but issue is still there.

I installed the driver -> then the patch with ./install.sh.

IOMMU and PCIe ACS are disabled.

Any help is appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions