-
Notifications
You must be signed in to change notification settings - Fork 130
Description
NVIDIA Open GPU Kernel Modules Version
565.57.01-p2p
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
- I confirm that this does not happen with the proprietary driver package.
Operating System and Version
Ubuntu 22.04.4 LTS
Kernel Release
Linux nj01n182 5.15.0-119-generic NVIDIA#129-Ubuntu SMP Fri Aug 2 19:25:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
- I am running on a stable kernel release.
Hardware: GPU
GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-fd8fde67-bf94-8d32-8e09-bde3e1d137c3) GPU 1: NVIDIA GeForce RTX 4090 (UUID: GPU-fcaa4dab-d13d-2b11-b8ce-2923ed46919d)
Describe the bug
While running simpleP2P, simpleP2P works as expected
CUDA-capable device count: 2
Checking GPU(s) for support of peer to peer memory access...
Peer access from NVIDIA GeForce RTX 4090 (GPU0) -> NVIDIA GeForce RTX 4090 (GPU1) : Yes
Peer access from NVIDIA GeForce RTX 4090 (GPU1) -> NVIDIA GeForce RTX 4090 (GPU0) : Yes
Enabling peer access between GPU0 and GPU1...
Allocating buffers (2048MB on GPU0, GPU1 and CPU Host)...
go0 alloc addr:0x7f4b0e000000
g1 alloc addr:0x7f4a8e000000
h0 alloc addr:0x7f4a0e000000
Creating event handles...
Preparing host buffer and memcpy to GPU0...
Run kernel on GPU1, taking source data from GPU0 and writing to GPU1...
Run kernel on GPU0, taking source data from GPU1 and writing to GPU0...
Copy data back to host from GPU0 and verify results...
Disabling peer access...
Shutting down...
Test passed
But dmesg shows below error logs:
[4113109.595315] NVRM: iovaspaceDestruct_IMPL: 4 left-over mappings in IOVAS 0x3e00
[4113109.695407] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.695414] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
[4113109.695584] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.695587] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
[4113109.697156] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.697158] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
[4113109.697161] NVRM: nvAssertFailedNoLog: Assertion failed: Sysmemdesc outlived its attached pGpu @ mem_desc.c:1509
[4113109.697234] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.697237] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
To Reproduce
run simpleP2P
Bug Incidence
Always
nvidia-bug-report.log.gz
More Info
No response