Skip to content

while running SimpleP2P, kernel reports "NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592" error #38

@legezywzh

Description

@legezywzh

NVIDIA Open GPU Kernel Modules Version

565.57.01-p2p

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Ubuntu 22.04.4 LTS

Kernel Release

Linux nj01n182 5.15.0-119-generic NVIDIA#129-Ubuntu SMP Fri Aug 2 19:25:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-fd8fde67-bf94-8d32-8e09-bde3e1d137c3) GPU 1: NVIDIA GeForce RTX 4090 (UUID: GPU-fcaa4dab-d13d-2b11-b8ce-2923ed46919d)

Describe the bug

While running simpleP2P, simpleP2P works as expected
CUDA-capable device count: 2

Checking GPU(s) for support of peer to peer memory access...

Peer access from NVIDIA GeForce RTX 4090 (GPU0) -> NVIDIA GeForce RTX 4090 (GPU1) : Yes
Peer access from NVIDIA GeForce RTX 4090 (GPU1) -> NVIDIA GeForce RTX 4090 (GPU0) : Yes
Enabling peer access between GPU0 and GPU1...
Allocating buffers (2048MB on GPU0, GPU1 and CPU Host)...
go0 alloc addr:0x7f4b0e000000
g1 alloc addr:0x7f4a8e000000
h0 alloc addr:0x7f4a0e000000
Creating event handles...
Preparing host buffer and memcpy to GPU0...
Run kernel on GPU1, taking source data from GPU0 and writing to GPU1...
Run kernel on GPU0, taking source data from GPU1 and writing to GPU0...
Copy data back to host from GPU0 and verify results...
Disabling peer access...
Shutting down...
Test passed

But dmesg shows below error logs:

[4113109.595315] NVRM: iovaspaceDestruct_IMPL: 4 left-over mappings in IOVAS 0x3e00
[4113109.695407] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.695414] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
[4113109.695584] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.695587] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
[4113109.697156] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.697158] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601
[4113109.697161] NVRM: nvAssertFailedNoLog: Assertion failed: Sysmemdesc outlived its attached pGpu @ mem_desc.c:1509
[4113109.697234] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:592
[4113109.697237] NVRM: nvAssertFailedNoLog: Assertion failed: pIOVAS != NULL @ io_vaspace.c:601

To Reproduce

run simpleP2P

Bug Incidence

Always

nvidia-bug-report.log.gz

nvidia-bug-report.log.gz

More Info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions