Skip to content

Support for GPUDirect RDMA #46

@Antinis

Description

@Antinis

NVIDIA Open GPU Kernel Modules Version

570.124.06-p2p

Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.

  • I confirm that this does not happen with the proprietary driver package.

Operating System and Version

Ubuntu 22.04.5 LTS

Kernel Release

5.15.0-135-generic

Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.

  • I am running on a stable kernel release.

Hardware: GPU

NVIDIA GeForce RTX 3090 (UUID: GPU-02cdc648-49bd-4725-805d-b7d138b6c1d4)

Describe the bug

I am sorry for finding suggestions of new feature in the bug section.

I would like to use GPU Direct RDMA on my GeForce GPU (using a Mellanox ConnectX network card for cross-node communication). While GPU Direct P2P and GPU Direct RDMA should be identical in principle, the current driver only supports GPU Direct P2P and not GPU Direct RDMA. Therefore, I have the following questions and would appreciate your advice:

  1. Compared to GPU Direct P2P, does the GSP firmware impose additional restrictions on GPU Direct RDMA? If not, then the restrictions are only at the software driver level, correct?

  2. Regarding enabling GPU Direct RDMA with this driver, do you have any general suggestions or can you provide an estimate of the implementation difficulty?

Thank you very much for any advice and hint!

To Reproduce

N/A

Bug Incidence

Once

nvidia-bug-report.log.gz

N/A

More Info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions