Skip to content

Conversation

@praneshgo
Copy link
Contributor

Description

  • Introduced external GPU synchronization primitives enabling interoperability with Direct3D 12 and Vulkan graphics APIs.
  • Added APIs to acquire, wait on, and signal external GPU synchronization fences or semaphores in execution providers.
  • Enabled setup of CUDA Interop Graphics (CIG) contexts to synchronize CUDA streams with graphics API command queues/devices.
  • Extended NVIDIA TensorRT RTX execution provider with support for importing and synchronizing external semaphores from graphics APIs.
  • Added execution provider interface methods for managing external synchronization primitives and interop synchronization.
  • Added build options to enable DirectX and Vulkan interoperability features.
  • Added comprehensive GPU interoperability test validating D3D12 and ONNX Runtime synchronization with TensorRT RTX.

Motivation and Context

This commit largely focuses on adding the design keeping in mind sample apps using either DX or Vulkan Graphics API.
The implementation part however focuses solely on DX for now, with plans to add Vulkan implementation support soon.
NV RTR RTX EP gets implementation details added and virtual base APIs are added in execution_provider.h header to have fallback path for other EPs.
- Fixed the Native Methods to include binding delegations
- Cleaned up GetExtSemaphore API to take better care of return values
- Made returns clearer while looping active EPs
- Fixed the pointer assignment and dereferencing in fallback path APIs
- Fixed the namespace in GetOrtFenceForGraphicsInterop API
- in InteropEpWait and InteropEPSignal, added return calls when stream is nullptr
- Adding a compile parameter use_dx_for_interop. When specified, it enables a macro DX_FOR_INTEROP that enables d3d12.h inclusion and all other DX specific logic for compilation
- Couple of small fixes
… and EpWait APIs

- Modified GetOrtFenceForGraphicsInterop API to use pointer to GraphicsInteropParams
- Added documentation for exposed APIs
- Updated error handling in EpSignal and EpWait APIs
…eropParams and addressing current review comments

- Adding Vulkan compilation support
- Adding more members to GraphicsInteropParams
- Addressing current review comments
Adding CIG support for DX - expected to work with TRT 1.3. Used the corresponding local fixes in TRT to verify the working of sample app that uses CIG+NV TRT RTX EP.
- Added a struct to store extSemFence and the corresponding selectedEp at GetExtSemaphore API
- Not looping for EPs in InteropWait and InteropSignal calls, using the selected EP instead.
… made an opaque struct

- Bifurcated fence params from graphicsInteropParams
- made SemaphoreEpMap an opaque struct, defined in ORT but app does not call it. Instead it uses a void*
- Some additional code fixes called out by CodeRabbit
- Added an opaque OrtFence struct. Replaced void*/void** with OrtFence*/OrtFence** for fence related params.
- Replaced passing stream handle to EP with passing stream directly.
- Changed the name of SetupCigContextForEpDevice to SetupGraphicsInteropContextForEpDevice
- Made the datatypes of DX/Vulkan as void* and removed the corresponding DX/Vulkan headers in onnxruntime_c_api.h
- Moved the DX/Vulkan header inclusion into execution_provider.h
- Reinterpret casting into appropriate data types from void* where needed
Add a to-do regarding further changes needed in fallback code
Not using pVkSempahore as a pointer to VkSemaphore but the semaphore handle directly, hence renaming the variable accordlingly
…RT RTX EP device.

Validated to be working as expected when interop APIs are setup and outputs garbage when interop APIs are commented out.
@@ -0,0 +1,438 @@
// SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning test

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant