-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[Build] update cuda 13 package: fatbin compress mode and cuda archs #26516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
|
Yes please, nightly feeds are not available publicly anymore and nobody raised this issue and all cuda 13 builds hardcode cuda 12 cublas and other libraries |
Contributor
|
Also this pr is required to fix cuda 13 builds #26518 |
tianleiwu
pushed a commit
that referenced
this pull request
Nov 9, 2025
## Description Fixes runtime library loading failures when building with CUDA 13 by replacing hardcoded CUDA 12 references with dynamic version detection. Related to #26516 which updates CUDA 13 build pipelines, but this PR fixes the Python runtime code that was still hardcoded to CUDA 12. ## Problem The build system correctly detects CUDA 13 via CMake, but the runtime Python code had CUDA 12 hardcoded in multiple locations, causing "CUDA 12 not found" errors on CUDA 13 systems. ## Solution Modified onnxruntime/__init__.py and setup.py to dynamically use the detected CUDA version instead of hardcoded "12" strings. ## Changes - Dynamic CUDA version extraction from build info - Library paths now use f-strings with cuda_major_version - Added CUDA 13 support to extras_require and dependency exclusions - Fixed TensorRT RTX package to use correct CUDA version - Updated version validation to accept CUDA 12+ - Fixed PyTorch compatibility checks to compare versions dynamically ## Impact - CUDA 13 builds now load correct libraries - Backward compatible with CUDA 12 - Forward compatible with future CUDA versions ## Testing Verified with CUDA 13.0 build that library paths resolve correctly and preload_dlls() loads CUDA 13 libraries without errors.
snnn
approved these changes
Nov 10, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Update cuda 13 python packaging pipeline:
(1) Use fatbin compress mode = size to reduce package size. This could significantly reduce package size.
(2) Update CMAKE_CUDA_ARCHITECTURES for cuda 13. Since we reduced package size, we are able to add more architectures.
(3) Fix cuda 13 packaging pipeline:
Note that the compress mode and cuda archs settings are not changed for CUDA 12.8, so cuda 12 wheel size is larger than cuda 13 wheel size. We can update settings for cuda 12.8 in a separated PR if needed.
The nuget pipeline for cuda 13 need extra code change, and this PR only fixes python packaging pipeline.
Python GPU Wheel Size (Cuda Architectures + PTX)