[Build] update cuda 13 package: fatbin compress mode and cuda archs #26516

tianleiwu · 2025-11-06T22:40:22Z

Changes

Update cuda 13 python packaging pipeline:
(1) Use fatbin compress mode = size to reduce package size. This could significantly reduce package size.
(2) Update CMAKE_CUDA_ARCHITECTURES for cuda 13. Since we reduced package size, we are able to add more architectures.
(3) Fix cuda 13 packaging pipeline:

use correct (cuda13 instead of cuda12) manylinux docker. The new linux docker has cuda 13.0.2 and cuDNN 9.14.
pass cuda version properly to run build_linux_python_package.sh in docker. (CUDA_VERSION in docker was 12.8.1, and now we pass "12.8" from yml to be consistent).

Note that the compress mode and cuda archs settings are not changed for CUDA 12.8, so cuda 12 wheel size is larger than cuda 13 wheel size. We can update settings for cuda 12.8 in a separated PR if needed.

The nuget pipeline for cuda 13 need extra code change, and this PR only fixes python packaging pipeline.

Python GPU Wheel Size (Cuda Architectures + PTX)

CUDA	Windows	Linux
12.8	221 MB (52;61;75;86;89+90)	271 MB (60;70;75;80;86;90a+90)
13.0	186 MB (75;80;86;89;90a;100a;120a+120)	191 MB (75;80;86;89;90a;100a;120a+120)

GrigoryEvko · 2025-11-07T01:55:12Z

Yes please, nightly feeds are not available publicly anymore and nobody raised this issue and all cuda 13 builds hardcode cuda 12 cublas and other libraries

GrigoryEvko · 2025-11-07T02:14:23Z

Also this pr is required to fix cuda 13 builds #26518

This reverts commit 00f78d8.

## Description Fixes runtime library loading failures when building with CUDA 13 by replacing hardcoded CUDA 12 references with dynamic version detection. Related to #26516 which updates CUDA 13 build pipelines, but this PR fixes the Python runtime code that was still hardcoded to CUDA 12. ## Problem The build system correctly detects CUDA 13 via CMake, but the runtime Python code had CUDA 12 hardcoded in multiple locations, causing "CUDA 12 not found" errors on CUDA 13 systems. ## Solution Modified onnxruntime/__init__.py and setup.py to dynamically use the detected CUDA version instead of hardcoded "12" strings. ## Changes - Dynamic CUDA version extraction from build info - Library paths now use f-strings with cuda_major_version - Added CUDA 13 support to extras_require and dependency exclusions - Fixed TensorRT RTX package to use correct CUDA version - Updated version validation to accept CUDA 12+ - Fixed PyTorch compatibility checks to compare versions dynamically ## Impact - CUDA 13 builds now load correct libraries - Backward compatible with CUDA 12 - Forward compatible with future CUDA versions ## Testing Verified with CUDA 13.0 build that library paths resolve correctly and preload_dlls() loads CUDA 13 libraries without errors.

tianleiwu added 2 commits November 6, 2025 12:18

update pipeline cuda settings

bde673a

revert cuda archs for cuda 12.8

6052d3f

tianleiwu marked this pull request as draft November 6, 2025 22:40

tianleiwu added 4 commits November 6, 2025 15:00

order

c147710

pass CudaArchs for windows nuget

9a1e08a

fix linux build error

00f78d8

refactoring

835ed69

GrigoryEvko mentioned this pull request Nov 7, 2025

Fix CUDA version hardcoding to support CUDA 13+ dynamically #26518

Merged

tianleiwu marked this pull request as ready for review November 7, 2025 20:13

tianleiwu added 8 commits November 8, 2025 00:08

new cuda 13 docker

5476811

Revert "fix linux build error"

a357697

This reverts commit 00f78d8.

exclude cuda 13 dlls

30bb409

add 100-real

f428aac

cudnn 9.14

b58076e

fix nuget pipeline

03757ac

fix c-api-noopenmp-packaging-pipelines.yml

251a8fb

cudnn 9.14.0.64_cuda13

bee2aa1

tianleiwu requested a review from snnn November 9, 2025 05:53

tianleiwu enabled auto-merge (squash) November 10, 2025 03:30

snnn approved these changes Nov 10, 2025

View reviewed changes

tianleiwu merged commit 8fceca0 into main Nov 10, 2025
123 of 127 checks passed

tianleiwu deleted the tlwu/fix_cuda13_pipeline branch November 10, 2025 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Build] update cuda 13 package: fatbin compress mode and cuda archs #26516

[Build] update cuda 13 package: fatbin compress mode and cuda archs #26516

Uh oh!

tianleiwu commented Nov 6, 2025 •

edited

Loading

Uh oh!

GrigoryEvko commented Nov 7, 2025

Uh oh!

GrigoryEvko commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Build] update cuda 13 package: fatbin compress mode and cuda archs #26516

[Build] update cuda 13 package: fatbin compress mode and cuda archs #26516

Uh oh!

Conversation

tianleiwu commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Python GPU Wheel Size (Cuda Architectures + PTX)

Uh oh!

GrigoryEvko commented Nov 7, 2025

Uh oh!

GrigoryEvko commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tianleiwu commented Nov 6, 2025 •

edited

Loading