-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Make sure TRT EPs can loads models when initializers in memory #26721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the initialization of graph initializers by moving the conversion of TensorProto initializers to OrtValues from the Graph constructor to an explicit call during graph transformation (before partitioning). This change ensures that execution providers can work with models that have initializers in memory, addressing issue #26653.
Key Changes:
- Added
Graph::ConvertInitializersIntoOrtValues()method to explicitly convert large initializers to OrtValues with in-memory external data references - Enhanced provider interfaces with move assignment operators and iterator support for TensorProtos
- Refactored TensorRT and NV TensorRT providers to handle initializers more uniformly using the new interfaces
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
include/onnxruntime/core/graph/graph.h |
Added declaration for ConvertInitializersIntoOrtValues() method |
onnxruntime/core/graph/graph.cc |
Implemented new conversion method and removed old lambda-based conversion from constructor |
onnxruntime/core/session/inference_session.cc |
Added call to convert initializers before partitioning and improved exception handling |
onnxruntime/core/session/provider_bridge_ort.cc |
Implemented iterator interfaces and move assignment operators for provider bridge |
onnxruntime/core/providers/shared_library/provider_interfaces.h |
Added TensorProto iterator interfaces and updated method signatures for const-correctness |
onnxruntime/core/providers/shared_library/provider_wrappedtypes.h |
Updated wrapper types with move semantics and iterator support |
onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc |
Refactored initializer handling to use new iterator-based approach and simplified logic |
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc |
Attempted refactoring of initializer handling with critical bugs in variable references |
onnxruntime/test/ir/graph_test.cc |
Updated test to validate explicit conversion behavior |
onnxruntime/test/ir/utils_test.cc |
Minor refactoring to use ASSERT_STATUS_OK macro |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This PR moves the conversion of initializers in-memory from Graph constructor to early in graph transform before the partitioning. This is done to avoid conversion when subgraphs are constructed.
It also addresses bugs in TRT and NV TRT providers.
Addresses issue: #26653
Graph Initializer Conversion and Handling:
Graph::ConvertInitializersIntoOrtValues()to convert all graph TensorProto initializers into OrtValues and create in-memory external data references, separating this logic from graph construction and making it reusable. (include/onnxruntime/core/graph/graph.h,onnxruntime/core/graph/graph.cc) [1] [2]onnxruntime/core/graph/graph.cc) [1] [2] [3]Provider Interface Enhancements:
GraphProtoandTensorProtoin both the provider interface (ProviderHost) and wrapper structs, allowing for more efficient object transfers and assignment. (onnxruntime/core/providers/shared_library/provider_interfaces.h,onnxruntime/core/providers/shared_library/provider_wrappedtypes.h) [1] [2] [3] [4]TensorProto_ConstIterator,TensorProto_Iterator) and corresponding methods toTensorProtosfor clean iteration over initializer lists, improving code readability and maintainability. (onnxruntime/core/providers/shared_library/provider_interfaces.h,onnxruntime/core/providers/shared_library/provider_wrappedtypes.h) [1] [2] [3]Execution Provider Logic Simplification:
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc) [1] [2] [3]Other Minor Improvements:
TensorProtos. (onnxruntime/core/providers/shared_library/provider_interfaces.h,onnxruntime/core/providers/shared_library/provider_wrappedtypes.h) [1] [2]