Skip to content

Conversation

@fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Nov 20, 2025

Description

This PR makes ORT to prefer initializer allocator when calling OpKernel::PrePack.

If an EP does not register an initializer allocator (currently only WebGPU does this), the behavior is kept unchanged.

Motivation and Context

Helps to improve the memory usage when doing prepack.

@fs-eire fs-eire requested a review from Copilot November 20, 2025 01:51
Copilot finished reviewing on behalf of fs-eire November 20, 2025 01:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables the use of read-only initializer allocators for prepack operations, improving memory usage when an execution provider registers a dedicated initializer allocator (e.g., WebGPU EP). The change maintains backward compatibility by falling back to the standard allocator when no initializer-specific allocator is available.

  • Replaces GetAllocator with GetInitializerAllocator in the non-shared prepack path
  • Maintains backward compatibility for EPs without initializer allocators
  • Leverages existing GetInitializerAllocator infrastructure

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// we store the newly minted pre-packed data.

AllocatorPtr session_cpu_alloc = GetAllocator(kernel->Info().GetDevice(OrtMemType::OrtMemTypeDefault));
AllocatorPtr session_initializer_alloc = GetInitializerAllocator(kernel->Info().GetDevice(OrtMemType::OrtMemTypeDefault));
Copy link
Member

@yuslepukhin yuslepukhin Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllocatorPtr session_initializer_allo

Q. Is it guaranteed accessible from CPU code? We need to compute the container key. Sorry, I do not remember the properties of the an allocator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original variable name is confusing. It actually returns the registered allocator by looking up the map using the OrtDevice as key. for a node on webgpu EP, this returns a webgpu allocator instead of a CPU one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants