Skip to content

Conversation

@xenova
Copy link
Contributor

@xenova xenova commented Nov 26, 2025

Description

More accurately compute Pow(2.0) on WebGPU EP.

Reproduction script:

from onnx import helper, TensorProto
import onnxruntime as ort
import numpy as np

# 1. Create the ONNX model
# Define input and output
input_info = helper.make_tensor_value_info('X', TensorProto.FLOAT, [1, 1])
output_info = helper.make_tensor_value_info('Y', TensorProto.FLOAT, [1, 1])

# Create a constant tensor for the exponent (2.0)
exponent_tensor = helper.make_tensor('exponent', TensorProto.FLOAT, [], [2.0])
exponent_node = helper.make_node('Constant', [], ['exponent_out'], value=exponent_tensor)

# Create the Pow node
# Pow takes two inputs: Base (X) and Power (exponent_out)
pow_node = helper.make_node(
    'Pow',
    inputs=['X', 'exponent_out'],
    outputs=['Y'],
    name='PowNode'
)

# Create the graph
graph_def = helper.make_graph(
    [exponent_node, pow_node],
    'test-model',
    [input_info],
    [output_info]
)

# Create the model
model_def = helper.make_model(graph_def, producer_name='onnx-example')
opset = model_def.opset_import[0]
opset.version = 13 # Ensure opset version supports the operations

# 2. Convert model to string (bytes)
model_str = model_def.SerializeToString()

# 3. Prepare input data
np.random.seed(0)
input_data = np.array([[-2e3]], dtype=np.float32)

# 4. Run on CPUExecutionProvider
sess_cpu = ort.InferenceSession(model_str, providers=['CPUExecutionProvider'])
res_cpu = sess_cpu.run(['Y'], {'X': input_data})[0]
print("CPU Result:", res_cpu)

# 5. Run on WebGpuExecutionProvider
sess_webgpu = ort.InferenceSession(model_str, providers=['WebGpuExecutionProvider'])
res_webgpu = sess_webgpu.run(['Y'], {'X': input_data})[0]
print("WebGPU Result:", res_webgpu)

# Compare results
diff = np.abs(res_cpu - res_webgpu)
max_diff = diff.max().item()
assert max_diff < 1e-5, f"Results do not match within tolerance! Max diff: {max_diff}"
print("Results match!")

currently produces

CPU Result: [[4.e+06]]
WebGPU Result: [[3.999999e+06]]
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[1], [line 56](vscode-notebook-cell:?execution_count=1&line=56)
     54 diff = np.abs(res_cpu - res_webgpu)
     55 max_diff = diff.max().item()
---> [56](vscode-notebook-cell:?execution_count=1&line=56) assert max_diff < 1e-5, f"Results do not match within tolerance! Max diff: {max_diff}"
     57 print("Results match!")

AssertionError: Results do not match within tolerance! Max diff: 1.0

but with this PR:

CPU Result: [[4.e+06]]
WebGPU Result: [[4.e+06]]
Results match!

Motivation and Context

Leads to downstream issues/inaccuracies for certain models, especially those which have larger values to compute pow(x,2) for.

cc @guschmue

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Nov 26, 2025
@guschmue guschmue enabled auto-merge (squash) November 26, 2025 21:50
@guschmue
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@guschmue guschmue merged commit 4c43c66 into microsoft:main Nov 27, 2025
98 of 101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants