[webgpu] Fix softmax(max_negative_number) in float32 #26670

xenova · 2025-11-27T01:39:13Z

Description

The correct definition of the most negative number is -3.40282346638528e+38, according to IEEE 754, but it is being incorrectly registered inline as a truncated version -3.402823e+38f.

>>> import numpy as np
>>> np.finfo(np.float32).min
np.float32(-3.4028235e+38)
>>> np.finfo(np.float32).min.item()
-3.4028234663852886e+38

For this reason, values less than this threshold were handled incorrectly. While this may seem like a small/irrelevant detail, it's essential in attention masking, where we do in fact use this value, leading to large numerical errors down the line.

Reproduction:

from onnx import helper, TensorProto
import onnxruntime as ort
import numpy as np

# 1. Create the ONNX model
# Define input and output
input_shape = [1, 2]
input_info = helper.make_tensor_value_info('X', TensorProto.FLOAT, input_shape)
output_info = helper.make_tensor_value_info('Y', TensorProto.FLOAT, input_shape)

# Create the Softmax node
# Softmax takes one input: X
softmax_node = helper.make_node(
    'Softmax',
    inputs=['X'],
    outputs=['Y'],
    name='SoftmaxNode',
    axis=-1 # Default axis is -1, usually applied to the last dimension
)

# Create the graph
graph_def = helper.make_graph(
    [softmax_node],
    'test-model',
    [input_info],
    [output_info]
)

# Create the model
model_def = helper.make_model(graph_def, producer_name='onnx-example')
opset = model_def.opset_import[0]
opset.version = 13 # Ensure opset version supports the operations

# 2. Convert model to string (bytes)
model_str = model_def.SerializeToString()

# 3. Prepare input data
np.random.seed(0)
input_data = np.array(
[[-3.40282346638528e+38, -3.40282346638528e+38]]
# [[-3.4028234663852886e+38, -3.4028234663852886e+38]]
).astype(np.float32)
print(input_data.tolist())

# 4. Run on CPUExecutionProvider
sess_cpu = ort.InferenceSession(model_str, providers=['CPUExecutionProvider'])
res_cpu = sess_cpu.run(['Y'], {'X': input_data})[0]
print("CPU Result:", res_cpu)

# 5. Run on WebGpuExecutionProvider
sess_webgpu = ort.InferenceSession(model_str, providers=['WebGpuExecutionProvider'])
res_webgpu = sess_webgpu.run(['Y'], {'X': input_data})[0]
print("WebGPU Result:", res_webgpu)

# Compare results
diff = np.abs(res_cpu - res_webgpu)
max_diff = diff.max().item()
print(diff)
print(f"Max diff: {max_diff}")
assert max_diff < 1e-5, f"Results do not match within tolerance! Max diff: {max_diff}"
print("Results match!")

Before:

[[-3.4028234663852886e+38, -3.4028234663852886e+38]]
CPU Result: [[0.5 0.5]]
WebGPU Result: [[0. 0.]]
[[0.5 0.5]]
Max diff: 0.5
AssertionError: Results do not match within tolerance! Max diff: 0.5

After:

[[-3.4028234663852886e+38, -3.4028234663852886e+38]]
CPU Result: [[0.5 0.5]]
WebGPU Result: [[0.5 0.5]]
[[0. 0.]]
Max diff: 0.0
Results match!

cc @guschmue

Fixes incorrect definition of most negative number

[webgpu] Fix softmax(-inf) in float32

cb2b329

Fixes incorrect definition of most negative number

xenova changed the title ~~[webgpu] Fix softmax(-inf) in float32~~ [webgpu] Fix softmax(-max_negative_number) in float32 Nov 27, 2025

xenova changed the title ~~[webgpu] Fix softmax(-max_negative_number) in float32~~ [webgpu] Fix softmax(max_negative_number) in float32 Nov 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[webgpu] Fix softmax(max_negative_number) in float32 #26670

[webgpu] Fix softmax(max_negative_number) in float32 #26670

Uh oh!

xenova commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[webgpu] Fix softmax(max_negative_number) in float32 #26670

Are you sure you want to change the base?

[webgpu] Fix softmax(max_negative_number) in float32 #26670

Uh oh!

Conversation

xenova commented Nov 27, 2025

Description

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant