Platform
a2a3 (Ascend 910B/C hardware)
Runtime Variant
tensormap_and_ringbuffer
Description
The paged_attention device test (tests/device_tests/a2a3/tensormap_and_ringbuffer/paged_attention) exhibits intermittent precision verification failures. The test passes most runs but occasionally produces output that does not match golden values.
Steps to Reproduce
1. Use the batch test script `batch_pa_test.sh` to run the paged_attention test repeatedly (100 iterations):
bash batch_pa_test.sh
The script runs the test in a loop and stops early on the first precision failure.
2. Alternatively, run the test manually in a loop:
for i in $(seq 1 100); do
python examples/scripts/run_example.py \
-k tests/device_tests/a2a3/tensormap_and_ringbuffer/paged_attention/kernels \
-g tests/device_tests/a2a3/tensormap_and_ringbuffer/paged_attention/golden.py \
-p a2a3
done
The failure typically occurs within ~85 runs but can happen at any point.
Expected Behavior
All 100 runs should pass precision verification (100 PASSED, 0 PRECISION_FAILED).
Actual Behavior
The test fails intermittently with a precision mismatch:
[INFO] Comparing out: shape=torch.Size([256, 16, 128]), dtype=torch.float32
[ERROR] TEST FAILED: Output 'out' does not match golden.
Mismatched elements: 478/524288
rtol=0.001, atol=0.001
Git Commit ID
2757be6
CANN Version
No response
Driver Version
No response
Host Platform
Linux (x86_64)
Additional Context
No response
Platform
a2a3 (Ascend 910B/C hardware)
Runtime Variant
tensormap_and_ringbuffer
Description
The
paged_attentiondevice test (tests/device_tests/a2a3/tensormap_and_ringbuffer/paged_attention) exhibits intermittent precision verification failures. The test passes most runs but occasionally produces output that does not match golden values.Steps to Reproduce
Expected Behavior
All 100 runs should pass precision verification (100 PASSED, 0 PRECISION_FAILED).
Actual Behavior
The test fails intermittently with a precision mismatch:
Git Commit ID
2757be6
CANN Version
No response
Driver Version
No response
Host Platform
Linux (x86_64)
Additional Context
No response