Skip to content

[webgpu] Fused CopyKVCache and SplitPackedQKVWithRotaryEmbedding as SplitPackedQKVWithRotaryEmbeddingAndCopyKV #8206

[webgpu] Fused CopyKVCache and SplitPackedQKVWithRotaryEmbedding as SplitPackedQKVWithRotaryEmbeddingAndCopyKV

[webgpu] Fused CopyKVCache and SplitPackedQKVWithRotaryEmbedding as SplitPackedQKVWithRotaryEmbeddingAndCopyKV #8206

Triggered via pull request November 19, 2025 07:19
Status Success
Total duration 1h 58m 55s
Artifacts 1

windows_tensorrt.yml

on: pull_request
Windows GPU TensorRT CI Pipeline
37m 47s
Windows GPU TensorRT CI Pipeline
Windows GPU TensorRT CI Pipeline Test Job
1h 7m
Windows GPU TensorRT CI Pipeline Test Job
Fit to window
Zoom out
Zoom in

Annotations

6 warnings
Windows GPU TensorRT CI Pipeline: onnxruntime/core/mlas/lib/amd64/QgemmU8X8KernelAvx2.asm#L1234
epilog offset from end of function exceeds 4095
Windows GPU TensorRT CI Pipeline: onnxruntime/core/mlas/lib/amd64/QgemmU8X8KernelAvx2.asm#L1227
epilog offset from end of function exceeds 4095
Windows GPU TensorRT CI Pipeline: onnxruntime/core/mlas/lib/amd64/QgemmU8X8KernelAvx2.asm#L1220
epilog offset from end of function exceeds 4095
Windows GPU TensorRT CI Pipeline: onnxruntime/core/mlas/lib/amd64/QgemmU8X8KernelAvx2.asm#L1213
epilog offset from end of function exceeds 4095
Windows GPU TensorRT CI Pipeline: onnxruntime/core/mlas/lib/amd64/QgemmU8X8KernelAvx2.asm#L1206
epilog offset from end of function exceeds 4095
Windows GPU TensorRT CI Pipeline: onnxruntime/core/mlas/lib/amd64/QgemmU8X8KernelAvx2.asm#L1199
epilog offset from end of function exceeds 4095

Artifacts

Produced during runtime
Name Size Digest
build-artifacts
1.91 GB
sha256:c58027f0c81e33120eb2155b40d87256efe31c33ba99a7d841a58932f512e198