Skip to content

Conversation

@peishenyan
Copy link
Contributor

Description

Support the local_window_size attribute in GroupQueryAttention Operator, which is designed for sliding window attention and may influence the attention mask pattern.

For local window size not equal to -1, new attention mask pattern will be created as follows for applying sliding window.

     condition_1 (old attn_mask) ---> CumSum (axis=3, exclusive=true, reversed=true)
          |                             |
          |                           Lesser <--- local_window_size
          |                             |
      LogicalAnd <----------------- condition_2
          |
    new attn_mask

Motivation and Context

add log info

temp

Support local_window_size for WebNN GQA
@peishenyan
Copy link
Contributor Author

PTAL, thanks. @Honry

Copy link
Contributor

@Honry Honry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM, pls. also mention the needs to use 'expand' for mask_shape_ones_shape_constant in your commit message.

@Honry
Copy link
Contributor

Honry commented Nov 17, 2025

  • @fdwr, please take another look, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants