Request: Add Backward Pass support for SageAttention3 (Training/Fine-tuning)

Hello,

I noticed that the current implementation of `SageAttention3` lacks support for the backward pass. This is somewhat confusing, as the SageAttention3 paper includes graphs and data explicitly benchmarking the backward pass performance.

Since many users (myself included) intend to use this library for training and fine-tuning models rather than just inference, the lack of backward gradients is a major blocker.

Could you please prioritize adding backward pass support to align the repository with the capabilities presented in the paper?

Thank you!

<img width="1092" height="339" alt="Image" src="https://github.com/user-attachments/assets/444cf62a-3445-4ced-a6d1-a27cb67bc4db" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request: Add Backward Pass support for SageAttention3 (Training/Fine-tuning) #317

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request: Add Backward Pass support for SageAttention3 (Training/Fine-tuning) #317

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions