Skip to content

tml-epfl/sparse-attention-dynamics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Incremental Learning of Sparse Attention Patterns in Transformers

This is the official code for the paper on Incremental Learning of Sparse Attention Patterns in Transformers presented at EurIPS 2025 Workshop on Principles of Generative Modeling and accepted to ICML 2026 Main Conference.

The analysis/ folder contains notebooks for regenerating the paper plots from the W&B project r-alvarezlucendo16/incremental-learning.

Installation

uv sync

Running Experiments

# List available experiments
bash run.sh

# Run a specific experiment
bash run.sh <experiment_name>

Configuration

Experiments are configured using Hydra with configs located in conf/.

  • Experiment configs in conf/experiments/ override base settings from conf/train.yaml
  • Component configs can be customized: model/, dataset/, optimizer/, scheduler/, loss/

About

The code for "Incremental Learning of Sparse Attention Patterns in Transformers"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors