Skip to content

Simplify FA3 discovery#2849

Merged
vcherepanov-nv merged 1 commit intoNVIDIA:mainfrom
vcherepanov-nv:simplify-fa3-dir
Apr 9, 2026
Merged

Simplify FA3 discovery#2849
vcherepanov-nv merged 1 commit intoNVIDIA:mainfrom
vcherepanov-nv:simplify-fa3-dir

Conversation

@vcherepanov-nv
Copy link
Copy Markdown
Collaborator

Description

Use FA3 install as is, without copying files.

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Don't copy flash_attn_interface.py to another location after FA3 install

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>
@vcherepanov-nv vcherepanov-nv requested a review from cyanguwa April 8, 2026 04:13
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 8, 2026

Greptile Summary

This PR simplifies FA3 discovery by removing a post-install file-copy step and instead relying on setup.py install to place flash_attn_interface directly on the Python path, making the import from flash_attn_interface import … work out of the box. The change also updates the QA test script to remove the now-unnecessary copy step.

Confidence Score: 5/5

Safe to merge — the simplification is correct and the two remaining findings are P2 quality/robustness suggestions that do not block the primary use case.

All findings are P2: one is a defensive-coding suggestion (wrap the else imports in a try/except ImportError) and one is a CI reproducibility note (pin FA3 git checkout to the named version tag). Neither represents a present defect when FA3 is correctly installed via setup.py install, which is the intended and documented path.

No files require special attention; optional hardening suggested in backends.py and test.sh.

Vulnerabilities

No security concerns identified.

Important Files Changed

Filename Overview
transformer_engine/pytorch/attention/dot_product_attention/backends.py Switches FA3 imports from a copied local file to direct from flash_attn_interface import …; the else branch has no ImportError guard so a partially-installed FA3 package would crash the whole module load.
transformer_engine/pytorch/attention/dot_product_attention/utils.py Minor update to installation instructions comment for FA3 — no logic changes, looks correct.
qa/L3_pytorch_FA_versions_test/test.sh Removes file-copy step after FA3 install; FA3 clone always uses HEAD without pinning to the listed version tag, risking non-reproducible CI.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[backends.py module load] --> B{get_pkg_version\nflash-attn-3}
    B -- PackageNotFoundError --> C[flash_attn_func_v3 = None\nflash_attn_varlen_func_v3 = None\nflash_attn_with_kvcache_v3 = None]
    B -- found --> D[from flash_attn_interface import ...]
    D -- success --> E[fa_utils.set_flash_attention_3_params\nv3_is_installed = True]
    D -- ImportError not caught --> F[Module load crash]
    C --> G[FA3 disabled gracefully]
    E --> H[FA3 available for use]
Loading

Comments Outside Diff (1)

  1. transformer_engine/pytorch/attention/dot_product_attention/backends.py, line 133-151 (link)

    P2 Unguarded ImportError in else branch

    If get_pkg_version("flash-attn-3") succeeds (metadata is found) but flash_attn_interface is somehow not importable (e.g., an unusual install layout or broken sys.path), the bare from flash_attn_interface import ... calls in the else block will raise an unhandled ImportError that bubbles up and prevents the entire backends module from loading — crashing TE's import entirely. Wrapping the imports in a nested try/except ImportError would degrade gracefully to the "not installed" state instead:

    try:
        fa_utils.fa3_version = PkgVersion(get_pkg_version("flash-attn-3"))
    except PackageNotFoundError:
        flash_attn_func_v3 = None
        flash_attn_varlen_func_v3 = None
        flash_attn_with_kvcache_v3 = None
    else:
        try:
            from flash_attn_interface import flash_attn_func as flash_attn_func_v3
            from flash_attn_interface import (
                flash_attn_varlen_func as flash_attn_varlen_func_v3,
            )
            from flash_attn_interface import (
                flash_attn_with_kvcache as flash_attn_with_kvcache_v3,
            )
            from flash_attn_interface import _flash_attn_forward as _flash_attn_fwd_v3
            from flash_attn_interface import _flash_attn_backward as _flash_attn_bwd_v3
            fa_utils.set_flash_attention_3_params()
        except ImportError:
            flash_attn_func_v3 = None
            flash_attn_varlen_func_v3 = None
            flash_attn_with_kvcache_v3 = None

Reviews (1): Last reviewed commit: "Simplify FA3 discovery" | Re-trigger Greptile

Comment on lines 35 to 38
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper && python setup.py install
python_path=`python -c "import site; print(site.getsitepackages()[0])"`
mkdir -p $python_path/flash_attn_3
cp flash_attn_interface.py $python_path/flash_attn_3/
cd ../../
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 FA3 install always uses HEAD, not the specified version

For FA3 (fa_version = "3.0.0b1"), the script clones the default branch HEAD without checking out the tag or commit corresponding to that version. If the upstream flash-attention main branch advances its API between runs, tests may pass or fail inconsistently and won't actually validate the 3.0.0b1 release. Adding a git checkout after the clone would pin the test to the intended version:

Suggested change
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper && python setup.py install
python_path=`python -c "import site; print(site.getsitepackages()[0])"`
mkdir -p $python_path/flash_attn_3
cp flash_attn_interface.py $python_path/flash_attn_3/
cd ../../
fi
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention && git checkout v${fa_version} && cd hopper && python setup.py install
cd ../../

@vcherepanov-nv vcherepanov-nv merged commit 181322e into NVIDIA:main Apr 9, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants