Skip to content
Change the repository type filter

All

    Repositories list

    • diffusers

      Public
      🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
      Python
      6.6k000Updated Dec 8, 2025Dec 8, 2025
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      3.7k000Updated Dec 6, 2025Dec 6, 2025
    • LeetCUDA

      Public
      📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
      Cuda
      8688.8k50Updated Dec 4, 2025Dec 4, 2025
    • Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
      Cuda
      278000Updated Dec 3, 2025Dec 3, 2025
    • Z-Image

      Public
      305100Updated Nov 28, 2025Nov 28, 2025
    • 🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
      C++
      7674.3k00Updated Nov 28, 2025Nov 28, 2025
    • 📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
      Python
      3264.8k10Updated Nov 28, 2025Nov 28, 2025
    • 📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
      Python
      2446100Updated Nov 28, 2025Nov 28, 2025
    • .github

      Public
      0100Updated Nov 25, 2025Nov 25, 2025
    • cache-dit

      Public
      A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗DiTs.
      Python
      36400Updated Nov 24, 2025Nov 24, 2025
    • ffpa-attn

      Public
      🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
      Cuda
      1223500Updated Nov 18, 2025Nov 18, 2025
    • [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
      Python
      83000Updated Oct 30, 2025Oct 30, 2025
    • 🔥LongCat-Video 1.7x🎉 speedup: cache acceleration and 4/8-bits weight only.
      Python
      0600Updated Oct 28, 2025Oct 28, 2025
    • Python
      166000Updated Oct 28, 2025Oct 28, 2025
    • ComfyUI

      Public
      The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
      Python
      11k000Updated Oct 27, 2025Oct 27, 2025
    • ⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
      Python
      01640Updated Oct 24, 2025Oct 24, 2025
    • Kandinsky 5.0: A family of diffusion models for Video & Image generation
      Python
      37000Updated Oct 22, 2025Oct 22, 2025
    • Wan2.1

      Public
      Wan: Open and Advanced Large-Scale Video Generative Models
      Python
      2.2k100Updated Oct 17, 2025Oct 17, 2025
    • Wan2.2

      Public
      Wan: Open and Advanced Large-Scale Video Generative Models
      Python
      1.4k000Updated Oct 17, 2025Oct 17, 2025
    • nunchaku

      Public
      [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
      Python
      204200Updated Oct 15, 2025Oct 15, 2025
    • Enjoy the magic of Diffusion models!
      Python
      1k000Updated Oct 13, 2025Oct 13, 2025
    • flux-fast

      Public
      A forked version of flux-fast that makes flux-fast even faster with cache-dit.
      Python
      16400Updated Oct 11, 2025Oct 11, 2025
    • HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
      Python
      115100Updated Oct 4, 2025Oct 4, 2025
    • cache-dit for comfyui
      Python
      22220Updated Sep 27, 2025Sep 27, 2025
    • HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation​
      Python
      50100Updated Sep 10, 2025Sep 10, 2025
    • Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
      Python
      40000Updated Sep 9, 2025Sep 9, 2025
    • Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
      Python
      355100Updated Sep 3, 2025Sep 3, 2025
    • Model Compression Toolbox for Large Language Models and Diffusion Models
      Python
      72000Updated Aug 14, 2025Aug 14, 2025
    • SpargeAttention: A training-free sparse attention that can accelerate any model inference.
      Cuda
      68600Updated Aug 7, 2025Aug 7, 2025
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      26k000Updated Aug 5, 2025Aug 5, 2025