xlite-dev

All

51 repositories

diffusers
Public
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Python
•
Apache License 2.0
•6.6k•0•0•0•Updated Dec 8, 2025Dec 8, 2025
sglang
Public
SGLang is a fast serving framework for large language models and vision language models.
Python
•
Apache License 2.0
•3.7k•0•0•0•Updated Dec 6, 2025Dec 6, 2025
LeetCUDA
Public
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
cuda cuda-kernels cuda-demo cuda-toolkit cuda-library cuda-kernel learn-cuda cuda-cpp hgemm flash-attention
Cuda
•
GNU General Public License v3.0
•868•8.8k•5•0•Updated Dec 4, 2025Dec 4, 2025
SageAttention
Public
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Cuda
•
Apache License 2.0
•278•0•0•0•Updated Dec 3, 2025Dec 3, 2025
Z-Image
Public
Apache License 2.0
•305•1•0•0•Updated Nov 28, 2025Nov 28, 2025
lite.ai.toolkit
Public
🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
tensorrt mnn ncnn onnx onnxruntime yolov5 tnn mnn-model yolox robustvideomatting
C++
•
GNU General Public License v3.0
•767•4.3k•0•0•Updated Nov 28, 2025Nov 28, 2025
Awesome-LLM-Inference
Public
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
mla vllm llm-inference awesome-llm flash-attention tensorrt-llm paged-attention deepseek flash-attention-3 deepseek-v3
Python
•
GNU General Public License v3.0
•326•4.8k•1•0•Updated Nov 28, 2025Nov 28, 2025
Awesome-DiT-Inference
Public
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
flux wan diffusion dit sora stable-diffusion sdxl sd15 deepcache open-sora-plan
Python
•
GNU General Public License v3.0
•24•461•0•0•Updated Nov 28, 2025Nov 28, 2025
.github
Public
0•1•0•0•Updated Nov 25, 2025Nov 25, 2025
cache-dit
Public
A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗DiTs.
Python
•
Apache License 2.0
•36•4•0•0•Updated Nov 24, 2025Nov 24, 2025
ffpa-attn
Public
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
cuda attention sdpa mla mlsys tensor-cores flash-attention deepseek deepseek-v3 deepseek-r1
Cuda
•
GNU General Public License v3.0
•12•235•0•0•Updated Nov 18, 2025Nov 18, 2025
ImageReward
Public
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Python
•
Apache License 2.0
•83•0•0•0•Updated Oct 30, 2025Oct 30, 2025
longcat-video-fast
Public
🔥LongCat-Video 1.7x🎉 speedup: cache acceleration and 4/8-bits weight only.
longcat longcat-video
Python
•0•6•0•0•Updated Oct 28, 2025Oct 28, 2025
LongCat-Video
Public
Python
•
MIT License
•166•0•0•0•Updated Oct 28, 2025Oct 28, 2025
ComfyUI
Public
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Python
•
GNU General Public License v3.0
•11k•0•0•0•Updated Oct 27, 2025Oct 27, 2025
qwen-image-fast
Public
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
qwen-image qwen-image-lightning qwen-image-edit qwen-image-api qwen-image-lora
Python
•
Apache License 2.0
•0•16•4•0•Updated Oct 24, 2025Oct 24, 2025
Kandinsky-5
Public
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Python
•
Apache License 2.0
•37•0•0•0•Updated Oct 22, 2025Oct 22, 2025
Wan2.1
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•2.2k•1•0•0•Updated Oct 17, 2025Oct 17, 2025
Wan2.2
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•1.4k•0•0•0•Updated Oct 17, 2025Oct 17, 2025
nunchaku
Public
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Python
•
Apache License 2.0
•204•2•0•0•Updated Oct 15, 2025Oct 15, 2025
DiffSynth-Studio
Public
Enjoy the magic of Diffusion models!
Python
•
Apache License 2.0
•1k•0•0•0•Updated Oct 13, 2025Oct 13, 2025
flux-fast
Public
A forked version of flux-fast that makes flux-fast even faster with cache-dit.
Python
•16•4•0•0•Updated Oct 11, 2025Oct 11, 2025
HunyuanImage-3.0
Public
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Python
•
Other
•115•1•0•0•Updated Oct 4, 2025Oct 4, 2025
comfyui-cache-dit
Public
cache-dit for comfyui
Python
•2•22•2•0•Updated Sep 27, 2025Sep 27, 2025
HunyuanImage-2.1
Public
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
Python
•
Other
•50•1•0•0•Updated Sep 10, 2025Sep 10, 2025
Qwen-Image-Lightning
Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Python
•
Apache License 2.0
•40•0•0•0•Updated Sep 9, 2025Sep 9, 2025
Qwen-Image
Public
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Python
•
Apache License 2.0
•355•1•0•0•Updated Sep 3, 2025Sep 3, 2025
deepcompressor
Public
Model Compression Toolbox for Large Language Models and Diffusion Models
Python
•
Apache License 2.0
•72•0•0•0•Updated Aug 14, 2025Aug 14, 2025
SpargeAttn
Public
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
Cuda
•
Apache License 2.0
•68•6•0•0•Updated Aug 7, 2025Aug 7, 2025
pytorch
Public
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Python
•
Other
•26k•0•0•0•Updated Aug 5, 2025Aug 5, 2025