What's Changed
- Add request queue to limit concurrent generation requests by @BBuf in #535
- News: 🔥🔥SGLang Diffusion x Cache-DiT ready!🔥🔥 by @DefTruth in #536
- feat: optimize async fp8 ulysses attn by @DefTruth in #537
- chore: delete useless code in serving by @BBuf in #539
- feat: make all_to_all comm unified by @DefTruth in #538
- feat: add refresh cache context api by @DefTruth in #542
- feat: support image edit model in serving by @BBuf in #541
- chore: simplify ulysses async flag by @DefTruth in #543
- chore: re-registered sage attn backend by @DefTruth in #544
- feat: support any head num for ulysses by @DefTruth in #546
- feat: support uneven heads in ulysses w/o padding by @triple-Mu in #547
- chore: refactor FLUX.2 image editing tests by @BBuf in #548
- feat: Add ulysses for any heads w/o padding by @DefTruth in #549
- feat: add envs manager for cache-dit by @DefTruth in #550
- fix ulysses fp8 and uneven head conflicts by @DefTruth in #552
Full Changelog: v1.1.7...v1.1.8