UPSTREAM PR #1274: feat: option to enable vae tiling automatically for large images#60
UPSTREAM PR #1274: feat: option to enable vae tiling automatically for large images#60
Conversation
OverviewAnalysis of 48,340 functions across two Stable Diffusion C++ binaries reveals modest performance changes between versions. Modified functions: 102 (0.21%), new functions: 43, removed: 0, unchanged: 48,195 (99.7%). Power Consumption:
Function AnalysisThe single intentional code change—VAE tiling threshold feature—shows significant improvements:
Most performance changes occur in C++ standard library functions with no source modifications: Regressions:
Improvements:
Other analyzed functions showed minor changes in error handling and JSON deserialization paths. Additional FindingsPerformance changes stem primarily from compiler/toolchain differences rather than application code. The VAE tiling enhancement provides intelligent memory management for large image generation while avoiding overhead on small images, justifying the minimal 0.64-0.68% power increase. RMSNorm destruction regression may accumulate during model cleanup but occurs outside core inference loops. Most regressions affect supporting infrastructure (STL containers, timing utilities) with sub-microsecond absolute impacts. 🔎 Full breakdown: Loci Inspector. |
Note
Source pull request: leejet/stable-diffusion.cpp#1274
This allows controlling the use of VAE tiling according to the requested image size: tiling will be enabled only for images larger than a given threshold.
Koboldcpp has this same feature. It's mainly useful when the size isn't known at launch time, like in the sd-server, and when the size comes from an input image.
The size is specified by a square image side:
--vae-tiling 768means enabling tiling for images larger than 768x768 - mainly because it's easy to test a NxN image, and more meaningful than something like 'megapixels'. Another possibility could be specifying the size with a string like "768x768". A memory threshold would arguably be better, but harder to implement, since the exact usage depends a lot on the specific backend and flags.To keep it compatible with current usage, and still avoid an extra flag, the value is optional:
--vae-tilingalone is the same as--vae-tiling 1.--vae-tiling 0also means no tiling.