-
Notifications
You must be signed in to change notification settings - Fork 4
Description
I've tried to implement the diff2flow for sd 1.5 and have trained it following the paper. However, I'm unclear on the correct sampling approach after training. The core question: How should we sample from the trained model?
During training, the model sees:
Non-uniform t_dm (from t_fm ~ U(0,1) through FM→DM conversion)
Scaled interpolants x_dm = scale(t) * x_fm
Optimizes velocity loss through epsilon→velocity conversion
Possible approaches I've considered:
Standard DDIM - Treat model as regular diffusion model
Euler integration - Sample in FM space with FM↔DM conversion at each step
DDIM with wrapper - Model wrapper handles conversion (like FlowModelObj)
The mismatch is that standard DDIM provides uniform timesteps {1000, 950, 900, ...} and expects diffusion noise schedule, while the model was trained on non-uniform timesteps and scaled linear interpolants.
Could you clarify which approach is correct for epsilon-prediction models trained with Diff2Flow? Is there special handling needed for the training/inference distribution mismatch?
Thanks for the great work on this paper!