You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
https://arxiv.org/abs/2302.11552 + recovery likelihood. Need to know compute/time and data requirement. Need to know which of the pipelines are energy based. Which of the pipelines perform best.
There is the diffusion model, whose forward(x, t) function computes the gradient.
There is the ebm wrapped around the diffusion model. The ebm has an additional neg_logp_unnorm function that computes the negative log prob. ebm also has a forward function that computes the gradient, but this gradient requires a backprop through the diffusion model's network.
ebm models can be composed using their neg_log_unnorm functions. ProductEBMDiffusionModel is the product of 2 distributions, which means the addition of neg_log_unnorm outputs.
Either the diffusion model or the ebm can be used to construct a PortableDiffusionModel (ddpm). ddpm is required for training. If ddpm holds an ebm, it can compute p_energy. The behavior of the loss function is the same, and does not use ebm's neg_log_unnorm.
There are samplers that don't need energy: AnnealedULASampler, AnnealedUHASampler
There are samplers that require energy: AnnealedMALASampler, AnnealedMUHASampler
Questions:
What's the sample quality if, we use a pretrained diffusion model, without finetune, be wrapped as an EBM and with an energy-based sampler like AnnealedMUHASampler?
Does further EBM training/fine-tune help with the sample quality?
If the diffusion is on latents (e.g. stable diffusion), do the conclusions to the above questions change? What tweaks are needed for using latents?
Besides composition of distributions, what other benefits do EBM models have against score-based models?
The text was updated successfully, but these errors were encountered: