SDGAN: Tuning Stable Diffusion with an adversarial network
This is a research project aimed at enhancing Stable Diffusion with GAN training for better perceptual detail. Currently, no pretrained model weights are available. Things are still in flux and the model may change in ways that are not backwards compatible with trained weights. Use at your own risk.
TODO
- ☐ Investigate using different samples for generator and discriminator training
- ☑
Test the effect of adding self-attention to all layersFailed due to CUDA running out of memory - ☐ Test the effect of different parameter initializations
- ☑ Investigate using cross attention
BIBLIOGRAPHY (incomplete)
- High-Resolution Image Synthesis with Latent Diffusion Models
- Image-to-Image Translation with Conditional Adversarial Networks
- Diffusion-GAN: Training GANs with Diffusion
- Scaling up GANs for Text-to-Image Synthesis
- Fast Transformer Decoding: One Write-Head is All You Need
- Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
- Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models
- Refining activation downsampling with SoftPool
- Adversarial score matching and improved sampling for image generation