Image Generation with Diffusion Models

Authors:

Mikołaj Gałkowski
Hubert Bujakowski

This project was conducted as part of a deep learning course, focusing on generating images using various diffusion model architectures. The objective was to develop and evaluate models capable of producing high-quality images through the reverse diffusion process. The dataset used for training and evaluation consisted of the LSUN Bedroom dataset.

Training

Set up the training environment by installing the required packages defined in the requirements.txt file.
Navigate to the src directory.

Run the training script:

python train.py --model_type ... --output_dir ... --num_epochs ...

Evaluation

With trained model you can evaluate it using the model_evaluation.py script.

python model_evaluation.py --experiment_dir ... -no_times ... (we used 200)

Model Checkpoints

Model checkpoints are available on the HuggingFace repositories. The checkpoints for each model type are stored under directories named after the training duration (model_type_x_epochs_LSUN_bedroom).

We conducted experiments with three types of diffusion models: DDPM, DDIM, and PNDM. Each model was trained for 5, 10, and 50 epochs to evaluate its performance. The table below summarizes the results using the Fréchet Inception Distance (FID) metric, which measures the quality of generated images.

Model Type	Epochs	FID	Animation
DDPM	5	34.94	animation
DDIM	5	128.27	animation
PNDM	5	127.71	animation
DDPM	10	10.05	animation
DDIM	10	56.86	animation
PNDM	10	61.44	animation
DDPM	50	12.73	animation
DDIM	50	14.49	animation
PNDM	50	23.44	animation

From these results, we observe that the DDPM model, especially when trained for 10 epochs, achieved the lowest FID score, indicating high-quality image generation. Despite the advancements in DDIM and PNDM models, they did not outperform DDPM in our experiments. This may be partly due to the smaller sample size used in our evaluations compared to the original benchmarks.

Animation from the DDPM model trained for 50 epochs:

Latent Space Interpolation

We also investigated the capability of the models to interpolate between latent vectors and generate intermediate images. This exploration demonstrated the models' ability to smoothly transition between different room layouts, lighting conditions, and furniture arrangements. The resulting interpolations highlight the potential of these models for applications such as interior design visualization and scene generation.

Results from the DDPM model trained for 50 epochs:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image Generation with Diffusion Models

Training

Evaluation

Model Checkpoints

Latent Space Interpolation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image Generation with Diffusion Models

Training

Evaluation

Model Checkpoints

Latent Space Interpolation