Skip to content

Image Captioning for Cake Images using Stable Diffusion Augmentation

License

Notifications You must be signed in to change notification settings

MissWildcard/PictureCue

Repository files navigation

PictureCue

Cakewalk: Image Captioning for Cake Images using Stable Diffusion Augmentation

Step0:
Before running the notebook, make sure that you have downloaded the COCO dataset http://images.cocodataset.org/

Foe downloading the dataset, you may use :

wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/zips/test2017.zip
wget http://images.cocodataset.org/zips/unlabeled2017.zip

Go to the respective directory and unzip them using:

unzip train2017.zip
unzip val2017.zip
unzip test2017.zip
unzip unlabeled2017.zip

rm train2017.zip
rm val2017.zip
rm test2017.zip
rm unlabeled2017.zip

Step1:
Run 1_DatasetGeneration.ipynb to extract the required cake images (2.9k dataset) and captions

Step2:
Run 2_DatasetAugmentation_withSD.ipynb to generate synthetic cake images using stable diffuion
Merge the synthetic images and original images into a single folder (5.9k dataset)
At this point you may run Similaritymatrix.ipyng to have an understanding about the images and its captions

Step3:
Run 3_ModelTraining.ipynb to train your model on both the original dataset and the synthetic dataset
It is recommended to give appropriate name while saving these model to your directory

Step4:
We have saved the trained models at https://drive.google.com/drive/folders/1WpzQjFIvOQkhHSSTrdB5vAwkXzg-4lGt?usp=sharing Load the required model and run 4_ModelEvaluation.ipynb to evaluate the model using BLEU scores

You may use either of our models (one trained on original 2.9k dataset and the other on 5.9k augmented dataset) for inference
You may use the cake sample images from Data/cake_samples.zip.
You can also use your own cake images on our model
Run DEMO.ipynb for this

About

Image Captioning for Cake Images using Stable Diffusion Augmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •