Image Captioning

Usage

Create a directory named 'models' in code directory
Download best_model.pt and place into models directory
best_model.pt specifies parameters for a single-cell LSTM-based RNN decoder with embedding and output FC layers to transform word tokens
place any images you want captions to be generated for in the img_data/images/ directory
outputs will be saved to generated_output/
for caption generation, simply run 'python main_generate.py'

main.py: Main driver class
generate_main.py: just runs image generation using pretrained network best_model.pt
experiment.py: Main experiment class. Initialized based on config - takes care of training, saving stats and plots, logging and resuming experiments.
generate_captions.py: stripped down version of experiment.py, only contains code for generation using pretrained model
dataset_factory: Factory to build datasets based on config
model_factory.py: Factory to build models based on config
constants.py: constants used across the project
file_utils.py: utility functions for handling files
caption_utils.py: utility functions to generate bleu scores
vocab.py: A simple Vocabulary wrapper
coco_dataset: A simple implementation of torch.utils.data.Dataset the Coco Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
caption_utils.py		caption_utils.py
coco_dataset.py		coco_dataset.py
constants.py		constants.py
dataset_factory.py		dataset_factory.py
default.json		default.json
experiment.py		experiment.py
file_utils.py		file_utils.py
generate_captions.py		generate_captions.py
main.py		main.py
main_generate.py		main_generate.py
model_factory.py		model_factory.py
model_factory_gen.py		model_factory_gen.py
savedVocab		savedVocab
test_ids.csv		test_ids.csv
train_ids.csv		train_ids.csv
val_ids.csv		val_ids.csv
vocab.py		vocab.py