VSE: Visual Semantic Embedding in PyTorch

Description

This repository contains the implementation of visual-semantic embedding.
Training and evaluation is done on the MSCOCO dataset.

Requirements (libraries)

python>=3.7
numpy
matplotlib
pytorch>=1.1.0
torchvision
Pillow
faiss-cpu (for nearest neighbor search)
accimage (optional, for fast loading of images)
torchtext (for vocabulary)
spacy (for spacy tokenizer)

Run the below command before training.

$ python -m spacy download en

For Anaconda Users

environment.yml file contains environment details for Anaconda users.
run conda env create -f environment.yml && conda activate mse for simple use.

Preparation of Dataset

Go to the directory where the data should be and run download_coco.sh.
This directory would be denoted $ROOTPATH.

Training

$ python train.py --root_path $ROOTPATH

Reported Scores

Image to Caption

	R@1	R@5	R@10	Med r
VSE++	41.3	71.1	81.2	2.0
Our Implementation	31.7	61.5	72.6	3.0

Caption to Image

	R@1	R@5	R@10	Med r
VSE++	30.3	59.4	72.4	4.0
Our Implementation	22.4	48.8	61.9	6.0

Evaluation, Visualization

$ python eval.py --root_path $ROOTPATH --checkpoint hogehoge.ckpt --image_path $IMAGE --caption $CAPTION

$IMAGE denotes the path to reference image. Defaults to samples/sample1.jpg.
$CAPTION denotes the reference caption. Defaults to "the cat is walking on the street"
Retrieval is done on MSCOCO validation set.

TODO

add Flickr8k
add Flickr30k
clean up validation
find optimal hyperparams

Name	Name	Last commit message	Last commit date
Latest commit skasai5296 fixed evaluation code May 14, 2020 b1315be · May 14, 2020 History 52 Commits
samples	samples	fixing bugs in evaluation	Aug 13, 2019
.gitignore	.gitignore	fixed some	Sep 12, 2019
README.md	README.md	updated best scores	Sep 12, 2019
captions_train2017.txt	captions_train2017.txt	initial commit	Jul 27, 2019
dataset.py	dataset.py	fixed evaluation code	May 14, 2020
download_coco.sh	download_coco.sh	added coco download	Aug 5, 2019
download_flickr.sh	download_flickr.sh	fixed some code	Sep 3, 2019
embedding.py	embedding.py	fixed evaluation code	May 14, 2020
environment.yml	environment.yml	fixing bugs in evaluation	Aug 13, 2019
eval.py	eval.py	fixed evaluation code	May 14, 2020
model.py	model.py	fixed code and default values	May 13, 2020
train.py	train.py	fixed evaluation code	May 14, 2020
utils.py	utils.py	refined code, working for caption generation and reconstruction	May 10, 2020
vocab.py	vocab.py	refined code, working for caption generation and reconstruction	May 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VSE: Visual Semantic Embedding in PyTorch

Description

Requirements (libraries)

For Anaconda Users

Preparation of Dataset

Training

Reported Scores

Image to Caption

Caption to Image

Evaluation, Visualization

TODO

About

Releases

Packages

Languages

skasai5296/VSE

Folders and files

Latest commit

History

Repository files navigation

VSE: Visual Semantic Embedding in PyTorch

Description

Requirements (libraries)

For Anaconda Users

Preparation of Dataset

Training

Reported Scores

Image to Caption

Caption to Image

Evaluation, Visualization

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages