Skip to content

QinxiWang/Deep-Re-learning-with-Image-Captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Deep-Re-learning-with-Image-Captioning

This project explores the task of image captioning through a deep learning and a collective intelligence approach. We are interested in how effectively and accurately our model can generate captioning for any input image, and also takes in feedback of the result, and improves the performance. After generating captions for images using the LSTM model, We then created a Twitterbot using NodeJS’s Twit to get user feedback on these captions which we then can run back through our model to generate better captions!

Dependencies and Requirements:

Instructions on how to run the program:

  1. Build the model
    cd im2txt/
    INCEPTION_DIR="${HOME}/path you placed the inception_v3.ckpt"
    MODEL_DIR="${HOME}/your path to the pretrained model"

    bazel build -c opt //im2txt/…
                 
Optional(if you have MSCOCO data and want to train more on your own):
COCO_DIR=(where you place your MSCOCO datasets in the correct tf format)
    
            bazel-bin/im2txt/train \
         --input_file_pattern="${COCO_DIR}/train-?????-of-00256" \
          --inception_checkpoint_file="${INCEPTION_CHECKPOINT}" \
         --train_dir="${MODEL_DIR}/train" \
         --train_inception=false \
          --number_of_steps= number of steps you want to train on
  1. Generate captions on any given image
    cd im2txt/
CHECKPOINT_PATH="${HOME}/your path to the trained model"
VOCAB_FILE=""${HOME}/your path to the word_counts.txt in trained model"
IMAGE_FILE="${HOME}/your path to the image"

    bazel build -c opt //im2txt:run_inference

    bazel-bin/im2txt/run_inference \
      --checkpoint_path=${CHECKPOINT_PATH} \
     --vocab_file=${VOCAB_FILE} \
      --input_files=${IMAGE_FILE}
  1. Incorporate new data and train the previous model checkpoint

    1. After you download the user feedbacks from the surveys as a csv file:
    • Run feedbackConverter on the feeback.csv file
    • Run tfrecordConverter on the output of the feedbackConverter (in the code replace the path to the files with your own path)
    1. then in im2txt directory run:
    DATA_DIR="${HOME}/your path to the tf_record file generated by tfrecordConverter"
MODEL_DIR="${HOME}/your path to the model"

bazel-bin/im2txt/train \
  --input_file_pattern="${DATA_DIR}/train-?????-of-00256" \
  --train_dir="${MODEL_DIR}" \
  --train_inception=true \
  --number_of_steps=200 #or any number of steps you want to iterate on

The training will cumulatively save to on your previous model checkpoint on each iteration, so after the new training you can generate the captions following the same steps as in II.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published