video_captioning_dataloader

According to PyTorch Documentation, the code for processing data samples can get messy and hard to maintain; we ideally want our dataset code to be decoupled from our model training code for better readability and modularity. PyTorch provides two data primitives: torch.utils.data.DataLoader and torch.utils.data.Dataset that allow you to use pre-loaded datasets as well as your own data. Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples. Also, we can benefit from PyTorch domain libraries which provide a number of pre-loaded datasets (such as FashionMNIST) that subclass torch.utils.data.Dataset and implement functions specific to the particular data. They can be used to prototype and benchmark your model.

requirements

cv2
csv
numpy
pandas
pytorch
sklearn

data

a .csv file contains the path to 5 videos and their corresponding caption. by iterating through this file and each video is loaded. at the next stage, a dataframe is created in which each row contains an array of size (NHW*C) and a caption. A class named 'MovieCaptioningDataset' is created which inherits its properties from torch.utils.data's Dataset class. An instance of this class is generated for the training and validation data containing the model's input and output.

usage

by changing DISPLAY_FRAMES and SAVE_FRAMES into True or False you can chooses to save the sampled frames in a folder or to display them. NUM_OF_SAMPLES determines the number of samples taken from each video.

Also to run the .py file :

python dataloader.py

The .ipynb version has also been uploaded in the repository.

The weaknesses:

1. data augmentation is not implemented

Solution:

This library provides a useful tool for augmenting videos

2. The input resolution must be constant.

Solution:

this can be solved by Albumentations library using which we can resize every input frame into a constant Height and Width and perform various kind of transforms on the sampled video frames.

contact:

[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
SampleVideo_720x480_10mb.mp4		SampleVideo_720x480_10mb.mp4
SampleVideo_720x480_1mb.mp4		SampleVideo_720x480_1mb.mp4
SampleVideo_720x480_20mb.mp4		SampleVideo_720x480_20mb.mp4
SampleVideo_720x480_2mb.mp4		SampleVideo_720x480_2mb.mp4
SampleVideo_720x480_5mb.mp4		SampleVideo_720x480_5mb.mp4
dataloader.ipynb		dataloader.ipynb
dataloader.py		dataloader.py
file_names.csv		file_names.csv
video_generation_methods_summerized.pdf		video_generation_methods_summerized.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

video_captioning_dataloader

requirements

data

usage

The weaknesses:

1. data augmentation is not implemented

2. The input resolution must be constant.

contact:

About

Releases

Packages

Languages

Nadiam75/video_captioning_dataloader

Folders and files

Latest commit

History

Repository files navigation

video_captioning_dataloader

requirements

data

usage

The weaknesses:

1. data augmentation is not implemented

2. The input resolution must be constant.

contact:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages