DatasetHelper

List of small components written in python, that helps you to ease the task.

List of components

Dataset Splitter - Splits Dataset in Train, Test and Validation Dataset Randomly (80:10:10) ratio
CSV Generator - Generates a CSV file from Pascal VOC dataset.
TFRecord Generator - Generates TFRecord from CSV file of Pascal VOC dataset.

Usage

Dataset Splitter

Steps to split the PASCAL VOC dataset in Colab.

Sample dataset structure would look like:

images.zip
  -- img1.jpg
  -- img1.xml
  -- img2.jpg
  -- img2.xml
  ...

Make the image dir

!mkdir /content/images

(Optional) Unzip if your dataset is in zip compressed folder. (Note: If you have unzipped dataset, make sure all your files are in '/images/all' directory)

!unzip -q images.zip -d /content/images/all

Make Directories for Test, Train and Validation

!mkdir /content/images/train; mkdir /content/images/test; mkdir /content/images/validation

Import the DatasetSplitter.py file

!wget https://raw.githubusercontent.com/MasoomBadi/DatasetHelper/main/DatasetSplitter.py

Run the python file.

!python DatasetSplitter.py

CSV Generator & TFRecord Generator

Once you have your dataset ready, execute these script to generate the CSV file from Pascal VOC dataset and create a TFRecord file from it.

(Optional) If you don't yet have the labelmap.txt file ready, you can run script from below to create it.

labelmap.txt contains the list of classes that are used in your dataset, each in a new line.

%%bash
cat <<EOF >> /content/labelmap.txt
Class1
Class2
Class3
Class4
EOF

Get the scripts.

!wget https://raw.githubusercontent.com/MasoomBadi/DatasetHelper/main/CSVGenerator.py
!wget https://raw.githubusercontent.com/MasoomBadi/DatasetHelper/main/TFRecordGenerator.py

Run the files to create a TFRecord.

!python3 CSVGenerator.py
!python3 TFRecordGenerator.py --csv_input=images/train_labels.csv --labelmap=labelmap.txt --image_dir=images/train --output_path=train.tfrecord
!python3 TFRecordGenerator.py --csv_input=images/validation_labels.csv --labelmap=labelmap.txt --image_dir=images/validation --output_path=val.tfrecord

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
CSVGenerator.py		CSVGenerator.py
DatasetSplitter.py		DatasetSplitter.py
LICENSE		LICENSE
README.md		README.md
TFRecordGenerator.py		TFRecordGenerator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DatasetHelper

List of components

Usage

Dataset Splitter

CSV Generator & TFRecord Generator

About

Releases

Packages

Languages

License

MasoomBadi/DatasetHelper

Folders and files

Latest commit

History

Repository files navigation

DatasetHelper

List of components

Usage

Dataset Splitter

CSV Generator & TFRecord Generator

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages