Scribe

Scribe is a command line utility for converting the binary data files of the MNIST data set (see: http://yann.lecun.com/exdb/mnist/) to a collection of BMP images.

Running the script

To compile, run cargo build --release. You will find the executable in the directory ./target/release.

Then, at the same level as the src directory, create the following directories:

./out/0
./out/1
./out/2
./out/3
./out/4
./out/5
./out/6
./out/7
./out/8
./out/9

The script requires 4 parameters:

Two paths relative to the current working directory
- The location of the image data file
- The location of the corresponding labels file
A 1-based index describing which image to start with
How many images to read

For example, if your directory structure looks like this:

./mnist
  t10k-images-idx3-ubyte
  t10k-labels-idx1-ubyte
  train-images-idx3-ubyte
  train-labels-idx1-ubyte

You could invoke the script like this to dump all 10,000 testing images: ./target/release/scribe ./mnist/t10k-images-idx3-ubyte ./mnist/t10k-labels-idx1-ubyte 1 10000

As another example, you could just dump 3,000 images, starting from image number 6,000: ./target/release/scribe ./mnist/t10k-images-idx3-ubyte ./mnist/t10k-labels-idx1-ubyte 6000 3000

The script will output images in a directory called out. Images are organized by the type into separate folders. That is, all the zeros go into ./out/0, all the ones into ./out/1, etc. Each image is named in the following pattern: d{type}-{id}.bmp so the name: d5-0040.bmp would indicate this image is a 5 (denoted by d5) and it is the 41st 5 out of all 5s read from the dataset. The images are 0 indexed, which is why 0040 is the 41st image.

You can also print a help message by invoking the script with a single option: --help, like so: scribe --help.

The tool does not have very robust error reporting and is not configurable aside from the options described here, but it is functional!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.idea		.idea
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
scribe.iml		scribe.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scribe

Running the script

About

Uh oh!

Releases

Packages

Languages

License

JDSeiler/scribe

Folders and files

Latest commit

History

Repository files navigation

Scribe

Running the script

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages