forked from mlcommons/training
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
First commit, with reference implementations for all benchmarks
- Loading branch information
0 parents
commit 77a6647
Showing
312 changed files
with
27,064 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
[submodule "object_detection/caffe2"] | ||
path = object_detection/caffe2/caffe2 | ||
url = https://github.com/caffe2/caffe2.git | ||
[submodule "object_detection/detectron"] | ||
path = object_detection/caffe2/detectron | ||
url = https://github.com/ddkang/Detectron.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# MLPerf Reference Implementations | ||
|
||
This is a repository of reference implementations for the MLPerf benchmark. These implementations are valid as starting points for benchmark implementations but are not fully optimized and are not intended to be used for "real" performance measurements of software frameworks or hardware. | ||
|
||
# Preliminary release (v0.5) | ||
|
||
This release is very much an "alpha" release -- it could be improved in many ways. The benchmark suite is still being developed and refined, see the Suggestions section below to learn how to contribute. | ||
|
||
We anticipate a significant round of updates at the end of May based on input from users. | ||
|
||
# Contents | ||
|
||
We provide reference implementations for each of the 7 benchmarks in the MLPerf suite. | ||
|
||
* image_classification - Resnet-50 v1 applied to Imagenet. | ||
* object_detection - Mask R-CNN applied to COCO. | ||
* speech_recognition - DeepSpeech2 applied to Librispeech. | ||
* translation - Transformer applied to WMT English-German. | ||
* recommendation - Neural Collaborative Filtering applied to MovieLens 20 Million (ml-20m). | ||
* sentiment_analysis - Seq-CNN applied to IMDB dataset. | ||
* reinforcement - Mini-go applied to predicting pro game moves. | ||
|
||
Each reference implementation provides the following: | ||
|
||
* Code that implements the model in at least one framework. | ||
* A Dockerfile which can be used to run the benchmark in a container. | ||
* A script which downloads the appropriate dataset. | ||
* A script which runs and times training the model. | ||
* Documentaiton on the dataset, model, and machine setup. | ||
|
||
# Running Benchmarks | ||
|
||
These benchmarks have been tested on the following machine configuration: | ||
|
||
* 16 CPUs, one Nvidia P100. | ||
* Ubuntu 16.04, including docker with nvidia support. | ||
* 600GB of disk (though many benchmarks do require less disk). | ||
|
||
Generally, a benchmark can be run with the following steps: | ||
|
||
1. Setup docker & dependencies. There is a shared script (install_cuda_docker.sh) to do this. Some benchmarks will have additional setup, mentioned in their READMEs. | ||
2. Download the dataset using `./download_dataset.sh`. This should be run outside of docker, on your host machine. This should be run from the directory it is in (it may make assumptions about CWD). | ||
3. Optionally, run `verify_dataset.sh` to ensure the was successfully downloaded. | ||
4. Build and run the docker image, the command to do this is included with each Benchmark. | ||
|
||
Each benchmark will run until the target quality is reached and then stop, printing timing results. | ||
|
||
Some these benchmarks are rather slow or take a long time to run on the reference hardware (i.e. 16 CPUs and one P100). We expect to see significant performance improvements with more hardware and optimized implementations. | ||
|
||
# Suggestions | ||
|
||
We are still in the early stages of developing MLPerf and we are looking for areas to improve, partners, and contributors. If you have recommendations for new benchmarks, or otherwise would like to be involved in the process, please reach out to `[email protected]`. For technical bugs or support, email `[email protected]`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# 1. Problem | ||
Which benchmark is implemented, e.g. image classification | ||
# 2. Directions | ||
### Steps to configure machine | ||
Ideally, a list of command lines | ||
### Steps to download and verify data | ||
Ideally, a list of command lines | ||
### Steps to run and time | ||
Ideally, a list of command lines | ||
# 3. Dataset/Environment | ||
### Publication/Attribution | ||
Cite paper describing dataset plus any additional attribution requested by dataset authors | ||
### Data preprocessing | ||
What preprocessing is done to the the dataset? | ||
### Training and test data separation | ||
How is the test set extracted? | ||
### Training data order | ||
In what order is the training data traversed? | ||
### Test data order | ||
In what order is the test data traversed? | ||
### Simulation environment (RL models only) | ||
Describe simulation environment briefly, if applicable. | ||
# 4. Model | ||
### Publication/Attribution | ||
Cite paper describing model plus any additional attribution requested by code authors | ||
### List of layers | ||
Brief summary of structure of model | ||
### Weight and bias initialization | ||
How are weights and biases intialized | ||
### Loss function | ||
Name/description of loss function used | ||
### Optimizer | ||
Name of optimzier used | ||
# 5. Quality | ||
### Quality metric | ||
What is the target quality metric | ||
### Quality target | ||
What is the numeric quality target | ||
### Evaluation frequency | ||
How many training items between quality evaluations (typically all, evaluated every epoch) | ||
### Evaluation thoroughness | ||
How many test items per quality evaluation (typically all) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
# 1. Problem | ||
This benchmark uses a RNN to classify images. This is a fork of https://github.com/tensorflow/models/tree/master/official/resnet. | ||
|
||
|
||
## Disclaimer | ||
|
||
The current timing scripts do not time all of the data pre-processing. The preprocessing done as described in the dataset section is not included in the timing. [Some of the preprocessing](https://github.com/tensorflow/models/blob/master/official/resnet/imagenet_preprocessing.py), though, is including in the timing. This is an artifact of the difficulty and lack of automation in the data processing and downloading and will be remedied in the future. | ||
|
||
|
||
# 2. Directions | ||
### Steps to configure machine | ||
|
||
To setup the environment on Ubuntu 16.04 (16 CPUs, one P100, 100 GB disk), you can use these commands. This may vary on a different operating system or graphics card. | ||
|
||
# Install docker | ||
sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common | ||
|
||
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - | ||
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - | ||
sudo apt-key fingerprint 0EBFCD88 | ||
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ | ||
$(lsb_release -cs) \ | ||
stable" | ||
sudo apt update | ||
# sudo apt install docker-ce -y | ||
sudo apt install docker-ce=18.03.0~ce-0~ubuntu -y --allow-downgrades | ||
|
||
# Install nvidia-docker2 | ||
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - | ||
curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu16.04/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list | ||
sudo apt-get update | ||
sudo apt install nvidia-docker2 -y | ||
|
||
|
||
sudo tee /etc/docker/daemon.json <<EOF | ||
{ | ||
"runtimes": { | ||
"nvidia": { | ||
"path": "/usr/bin/nvidia-container-runtime", | ||
"runtimeArgs": [] | ||
} | ||
} | ||
} | ||
EOF | ||
sudo pkill -SIGHUP dockerd | ||
|
||
sudo apt install -y bridge-utils | ||
sudo service docker stop | ||
sleep 1; | ||
sudo iptables -t nat -F | ||
sleep 1; | ||
sudo ifconfig docker0 down | ||
sleep 1; | ||
sudo brctl delbr docker0 | ||
sleep 1; | ||
sudo service docker start | ||
|
||
ssh-keyscan github.com >> ~/.ssh/known_hosts | ||
git clone [email protected]:mlperf/reference.git | ||
|
||
|
||
|
||
### Steps to download and verify data | ||
Unfortunately data downloading and preprocess is a somewhat cumbersome process. Please refer to the instructions here: | ||
|
||
https://github.com/tensorflow/models/tree/master/research/inception#getting-started | ||
|
||
|
||
### Steps to run and time | ||
|
||
We assume that imagenet pre-processed has already been mounted at `/imn`. | ||
|
||
cd ~/reference/image_classification/tensorflow/ | ||
IMAGE=`sudo docker build . | tail -n 1 | awk '{print $3}'` | ||
SEED=2 | ||
NOW=`date "+%F-%T"` | ||
sudo docker run -v /imn:/imn --runtime=nvidia -t -i $IMAGE "./run_and_time.sh" $SEED | tee benchmark-$NOW.log | ||
|
||
For reference, | ||
|
||
$ ls /imn | ||
imagenet lost+found | ||
|
||
# 3. Dataset/Environment | ||
### Publication/Attribution | ||
We use Imagenet (http://image-net.org/): | ||
|
||
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, | ||
Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet | ||
large scale visual recognition challenge. arXiv:1409.0575, 2014. | ||
|
||
|
||
### Data preprocessing | ||
The dataset is extensively preprocessed, in several ways including image processing and, batching and TF formatting. The first pass does conversion and scaling (e.g. png to jpg). The second step is to group images in larger groups and convert into a Tensorflow format. There is also cropping and augmentation, mean color subtraction, bounding boxes etc. | ||
|
||
For more information on preprocessing, see this file and documentation: | ||
https://github.com/tensorflow/models/tree/master/research/inception#getting-started | ||
|
||
### Training and test data separation | ||
This is proivded by the imagenet dataset and original authors. | ||
|
||
### Training data order | ||
Each epoch goes over all the training data, shuffled every epoch. | ||
|
||
### Test data order | ||
We use all the data for evaluation. We don't provide an order for of data traversal for evaluation. | ||
|
||
# 4. Model | ||
### Publication/Attribution | ||
|
||
See the following papers for more background: | ||
|
||
[1] [Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Dec 2015. | ||
|
||
[2] [Identity Mappings in Deep Residual Networks](https://arxiv.org/pdf/1603.05027.pdf) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Jul 2016. | ||
|
||
|
||
### Structure & Loss | ||
|
||
In brief, this is a 50 layer v1 RNN. Refer to [Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf) for the layer structure and loss function. | ||
|
||
|
||
### Weight and bias initialization | ||
|
||
Weight initialization is done as described here in [Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification](https://arxiv.org/abs/1502.01852). | ||
|
||
|
||
### Optimizer | ||
We use a SGD Momentum based optimizer. The momentum and learning rate are scaled based on the batch size. | ||
|
||
|
||
# 5. Quality | ||
### Quality metric | ||
Percent of correct classifications on the Image Net test dataset. | ||
|
||
### Quality target | ||
We run to 0.749 accuracy (74.9% correct classifications). | ||
|
||
### Evaluation frequency | ||
We evaluate after every epoch. | ||
|
||
### Evaluation thoroughness | ||
Every test exmaple is used each time. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
#!/bin/bash | ||
|
||
# TODO |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04 | ||
|
||
|
||
WORKDIR /research | ||
|
||
RUN apt-get update | ||
|
||
RUN apt-get update && apt-get install -y --no-install-recommends \ | ||
ca-certificates \ | ||
build-essential \ | ||
git \ | ||
python \ | ||
python-pip | ||
|
||
|
||
ENV HOME /research | ||
ENV PYENV_ROOT $HOME/.pyenv | ||
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH | ||
|
||
|
||
RUN apt-get install -y python-setuptools | ||
|
||
RUN apt-get install -y python-pip python3-pip virtualenv htop | ||
RUN pip3 install --upgrade numpy scipy sklearn tf-nightly-gpu | ||
|
||
|
||
# Mount data into the docker | ||
ADD . /research/resnet | ||
|
||
|
||
WORKDIR /research/resnet | ||
RUN pip3 install -r official/requirements.txt | ||
|
||
|
||
ENTRYPOINT ["/bin/bash"] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
Install | ||
========== | ||
|
||
In order to run this, you must first set stuff up... for now see Transformer's README. | ||
|
||
|
||
Downlaoding Data | ||
========== | ||
|
||
Downloading data is TBD. | ||
|
||
|
||
Processing Data | ||
============= | ||
|
||
TBD. | ||
|
||
|
||
Running the Benchmark | ||
============ | ||
|
||
You first must build the docker file; | ||
|
||
docker build . | ||
|
||
|
||
Remember the image name/number. | ||
|
||
|
||
1. Make sure /imn on the host contains the pre-processed data. (Scripts for this TODO). | ||
2. Choose your random seed (below we use 77) | ||
3. Enter your docker's image name (below we use 5ca81979cbc2 which you don't have) | ||
|
||
Then, executute the following: | ||
|
||
sudo docker run -v /imn:/imn --runtime=nvidia -t -i 5ca81979cbc2 "./run_and_time.sh" 77 | tee benchmark.log | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
MNIST-data | ||
labels.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Docker image for running examples in Tensorflow models. | ||
# base_image depends on whether we are running on GPUs or non-GPUs | ||
FROM ubuntu:latest | ||
|
||
RUN apt-get update && apt-get install -y --no-install-recommends \ | ||
ca-certificates \ | ||
build-essential \ | ||
git \ | ||
python \ | ||
python-pip \ | ||
python-setuptools | ||
|
||
RUN pip install tf-nightly | ||
|
||
# Checkout tensorflow/models at HEAD | ||
RUN git clone https://github.com/tensorflow/models.git /tensorflow_models | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Docker image for running examples in Tensorflow models. | ||
# base_image depends on whether we are running on GPUs or non-GPUs | ||
FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04 | ||
|
||
RUN apt-get update && apt-get install -y --no-install-recommends \ | ||
ca-certificates \ | ||
build-essential \ | ||
git \ | ||
python \ | ||
python-pip \ | ||
python-setuptools | ||
|
||
RUN pip install tf-nightly-gpu | ||
|
||
# Checkout tensorflow/models at HEAD | ||
RUN git clone https://github.com/tensorflow/models.git /tensorflow_models | ||
|
||
|
Oops, something went wrong.