Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please help me to for how to get the high accuracy in faster-rcnn and parameter tuning #923

Open
wants to merge 39 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
b3b260a
training imagenet on faster rcnn
andrewliao11 Feb 2, 2016
520d2f7
upload README
andrewliao11 Feb 3, 2016
1be564f
upload new README
andrewliao11 Feb 3, 2016
92a55b6
add prototxt
andrewliao11 Feb 3, 2016
90cf42d
modify demo.py
andrewliao11 Feb 3, 2016
d9dac04
upload demo image and modify demo.py
andrewliao11 Feb 3, 2016
6d78a08
upload output image and modify demo.py
andrewliao11 Feb 3, 2016
0bd2faf
Delete output_demo_01.jpg.png
andrewliao11 Feb 3, 2016
ddaec17
Delete output_demo_02.jpg.png
andrewliao11 Feb 3, 2016
1fc3168
Delete output_demo_03.jpg.png
andrewliao11 Feb 3, 2016
623a153
Delete output_demo_04.jpg
andrewliao11 Feb 3, 2016
fc753bc
Delete output_demo_05.jpg
andrewliao11 Feb 3, 2016
a6992b3
Update README(demo image and videos)
andrewliao11 Feb 4, 2016
7877474
Update README
andrewliao11 Feb 16, 2016
915518c
Update README
andrewliao11 Feb 16, 2016
be50159
typo
andrewliao11 Feb 16, 2016
b7906b9
update shell script part
andrewliao11 Mar 5, 2016
c189714
Update README.md
andrewliao11 Mar 19, 2016
bcb7241
Update README.md
andrewliao11 May 4, 2016
3c0371c
add my log file
andrewliao11 Jun 20, 2016
3c07d1b
add image
andrewliao11 Jun 20, 2016
327978d
Add files via upload
andrewliao11 Jun 20, 2016
bd9d46b
add experiment
andrewliao11 Jun 20, 2016
3b7f730
Update README.md
andrewliao11 Jun 20, 2016
d19c5db
Update README.md
andrewliao11 Jun 20, 2016
12cca09
Update README.md
andrewliao11 Jun 20, 2016
a332ef3
Update README.md
andrewliao11 Jul 14, 2016
5fd4a95
Update README.md
andrewliao11 Jul 19, 2016
6701c86
Create .keep
andrewliao11 Aug 22, 2016
e5953e3
Add files via upload
andrewliao11 Aug 22, 2016
9a43c70
Update README.md
andrewliao11 Aug 22, 2016
04a5822
Update README.md
andrewliao11 Aug 22, 2016
8a9f8c8
Delete ILSVRC2012_val_00037038.xml
andrewliao11 Aug 22, 2016
f9a16cd
Add files via upload
andrewliao11 Aug 22, 2016
c73fee7
Update README.md
andrewliao11 Aug 22, 2016
4de5a22
Update README.md
andrewliao11 Aug 22, 2016
6acd4bb
Update README.md
andrewliao11 Apr 1, 2017
3de4e4c
Update README.md
andrewliao11 Apr 22, 2017
e9f19bb
Update README.md
andrewliao11 Jul 15, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
286 changes: 103 additions & 183 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,201 +1,121 @@
### Disclaimer

The official Faster R-CNN code (written in MATLAB) is available [here](https://github.com/ShaoqingRen/faster_rcnn).
If your goal is to reproduce the results in our NIPS 2015 paper, please use the [official code](https://github.com/ShaoqingRen/faster_rcnn).
# Training Faster RCNN on Imagenet
[![Readme Score](http://readme-score-api.herokuapp.com/score.svg?url=andrewliao11/py-faster-rcnn-imagenet)](http://clayallsopp.github.io/readme-score?url=andrewliao11/py-faster-rcnn-imagenet)

This repository contains a Python *reimplementation* of the MATLAB code.
This Python implementation is built on a fork of [Fast R-CNN](https://github.com/rbgirshick/fast-rcnn).
There are slight differences between the two implementations.
In particular, this Python port
- is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)
- gives similar, but not exactly the same, mAP as the MATLAB version
- is *not compatible* with models trained using the MATLAB code due to the minor implementation differences
- **includes approximate joint training** that is 1.5x faster than alternating optimization (for VGG16) -- see these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more information
If you want to know some basic ideas in faster rcnn, try to check [Video Object Detection using Faster R-CNN](https://andrewliao11.github.io/object/detection/2016/07/23/detection/) out!

# *Faster* R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Feel free to contact me via email, I'll try to give you a hand if I can, lol.

By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)
## preparing data

This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.

Please see the official [README.md](https://github.com/ShaoqingRen/faster_rcnn/blob/master/README.md) for more details.

Faster R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1506.01497) and was subsequently published in NIPS 2015.

### License

Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).

### Citing Faster R-CNN

If you find Faster R-CNN useful in your research, please consider citing:

@inproceedings{renNIPS15fasterrcnn,
Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
Title = {Faster {R-CNN}: Towards Real-Time Object Detection
with Region Proposal Networks},
Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
Year = {2015}
}

### Contents
1. [Requirements: software](#requirements-software)
2. [Requirements: hardware](#requirements-hardware)
3. [Basic installation](#installation-sufficient-for-the-demo)
4. [Demo](#demo)
5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)
6. [Usage](#usage)

### Requirements: software

1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))

**Note:** Caffe *must* be built with support for Python layers!

```make
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
```

You can download my [Makefile.config](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/Makefile.config) for reference.
2. Python packages you might not have: `cython`, `python-opencv`, `easydict`
3. [optional] MATLAB (required for PASCAL VOC evaluation only)

### Requirements: hardware

1. For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
2. For training with VGG16, you'll need a K40 (~11G of memory)

### Installation (sufficient for the demo)

1. Clone the Faster R-CNN repository
```Shell
# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
```

2. We'll call the directory that you cloned Faster R-CNN into `FRCN_ROOT`

*Ignore notes 1 and 2 if you followed step 1 above.*

**Note 1:** If you didn't clone Faster R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:
```Shell
git submodule update --init --recursive
```
**Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `faster-rcnn` branch (or equivalent detached state). This will happen automatically *if you followed step 1 instructions*.

3. Build the Cython modules
```Shell
cd $FRCN_ROOT/lib
make
```

4. Build Caffe and pycaffe
```Shell
cd $FRCN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe
```

5. Download pre-computed Faster R-CNN detectors
```Shell
cd $FRCN_ROOT
./data/scripts/fetch_faster_rcnn_models.sh
```

This will populate the `$FRCN_ROOT/data` folder with `faster_rcnn_models`. See `data/README.md` for details.
These models were trained on VOC 2007 trainval.

### Demo

*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.

**Python**

To run the demo
```Shell
cd $FRCN_ROOT
./tools/demo.py
```
The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.

### Beyond the demo: installation for training and testing models
1. Download the training, validation, test data and VOCdevkit

```Shell
wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
```

2. Extract all of these tars into one directory named `VOCdevkit`

```Shell
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
```

3. It should have this basic structure

```Shell
$VOCdevkit/ # development kit
$VOCdevkit/VOCcode/ # VOC utility code
$VOCdevkit/VOC2007 # image sets, annotations, etc.
# ... and several other directories ...
```

4. Create symlinks for the PASCAL VOC dataset
ILSVRC13
└─── ILSVRC2013_DET_val
│ *.JPEG (Image files, ex:ILSVRC2013_val_00000565.JPEG)
└─── ILSVRC2013_DET_bbox_val
| *.xml (you can find the example from ./misc/ILSVRC2012_val_00018464.xml under this repo)
└─── data
│ meta_det.mat
└─── det_lists
│ val1.txt, val2.txt
```
meta_det.mat => Load the category inside, like [here](https://github.com/andrewliao11/py-faster-rcnn-imagenet/blob/master/lib/datasets/imagenet.py#L26/)
Load the meta_det.mat file by
```
classes = sio.loadmat(os.path.join(self._devkit_path, 'data', 'meta_det.mat'))
```

```Shell
cd $FRCN_ROOT/data
ln -s $VOCdevkit VOCdevkit2007
```
Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012
6. Follow the next sections to download pre-trained ImageNet models
## Construct IMDB file
There's are several file you need to modify.

### Download pre-trained ImageNet models
#### factory_imagenet.py
This file is in the directory **$FRCNN_ROOT/lib/datasets**($FRCNN_ROOT is the where your faster rcnn locate) and is called by train_net_imagenet.py.
It is the interface loading the imdb file.
```
for split in ['train', 'val', 'val1', 'val2', 'test']:
name = 'imagenet_{}'.format(split)
devkit_path = '/media/VSlab2/imagenet/ILSVRC13'
__sets[name] = (lambda split=split, devkit_path=devkit_path:datasets.imagenet.imagenet(split,devkit_path))
```
#### imagenet.py
##### In function __ __init__ __(self, image_set, devkit_path)
we have to enlarge the number of category from 20+1 into 200+1 categories. Note that in imagenet dataset, the object category is something like "n02691156", instead of "airplane"
```
self._data_path = os.path.join(self._devkit_path, 'ILSVRC2013_DET_' + self._image_set[:-1])
synsets = sio.loadmat(os.path.join(self._devkit_path, 'data', 'meta_det.mat'))
self._classes = ('__background__',)
self._wnid = (0,)
for i in xrange(200):
self._classes = self._classes + (synsets['synsets'][0][i][2][0],)
self._wnid = self._wnid + (synsets['synsets'][0][i][1][0],)
self._wnid_to_ind = dict(zip(self._wnid, xrange(self.num_classes)))
self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes)))
```
self._class denotes the class name
self._wnid denotes the id of the category

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.
##### In function _load_imagenet_annotation(self, index)
This is because in the pascal voc dataset, all coordinates start from one, so in order to make them start from 0, we need to minus 1. But this is not true for imagenet, so we should not minus 1.
So we need to modify these lines to:
```
for ix, obj in enumerate(objs):
x1 = float(get_data_from_tag(obj, 'xmin'))
y1 = float(get_data_from_tag(obj, 'ymin'))
x2 = float(get_data_from_tag(obj, 'xmax'))
y2 = float(get_data_from_tag(obj, 'ymax'))
cls = self._wnid_to_ind[str(get_data_from_tag(obj, "name")).lower().strip()]
```
Noted that in faster rcnnn, we don't need to run the selective-search, which is the main difference from fast rcnn.
## Modify the prototxt
Under the directory **$FRCNN_ROOT/**
#### train.prototxt
Change the number of classes into 200+1
```
param_str: "'num_classes': 201"
```
In layer "bbox_pred", change the number of output into (200+1)*4
```
num_output: 804
```
You can modify the **test.prototxt** in the same way.

```Shell
cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh
## [Last step] Modify the shell script
Under the dircetory **$FRCNN_ROOT/experiments/scripts**
#### faster_rcnn_end2end_imagenet.sh
You can specify which dataset to train/test on and your what pre-trainded model is
```
ITERS=100000
DATASET_TRAIN=imagenet_val1
DATASET_TEST=imagenet_val2
NET_INIT=data/imagenet_models/${NET}.v2.caffemodel
```
VGG16 comes from the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but is provided here for your convenience.
ZF was trained at MSRA.
## Start to Train Faster RCNN On Imagenet!
Run the **$FRCNN/experiments/scripts/faster_rcnn_end2end_imagenet.sh**.
The use of .sh file is just the same as the original [faster rcnn ](https://github.com/rbgirshick/py-faster-rcnn)

### Usage
## Experiment
This is the mean/median AP of different iterations.The highest mean AP falls in 90000 iterations.
![](https://github.com/andrewliao11/py-faster-rcnn/blob/master/asset/mAP_imagenet.png?raw=true)

To train and test a Faster R-CNN detector using the **alternating optimization** algorithm from our NIPS 2015 paper, use `experiments/scripts/faster_rcnn_alt_opt.sh`.
Output is written underneath `$FRCN_ROOT/output`.
The original Faster R-CNN states that they can achieve **59.9% mAP** on PASCAL VOC 2007, which only contains 20 categories. The result of mine is relatively low compared to the original work. However, this is the trade-off since we increase the diversity of the object categories. My network can achieve **33.1% mAP**.

```Shell
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701
```
The low accuracy is due to:
- Smaller dataset( ImageNet validation1 )
- Diverse object category

("alt opt" refers to the alternating optimization training algorithm described in the NIPS paper.)
So here I present the result of the overlapped category. My model achieves **48.7% mAP** from the object category that appears in PASCAL VOC 2007 (12 categories), which is much higher than that of 200 categories.
![](https://github.com/andrewliao11/py-faster-rcnn/blob/master/asset/mAP_overlap.png?raw=true)

To train and test a Faster R-CNN detector using the **approximate joint training** method, use `experiments/scripts/faster_rcnn_end2end.sh`.
Output is written underneath `$FRCN_ROOT/output`.
And I also present the mAP for each category in ImageNet
![](https://github.com/andrewliao11/py-faster-rcnn/blob/master/asset/mAP_200.png?raw=true)

```Shell
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701
```
## Demo
Just run the **demo.py** to visualize pictures!
![demo_02](https://github.com/andrewliao11/py-faster-rcnn/blob/master/tools/output_demo_02.jpg?raw=true)

### faster rcnn with tracker on videos
[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/wY7LADoEuFs/0.jpg)](http://www.youtube.com/watch?v=wY7LADoEuFs)

This method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these [slides](https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0) for more details.
Original video "https://www.jukinmedia.com/videos/view/5655"
## Reference
[How to train fast rcnn on imagenet](http://sunshineatnoon.github.io/Train-fast-rcnn-model-on-imagenet-without-matlab/)
Binary file added asset/loss_bbox.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added asset/loss_cls.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added asset/loss_rpn_bbox.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added asset/loss_rpn_cls.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added asset/mAP_200.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added asset/mAP_imagenet.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added asset/mAP_overlap.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo/demo_01.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo/demo_02.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added data/demo/demo_03.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 48 additions & 0 deletions experiments/scripts/faster_rcnn_end2end_imagenet.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/bash
# Usage:
# ./experiments/scripts/default_faster_rcnn.sh GPU NET [--set ...]
# Example:
# ./experiments/scripts/default_faster_rcnn.sh 0 ZF \
# --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES "[400,500,600,700]"

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
NET=$2
NET_lc=${NET,,}
ITERS=100000
DATASET_TRAIN=imagenet_val1
DATASET_TEST=imagenet_val2

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:2:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

LOG="experiments/logs/faster_rcnn_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

NET_INIT=data/imagenet_models/${NET}.v2.caffemodel

time ./tools/train_net_imagenet.py --gpu ${GPU_ID} \
--solver models/${NET}/faster_rcnn_end2end/solver.prototxt \
--weights ${NET_INIT} \
--imdb ${DATASET_TRAIN} \
--iters ${ITERS} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}

set +x
NET_FINAL=`grep -B 1 "done solving" ${LOG} | grep "Wrote snapshot" | awk '{print $4}'`
set -x

time ./tools/test_net_imagenet.py --gpu ${GPU_ID} \
--def models/${NET}/faster_rcnn_end2end/test.prototxt \
--net ${NET_FINAL} \
--imdb ${DATASET_TEST} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}
37 changes: 37 additions & 0 deletions experiments/scripts/test_faster_rcnn_end2end_imagenet.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash
# Usage:
# ./experiments/scripts/default_faster_rcnn.sh GPU NET [--set ...]
# Example:
# ./experiments/scripts/default_faster_rcnn.sh 0 ZF \
# --set EXP_DIR foobar RNG_SEED 42 TRAIN.SCALES "[400,500,600,700]"

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
NET=$2
NET_lc=${NET,,}
ITERS=100000
DATASET_TRAIN=imagenet_val1
DATASET_TEST=imagenet_val2

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:2:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

LOG="experiments/logs/faster_rcnn_${NET}_${EXTRA_ARGS_SLUG}.txt.`date +'%Y-%m-%d_%H-%M-%S'`"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

NET_INIT=data/imagenet_models/${NET}.v2.caffemode
NET_FINAL=/home/andrewliao11/py-faster-rcnn/output/faster_rcnn_end2end/val1/vgg16_faster_rcnn_iter_70000.caffemodel

time ./tools/test_net_imagenet.py --gpu ${GPU_ID} \
--def models/${NET}/faster_rcnn_end2end/test.prototxt \
--net ${NET_FINAL} \
--imdb ${DATASET_TEST} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}
Loading