forked from sfzhang15/ATSS
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 5f1c9fe
Showing
291 changed files
with
24,435 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# This is an example .flake8 config, used when developing *Black* itself. | ||
# Keep in sync with setup.cfg which is used for source packages. | ||
|
||
[flake8] | ||
ignore = E203, E266, E501, W503 | ||
max-line-length = 80 | ||
max-complexity = 18 | ||
select = B,C,E,F,W,T4,B9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Auto detect text files and perform LF normalization | ||
* text=auto |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# compilation and distribution | ||
__pycache__ | ||
_ext | ||
*.pyc | ||
*.so | ||
atss.egg-info/ | ||
build/ | ||
dist/ | ||
|
||
# pytorch/python/numpy formats | ||
*.pth | ||
*.pkl | ||
*.npy | ||
|
||
# ipython/jupyter notebooks | ||
*.ipynb | ||
**/.ipynb_checkpoints/ | ||
|
||
# Editor temporaries | ||
*.swn | ||
*.swo | ||
*.swp | ||
*~ | ||
|
||
# Pycharm editor settings | ||
.idea | ||
|
||
# project dirs | ||
/datasets | ||
/models | ||
/experiments |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
## Abstractions | ||
The main abstractions introduced by `maskrcnn_benchmark` that are useful to | ||
have in mind are the following: | ||
|
||
### ImageList | ||
In PyTorch, the first dimension of the input to the network generally represents | ||
the batch dimension, and thus all elements of the same batch have the same | ||
height / width. | ||
In order to support images with different sizes and aspect ratios in the same | ||
batch, we created the `ImageList` class, which holds internally a batch of | ||
images (os possibly different sizes). The images are padded with zeros such that | ||
they have the same final size and batched over the first dimension. The original | ||
sizes of the images before padding are stored in the `image_sizes` attribute, | ||
and the batched tensor in `tensors`. | ||
We provide a convenience function `to_image_list` that accepts a few different | ||
input types, including a list of tensors, and returns an `ImageList` object. | ||
|
||
```python | ||
from maskrcnn_benchmark.structures.image_list import to_image_list | ||
|
||
images = [torch.rand(3, 100, 200), torch.rand(3, 150, 170)] | ||
batched_images = to_image_list(images) | ||
|
||
# it is also possible to make the final batched image be a multiple of a number | ||
batched_images_32 = to_image_list(images, size_divisible=32) | ||
``` | ||
|
||
### BoxList | ||
The `BoxList` class holds a set of bounding boxes (represented as a `Nx4` tensor) for | ||
a specific image, as well as the size of the image as a `(width, height)` tuple. | ||
It also contains a set of methods that allow to perform geometric | ||
transformations to the bounding boxes (such as cropping, scaling and flipping). | ||
The class accepts bounding boxes from two different input formats: | ||
- `xyxy`, where each box is encoded as a `x1`, `y1`, `x2` and `y2` coordinates, and | ||
- `xywh`, where each box is encoded as `x1`, `y1`, `w` and `h`. | ||
|
||
Additionally, each `BoxList` instance can also hold arbitrary additional information | ||
for each bounding box, such as labels, visibility, probability scores etc. | ||
|
||
Here is an example on how to create a `BoxList` from a list of coordinates: | ||
```python | ||
from maskrcnn_benchmark.structures.bounding_box import BoxList, FLIP_LEFT_RIGHT | ||
|
||
width = 100 | ||
height = 200 | ||
boxes = [ | ||
[0, 10, 50, 50], | ||
[50, 20, 90, 60], | ||
[10, 10, 50, 50] | ||
] | ||
# create a BoxList with 3 boxes | ||
bbox = BoxList(boxes, image_size=(width, height), mode='xyxy') | ||
|
||
# perform some box transformations, has similar API as PIL.Image | ||
bbox_scaled = bbox.resize((width * 2, height * 3)) | ||
bbox_flipped = bbox.transpose(FLIP_LEFT_RIGHT) | ||
|
||
# add labels for each bbox | ||
labels = torch.tensor([0, 10, 1]) | ||
bbox.add_field('labels', labels) | ||
|
||
# bbox also support a few operations, like indexing | ||
# here, selects boxes 0 and 2 | ||
bbox_subset = bbox[[0, 2]] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Code of Conduct | ||
|
||
Facebook has adopted a Code of Conduct that we expect project participants to adhere to. | ||
Please read the [full text](https://code.fb.com/codeofconduct/) | ||
so that you can understand what actions will and will not be tolerated. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Contributing to Mask-RCNN Benchmark | ||
We want to make contributing to this project as easy and transparent as | ||
possible. | ||
|
||
## Our Development Process | ||
Minor changes and improvements will be released on an ongoing basis. Larger changes (e.g., changesets implementing a new paper) will be released on a more periodic basis. | ||
|
||
## Pull Requests | ||
We actively welcome your pull requests. | ||
|
||
1. Fork the repo and create your branch from `master`. | ||
2. If you've added code that should be tested, add tests. | ||
3. If you've changed APIs, update the documentation. | ||
4. Ensure the test suite passes. | ||
5. Make sure your code lints. | ||
6. If you haven't already, complete the Contributor License Agreement ("CLA"). | ||
|
||
## Contributor License Agreement ("CLA") | ||
In order to accept your pull request, we need you to submit a CLA. You only need | ||
to do this once to work on any of Facebook's open source projects. | ||
|
||
Complete your CLA here: <https://code.facebook.com/cla> | ||
|
||
## Issues | ||
We use GitHub issues to track public bugs. Please ensure your description is | ||
clear and has sufficient instructions to be able to reproduce the issue. | ||
|
||
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe | ||
disclosure of security bugs. In those cases, please go through the process | ||
outlined on that page and do not file a public issue. | ||
|
||
## Coding Style | ||
* 4 spaces for indentation rather than tabs | ||
* 80 character line length | ||
* PEP8 formatting following [Black](https://black.readthedocs.io/en/stable/) | ||
|
||
## License | ||
By contributing to Mask-RCNN Benchmark, you agree that your contributions will be licensed | ||
under the LICENSE file in the root directory of this source tree. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
## Installation | ||
|
||
### Requirements: | ||
- PyTorch >= 1.0. Installation instructions can be found in https://pytorch.org/get-started/locally/. | ||
- torchvision==0.2.1 | ||
- cocoapi | ||
- yacs | ||
- matplotlib | ||
- GCC >= 4.9,< 6.0 | ||
- (optional) OpenCV for the webcam demo | ||
|
||
### Option 1: Step-by-step installation | ||
|
||
```bash | ||
# first, make sure that your conda is setup properly with the right environment | ||
# for that, check that `which conda`, `which pip` and `which python` points to the | ||
# right path. From a clean conda env, this is what you need to do | ||
|
||
conda create --name ATSS | ||
conda activate ATSS | ||
|
||
# this installs the right pip and dependencies for the fresh python | ||
conda install ipython | ||
|
||
# ATSS and coco api dependencies | ||
pip install ninja yacs cython matplotlib tqdm | ||
|
||
# follow PyTorch installation in https://pytorch.org/get-started/locally/ | ||
# we give the instructions for CUDA 9.0 | ||
conda install -c pytorch torchvision=0.2.1 cudatoolkit=9.0 | ||
|
||
export INSTALL_DIR=$PWD | ||
|
||
# install pycocotools. Please make sure you have installed cython. | ||
cd $INSTALL_DIR | ||
git clone https://github.com/cocodataset/cocoapi.git | ||
cd cocoapi/PythonAPI | ||
python setup.py build_ext install | ||
|
||
# install PyTorch Detection | ||
cd $INSTALL_DIR | ||
git clone https://github.com/sfzhang15/ATSS.git | ||
cd ATSS | ||
|
||
# the following will install the lib with | ||
# symbolic links, so that you can modify | ||
# the files if you want and won't need to | ||
# re-build it | ||
python setup.py build develop --no-deps | ||
|
||
|
||
unset INSTALL_DIR | ||
|
||
# or if you are on macOS | ||
# MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build develop | ||
``` | ||
|
||
### Option 2: Docker Image (Requires CUDA, Linux only) | ||
*The following steps are for original maskrcnn-benchmark. Please change the repository name if needed.* | ||
|
||
Build image with defaults (`CUDA=9.0`, `CUDNN=7`, `FORCE_CUDA=1`): | ||
|
||
nvidia-docker build -t maskrcnn-benchmark docker/ | ||
|
||
Build image with other CUDA and CUDNN versions: | ||
|
||
nvidia-docker build -t maskrcnn-benchmark --build-arg CUDA=9.2 --build-arg CUDNN=7 docker/ | ||
|
||
Build image with FORCE_CUDA disabled: | ||
|
||
nvidia-docker build -t maskrcnn-benchmark --build-arg FORCE_CUDA=0 docker/ | ||
|
||
Build and run image with built-in jupyter notebook(note that the password is used to log in jupyter notebook): | ||
|
||
nvidia-docker build -t maskrcnn-benchmark-jupyter docker/docker-jupyter/ | ||
nvidia-docker run -td -p 8888:8888 -e PASSWORD=<password> -v <host-dir>:<container-dir> maskrcnn-benchmark-jupyter |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
ATSS for non-commercial purposes | ||
|
||
Copyright (c) 2019 the authors | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
* Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
* Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE | ||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR | ||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | ||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection | ||
|
||
[![License](https://img.shields.io/badge/license-BSD-blue.svg)](LICENSE) | ||
|
||
By [Shifeng Zhang](http://www.cbsr.ia.ac.cn/users/sfzhang/), [Cheng Chi](https://chicheng123.github.io/), [Yongqiang Yao](https://github.com/yqyao), [Zhen Lei](http://www.cbsr.ia.ac.cn/users/zlei/), [Stan Z. Li](http://www.cbsr.ia.ac.cn/users/szli/). | ||
|
||
## Introduction | ||
|
||
In this work, we first point out that the essential difference between anchor-based and anchor-free detection is actually **how to define positive and negative training samples**. Then we propose an Adaptive Training Sample Selection (ATSS) to automatically select positive and negative samples according to statistical characteristics of object, which significantly improves the performance of anchor-based and anchor-free detectors and bridges the gap between them. Finally, we demonstrate that tiling multiple anchors per location on the image to detect objects is a thankless operation under current situations. Extensive experiments conducted on MS COCO support our aforementioned analysis and conclusions. With the newly introduced ATSS, we improve state-of-the-art detectors by a large margin to 50.7% AP without introducing any overhead. For more details, please refer to our [paper](https://arxiv.org/pdf/1912.0xxxxpdf). | ||
|
||
## Installation | ||
This ATSS implementation is based on [FCOS](https://github.com/tianzhi0549/FCOS) and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) and the installation is the same as them. Please check [INSTALL.md](INSTALL.md) for installation instructions. | ||
|
||
## A quick demo | ||
Once the installation is done, you can download *ATSS_R_50_FPN_1x.pth* from [Google](https://drive.google.com/open?id=1t8RLdQ6fsFXa0kzPIQ7541uZeQeMXP73) or [Baidu](https://pan.baidu.com/s/1bYXjWJE35kHLpQAIeWtZ0g) to run a quick demo. | ||
|
||
# assume that you are under the root directory of this project, | ||
# and you have activated your virtual environment if needed. | ||
python demo/atss_demo.py | ||
|
||
|
||
|
||
## Inference | ||
The inference command line on coco minival split: | ||
|
||
python tools/test_net.py \ | ||
--config-file configs/atss/atss_R_50_FPN_1x.yaml \ | ||
MODEL.WEIGHT ATSS_R_50_FPN_1x.pth \ | ||
TEST.IMS_PER_BATCH 4 | ||
|
||
Please note that: | ||
1) If your model's name is different, please replace `ATSS_R_50_FPN_1x.pth` with your own. | ||
2) If you enounter out-of-memory error, please try to reduce `TEST.IMS_PER_BATCH` to 1. | ||
3) If you want to evaluate a different model, please change `--config-file` to its config file (in [configs/atss](configs/atss)) and `MODEL.WEIGHT` to its weights file. | ||
|
||
## Models | ||
For your convenience, we provide the following trained models. All models are trained with 16 images in a mini-batch and frozen batch normalization (i.e., consistent with models in [FCOS](https://github.com/tianzhi0549/FCOS) and [maskrcnn_benchmark](https://github.com/facebookresearch/maskrcnn-benchmark)).* | ||
|
||
Model | Multi-scale training | Testing time / im | AP (minival) | AP (test-dev) | Link | ||
--- |:---:|:---:|:---:|:---:|:---: | ||
ATSS_R_50_FPN_1x | No | 44ms | 39.3 | 39.3 | [Google](https://drive.google.com/open?id=1t8RLdQ6fsFXa0kzPIQ7541uZeQeMXP73)/[Baidu](https://pan.baidu.com/s/1bYXjWJE35kHLpQAIeWtZ0g) | ||
ATSS_dcnv2_R_50_FPN_1x | No | 54ms | 43.2 | 43.0 | [Google](https://drive.google.com/open?id=1_Zl6sVrNZbvawxtMdvNSE9wgURmkLLka)/[Baidu](https://pan.baidu.com/s/1baZJMCCy_waR0hhChgEQFA) | ||
ATSS_R_101_FPN_2x | Yes | 57ms | 43.5 | 43.6 | [Google](https://drive.google.com/open?id=1jenAgiLLqome8nn5ghV7wmknfr1Xg_Dw)/[Baidu](https://pan.baidu.com/s/1hiAew46s877dpgAZ-AweLw) | ||
ATSS_dcnv2_R_101_FPN_2x | Yes | 73ms | 46.1 | 46.3 | [Google](https://drive.google.com/open?id=17S-M6UILyS18s5RW1T6lWFi8nrKMhwd7)/[Baidu](https://pan.baidu.com/s/1eakRoQIqR-UmjWT4RM8vyQ) | ||
ATSS_X_101_32x8d_FPN_2x | Yes | 110ms | 44.8 | 45.1 | [Google](https://drive.google.com/open?id=1jFTdsQD2KfR9Dh1NgX05_02wfQxlnmD3)/[Baidu](https://pan.baidu.com/s/1uO3ZLstI7tkVQBayjRy-6w) | ||
ATSS_dcnv2_X_101_32x8d_FPN_2x | Yes | 143ms | 47.7 | 47.7 | [Google](https://drive.google.com/open?id=19E7vh7YCq0ZpvRIaswDMWGRmwcGK56Bz)/[Baidu](https://pan.baidu.com/s/1pOMZGb3UZb7u_lTqUk55Mw) | ||
ATSS_X_101_64x4d_FPN_2x | Yes | 112ms | 45.5 | 45.6 | [Google](https://drive.google.com/open?id=1ECj7mQwZowiTsSwDXU5Q_Ab2tG-Byhsk)/[Baidu](https://pan.baidu.com/s/1LxNkz0To_mGWGRbtzA78bw) | ||
ATSS_dcnv2_X_101_64x4d_FPN_2x | Yes | 144ms | 47.7 | 47.7 | [Google](https://drive.google.com/open?id=1Lmhtn71AgJC_6B5iqU8-PG_rYanKEr2k)/[Baidu](https://pan.baidu.com/s/1nzX-lUvZfnV--fj6OwsnmQ) | ||
[1] *The testing time is taken from [FCOS](https://github.com/tianzhi0549/FCOS), because our method only redefines positive and negative training samples without incurring any additional overhead.* \ | ||
[2] *1x and 2x mean the model is trained for 90K and 180K iterations, respectively.* \ | ||
[3] *All results are obtained with a single model and without any test time data augmentation such as multi-scale, flipping and etc..* \ | ||
[4] *`dcnv2` denotes deformable convolutional networks v2. Note that for ResNet based models, we apply deformable convolutions from stage c3 to c5 in backbones. For ResNeXt based models, only stage c4 and c5 use deformable convolutions. All models use deformable convolutions in the last layer of detector towers.* \ | ||
[5] *The model `ATSS_dcnv2_X_101_64x4d_FPN_2x` with multi-scale testing achieves 50.7% in AP on COCO test-dev. Please use `TEST.BBOX_AUG.ENABLED True` to enable multi-scale testing.* | ||
|
||
## Training | ||
|
||
The following command line will train ATSS_R_50_FPN_1x on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD): | ||
|
||
python -m torch.distributed.launch \ | ||
--nproc_per_node=8 \ | ||
--master_port=$((RANDOM + 10000)) \ | ||
tools/train_net.py \ | ||
--config-file configs/atss/atss_R_50_FPN_1x.yaml \ | ||
DATALOADER.NUM_WORKERS 2 \ | ||
OUTPUT_DIR training_dir/atss_R_50_FPN_1x | ||
Please note that: | ||
1) If you want to use fewer GPUs, please change `--nproc_per_node` to the number of GPUs. No other settings need to be changed. The total batch size does not depends on `nproc_per_node`. If you want to change the total batch size, please change `SOLVER.IMS_PER_BATCH` in [configs/atss/atss_R_50_FPN_1x.yaml](configs/atss/atss_R_50_FPN_1x.yaml). | ||
2) The models will be saved into `OUTPUT_DIR`. | ||
3) If you want to train ATSS with other backbones, please change `--config-file`. | ||
|
||
## Contributing to the project | ||
Any pull requests or issues are welcome. | ||
|
||
## Citations | ||
Please cite our paper in your publications if it helps your research: | ||
``` | ||
@article{zhang2019bridging, | ||
title = {Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection}, | ||
author = {Zhang, Shifeng and Chi, Cheng and Yao, Yongqiang and Lei, Zhen and Li, Stan Z.}, | ||
journal = {arXiv preprint arXiv:1912.0xxxx}, | ||
year = {2019} | ||
} | ||
``` |
Oops, something went wrong.