forked from naver-ai/calm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
SeongJoonOh
authored and
SeongJoonOh
committed
Jun 11, 2021
0 parents
commit 483a1c6
Showing
76 changed files
with
3,267,599 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
CALM | ||
Copyright (c) 2021-present NAVER Corp. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
CALM | ||
Copyright (c) 2021-present NAVER Corp. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
|
||
-------------------------------------------------------------------------------------- | ||
|
||
This project contains subcomponents with separate copyright notices and license terms. | ||
Your use of the source code for these subcomponents is subject to the terms and conditions of the following licenses. | ||
|
||
===== | ||
|
||
pytorch/vision | ||
https://github.com/pytorch/vision | ||
|
||
BSD 3-Clause License | ||
|
||
Copyright (c) Soumith Chintala 2016, | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
* Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
* Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
* Neither the name of the copyright holder nor the names of its | ||
contributors may be used to endorse or promote products derived from | ||
this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE | ||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR | ||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | ||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
===== |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
## Keep CALM and Improve Visual Feature Attribution | ||
|
||
Jae Myung Kim<sup>1*</sup>, Junsuk Choe<sup>1*</sup>, Zeynep Akata<sup>2</sup>, Seong Joon Oh<sup>1†</sup> | ||
<sub>\* Equal contribution</sub> <sub>†</sub> <sub> Corresponding author </sub> | ||
|
||
<sup>1</sup> <sub>NAVER AI LAB</sub> <sup>2</sup> <sub>University of Tübingen</sub> | ||
|
||
|
||
<p align="center"> | ||
<img src="teaser.png" width="70%" title="" alt="CAM vs CALM"></img> | ||
</p> | ||
|
||
### Abstract | ||
The class activation mapping, or CAM, has been the cornerstone of feature attribution methods for multiple vision tasks. Its simplicity and effectiveness have led to wide applications in the explanation of visual predictions and weakly-supervised localization tasks. However, CAM has its own shortcomings. The computation of attribution maps relies on ad-hoc calibration steps that are not part of the training computational graph, making it difficult for us to understand the real meaning of the attribution values. In this paper, we improve CAM by explicitly incorporating a latent variable encoding the location of the cue for recognition in the formulation, thereby subsuming the attribution map into the training computational graph. The resulting model, ***class activation latent mapping***, or ***CALM***, is trained with the expectation-maximization algorithm. Our experiments show that CALM identifies discriminative attributes for image classifiers more accurately than CAM and other visual attribution baselines. CALM also shows performance improvements over prior arts on the weakly-supervised object localization benchmarks. | ||
|
||
### Dataset downloading | ||
For ImageNet and CUB datasets, please follow the common procedure for downloading the datasets. <br> | ||
For ImageNetV2, CUBV2, and OpenImages30k, please follow the procedure introduced in [wsol-evaluation page](https://github.com/naver-ai/wsolevaluation#2-dataset-downloading-and-license). | ||
|
||
### How to use models | ||
You can train CALM models by | ||
``` | ||
$ python main.py --experiment_name=experiment_name/ \ | ||
--architecture=resnet50 \ | ||
--attribution_method=CALM_EM \ | ||
--dataset=CUB \ | ||
--use_bn=True --large_feature_map=True | ||
``` | ||
|
||
You can evaluate the models on two different metrics, | ||
``` | ||
$ python eval_pixel_perturb.py --experiment_name=experiment_name/ \ | ||
--architecture=resnet50 \ | ||
--attribution_method=CALM_EM \ | ||
--dataset=CUB \ | ||
--use_bn=True --large_feature_map=True \ | ||
--use_load_checkpoint=True \ | ||
--load_checkpoint=checkpoint_name/ \ | ||
--score_map_process=jointll --norm_type=clipping & | ||
$ python eval_cue_location.py --experiment_name=experiment_name/ \ | ||
--architecture=resnet50 \ | ||
--attribution_method=CALM_EM \ | ||
--dataset=CUB \ | ||
--use_bn=True --large_feature_map=True \ | ||
--use_load_checkpoint=True \ | ||
--load_checkpoint=checkpoint_name/ \ | ||
--score_map_process=jointll --norm_type=clipping --threshold_type=log & | ||
``` | ||
|
||
### Pretrained weights | ||
For those who wish to use pretrained CALM weights, | ||
| Model name | Dataset | cls. accuracy | weights | | ||
|:-------:|:--------:|:--------:|:--------:| | ||
| CALM_EM | CUB | 71.8 | [link](https://drive.google.com/file/d/1XfBEc1Lh24WqJZP1aLLrwGSlj_INMVef/view?usp=sharing) | | ||
| CALM_EM | OpenImages | 70.1 | [link](https://drive.google.com/file/d/11250uUiRNafuTbnx2h4kryiVI-7218mW/view?usp=sharing) | | ||
| CALM_EM | ImageNet | 70.4 | [link](https://drive.google.com/file/d/17451YI9KANnkmmn2ix0N-uHYZljsq4Lc/view?usp=sharing) | | ||
| CALM_ML | CUB | 59.6 | [link](https://drive.google.com/file/d/1JgupGC2EoIX8wqpYgKPS-3kSLF-29Z8g/view?usp=sharing) | | ||
| CALM_ML | OpenImages | 70.9 | [link](https://drive.google.com/file/d/1QHhiRjO_Oz_yIl64PJqMeCJzK1KUlffA/view?usp=sharing) | | ||
| CALM_ML | ImageNet | 70.6 | [link](https://drive.google.com/file/d/131VHERtxDC-45MhIKgGok-1WxCXUlvdC/view?usp=sharing) | | ||
|
||
### Explainability scores | ||
Cue localization and Remove-and-classify results. More details about the metrics are in the paper. | ||
Cue localization <br> (the higher, the better) | Remove-and-classify <br> (the lower, the better) | ||
:-------------------------:|:-------------------------: | ||
![](img_cue_localization.png) | ![](img_remove_and_classify.png) | ||
|
||
### License | ||
|
||
``` | ||
Copyright (c) 2021-present NAVER Corp. | ||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
""" | ||
CALM | ||
Copyright (c) 2021-present NAVER Corp. | ||
MIT license | ||
""" | ||
|
||
import argparse | ||
import munch | ||
import importlib | ||
import os | ||
|
||
from os.path import join as ospj | ||
import shutil | ||
|
||
from util import Logger | ||
|
||
_DATASET_NAMES = ('CUB', 'ILSVRC', 'OpenImages') | ||
_ARCHITECTURE_NAMES = ('vgg16', 'resnet50', 'inception_v3') | ||
_ATTRIBUTION_METHODS = ('CAM', 'CALM-EM', 'CALM-ML') | ||
_SCORE_MAP_METHOD_NAMES = ('activation_map', 'backprop') | ||
_SCORE_MAP_PROCESS_NAMES = ( | ||
'vanilla', 'vanilla-saliency', 'vanilla-superclass', | ||
'jointll', 'jointll-superclass', 'jointll-superclass-mean', | ||
'gtcond', 'gtcond-superclass', 'gtcond-superclass-mean', | ||
'saliency', | ||
'input_grad', 'integrated_grad', 'smooth_grad', 'var_grad') | ||
_NORM_TYPES = ('max', 'minmax', 'clipping') | ||
_THRESHOLD_TYPES = ('even', 'log') | ||
_SPLITS = ('train', 'val', 'test') | ||
_LOGGER_TYPE = ('PythonLogger') | ||
|
||
|
||
def mch(**kwargs): | ||
return munch.Munch(dict(**kwargs)) | ||
|
||
|
||
def str2bool(v): | ||
if v.lower() in ('yes', 'true', 't', 'y', '1'): | ||
return True | ||
elif v.lower() in ('no', 'false', 'f', 'n', '0'): | ||
return False | ||
else: | ||
raise argparse.ArgumentTypeError('Boolean value expected.') | ||
|
||
|
||
def configure_data_paths(args): | ||
train, val, test = set_data_path( | ||
dataset_name=args.dataset_name, | ||
data_root=args.data_root | ||
) | ||
data_paths = mch(train=train, val=val, test=test) | ||
return data_paths | ||
|
||
|
||
def set_data_path(dataset_name, data_root): | ||
if dataset_name == 'ILSVRC': | ||
train = test = ospj(data_root, dataset_name) | ||
val = ospj(data_root, 'ImageNetV2') | ||
elif dataset_name == 'CUB': | ||
train = test = ospj(data_root, dataset_name, 'images') | ||
val = ospj(data_root, 'CUBV2') | ||
elif dataset_name == 'OpenImages': | ||
train = val = test = ospj(data_root, dataset_name) | ||
else: | ||
raise ValueError("Dataset {} unknown.".format(dataset_name)) | ||
return train, val, test | ||
|
||
|
||
def configure_mask_root(args): | ||
mask_root = ospj(args.mask_root, 'OpenImages') | ||
return mask_root | ||
|
||
|
||
def configure_log_folder(args): | ||
log_folder = ospj(args.save_root, args.experiment_name) | ||
|
||
if os.path.isdir(log_folder): | ||
if args.override_cache: | ||
shutil.rmtree(log_folder, ignore_errors=True) | ||
else: | ||
raise RuntimeError("Experiment with the same name exists: {}" | ||
.format(log_folder)) | ||
os.makedirs(log_folder) | ||
return log_folder | ||
|
||
|
||
def configure_log(args): | ||
log_file_name = ospj(args.log_folder, 'log.log') | ||
Logger(log_file_name) | ||
|
||
|
||
def configure_reporter(args): | ||
reporter = importlib.import_module('util').Reporter | ||
reporter_log_root = ospj(args.log_folder, 'reports') | ||
if not os.path.isdir(reporter_log_root): | ||
os.makedirs(reporter_log_root) | ||
return reporter, reporter_log_root | ||
|
||
|
||
def configure_pretrained_path(args): | ||
pretrained_path = None | ||
return pretrained_path | ||
|
||
|
||
def get_configs(): | ||
parser = argparse.ArgumentParser() | ||
|
||
# Util | ||
parser.add_argument('--seed', type=int) | ||
parser.add_argument('--experiment_name', type=str, default='result') | ||
parser.add_argument('--override_cache', type=str2bool, nargs='?', | ||
const=True, default=False) | ||
parser.add_argument('--workers', default=4, type=int, | ||
help='number of data loading workers (default: 4)') | ||
parser.add_argument('--use_load_checkpoint', type=str2bool, nargs='?', | ||
const=True, default=False) | ||
parser.add_argument('--load_checkpoint', type=str, default=None, | ||
help='folder name for loading ckeckpoint') | ||
parser.add_argument('--is_different_checkpoint', type=str2bool, | ||
nargs='?', const=True, default=False) | ||
parser.add_argument('--save_root', type=str, default='save') | ||
parser.add_argument('--logger_type', type=str, | ||
default='PythonLogger', choices=_LOGGER_TYPE) | ||
|
||
# Data | ||
parser.add_argument('--dataset_name', type=str, default='CUB', | ||
choices=_DATASET_NAMES) | ||
parser.add_argument('--data_root', metavar='/PATH/TO/DATASET', | ||
default='dataset/', | ||
help='path to dataset images') | ||
parser.add_argument('--metadata_root', type=str, default='metadata/') | ||
parser.add_argument('--mask_root', metavar='/PATH/TO/MASKS', | ||
default='dataset/', | ||
help='path to masks') | ||
parser.add_argument('--proxy_training_set', type=str2bool, nargs='?', | ||
const=True, default=False, | ||
help='Efficient hyper_parameter search with a proxy ' | ||
'training set.') | ||
|
||
# Setting | ||
parser.add_argument('--architecture', default='resnet18', | ||
choices=_ARCHITECTURE_NAMES, | ||
help='model architecture: ' + | ||
' | '.join(_ARCHITECTURE_NAMES) + | ||
' (default: resnet18)') | ||
parser.add_argument('--attribution_method', type=str, default='CAM', | ||
choices=_ATTRIBUTION_METHODS) | ||
parser.add_argument('--is_train', type=str2bool, nargs='?', | ||
const=True, default=True) | ||
parser.add_argument('--epochs', default=40, type=int, | ||
help='number of total epochs to run') | ||
parser.add_argument('--pretrained', type=str2bool, nargs='?', | ||
const=True, default=True, | ||
help='Use pre_trained model.') | ||
parser.add_argument('--cam_curve_interval', type=float, default=.001, | ||
help='CAM curve interval') | ||
parser.add_argument('--resize_size', type=int, default=256, | ||
help='input resize size') | ||
parser.add_argument('--crop_size', type=int, default=224, | ||
help='input crop size') | ||
|
||
# Common hyperparameters | ||
parser.add_argument('--batch_size', default=64, type=int, | ||
help='Mini-batch size (default: 256), this is the total' | ||
'batch size of all GPUs on the current node when' | ||
'using Data Parallel or Distributed Data Parallel') | ||
parser.add_argument('--lr', default=0.01, type=float, | ||
help='initial learning rate', dest='lr') | ||
parser.add_argument('--lr_decay_frequency', type=int, default=30, | ||
help='How frequently do we decay the learning rate?') | ||
parser.add_argument('--lr_classifier_ratio', type=float, default=10, | ||
help='Multiplicative factor on the classifier layer.') | ||
parser.add_argument('--momentum', default=0.9, type=float, | ||
help='momentum') | ||
parser.add_argument('--weight_decay', default=1e-4, type=float, | ||
help='weight decay (default: 1e-4)', | ||
dest='weight_decay') | ||
parser.add_argument('--use_bn', type=str2bool, nargs='?', | ||
const=True, default=False) | ||
parser.add_argument('--large_feature_map', type=str2bool, nargs='?', | ||
const=True, default=False) | ||
parser.add_argument('--iou_thresholds', nargs='+', | ||
type=int, default=[30, 50, 70]) | ||
|
||
# Method-specific hyperparameters | ||
parser.add_argument('--smoothing_ksize', type=int, default=1) | ||
parser.add_argument('--score_map_method', type=str, default='activation_map', | ||
choices=_SCORE_MAP_METHOD_NAMES) | ||
parser.add_argument('--score_map_process', type=str, default='vanilla', | ||
choices=_SCORE_MAP_PROCESS_NAMES) | ||
parser.add_argument('--norm_type', default='minmax', type=str, | ||
choices=_NORM_TYPES) | ||
parser.add_argument('--threshold_type', default='even', type=str, | ||
choices=_THRESHOLD_TYPES) | ||
parser.add_argument('--smooth_grad_nr_iter', type=int, default=50, | ||
help='SmoothGrad number of sampling') | ||
parser.add_argument('--smooth_grad_sigma', type=float, default=4.0, | ||
help='SmoothGrad sigma multiplier') | ||
parser.add_argument('--integrated_grad_nr_iter', type=int, default=50, | ||
help='IntegratedGradient number of steps') | ||
args = parser.parse_args() | ||
|
||
args.log_folder = configure_log_folder(args) | ||
configure_log(args) | ||
|
||
args.data_root = args.data_root.strip('"') | ||
args.data_paths = configure_data_paths(args) | ||
args.metadata_root = ospj(args.metadata_root, args.dataset_name) | ||
args.mask_root = configure_mask_root(args) | ||
args.reporter, args.reporter_log_root = configure_reporter(args) | ||
args.pretrained_path = configure_pretrained_path(args) | ||
|
||
return args |
Oops, something went wrong.