Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
SeongJoonOh authored and SeongJoonOh committed Jun 11, 2021
0 parents commit 483a1c6
Show file tree
Hide file tree
Showing 76 changed files with 3,267,599 additions and 0 deletions.
20 changes: 20 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
CALM
Copyright (c) 2021-present NAVER Corp.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
62 changes: 62 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
CALM
Copyright (c) 2021-present NAVER Corp.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

--------------------------------------------------------------------------------------

This project contains subcomponents with separate copyright notices and license terms.
Your use of the source code for these subcomponents is subject to the terms and conditions of the following licenses.

=====

pytorch/vision
https://github.com/pytorch/vision

BSD 3-Clause License

Copyright (c) Soumith Chintala 2016,
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

=====
90 changes: 90 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
## Keep CALM and Improve Visual Feature Attribution

Jae Myung Kim<sup>1*</sup>, Junsuk Choe<sup>1*</sup>, Zeynep Akata<sup>2</sup>, Seong Joon Oh<sup>1&dagger;</sup>
<sub>\* Equal contribution</sub> <sub>&dagger;</sub> <sub> Corresponding author </sub>

<sup>1</sup> <sub>NAVER AI LAB</sub> <sup>2</sup> <sub>University of T&uuml;bingen</sub>


<p align="center">
<img src="teaser.png" width="70%" title="" alt="CAM vs CALM"></img>
</p>

### Abstract
The class activation mapping, or CAM, has been the cornerstone of feature attribution methods for multiple vision tasks. Its simplicity and effectiveness have led to wide applications in the explanation of visual predictions and weakly-supervised localization tasks. However, CAM has its own shortcomings. The computation of attribution maps relies on ad-hoc calibration steps that are not part of the training computational graph, making it difficult for us to understand the real meaning of the attribution values. In this paper, we improve CAM by explicitly incorporating a latent variable encoding the location of the cue for recognition in the formulation, thereby subsuming the attribution map into the training computational graph. The resulting model, ***class activation latent mapping***, or ***CALM***, is trained with the expectation-maximization algorithm. Our experiments show that CALM identifies discriminative attributes for image classifiers more accurately than CAM and other visual attribution baselines. CALM also shows performance improvements over prior arts on the weakly-supervised object localization benchmarks.

### Dataset downloading
For ImageNet and CUB datasets, please follow the common procedure for downloading the datasets. <br>
For ImageNetV2, CUBV2, and OpenImages30k, please follow the procedure introduced in [wsol-evaluation page](https://github.com/naver-ai/wsolevaluation#2-dataset-downloading-and-license).

### How to use models
You can train CALM models by
```
$ python main.py --experiment_name=experiment_name/ \
--architecture=resnet50 \
--attribution_method=CALM_EM \
--dataset=CUB \
--use_bn=True --large_feature_map=True
```

You can evaluate the models on two different metrics,
```
$ python eval_pixel_perturb.py --experiment_name=experiment_name/ \
--architecture=resnet50 \
--attribution_method=CALM_EM \
--dataset=CUB \
--use_bn=True --large_feature_map=True \
--use_load_checkpoint=True \
--load_checkpoint=checkpoint_name/ \
--score_map_process=jointll --norm_type=clipping &
$ python eval_cue_location.py --experiment_name=experiment_name/ \
--architecture=resnet50 \
--attribution_method=CALM_EM \
--dataset=CUB \
--use_bn=True --large_feature_map=True \
--use_load_checkpoint=True \
--load_checkpoint=checkpoint_name/ \
--score_map_process=jointll --norm_type=clipping --threshold_type=log &
```

### Pretrained weights
For those who wish to use pretrained CALM weights,
| Model name | Dataset | cls. accuracy | weights |
|:-------:|:--------:|:--------:|:--------:|
| CALM_EM | CUB | 71.8 | [link](https://drive.google.com/file/d/1XfBEc1Lh24WqJZP1aLLrwGSlj_INMVef/view?usp=sharing) |
| CALM_EM | OpenImages | 70.1 | [link](https://drive.google.com/file/d/11250uUiRNafuTbnx2h4kryiVI-7218mW/view?usp=sharing) |
| CALM_EM | ImageNet | 70.4 | [link](https://drive.google.com/file/d/17451YI9KANnkmmn2ix0N-uHYZljsq4Lc/view?usp=sharing) |
| CALM_ML | CUB | 59.6 | [link](https://drive.google.com/file/d/1JgupGC2EoIX8wqpYgKPS-3kSLF-29Z8g/view?usp=sharing) |
| CALM_ML | OpenImages | 70.9 | [link](https://drive.google.com/file/d/1QHhiRjO_Oz_yIl64PJqMeCJzK1KUlffA/view?usp=sharing) |
| CALM_ML | ImageNet | 70.6 | [link](https://drive.google.com/file/d/131VHERtxDC-45MhIKgGok-1WxCXUlvdC/view?usp=sharing) |

### Explainability scores
Cue localization and Remove-and-classify results. More details about the metrics are in the paper.
Cue localization <br> (the higher, the better) | Remove-and-classify <br> (the lower, the better)
:-------------------------:|:-------------------------:
![](img_cue_localization.png) | ![](img_remove_and_classify.png)

### License

```
Copyright (c) 2021-present NAVER Corp.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
```
213 changes: 213 additions & 0 deletions config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
"""
CALM
Copyright (c) 2021-present NAVER Corp.
MIT license
"""

import argparse
import munch
import importlib
import os

from os.path import join as ospj
import shutil

from util import Logger

_DATASET_NAMES = ('CUB', 'ILSVRC', 'OpenImages')
_ARCHITECTURE_NAMES = ('vgg16', 'resnet50', 'inception_v3')
_ATTRIBUTION_METHODS = ('CAM', 'CALM-EM', 'CALM-ML')
_SCORE_MAP_METHOD_NAMES = ('activation_map', 'backprop')
_SCORE_MAP_PROCESS_NAMES = (
'vanilla', 'vanilla-saliency', 'vanilla-superclass',
'jointll', 'jointll-superclass', 'jointll-superclass-mean',
'gtcond', 'gtcond-superclass', 'gtcond-superclass-mean',
'saliency',
'input_grad', 'integrated_grad', 'smooth_grad', 'var_grad')
_NORM_TYPES = ('max', 'minmax', 'clipping')
_THRESHOLD_TYPES = ('even', 'log')
_SPLITS = ('train', 'val', 'test')
_LOGGER_TYPE = ('PythonLogger')


def mch(**kwargs):
return munch.Munch(dict(**kwargs))


def str2bool(v):
if v.lower() in ('yes', 'true', 't', 'y', '1'):
return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
return False
else:
raise argparse.ArgumentTypeError('Boolean value expected.')


def configure_data_paths(args):
train, val, test = set_data_path(
dataset_name=args.dataset_name,
data_root=args.data_root
)
data_paths = mch(train=train, val=val, test=test)
return data_paths


def set_data_path(dataset_name, data_root):
if dataset_name == 'ILSVRC':
train = test = ospj(data_root, dataset_name)
val = ospj(data_root, 'ImageNetV2')
elif dataset_name == 'CUB':
train = test = ospj(data_root, dataset_name, 'images')
val = ospj(data_root, 'CUBV2')
elif dataset_name == 'OpenImages':
train = val = test = ospj(data_root, dataset_name)
else:
raise ValueError("Dataset {} unknown.".format(dataset_name))
return train, val, test


def configure_mask_root(args):
mask_root = ospj(args.mask_root, 'OpenImages')
return mask_root


def configure_log_folder(args):
log_folder = ospj(args.save_root, args.experiment_name)

if os.path.isdir(log_folder):
if args.override_cache:
shutil.rmtree(log_folder, ignore_errors=True)
else:
raise RuntimeError("Experiment with the same name exists: {}"
.format(log_folder))
os.makedirs(log_folder)
return log_folder


def configure_log(args):
log_file_name = ospj(args.log_folder, 'log.log')
Logger(log_file_name)


def configure_reporter(args):
reporter = importlib.import_module('util').Reporter
reporter_log_root = ospj(args.log_folder, 'reports')
if not os.path.isdir(reporter_log_root):
os.makedirs(reporter_log_root)
return reporter, reporter_log_root


def configure_pretrained_path(args):
pretrained_path = None
return pretrained_path


def get_configs():
parser = argparse.ArgumentParser()

# Util
parser.add_argument('--seed', type=int)
parser.add_argument('--experiment_name', type=str, default='result')
parser.add_argument('--override_cache', type=str2bool, nargs='?',
const=True, default=False)
parser.add_argument('--workers', default=4, type=int,
help='number of data loading workers (default: 4)')
parser.add_argument('--use_load_checkpoint', type=str2bool, nargs='?',
const=True, default=False)
parser.add_argument('--load_checkpoint', type=str, default=None,
help='folder name for loading ckeckpoint')
parser.add_argument('--is_different_checkpoint', type=str2bool,
nargs='?', const=True, default=False)
parser.add_argument('--save_root', type=str, default='save')
parser.add_argument('--logger_type', type=str,
default='PythonLogger', choices=_LOGGER_TYPE)

# Data
parser.add_argument('--dataset_name', type=str, default='CUB',
choices=_DATASET_NAMES)
parser.add_argument('--data_root', metavar='/PATH/TO/DATASET',
default='dataset/',
help='path to dataset images')
parser.add_argument('--metadata_root', type=str, default='metadata/')
parser.add_argument('--mask_root', metavar='/PATH/TO/MASKS',
default='dataset/',
help='path to masks')
parser.add_argument('--proxy_training_set', type=str2bool, nargs='?',
const=True, default=False,
help='Efficient hyper_parameter search with a proxy '
'training set.')

# Setting
parser.add_argument('--architecture', default='resnet18',
choices=_ARCHITECTURE_NAMES,
help='model architecture: ' +
' | '.join(_ARCHITECTURE_NAMES) +
' (default: resnet18)')
parser.add_argument('--attribution_method', type=str, default='CAM',
choices=_ATTRIBUTION_METHODS)
parser.add_argument('--is_train', type=str2bool, nargs='?',
const=True, default=True)
parser.add_argument('--epochs', default=40, type=int,
help='number of total epochs to run')
parser.add_argument('--pretrained', type=str2bool, nargs='?',
const=True, default=True,
help='Use pre_trained model.')
parser.add_argument('--cam_curve_interval', type=float, default=.001,
help='CAM curve interval')
parser.add_argument('--resize_size', type=int, default=256,
help='input resize size')
parser.add_argument('--crop_size', type=int, default=224,
help='input crop size')

# Common hyperparameters
parser.add_argument('--batch_size', default=64, type=int,
help='Mini-batch size (default: 256), this is the total'
'batch size of all GPUs on the current node when'
'using Data Parallel or Distributed Data Parallel')
parser.add_argument('--lr', default=0.01, type=float,
help='initial learning rate', dest='lr')
parser.add_argument('--lr_decay_frequency', type=int, default=30,
help='How frequently do we decay the learning rate?')
parser.add_argument('--lr_classifier_ratio', type=float, default=10,
help='Multiplicative factor on the classifier layer.')
parser.add_argument('--momentum', default=0.9, type=float,
help='momentum')
parser.add_argument('--weight_decay', default=1e-4, type=float,
help='weight decay (default: 1e-4)',
dest='weight_decay')
parser.add_argument('--use_bn', type=str2bool, nargs='?',
const=True, default=False)
parser.add_argument('--large_feature_map', type=str2bool, nargs='?',
const=True, default=False)
parser.add_argument('--iou_thresholds', nargs='+',
type=int, default=[30, 50, 70])

# Method-specific hyperparameters
parser.add_argument('--smoothing_ksize', type=int, default=1)
parser.add_argument('--score_map_method', type=str, default='activation_map',
choices=_SCORE_MAP_METHOD_NAMES)
parser.add_argument('--score_map_process', type=str, default='vanilla',
choices=_SCORE_MAP_PROCESS_NAMES)
parser.add_argument('--norm_type', default='minmax', type=str,
choices=_NORM_TYPES)
parser.add_argument('--threshold_type', default='even', type=str,
choices=_THRESHOLD_TYPES)
parser.add_argument('--smooth_grad_nr_iter', type=int, default=50,
help='SmoothGrad number of sampling')
parser.add_argument('--smooth_grad_sigma', type=float, default=4.0,
help='SmoothGrad sigma multiplier')
parser.add_argument('--integrated_grad_nr_iter', type=int, default=50,
help='IntegratedGradient number of steps')
args = parser.parse_args()

args.log_folder = configure_log_folder(args)
configure_log(args)

args.data_root = args.data_root.strip('"')
args.data_paths = configure_data_paths(args)
args.metadata_root = ospj(args.metadata_root, args.dataset_name)
args.mask_root = configure_mask_root(args)
args.reporter, args.reporter_log_root = configure_reporter(args)
args.pretrained_path = configure_pretrained_path(args)

return args
Loading

0 comments on commit 483a1c6

Please sign in to comment.