A suite of tests to assess attention faithfulness for explainability. This code is used in the paper Evaluating the Faithfulness of Saliency-based Explanations for Deep Learning Models for Temporal Colour Constancy, published in the proceedings of the 2022 XAI workshop on XAI.
ID | Code | What is tested? | Reference |
---|---|---|---|
acc |
WP1 | Impact of attention on accuracy | [4] |
mlp |
WP2 | Impact on the accuracy of using attention to guide non-contextual MLP model | [4] |
adv |
WP3 | Impact on predictions of adversarial attention distributions | [2, 4] |
ers (single) |
SS1 | Impact on predictions of single weights erasure | [3] |
ers (multi) |
SS2 | Impact on predictions groups of weights erasure | [3] |
rand_params |
A1 | Impact on attention weights of model parameters randomization | [1] |
rand_labels |
A2 | Impact on attention weights of training labels randomization | [1] |
- Adebayo et al., “Sanity Checks for Saliency Maps”, NIPS ( 2018) [ Paper - Code ]
- Jain & Wallace, “Attention is not Explanation”, NAACL ( 2019) [ Paper - Code ]
- Serrano & Smith, “Is Attention Interpretable?”, ACL ( 2019) [ Paper - Code ]
- Wiegreffe & Pinter, “Attention is not not Explanation”, EMNLP ( 2019) [ Paper - Code ]
- Color Constancy
- Temporal Color Constancy (TCC)
- TCC: TCC, presented in Qian et al. “A Benchmark for Temporal Color Constancy” ArXiv ( 2020) [Paper - Code]. The dataset can be downloaded at www.cs.ubc.ca/research/tcc/
The code has been developed and tested on Ubuntu 20.10 using Python3.9 and some functionalities introduced in torch 1.9.0. Please install the required Python packages listed in requirements.txt
. Using a virtualenv
is not compulsory but strongly advised.
Paths to the dataset are relative to a base path configurable inside auxiliary/settings.py
via the PATH_TO_DATASET = path/to/dataset
folder. Please ensure the desired dataset is stored at that path under a folder named coherently with the paths used inside the user-defined PyTorch dataset class. For example, the TCC dataset implementation at classes/tasks/ccc/multiframe/data/TCC.py
refers to a folder named tcc
, which should be found at path/to/dataset/color_checker
.
All tests can be run from the shell using the scripts (.sh
files) into the eval/tests
subdirectories. Each script allows for multiple configuration options that can be edited within the file and are thereby described.
- Run
python3 eval/tests/acc/acc_test.sh
to measure the impact on accuracy of using learned attention weights versus random attention weights
- Run
python3 eval/tests/adv/variance_test.sh
to train the models with multiple random seeds. This step is optional but strongly advised - Run
python3 eval/tests/adv/adv_test.sh
to train the adversarial attention models - Run
python3 eval/analysis/adv/adv_analysis.sh
to analyze the test output data
- Run
python3 eval/tests/mlp/mlp_test.sh
to train the MLP models either guided by imposed attention weights, learning their attention weights, or not using attention at all
- Run
python3 eval/tests/ers/save_grads.sh
to save the attention gradients of each model at test time. These values will be used to ground the gradient-based criteria for the multi-weight erasure - Run
python3 eval/tests/ers/erasure.sh
to run either the single or the multi-weight erasure - Run
python3 eval/analysis/ers/ers_analysis.sh
to analyze the test output data
- Run
python3 eval/tests/ers/rand_params_test.sh
to run the parameters' randomization test - Run
python3 eval/tests/ers/rand_labels_test.sh
to run the labels' randomization test