Lee, J.H., Wagstaff, K.L. Visualizing image content to explain novel image discovery. Data Min Knowl Disc 34, 1777–1804 (2020). https://doi.org/10.1007/s10618-020-00700-0
Lee, Jake; Wagstaff, Kiri, 2024, "Dataset for "Visualizing image content to explain novel image discovery"", https://doi.org/10.48577/jpl.3FUOOD, JPL Open Repository
This repository contains supplemental scripts and data used in the experiments presented in the paper.
Compile the image data set - It is recommended that the image filename include the class information. The images can be in class subfolders or in a single folder.
Preprocess the imageset - We recommend scaling and center-cropping your images to 227x227 first.
We used imagemagick:
mogrify -path imageset/# -format jpg -resize "227x227^" -gravity center -crop 227x227+0+0 +repage imageset/#/*.jpg
Caffe also provides a tool: https://github.com/BVLC/caffe/blob/master/tools/extra/resize_and_crop_images.py
Download and install DEMUD - Available at https://github.com/wkiri/DEMUD
Extract features - Extract features from the images using
. The extracted features will be saved as a CSV, with the first column being the image name.You will need to install Caffe and specify the trained Caffe model from which the features will be extracted. We used Caffe's pre-trained model called
with a modifieddeploy.prototxt
.The pre-trained model is available at https://github.com/BVLC/caffe/tree/master/models/bvlc_reference_caffenet.
The modified prototxt is available in this repository at
. -
Run DEMUD on features - Configure DEMUD by adding the path to the feature CSV in
at thefloatdatafile
line.Run DEMUD. An example run:
python demud.py -v --init-item=svd --n=300 --k=4096 --svdmethod=increm-brand --note=balfc6
indicates this is a run on CNN features in a CSV--init-item=svd
sets the first item initialization to full SVD initialization.--n=300
sets DEMUD to select the first 300 items.--k=4096
sets the number of principal components used during SVD to a maximum of 4096.--svdmethod=increm-brand
sets the SVD method to incremental SVD as described by Brand, 2002.--note=balfc6
will append "balfc6" to the end of the results directory.
Visualize the explanations - Use
to generate visualizations of the explanations. This script and its associated models were modified and trained from code provided for Dosovitskiy and Brox, 2016 (NIPS). The original source is available here. -
Calculate and plot discovery rates - Use
to calculate nAUCt scores and generate discovery plots. -
Organize results - Use
to generate PDFs to display selected images and visualized explanations.
Documentation for each file and script is available in their respective sub-directories.