Skip to content

Latest commit



109 lines (93 loc) · 5.75 KB

File metadata and controls

109 lines (93 loc) · 5.75 KB

Observer Dependent Lossy Image Compression


  • Python 3 (tested with versions 3.5.2 and 3.6.5)
  • Tensorflow (tested with versions 1.8.0 and 1.9.0)
  • other required python packages are specified in requirements.txt

Pretrained models

In order to prepare CNNs trained for ImageNet classification, run

cd code/
python --model CNN_MODEL

where CNN Model is one of densenet_121, inception_resnet_v2, inception_v3, mobilenet, resnet_50, vgg16, xception. This will download the model weights if necessary, and subsequently freeze the graph as a .pb file to code/resources/keras_models/.

Compress a sample image

To compress and classify an image, run the following command

cd code/
python --image /path/to/image_file --compression COMPPRESSION_METHOD --alpha ALPHA --quality Q_PARAM \

where compression is specified with

  • COMPRESSION_METHOD is one of rnn, jpeg, webp or bpg,
  • ALPHA determines the degree of classification oriented compression - either 0, 0.5 or 1.0 and
  • Q_PARAM controls the compression rate; for RNN compression this must be an integer in 1, ..., 8; for BPG, JPEG and WEBP compression the usual parameters apply.

If you want to additionally classify an image, make sure you have downloaded the corresponding pretrained weights (as described in Section "Pretrained models"). Additionally you need to download the ILSVCR2012 devkit and store it in the directory code/resources/imagenet/meta/. The classifier is the set via the flag --classifier CLASSIFIER as one of densenet_121, inception_resnet_v2, inception_v3, mobilenet, resnet_50, vgg16, xception

Downloading and prepraring image data


If you want to use ImageNet data, make sure you have an account with and a username with access key. Then, go through the following steps:

  1. Download the devkit from the official imagenet homepage and extract it to code/resources/imagenet/meta/.
  2. Download the ILSVRC2012 training and validation datasets, see e.g. the script provided by tensorflow.
  3. Create tfrecords files with training and validation data by running the steps described above.

Stanford Dogs

  1. Download the Images and Lists tar files from here
  2. Extract them both to the same folder (e.g. ~/data/stanford_dogs/)


  1. Download the All Images and Annotations tar file from here
  2. Extract to a folder (e.g. ~/data/cub200/)

Store data in tfrecords files

Throughout this repo we store data in tfrecords files, both for evaluation and training. Tfrecords files are generated using the script code/ for Stanford Dogs, CUB-200-2011 and ImageNet datasets. The following assumes that you have downloaded the necessary data to the right locations. Then, run the script as follows:

cd code/
python --dataset DATASET --split SPLIT --data_dir /path/to/data/dir --target_dir /path/to/records

where DATASET is one of imagenet, stanford_dogs, cub200 and SPLIT is either train or val.

Evaluate Accuracy and MS-SSIM on a dataset

You can evaluate accuracy and MS-SSIM on different datasets using one of the scripts code/ or code/ This assumes that you have trained models available and data stored as tfrecords files. The following is an example to evaluate accuracy on ImageNet compressed with RNNs:

cd code/
python --dataset imagenet \
    --compression rnn \
    --records /path/to/records/ \
    --rnn_ckpt_dir /path/to/rnn/ckpt/

Train your own models

Train RNN compression

I you want to train RNN compression, make sure you followed the previous steps and have the training data in the appropriate format. Note that if ALPHA > 0, you will have to download VGG-16 weights from here first. Then run

cd code/
python --train_records /path/to/train_records/ \
                            --val_records /path/to/val_records/ \
                            --job_id JOB_ID \
                            --alpha ALPHA \
                            --config src/compression/rnn/configs/config.json \
                            --vgg_weights /path/to/vgg_weights.npy

where JOB_ID is an id for the training session and ALPHA is a float in [0,1].

Train CNN classifiers for fine grained visual categorization

You can train your own classifiers for finegrained visual categorization using the scrip code/ It is possible to either train the models from scratch or initialize the feature extractor with imagenet weights. For the latter, you can download weights from tf slim here and extract the checkpoints to code/resources/tf_slim_models. The config files containing hyperparameters used in the paper are in code/src/classification/fine_grained_categorization/training/configs/.

The following is an example to finetune Incetpion-V3 on the Stanford Dogs dataset initializing the feature extractor with ImageNet weights and subsequently fine-tuning all layers:

cd code/
python --classifier inception_v3 \
    --dataset stanford_dogs \
    --train_records path/to/train_records/ \
    --val_records path/to/val_records/ \
    --job_id JOB_ID \
    --pretrained_model resources/tf_slim_models/inception_v3/inception_v3.ckpt