Skip to content

hamedhaghighi/UTLC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ITALIC: ITerative and Attentive Lossless Image Compression

Generally, we can categorize density estimators in two groups of auto-regressive and non-auto-regressive. Auto-regressive methods predict the probability of each pixel conditioned on previous pixels and non-auto-regressive approaches estimate it independently. The former approaches such as PixelCNN are powerful density estimators, however, they are slow in the decoding phase. On the other hand, for faster decoding time, non-auto-regressive models have lower NLL. They are popular models like Multi-scale pixelCNN lying between mentioned approaches but their fixed architecture does not allow them to select a specific time/bpp based on need. In this project, we attempt to mitigate this problem by designing a recursive architecture abling to establish a time/bpp trade-off.

Dependencies

  • numpy
  • torch==1.1
  • bcolz
  • tqdm

Dataset

Cifar-10

Overview

Our approach attempts to recursively estimate the probability of an image conditioned on a low-level code (in our project, bpg code) and use Arithmetic Coding (AC) for encode\decode procedure. Based on time/bpsp we need, we change the number of recursion (steps) for encoding\decoding an image. Also, the number of pixels being encoded\decoded is adjustable. Indeed, we input the network with a pre-determined binary mask for each step. Ones in the mask show the pixels being encoded\decoded in that step. So by this technique, we can change the steps from one to image's total number of pixels. The network's architecture is inspired by RNAN method. RNAN restores distorted images with the help of self-attention mechanism. Actually, the non-local attention allows the network to get more sensitive to the important part of the image. As we estimate density based on bpg code, our model can be interpreted as a probabilistic image enhancer, therefore, RNAN architecture is aligned with ours. Moreover, non-local blocks increase the receptive field to the size of the image which improves predictability. The architecture of the network is shown in Fig.1. arch

Fig.1 architecture of the network.

Results

We evaluate our bpsp (bit per sub-pixel) on Cifar-10 test-set and compare it to the recent lossless image compression methods in Table .1.

table

Table.1 bpsp comparison

Usage:

For training the model

  • python main.py

Generation WIP ...

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published