Skip to content

Reimplementation of Xiao et al's Invertible Image Rescaling using the FrEIA framework, plus further experiments.

License

Notifications You must be signed in to change notification settings

mgm52/IRN-FrEIA

Repository files navigation

Invertible Image Rescaling with FrEIA

This is a reimplementation of Invertible Image Rescaling using the Freia framework, which was my final project for Part II of the Cambridge CS Tripos.

You can read my full writeup here.

Repository overview

The repository is divided into two projects: mnist_generation and image_rescaling.

  • The MNIST generation project being intended as a way of exploring FrEIA in isolation, training experiments in minutes on a local CPU...
  • While image_rescaling models were trained on the Cambridge HPC GPU cluster, requiring integration with model-saving and progress-tracking callbacks.

The structure within the two projects is a modification of the Cookiecutter Data Science structure. YAML config files are used to set up experiments; Slurm to queue jobs on the HPC; wandb to track runs; and PyTorch for the underlying models.

The repo also contains IRN research ideas in experiments.md.

Links

CORE PAPER

BACKGROUND PAPERS

COMPETING PAPERS

  • HCFlow: Hierarchical conditional flow: a unified framework for image super-resolution and image rescaling https://arxiv.org/pdf/2108.05301.pdf
    • They train on image sizes of 160x160 instead of 144...
    • They use a grad_norm clip value of 100 instead of 10...
    • They use a grad_value clip value of 5 instead of (none)...
    • They perform an additional lr step at 450000 samples...
    • They use an initial learning rate of 2.5e-4 instead of 2e-4...
    • They use a loss of (mean) (r=1, g=0.05, d=0.00001) instead of (sum) (r=1, g=16, d=1)...
    • DIV2K 4x PSNR/SSIM: 35.23/0.9346, 4.4M params
  • FGRN: Approaching the Limit of Image Rescaling via Flow Guidance https://arxiv.org/pdf/2111.05133.pdf
    • Uses two non-invertible networks for compressed<->upscaled, one invertible network for compressed<->downscaled. I am slightly dubious as to how useful that really is.
    • In section 4.5, they train an IRN with z=0 instead of resampling z, and find that it achieves similar results. They conclude that this means z does not encode the information lost in downscaling. I think they might misunderstand the purpose of z. Better experiments could be to try sampling around z=0 and see if the samples achieve similar PSNR (I expect they would do if Z~N(0,1), but might not if Z=0).
    • DIV2K 4x PSNR/SSIM: 35.15/0.9322, 3.35M params
  • AIDN: Scale-arbitrary Invertible Image Downscaling https://arxiv.org/pdf/2201.12576.pdf
    • Outperforms IRN on not-power-of-two image rescaling
    • DIV2K 4x PSNR/SSIM: 34.94/?, 3.8M params

CAMBRIDGE HPC

OTHER

About

Reimplementation of Xiao et al's Invertible Image Rescaling using the FrEIA framework, plus further experiments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages