Merge pull request #94 from Atm1801/DIRE

⚡ Add Summary for DIRE
vlgiitr · May 21, 2024 · 0961609 · 0961609
2 parents bea27dc + 378d27e
commit 0961609
Show file tree

Hide file tree

Showing 2 changed files with 58 additions and 0 deletions.
diff --git a/images/DIRE.png b/images/DIRE.png
diff --git a/summaries/DIRE.md b/summaries/DIRE.md
@@ -0,0 +1,58 @@
+
+# DIRE for Diffusion-Generated Image Detection
+Zhendong Wang, Jianmin Bao, Wengang Zhou, Weilun Wang, Hezhen Hu, Hong Chen, Houqiang Li **ICCV**  **2023**
+
+## Summary
+
+
+This paper, seeks to build a detector for telling apart real images from diffusion generated images by proposing a novel image representation called **DI**ffusion **R**econstruction **E**rror (DIRE), which measures the error between an input image and its reconstruction counterpart by a pre-trained diffusion model. The hypothesis behind DIRE is the observation that images produced by diffusion processes can be reconstructed more accurately by a pre-trained diffusion model compared to real images.
+
+
+## Contributions
+
+
+- Proposed a novel image representation called DIRE for detecting diffusion-generated images.
+- Set up a new dataset, DiffusionForensics (including three-domain images (LSUN-Bedroom, ImageNet and CelebA-HQ)) generated by eleven different diffusion models for benchmarking the diffusion-generated image detectors.
+
+
+## Method
+
+
+Given an input image x<sub>0</sub> to judge whether it is generated by diffusion models, we take a pre-trained diffusion model and apply the DDIM inversion process to gradually add Gaussian noise into x<sub>0</sub>. Then the DDIM generation process is employed to reconstruct the input image and produces a recovered version x'<sub>0</sub>. Then the DIRE is defined as:
+
+
+$$
+DIRE(x_{0}) = |x_{0} - x'_{0}|
+$$
+
+<img  src='../images/DIRE.png'> **Illustration of the difference between a real sample and a generated sample**
+
+p<sub>g</sub>(x) represents the distribution of generated images while p<sub>r</sub>(x) represents the distribution of real images. x<sub>g</sub> and x<sub>r</sub> represent a generated sample and a real sample, respectively. Using the inversion and reconstruction process of DDIM x<sub>g</sub> and x<sub>r</sub> become x'<sub>g</sub> and x′<sub>r</sub> , respectively.
+
+As a sample x<sub>g</sub> from the generated distribution p<sub>g</sub>(x) and its reconstruction x′<sub>g</sub> belong to the same distribution, the DIRE value for x<sub>g</sub> would be relatively low. Conversely, the reconstruction of a real image x<sub>r</sub> is likely to differ significantly from itself, resulting in a high amplitude in DIRE.
+
+
+Thus for real images and diffusion-generated images, we get their DIRE representations and train a binary classifier to distinguish their DIREs using binary crossentropy loss.
+
+
+## Results
+
+- DIRE with a binary classifier significantly outperformed existing classifiers including CNNDetection, GANDetection, SBI, PatchForensics, F3Net at detecting - 
+	* Diffusion generated bedroom images
+	* Diffusion generated face images	
+	* Generated ImageNet images
+	* GAN-generated bedroom images
+
+- The robustness of detectors is checked in two-class degradations, Gaussian blur and JPEG compression, DIRE gets a perfect performance without performance drop.
+
+- Other methods of input also checked against DIRE were RGB images, reconstructed images (REC), and the combination of RGB and DIRE (RGB&DIRE). Using just DIRE as input achieved significantly higher accuracy
+
+
+## Two-Cents
+
+The proposed image representation DIRE contributes to a novel, accurate and robust detector, outperforming current SOTA detection models extensively.
+
+## Resources
+
+- [Paper](https://arxiv.org/pdf/2303.09295.pdf)
+- [Implementation](https://github.com/ZhendongWang6/DIRE)