Skip to content

Commit

Permalink
Repaint and LaMa Demo (IDEA-Research#259)
Browse files Browse the repository at this point in the history
* refine

* add lama demo

* refine readme

* update

* update

* add repaint
  • Loading branch information
rentainhe authored May 15, 2023
1 parent a53b989 commit f90b798
Show file tree
Hide file tree
Showing 10 changed files with 321 additions and 9 deletions.
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ We are very willing to **help everyone share and promote new projects** based on
The **core idea** behind this project is to **combine the strengths of different models in order to build a very powerful pipeline for solving complex problems**. And it's worth mentioning that this is a workflow for combining strong expert models, where **all parts can be used separately or in combination, and can be replaced with any similar but different models (like replacing Grounding DINO with GLIP or other detectors / replacing Stable-Diffusion with ControlNet or GLIGEN/ Combining with ChatGPT)**.

**🍇 Updates**
- **`2023/05/14`**: Release [PaintByExample](./playground/generation/PaintByExample/) demo with SAM.
- **`2023/05/11`**: We decide to share more interesting demo in [playground](./playground/) and we've already tested the [DeepFloyd](./playground/generation/DeepFloyd/) for image generation and style transfering and share some notes about using IF.
- **`2023/05/15`**: Release [LaMa](./playground/LaMa/) and [RePaint](./playground/RePaint/) demo, thanks for nice tips by [Tao Yu](https://github.com/geekyutao).
- **`2023/05/14`**: Release [PaintByExample](./playground/PaintByExample/) demo with SAM.
- **`2023/05/11`**: We decide to share more interesting demo in [playground](./playground/) and we've already tested the [DeepFloyd](./playground/DeepFloyd/) for image generation and style transfering and share some notes about using IF.
- **`2023/05/05`**: Release a simpler code for automatic labeling (combined with Tag2Text model): please see [automatic_label_simple_demo.py](./automatic_label_simple_demo.py)
- **`2023/05/03`**: Checkout the [Automated Dataset Annotation and Evaluation with GroundingDINO and SAM](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb) which is an amazing tutorial on automatic labeling! Thanks a lot for [Piotr Skalski](https://github.com/SkalskiP) and [Robotflow](https://github.com/roboflow/notebooks)!

Expand All @@ -42,9 +43,11 @@ The **core idea** behind this project is to **combine the strengths of different
- [Interactive Fashion-Edit Playground: Click for Segmentation And Editing](#dancers-interactive-editing)
- [Interactive Human-face Editing Playground: Click And Editing Human Face](#dancers-interactive-editing)
- [3D Box Via Segment Anything](#camera-3d-box-via-segment-anything)
- [Playground: More Interesting and Imaginative Demos](./playground/)
- [Playground: More Interesting and Imaginative Demos with Grounded-SAM](./playground/)
- [DeepFloyd: Image Generation with Text Prompt](./playground/DeepFloyd/)
- [PaintByExample: Exemplar-based Image Editing with Diffusion Models](./playground/PaintByExample/)
- [LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions](./playground/LaMa/)
- [RePaint: Inpainting using Denoising Diffusion Probabilistic Models](./playground/RePaint/)


## Preliminary Works
Expand Down
3 changes: 0 additions & 3 deletions grounded_sam_inpainting_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,6 @@ def show_box(box, ax, label):
inpaint_prompt = args.inpaint_prompt
output_dir = args.output_dir
cache_dir=args.cache_dir
# if not os.path.exists(cache_dir):
# print(f"create your cache dir:{cache_dir}")
# os.mkdir(cache_dir)
box_threshold = args.box_threshold
text_threshold = args.text_threshold
inpaint_mode = args.inpaint_mode
Expand Down
4 changes: 4 additions & 0 deletions playground/DeepFloyd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,8 @@ export FORCE_MEM_EFFICIENT_ATTN=1
### Dream
The `text-to-image` mode for DeepFloyd
```python
cd playground/DeepFloyd

export FORCE_MEM_EFFICIENT_ATTN=1
python dream.py
```
Expand All @@ -147,6 +149,8 @@ Download the original image from [here](https://github.com/IDEA-Research/detrex-
</div>

```python
cd playground/DeepFloyd

export FORCE_MEM_EFFICIENT_ATTN=1
python style_transfer.py
```
Expand Down
87 changes: 87 additions & 0 deletions playground/LaMa/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
## LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions

:grapes: [[Official Project Page](https://advimman.github.io/lama-project/)] &nbsp; :apple:[[LaMa Cleaner](https://github.com/Sanster/lama-cleaner)]

We use the highly organized code [lama-cleaner](https://github.com/Sanster/lama-cleaner) to simplify the demo code for users.

<div align="center">

![](https://raw.githubusercontent.com/senya-ashukha/senya-ashukha.github.io/master/projects/lama_21/ezgif-4-0db51df695a8.gif)

</div>

## Abstract

> Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of the main reasons for that is the lack of an ef-fective receptive field in both the inpainting network andthe loss function. To alleviate this issue, we propose anew method called large mask inpainting (LaMa). LaM ais based on: a new inpainting network architecture that uses fast Fourier convolutions, which have the image-widereceptive field
a high receptive field perceptual loss; large training masks, which unlocks the potential ofthe first two components. Our inpainting network improves the state-of-the-art across a range of datasets and achieves excellent performance even in challenging scenarios, e.g.completion of periodic structures. Our model generalizes surprisingly well to resolutions that are higher than thoseseen at train time, and achieves this at lower parameter & compute costs than the competitive baselines.

## Table of Contents
- [Installation](#installation)
- [LaMa Demos](#paint-by-example-demos)
- [Diffuser Demo](#paintbyexample-diffuser-demos)
- [PaintByExample with SAM](#paintbyexample-with-sam)


## TODO
- [x] LaMa Demo with lama-cleaner
- [x] LaMa with SAM
- [ ] LaMa with GroundingDINO
- [ ] LaMa with Grounded-SAM


## Installation
We're using lama-cleaner for this demo, install it as follows:
```bash
pip install lama-cleaner
```
Please refer to [lama-cleaner](https://github.com/Sanster/lama-cleaner) for more details.

Then install Grounded-SAM follows [Grounded-SAM Installation](https://github.com/IDEA-Research/Grounded-Segment-Anything#installation) for some extension demos.

## LaMa Demos
Here we provide the demos for `LaMa`

### LaMa Demo with lama-cleaner

```bash
cd playground/LaMa
python lama_inpaint_demo.py
```
with the highly organized code lama-cleaner, this demo can be done in about 20 lines of code. The result will be saved as `lama_inpaint_demo.jpg`:

<div align="center">

| Input Image | Mask | Inpaint Output |
|:----:|:----:|:----:|
| ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/example.jpg?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/mask.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/lama_inpaint_demo.jpg?raw=true) |

</div>

### LaMa with SAM

```bash
cd playground/LaMa
python sam_lama.py
```

**Tips**
To make it better for inpaint, we should **dilate the mask first** to make it a bit larger to cover the whole region (Thanks a lot for [Inpaint-Anything](https://github.com/geekyutao/Inpaint-Anything) and [Tao Yu](https://github.com/geekyutao) for this)


The `original mask` and `dilated mask` are shown as follows:

<div align="center">

| Mask | Dilated Mask |
|:---:|:---:|
| ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/sam_demo_mask.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/dilated_mask.png?raw=true) |

</div>


And the inpaint result will be saved as `sam_lama_demo.jpg`:

| Input Image | SAM Output | Dilated Mask | LaMa Inpaint |
|:---:|:---:|:---:|:---:|
| ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/paint_by_example/input_image.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/paint_by_example/demo_with_point_prompt.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/dilated_mask.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/lama/sam_lama_demo.jpg?raw=true) |

25 changes: 25 additions & 0 deletions playground/LaMa/lama_inpaint_demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import cv2
import PIL
import requests
import numpy as np
from lama_cleaner.model.lama import LaMa
from lama_cleaner.schema import Config


def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image


img_url = "https://raw.githubusercontent.com/Sanster/lama-cleaner/main/assets/dog.jpg"
mask_url = "https://user-images.githubusercontent.com/3998421/202105351-9fcc4bf8-129d-461a-8524-92e4caad431f.png"

image = np.asarray(download_image(img_url))
mask = np.asarray(download_image(mask_url).convert("L"))

# set to GPU for faster inference
model = LaMa("cpu")
result = model(image, mask, Config(hd_strategy="Original", ldm_steps=20, hd_strategy_crop_margin=128, hd_strategy_crop_trigger_size=800, hd_strategy_resize_limit=800))
cv2.imwrite("lama_inpaint_demo.jpg", result)
96 changes: 96 additions & 0 deletions playground/LaMa/sam_lama.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# !pip install diffusers transformers

import requests
import cv2
import numpy as np
import PIL
from PIL import Image
from io import BytesIO

from segment_anything import sam_model_registry, SamPredictor

from lama_cleaner.model.lama import LaMa
from lama_cleaner.schema import Config

"""
Step 1: Download and preprocess demo images
"""
def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image


img_url = "https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/paint_by_example/input_image.png?raw=true"


init_image = download_image(img_url)
init_image = np.asarray(init_image)


"""
Step 2: Initialize SAM and LaMa models
"""

DEVICE = "cuda:1"

# SAM
SAM_ENCODER_VERSION = "vit_h"
SAM_CHECKPOINT_PATH = "/comp_robot/rentianhe/code/Grounded-Segment-Anything/sam_vit_h_4b8939.pth"
sam = sam_model_registry[SAM_ENCODER_VERSION](checkpoint=SAM_CHECKPOINT_PATH).to(device=DEVICE)
sam_predictor = SamPredictor(sam)
sam_predictor.set_image(init_image)

# LaMa
model = LaMa(DEVICE)


"""
Step 3: Get masks with SAM by prompt (box or point) and inpaint the mask region by example image.
"""

input_point = np.array([[350, 256]])
input_label = np.array([1]) # positive label

masks, _, _ = sam_predictor.predict(
point_coords=input_point,
point_labels=input_label,
multimask_output=False
)
masks = masks.astype(np.uint8) * 255
# mask_pil = Image.fromarray(masks[0]) # simply save the first mask


"""
Step 4: Dilate Mask to make it more suitable for LaMa inpainting
The idea behind dilate mask is to mask a larger region which will be better for inpainting.
Borrowed from Inpaint-Anything: https://github.com/geekyutao/Inpaint-Anything/blob/main/utils/utils.py#L18
"""

def dilate_mask(mask, dilate_factor=15):
mask = mask.astype(np.uint8)
mask = cv2.dilate(
mask,
np.ones((dilate_factor, dilate_factor), np.uint8),
iterations=1
)
return mask

def save_array_to_img(img_arr, img_p):
Image.fromarray(img_arr.astype(np.uint8)).save(img_p)

# [1, 512, 512] to [512, 512] and save mask
save_array_to_img(masks[0], "./mask.png")

mask = dilate_mask(masks[0], dilate_factor=15)

save_array_to_img(mask, "./dilated_mask.png")

"""
Step 5: Run LaMa inpaint model
"""
result = model(init_image, mask, Config(hd_strategy="Original", ldm_steps=20, hd_strategy_crop_margin=128, hd_strategy_crop_trigger_size=800, hd_strategy_resize_limit=800))
cv2.imwrite("sam_lama_demo.jpg", result)
4 changes: 2 additions & 2 deletions playground/PaintByExample/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Here we provide the demos for `PaintByExample`

### PaintByExample Diffuser Demos
```python
cd playground/generation/PaintByExample
cd playground/PaintByExample
python paint_by_example.py
```
**Notes:** set `cache_dir` to save the pretrained weights to specific folder. The paint result will be save as `paint_by_example_demo.jpg`:
Expand All @@ -59,7 +59,7 @@ In this demo, we did inpaint task by:
2. Inpaint with mask and example image

```python
cd playground/generation/PaintByExample
cd playground/PaintByExample
python sam_paint_by_example.py
```
**Notes:** We set a more `num_inference_steps` (like 200 to 500) to get higher quality image. And we've found that the mask region can influence a lot on the final result (like a panda can not be well inpainted with a region like dog). It needed to have more test on it.
Expand Down
7 changes: 6 additions & 1 deletion playground/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,11 @@ We will try more interesting **base models** and **build more fun demos** in the
- [DeepFloyd: Text-to-Image Generation](./DeepFloyd/)
- [Dream: Text-to-Image Generation](./DeepFloyd/dream.py)
- [Style Transfer](./DeepFloyd/style_transfer.py)
- [Paint By Example](./PaintByExample/)
- [Paint by Example: Exemplar-based Image Editing with Diffusion Models](./PaintByExample/)
- [Diffuser Demo](./PaintByExample/paint_by_example.py)
- [PaintByExample with SAM](./PaintByExample/sam_paint_by_example.py)
- [LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions](./LaMa/)
- [LaMa Demo](./LaMa/lama_inpaint_demo.py)
- [LaMa with SAM](./LaMa/sam_lama.py)
- [RePaint: Inpainting using Denoising Diffusion Probabilistic Models](./RePaint/)
- [RePaint Demo](./RePaint/repaint.py)
55 changes: 55 additions & 0 deletions playground/RePaint/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
## RePaint: Inpainting using Denoising Diffusion Probabilistic Models

:grapes: [[Official Project Page](https://github.com/andreas128/RePaint)]

<div align="center">

![](https://user-images.githubusercontent.com/11280511/150803812-a4729ef8-6ad4-46aa-ae99-8c27fbb2ea2e.png)

</div>

## Abstract

> Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask. Most existing approaches train for a certain distribution of masks, which limits their generalization capabilities to unseen mask types. Furthermore, training with pixel-wise and perceptual losses often leads to simple textural extensions towards the missing areas instead of semantically meaningful generation. In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks. We employ a pretrained unconditional DDPM as the generative prior. To condition the generation process, we only alter the reverse diffusion iterations by sampling the unmasked regions using the given image information. Since this technique does not modify or condition the original DDPM network itself, the model produces highquality and diverse output images for any inpainting form. We validate our method for both faces and general-purpose image inpainting using standard and extreme masks. RePaint outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions.


## Table of Contents
- [Installation](#installation)
- [Repaint Demos](#repaint-demos)
- [Diffuser Demo](#repaint-diffuser-demos)


## TODO
- [x] RePaint Diffuser Demo
- [ ] RePaint with SAM
- [ ] RePaint with GroundingDINO
- [ ] RePaint with Grounded-SAM

## Installation
We're using PaintByExample with diffusers, install diffusers as follows:
```bash
pip install diffusers==0.16.1
```
Then install Grounded-SAM follows [Grounded-SAM Installation](https://github.com/IDEA-Research/Grounded-Segment-Anything#installation) for some extension demos.

## RePaint Demos
Here we provide the demos for `RePaint`


### RePaint Diffuser Demos
```python
cd playground/RePaint
python repaint.py
```
**Notes:** set `cache_dir` to save the pretrained weights to specific folder. The paint result will be save as `repaint_demo.jpg`:

<div align="center">

| Input Image | Mask | Inpaint Result |
|:----:|:----:|:----:|
| ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/repaint/celeba_hq_256.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/repaint/mask_256.png?raw=true) | ![](https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/repaint/repaint_demo.jpg?raw=true) |


</div>


40 changes: 40 additions & 0 deletions playground/RePaint/repaint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from io import BytesIO

import torch

import PIL
import requests
from diffusers import RePaintPipeline, RePaintScheduler


def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")


img_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/celeba_hq_256.png"
mask_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/mask_256.png"

# Load the original image and the mask as PIL images
original_image = download_image(img_url).resize((256, 256))
mask_image = download_image(mask_url).resize((256, 256))

# Load the RePaint scheduler and pipeline based on a pretrained DDPM model
DEVICE = "cuda:1"
CACHE_DIR = "/comp_robot/rentianhe/weights/diffusers/"
scheduler = RePaintScheduler.from_pretrained("google/ddpm-ema-celebahq-256", cache_dir=CACHE_DIR)
pipe = RePaintPipeline.from_pretrained("google/ddpm-ema-celebahq-256", scheduler=scheduler, cache_dir=CACHE_DIR)
pipe = pipe.to(DEVICE)

generator = torch.Generator(device=DEVICE).manual_seed(0)
output = pipe(
image=original_image,
mask_image=mask_image,
num_inference_steps=250,
eta=0.0,
jump_length=10,
jump_n_sample=10,
generator=generator,
)
inpainted_image = output.images[0]
inpainted_image.save("./repaint_demo.jpg")

0 comments on commit f90b798

Please sign in to comment.