forked from IDEA-Research/Grounded-Segment-Anything
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Repaint and LaMa Demo (IDEA-Research#259)
* refine * add lama demo * refine readme * update * update * add repaint
- Loading branch information
Showing
10 changed files
with
321 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
## LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions | ||
|
||
:grapes: [[Official Project Page](https://advimman.github.io/lama-project/)] :apple:[[LaMa Cleaner](https://github.com/Sanster/lama-cleaner)] | ||
|
||
We use the highly organized code [lama-cleaner](https://github.com/Sanster/lama-cleaner) to simplify the demo code for users. | ||
|
||
<div align="center"> | ||
|
||
 | ||
|
||
</div> | ||
|
||
## Abstract | ||
|
||
> Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of the main reasons for that is the lack of an ef-fective receptive field in both the inpainting network andthe loss function. To alleviate this issue, we propose anew method called large mask inpainting (LaMa). LaM ais based on: a new inpainting network architecture that uses fast Fourier convolutions, which have the image-widereceptive field | ||
a high receptive field perceptual loss; large training masks, which unlocks the potential ofthe first two components. Our inpainting network improves the state-of-the-art across a range of datasets and achieves excellent performance even in challenging scenarios, e.g.completion of periodic structures. Our model generalizes surprisingly well to resolutions that are higher than thoseseen at train time, and achieves this at lower parameter & compute costs than the competitive baselines. | ||
|
||
## Table of Contents | ||
- [Installation](#installation) | ||
- [LaMa Demos](#paint-by-example-demos) | ||
- [Diffuser Demo](#paintbyexample-diffuser-demos) | ||
- [PaintByExample with SAM](#paintbyexample-with-sam) | ||
|
||
|
||
## TODO | ||
- [x] LaMa Demo with lama-cleaner | ||
- [x] LaMa with SAM | ||
- [ ] LaMa with GroundingDINO | ||
- [ ] LaMa with Grounded-SAM | ||
|
||
|
||
## Installation | ||
We're using lama-cleaner for this demo, install it as follows: | ||
```bash | ||
pip install lama-cleaner | ||
``` | ||
Please refer to [lama-cleaner](https://github.com/Sanster/lama-cleaner) for more details. | ||
|
||
Then install Grounded-SAM follows [Grounded-SAM Installation](https://github.com/IDEA-Research/Grounded-Segment-Anything#installation) for some extension demos. | ||
|
||
## LaMa Demos | ||
Here we provide the demos for `LaMa` | ||
|
||
### LaMa Demo with lama-cleaner | ||
|
||
```bash | ||
cd playground/LaMa | ||
python lama_inpaint_demo.py | ||
``` | ||
with the highly organized code lama-cleaner, this demo can be done in about 20 lines of code. The result will be saved as `lama_inpaint_demo.jpg`: | ||
|
||
<div align="center"> | ||
|
||
| Input Image | Mask | Inpaint Output | | ||
|:----:|:----:|:----:| | ||
|  |  |  | | ||
|
||
</div> | ||
|
||
### LaMa with SAM | ||
|
||
```bash | ||
cd playground/LaMa | ||
python sam_lama.py | ||
``` | ||
|
||
**Tips** | ||
To make it better for inpaint, we should **dilate the mask first** to make it a bit larger to cover the whole region (Thanks a lot for [Inpaint-Anything](https://github.com/geekyutao/Inpaint-Anything) and [Tao Yu](https://github.com/geekyutao) for this) | ||
|
||
|
||
The `original mask` and `dilated mask` are shown as follows: | ||
|
||
<div align="center"> | ||
|
||
| Mask | Dilated Mask | | ||
|:---:|:---:| | ||
|  |  | | ||
|
||
</div> | ||
|
||
|
||
And the inpaint result will be saved as `sam_lama_demo.jpg`: | ||
|
||
| Input Image | SAM Output | Dilated Mask | LaMa Inpaint | | ||
|:---:|:---:|:---:|:---:| | ||
|  |  |  |  | | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
import cv2 | ||
import PIL | ||
import requests | ||
import numpy as np | ||
from lama_cleaner.model.lama import LaMa | ||
from lama_cleaner.schema import Config | ||
|
||
|
||
def download_image(url): | ||
image = PIL.Image.open(requests.get(url, stream=True).raw) | ||
image = PIL.ImageOps.exif_transpose(image) | ||
image = image.convert("RGB") | ||
return image | ||
|
||
|
||
img_url = "https://raw.githubusercontent.com/Sanster/lama-cleaner/main/assets/dog.jpg" | ||
mask_url = "https://user-images.githubusercontent.com/3998421/202105351-9fcc4bf8-129d-461a-8524-92e4caad431f.png" | ||
|
||
image = np.asarray(download_image(img_url)) | ||
mask = np.asarray(download_image(mask_url).convert("L")) | ||
|
||
# set to GPU for faster inference | ||
model = LaMa("cpu") | ||
result = model(image, mask, Config(hd_strategy="Original", ldm_steps=20, hd_strategy_crop_margin=128, hd_strategy_crop_trigger_size=800, hd_strategy_resize_limit=800)) | ||
cv2.imwrite("lama_inpaint_demo.jpg", result) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
# !pip install diffusers transformers | ||
|
||
import requests | ||
import cv2 | ||
import numpy as np | ||
import PIL | ||
from PIL import Image | ||
from io import BytesIO | ||
|
||
from segment_anything import sam_model_registry, SamPredictor | ||
|
||
from lama_cleaner.model.lama import LaMa | ||
from lama_cleaner.schema import Config | ||
|
||
""" | ||
Step 1: Download and preprocess demo images | ||
""" | ||
def download_image(url): | ||
image = PIL.Image.open(requests.get(url, stream=True).raw) | ||
image = PIL.ImageOps.exif_transpose(image) | ||
image = image.convert("RGB") | ||
return image | ||
|
||
|
||
img_url = "https://github.com/IDEA-Research/detrex-storage/blob/main/assets/grounded_sam/paint_by_example/input_image.png?raw=true" | ||
|
||
|
||
init_image = download_image(img_url) | ||
init_image = np.asarray(init_image) | ||
|
||
|
||
""" | ||
Step 2: Initialize SAM and LaMa models | ||
""" | ||
|
||
DEVICE = "cuda:1" | ||
|
||
# SAM | ||
SAM_ENCODER_VERSION = "vit_h" | ||
SAM_CHECKPOINT_PATH = "/comp_robot/rentianhe/code/Grounded-Segment-Anything/sam_vit_h_4b8939.pth" | ||
sam = sam_model_registry[SAM_ENCODER_VERSION](checkpoint=SAM_CHECKPOINT_PATH).to(device=DEVICE) | ||
sam_predictor = SamPredictor(sam) | ||
sam_predictor.set_image(init_image) | ||
|
||
# LaMa | ||
model = LaMa(DEVICE) | ||
|
||
|
||
""" | ||
Step 3: Get masks with SAM by prompt (box or point) and inpaint the mask region by example image. | ||
""" | ||
|
||
input_point = np.array([[350, 256]]) | ||
input_label = np.array([1]) # positive label | ||
|
||
masks, _, _ = sam_predictor.predict( | ||
point_coords=input_point, | ||
point_labels=input_label, | ||
multimask_output=False | ||
) | ||
masks = masks.astype(np.uint8) * 255 | ||
# mask_pil = Image.fromarray(masks[0]) # simply save the first mask | ||
|
||
|
||
""" | ||
Step 4: Dilate Mask to make it more suitable for LaMa inpainting | ||
The idea behind dilate mask is to mask a larger region which will be better for inpainting. | ||
Borrowed from Inpaint-Anything: https://github.com/geekyutao/Inpaint-Anything/blob/main/utils/utils.py#L18 | ||
""" | ||
|
||
def dilate_mask(mask, dilate_factor=15): | ||
mask = mask.astype(np.uint8) | ||
mask = cv2.dilate( | ||
mask, | ||
np.ones((dilate_factor, dilate_factor), np.uint8), | ||
iterations=1 | ||
) | ||
return mask | ||
|
||
def save_array_to_img(img_arr, img_p): | ||
Image.fromarray(img_arr.astype(np.uint8)).save(img_p) | ||
|
||
# [1, 512, 512] to [512, 512] and save mask | ||
save_array_to_img(masks[0], "./mask.png") | ||
|
||
mask = dilate_mask(masks[0], dilate_factor=15) | ||
|
||
save_array_to_img(mask, "./dilated_mask.png") | ||
|
||
""" | ||
Step 5: Run LaMa inpaint model | ||
""" | ||
result = model(init_image, mask, Config(hd_strategy="Original", ldm_steps=20, hd_strategy_crop_margin=128, hd_strategy_crop_trigger_size=800, hd_strategy_resize_limit=800)) | ||
cv2.imwrite("sam_lama_demo.jpg", result) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
## RePaint: Inpainting using Denoising Diffusion Probabilistic Models | ||
|
||
:grapes: [[Official Project Page](https://github.com/andreas128/RePaint)] | ||
|
||
<div align="center"> | ||
|
||
 | ||
|
||
</div> | ||
|
||
## Abstract | ||
|
||
> Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask. Most existing approaches train for a certain distribution of masks, which limits their generalization capabilities to unseen mask types. Furthermore, training with pixel-wise and perceptual losses often leads to simple textural extensions towards the missing areas instead of semantically meaningful generation. In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks. We employ a pretrained unconditional DDPM as the generative prior. To condition the generation process, we only alter the reverse diffusion iterations by sampling the unmasked regions using the given image information. Since this technique does not modify or condition the original DDPM network itself, the model produces highquality and diverse output images for any inpainting form. We validate our method for both faces and general-purpose image inpainting using standard and extreme masks. RePaint outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions. | ||
|
||
|
||
## Table of Contents | ||
- [Installation](#installation) | ||
- [Repaint Demos](#repaint-demos) | ||
- [Diffuser Demo](#repaint-diffuser-demos) | ||
|
||
|
||
## TODO | ||
- [x] RePaint Diffuser Demo | ||
- [ ] RePaint with SAM | ||
- [ ] RePaint with GroundingDINO | ||
- [ ] RePaint with Grounded-SAM | ||
|
||
## Installation | ||
We're using PaintByExample with diffusers, install diffusers as follows: | ||
```bash | ||
pip install diffusers==0.16.1 | ||
``` | ||
Then install Grounded-SAM follows [Grounded-SAM Installation](https://github.com/IDEA-Research/Grounded-Segment-Anything#installation) for some extension demos. | ||
|
||
## RePaint Demos | ||
Here we provide the demos for `RePaint` | ||
|
||
|
||
### RePaint Diffuser Demos | ||
```python | ||
cd playground/RePaint | ||
python repaint.py | ||
``` | ||
**Notes:** set `cache_dir` to save the pretrained weights to specific folder. The paint result will be save as `repaint_demo.jpg`: | ||
|
||
<div align="center"> | ||
|
||
| Input Image | Mask | Inpaint Result | | ||
|:----:|:----:|:----:| | ||
|  |  |  | | ||
|
||
|
||
</div> | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
from io import BytesIO | ||
|
||
import torch | ||
|
||
import PIL | ||
import requests | ||
from diffusers import RePaintPipeline, RePaintScheduler | ||
|
||
|
||
def download_image(url): | ||
response = requests.get(url) | ||
return PIL.Image.open(BytesIO(response.content)).convert("RGB") | ||
|
||
|
||
img_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/celeba_hq_256.png" | ||
mask_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/mask_256.png" | ||
|
||
# Load the original image and the mask as PIL images | ||
original_image = download_image(img_url).resize((256, 256)) | ||
mask_image = download_image(mask_url).resize((256, 256)) | ||
|
||
# Load the RePaint scheduler and pipeline based on a pretrained DDPM model | ||
DEVICE = "cuda:1" | ||
CACHE_DIR = "/comp_robot/rentianhe/weights/diffusers/" | ||
scheduler = RePaintScheduler.from_pretrained("google/ddpm-ema-celebahq-256", cache_dir=CACHE_DIR) | ||
pipe = RePaintPipeline.from_pretrained("google/ddpm-ema-celebahq-256", scheduler=scheduler, cache_dir=CACHE_DIR) | ||
pipe = pipe.to(DEVICE) | ||
|
||
generator = torch.Generator(device=DEVICE).manual_seed(0) | ||
output = pipe( | ||
image=original_image, | ||
mask_image=mask_image, | ||
num_inference_steps=250, | ||
eta=0.0, | ||
jump_length=10, | ||
jump_n_sample=10, | ||
generator=generator, | ||
) | ||
inpainted_image = output.images[0] | ||
inpainted_image.save("./repaint_demo.jpg") |