Skip to content

Commit 9d95db0

Browse files
committed
big update
1 parent e763eea commit 9d95db0

File tree

6 files changed

+171
-39
lines changed

6 files changed

+171
-39
lines changed

README.md

+71-31
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,80 @@
11
# sd-webui-ddsd
2-
A script that searches for specific keywords, inpaints them, and then upscales them
2+
자동으로 동작하는 후보정 작업 확장.
33

44
## What is
55
### Upscale
6-
Upscaling an image by a specific factor. Utilizes a tiled approach to scale with less memory
6+
이미지를 특정 크기로 잘라내어 타일별 업스케일을 하는 도구. 업스케일시 VRAM을 적게 소모.
7+
#### Upscale How to use
8+
1. 크기를 키울때 사용할 upscaler 모델 선택
9+
2. 크기를 키울 배수 선택
10+
3. 가로, 세로를 내가 단일로 생성할 수 있는 이미지의 최대 크기로 선택(이미지 생성 속도를 최대한 빠르게 하기 위하여)
11+
4. before running 체크
12+
1. 체크시 업스케일을 먼저 돌려서 인페인팅의 퀄리티 상승. 단, 인페인팅시 더 많은 VRAM 요구
13+
5. 생성!
714
### Detect Detailer
8-
Inpainting with additional prompts after mask search with specific keywords. Add counts separated by semicolons
15+
특정 키워드로 이미지를 탐색 후 인페인팅하는 도구.
916
#### Detect Detailer How to use
10-
0. Enable Inpaint Inner(or Outer) Mask Area(Use I2I Only)
11-
1. When using the inpaint inner option, the mask is created only inside the inpaint mask.
12-
2. When using the inpaint outer option, the mask is created only outside the inpaint mask.
13-
1. Input dino prompt
14-
1. Inpaint the dino prompt multiple times, separated by tabs.
15-
2. Additional options can be controlled.
16-
3. Each dino prompt can be calculated with AND, OR, XOR, NOR, and NAND gates.
17-
1. face OR (body NAND outfit) -> Create a body mask that does not overlap with the outfit. And composited with a face mask.
18-
2. Use parentheses sparingly. Parentheses operations consume more VRAM because they generate masks in advance.
19-
4. Option values ​​of each dino prompt can be entered by separating them with colons.
17+
0. 인페인팅의 범위 제한(I2I 전용)
18+
1. Inner 옵션은 I2I의 인페인팅에서 칠한 범위 내부만 이미지를 탐색
19+
2. Outer 옵션은 I2I의 인페인팅에서 칠한 범위 외부만 이미지를 탐색
20+
1. 탐색 키워드 작성
21+
1. 탐색할 키워드를 작성(face, person 등등)
22+
1. 탐색할 키워드는 문장형도 가능(happy face, running dog)
23+
2. 탐색할 키워드를 .으로 분할 가능(face. arm, face. chest)
24+
2. 탐색할 키워드에 사용 가능한 추가 옵션 존재
25+
1. <area:type>을 이용하여 특정 범위 탐색 가능
26+
1. 범위 종류는 left, right, top, bottom, all이 존재
27+
2. <file:filename>을 이용하여 특정 파일 탐색 가능
28+
1. 특정 파일의 위치는 models/ddsdmask
29+
3. <model:type>을 이용하여 특정 모델 탐색 가능
30+
1. type은 face_media_full, face_media_short와 파일명이 존재
31+
2. 파일은 models/yolo에 위치
32+
4. <type1:type2:dilation:confidence> 같이 type1과 type2외에 dilation과 confidence도 추가 입력 가능
33+
1. confidence는 model 타입에서만 사용되는 값
34+
3. 탐색한 범위를 AND, OR, XOR, NAND, NOR 등의 게이트 옵션으로 연산 가능
35+
1. face OR (body NAND outfit) -> 괄호안의 body NAND outfit을 먼저 한 후에 face와 OR 연산을 동작
36+
2. 괄호는 최대한 적게 이용. 많이 이용시 많은 VRAM 소모.
37+
3. 동작은 왼쪽에서 오른쪽으로 순차적 동작.
38+
4. 탐색할 키워드에 옵션으로 여러가지 옵션 조절 가능
2039
1. face:0:0.4:4 OR outfit:2:0.5:8
21-
2. Each option, in order, is prompt, detection level (0-2:default 0), box threshold (0-1:default 0.3), and dilation value (0-128:default 8).
22-
3. You can omit it if you wish. Replace with default value if omitted.
23-
2. Input positive prompt
24-
1. Inpaint the positive prompt multiple times, separated by semicolons.
25-
3. Input negative prompt
26-
1. Inpaint the negative prompt multiple times, separated by semicolons.
27-
4. Check the option to separate and inpaint the unconnected mask.
28-
1. When separating and inpainting, the number of inpaintings increases. But quality rises.
29-
5. Select a small area of ​​pixels to remove from the inpainting area when inpainting by isolation.
30-
6. Generate!
40+
2. 순서대로 탐색할 프롬프트, SAM 탐색 레벨(0-2), 민감도(0-1), 팽창값(0-512)을 가짐
41+
3. 값을 생략하면 초기값으로 세팅
42+
2. 긍정 프롬프트 입력
43+
1. 인페인팅시 동작시킬 긍정 프롬프트 입력
44+
3. 부정 프롬프트 입력
45+
1. 인페인팅시 동작시킬 부정 프롬프트 입력
46+
4. Denoising, CFG, Steps, Clip skip, Ckpt, Vae 수정
47+
1. 인페인팅시 동작에 영향을 주는 옵션
48+
5. Split Mask 옵션 체크
49+
1. 체크시 마스크가 떨어져 있는것이 존재한다면 따로 인페인팅.
50+
1. 따로 인페인팅시 퀄리티 상승. 하지만 더 많은 인페인팅을 요구하여 생성속도 하락.
51+
6. Remove Area 옵션 체크
52+
1. Split Mask 옵션이 Enable 되어야만 동작
53+
2. 분할 인페인팅시 일정 크기 이하의 면적은 인페인팅에서 제외
54+
6. 생성!
55+
### Postprocessing
56+
최종적으로 생성된 이미지에 가하는 후보정
57+
#### Postprocessing How to use
58+
1. 가하고자 하는 후보정을 선택
59+
2. 생성!
60+
### Watermark
61+
이미지 생성 최종본에 자신의 증명을 기입하는 기능
62+
#### Watermark How to use
63+
1. 기입할 증명의 종류 선택(글자, 이미지)
64+
2. 선택한 종류를 입력
65+
3. 선택한 종류의 크기와 위치를 지정
66+
4. Padding으로 해당 위치에서 얼만큼 떨어져 있을지 설정
67+
5. Alpha로 얼만큼 투명할지 결정
68+
6. 생성!
3169
## Installation
32-
1. Download [CUDA](https://developer.nvidia.com/cuda-toolkit-archive) and [cuDNN](https://developer.nvidia.com/rdp/cudnn-archive)
33-
1. You need current CUDA and cuDNN version
34-
2. This is [CUDA 117](https://drive.google.com/file/d/1HRTOLTB44-pRcrwIw9lQak2OC2ohNle3/view?usp=share_link) and [cuDNN](https://drive.google.com/file/d/1QcgaxUra0WnCWrCLjsWp_QKw1PKcvqpj/view?usp=share_link)
35-
3. After installing CUDA, overwrite cuDNN in the folder where you installed CUDA
36-
4. Easy install support version. (torch == 1.13.1+cu117, torch==2.0.0+cu117 , torch==2.0.0+cu118)
37-
2. Install from the extensions tab with url `https://github.com/NeoGraph-K/sd-webui-ddsd`
38-
3. Start Sd web UI
39-
4. It takes some time to install sam model and dino model
70+
1. 다운로드 [CUDA](https://developer.nvidia.com/cuda-toolkit-archive) [cuDNN](https://developer.nvidia.com/rdp/cudnn-archive)
71+
1. 자신이 가진 WebUI와 동일한 버전의 `CUDA``cuDNN`버전으로 설치
72+
1. 이것은 다운로드를 편하게 하기위한 구글링크. [CUDA 117](https://drive.google.com/file/d/1HRTOLTB44-pRcrwIw9lQak2OC2ohNle3/view?usp=share_link) [cuDNN](https://drive.google.com/file/d/1QcgaxUra0WnCWrCLjsWp_QKw1PKcvqpj/view?usp=share_link)
73+
2. `CUDA` 설치 후 해당 폴더에 `cuDNN` 덮어쓰기
74+
3. 일정 버전은 Easy Install을 지원. `CUDA``cuDNN` 불필요.
75+
1. 지원버전 (torch == 1.13.1+cu117, torch==2.0.0+cu117 , torch==2.0.0+cu118)
76+
2. 확장탭에서 설치 `https://github.com/NeoGraph-K/sd-webui-ddsd` 또는 다운로드 후 `extension/` 에 풀어넣기
77+
3. WebUI를 완전히 재시작
4078

4179
## Credits
4280

@@ -51,3 +89,5 @@ IDEA-Research/[GroundingDINO](https://github.com/IDEA-Research/GroundingDINO)
5189
IDEA-Research/[Grounded-Segment-Anything](https://github.com/IDEA-Research/Grounded-Segment-Anything)
5290

5391
continue-revolution/[sd-webui-segment-anything](https://github.com/continue-revolution/sd-webui-segment-anything)
92+
93+
Bing-su/[adetailer](https://github.com/Bing-su/adetailer)

install.py

+4
Original file line numberDiff line numberDiff line change
@@ -72,8 +72,11 @@ def install_groundingdino():
7272

7373
with open(req_file) as file:
7474
for lib in file:
75+
version = None
7576
lib = lib.strip()
7677
lib = 'skimage' if lib == 'scikit-image' else lib
78+
if '==' in lib:
79+
lib, version = [x.strip() for x in lib.split('==')]
7780
if not launch.is_installed(lib):
7881
if lib == 'pycocotools':
7982
install_pycocotools()
@@ -90,6 +93,7 @@ def install_groundingdino():
9093
f'sd-webui-ddsd requirement: pillow_lut'
9194
)
9295
else:
96+
lib = lib if version is None else lib + '==' + version
9397
launch.run_pip(
9498
f'install {lib}',
9599
f'sd-webui-ddsd requirement: {lib}'

requirements.txt

+3-1
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,6 @@ segment_anything
33
groundingdino
44
scipy
55
scikit-image
6-
pillow_lut
6+
pillow_lut
7+
ultralytics==8.0.87
8+
mediapipe==0.9.3.0

scripts/ddsd.py

+10-2
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
grounding_models_path = os.path.join(models_path, "grounding")
2626
sam_models_path = os.path.join(models_path, "sam")
2727
lut_models_path = os.path.join(models_path, 'lut')
28+
yolo_models_path = os.path.join(models_path, 'yolo')
2829
ddsd_config_path = os.path.join(os.path.dirname(os.path.dirname(__file__)),'config')
2930

3031
ckpt_model_name_pattern = re.compile('([\\w\\.\\[\\]\\\\\\+\\(\\)]+)\\s*\\[.*\\]')
@@ -56,6 +57,12 @@ def modeltitle(path, shorthash):
5657
return models
5758

5859
def startup():
60+
if (len(list_models(yolo_models_path, '.pth')) == 0) and (len(list_models(yolo_models_path, '.pt')) == 0):
61+
print("No detection yolo models found, downloading...")
62+
load_file_from_url('https://huggingface.co/Bingsu/adetailer/resolve/main/face_yolov8m.pt',yolo_models_path)
63+
load_file_from_url('https://huggingface.co/Bingsu/adetailer/resolve/main/face_yolov8n.pt',yolo_models_path)
64+
load_file_from_url('https://huggingface.co/Bingsu/adetailer/resolve/main/face_yolov8s.pt',yolo_models_path)
65+
5966
if (len(list_models(grounding_models_path, '.pth')) == 0):
6067
print("No detection groundingdino models found, downloading...")
6168
load_file_from_url('https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swint_ogc.pth',grounding_models_path)
@@ -870,7 +877,8 @@ def postprocess(self, p, res, *args, **kargs):
870877
self.change_vae_model(self.vae)
871878
opts.CLIP_stop_at_last_layers = self.clip_skip
872879
if len(self.image_results) < 1: return
873-
if p.n_iter > 1 or p.batch_size > 1:
880+
final_count = len(res.images)
881+
if (p.n_iter > 1 or p.batch_size > 1) and final_count != p.n_iter * p.batch_size:
874882
grid = res.images[0]
875883
res.images = res.images[1:]
876884
grid_texts = res.infotexts[0]
@@ -879,7 +887,7 @@ def postprocess(self, p, res, *args, **kargs):
879887
res.images = [image for sub in images for image in sub]
880888
infos = [[info] * (len(masks) + 1) for masks, info in zip(self.image_results, res.infotexts)]
881889
res.infotexts = [info for sub in infos for info in sub]
882-
if p.n_iter > 1 or p.batch_size > 1:
890+
if (p.n_iter > 1 or p.batch_size > 1) and final_count != p.n_iter * p.batch_size:
883891
res.images = [grid] + res.images
884892
res.infotexts = [grid_texts] + res.infotexts
885893

scripts/ddsd_bs.py

+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
from __future__ import annotations
2+
3+
import os
4+
import torch
5+
6+
import mediapipe as mp
7+
import numpy as np
8+
9+
from PIL import Image, ImageDraw
10+
from ultralytics import YOLO
11+
12+
from modules import safe
13+
from modules.shared import cmd_opts
14+
from modules.paths import models_path
15+
16+
yolo_models_path = os.path.join(models_path, 'yolo')
17+
18+
def mediapipe_face_detect(image, model_type, confidence):
19+
width, height = image.size
20+
image_np = np.array(image)
21+
22+
mp_face_detection = mp.solutions.face_detection
23+
with mp_face_detection.FaceDetection(model_selection=model_type, min_detection_confidence=confidence) as face_detector:
24+
predictor = face_detector.process(image_np)
25+
26+
if predictor.detections is None: return None
27+
28+
bboxes = []
29+
for detection in predictor.detections:
30+
31+
bbox = detection.location_data.relative_bounding_box
32+
x1 = bbox.xmin * width
33+
y1 = bbox.ymin * height
34+
x2 = x1 + bbox.width * width
35+
y2 = y1 + bbox.height * height
36+
bboxes.append([x1,y1,x2,y2])
37+
38+
return create_mask_from_bbox(image, bboxes)
39+
40+
def ultralytics_predict(image, model_type, confidence, device):
41+
models = [os.path.join(yolo_models_path,x) for x in os.listdir(yolo_models_path) if (x.endswith('.pt') or x.endswith('.pth')) and os.path.splitext(os.path.basename(x))[0].upper() == model_type]
42+
if len(models) == 0: return None
43+
model = YOLO(models[0])
44+
predictor = model(image, conf=confidence, show_labels=False, device=device)
45+
bboxes = predictor[0].boxes.xyxy.cpu().numpy()
46+
if bboxes.size == 0: return None
47+
bboxes = bboxes.tolist()
48+
return create_mask_from_bbox(image, bboxes)
49+
50+
def create_mask_from_bbox(image, bboxes):
51+
mask = Image.new('L', image.size, 0)
52+
draw = ImageDraw.Draw(mask)
53+
for bbox in bboxes:
54+
draw.rectangle(bbox, fill=255)
55+
return np.array(mask)
56+
57+
def bs_model(image, model_type, confidence):
58+
print(model_type, confidence)
59+
image = Image.fromarray(image)
60+
orig = torch.load
61+
torch.load = safe.unsafe_torch_load
62+
if model_type == 'FACE_MEDIA_FULL':
63+
mask = mediapipe_face_detect(image, 1, confidence)
64+
elif model_type == 'FACE_MEDIA_SHORT':
65+
mask = mediapipe_face_detect(image, 0, confidence)
66+
else:
67+
device = ''
68+
if getattr(cmd_opts, 'lowvram', False) or getattr(cmd_opts, 'medvram', False):
69+
device = 'cpu'
70+
mask = ultralytics_predict(image, model_type, confidence, device)
71+
torch.load = orig
72+
return mask

scripts/ddsd_utils.py

+11-5
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from glob import glob
88
from PIL import Image, ImageDraw, ImageFont
99
from scripts.ddsd_sam import sam_predict, clear_cache, dilate_mask
10+
from scripts.ddsd_bs import bs_model
1011
from modules.devices import torch_gc
1112
from skimage import measure
1213

@@ -75,10 +76,11 @@ def dino_detect_from_prompt(prompt:str, detailer_sam_model, detailer_dino_model,
7576
if inpaint_mask_mode == 'Outer': return cv2.bitwise_and(result, cv2.bitwise_not(image_mask))
7677
return None
7778

78-
def dino_prompt_token_file(prompt:str, image_np_zero):
79-
usage_type, usage, dilation = prompt_spliter(prompt, ':', 3)
79+
def dino_prompt_token_file(prompt:str, image_np_zero, image_np_rgb):
80+
usage_type, usage, dilation, confidence = prompt_spliter(prompt, ':', 4)
8081
usage_type = usage_type.upper()
8182
usage = usage.upper()
83+
confidence = try_convert(confidence, float, 0.3, 0, 1)
8284
if usage_type == 'AREA':
8385
if usage == 'LEFT':
8486
image_np_zero[:,:image_np_zero.shape[1] // 2] = 255
@@ -100,6 +102,10 @@ def dino_prompt_token_file(prompt:str, image_np_zero):
100102
h, w = image_np_zero.shape[:2]
101103
image = image.resize((w, h))
102104
image_np_zero = np.array(image)
105+
if usage_type == 'MODEL':
106+
mask = bs_model(image_np_rgb, usage, confidence)
107+
if mask is None: return image_np_zero
108+
image_np_zero = mask
103109
return dilate_mask(image_np_zero, try_convert(dilation, int, 2, 0, 512))
104110

105111
def dino_prompt_detector(prompt:str, model_set, image_set):
@@ -128,7 +134,7 @@ def dino_prompt_detector(prompt:str, model_set, image_set):
128134
try_convert(sam_level.strip(), int, 0, 0, 2))
129135
if left is None: left = image_set[3].copy()
130136
else:
131-
left = dino_prompt_token_file(match.group(1), image_set[3].copy())
137+
left = dino_prompt_token_file(match.group(1), image_set[3].copy(), image_set[2].copy())
132138
else:
133139
left = result_group[left.strip()]
134140
if not isinstance(right, np.ndarray):
@@ -143,7 +149,7 @@ def dino_prompt_detector(prompt:str, model_set, image_set):
143149
try_convert(sam_level.strip(), int, 0, 0, 2))
144150
if right is None: right = image_set[3].copy()
145151
else:
146-
right = dino_prompt_token_file(match.group(1), image_set[3].copy())
152+
right = dino_prompt_token_file(match.group(1), image_set[3].copy(), image_set[2].copy())
147153
else:
148154
right = result_group[right.strip()]
149155
spliter[:3] = [combine_masks(left, operator, right)]
@@ -159,7 +165,7 @@ def dino_prompt_detector(prompt:str, model_set, image_set):
159165
try_convert(sam_level.strip(), int, 0, 0, 2))
160166
if target is None: return image_set[3].copy()
161167
else:
162-
target = dino_prompt_token_file(match.group(1), image_set[3].copy())
168+
target = dino_prompt_token_file(match.group(1), image_set[3].copy(), image_set[2].copy())
163169
return target
164170

165171
def mask_spliter_and_remover(mask, area):

0 commit comments

Comments
 (0)