Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The issue of facial contamination in LoRA #166

Open
chaorenai opened this issue Sep 11, 2024 · 11 comments
Open

The issue of facial contamination in LoRA #166

chaorenai opened this issue Sep 11, 2024 · 11 comments

Comments

@chaorenai
Copy link

In the character Lora, if the output is a group photo, the face of the character Lora contaminates the faces of other people in the group. Various methods such as adjusting the dataset, lowering the learning rate, and layer-wise training have been tried, but the issue cannot be resolved. What exactly is going wrong?

@fofr
Copy link
Contributor

fofr commented Sep 11, 2024

Have you tried using regularisation images?

@kuzman123
Copy link

kuzman123 commented Sep 11, 2024

You need to provide more info: trained with or without captions? Network DIm\alpha? Because if you've trained Dim 128 for example, it's most likely that your Lora weights are huge, and weaker tokens can't break through it (faces of random AI generated humas). But anyways, in order to generate images with more different subjects, you just NEED to use attention masking and Inpainting (i'm using ComfyUI for that and it is amazing what you can achieve with masks + inpaint).
What i like to do is finding images with desired composition (1 main subject with minions besides, in your example), making mask image for that subject, and using it as Attention mask in IPAdapter, or just Faceswap.
I don't think there is any other way to generate the image you want, because you will always have to make a compromise between the strength of your Lora (by reducing it, you will reduce the impact on other characters, but the likeness will also decrease).

@chaorenai
Copy link
Author

You need to provide more info: trained with or without captions? Network DIm\alpha? Because if you've trained Dim 128 for example, it's most likely that your Lora weights are huge, and weaker tokens can't break through it (faces of random AI generated humas). But anyways, in order to generate images with more different subjects, you just NEED to use attention masking (i'm using ComfyUI for that and it is amazing what you can achieve with masks). What i like to do is finding images with desired composition (1 main subject with minions besides, in your example), making mask image for that subject, and using it as Attention mask in IPAdapter. I don't think there is any other way to generate the image you want, because you will always have to make a compromise between the strength of your Lora (by reducing it, you will reduce the impact on other characters, but the likeness will also decrease).

Thank you for your reply. I was about to give up, but you gave me hope. Here's my training process:

All of the data was labeled, using natural language labeling generated by ChatGPT-4.0 or LLaMA 3.1.
For the learning rate (lr), I tested lr: 0.00015, lr: 0.00025, lr: 0.0003, and lr: 0.0004.
Regarding Dim/alpha, if I'm not using layered training, I mainly use 16/1. If I am using layered training, it's 128/128.
For steps, I’ve tested 1000, 3000, 6000, and 10,000, but none of them solved the facial distortion issue.
Could you please teach me how to use attention masking specifically?

Do you have X (formerly Twitter) or YouTube? I would like to follow you.

@kuzman123
Copy link

I never EVER use captions for training faces, just trigger words (ohwx man, or ohwx woman.. girl etc). Default LR 1e-4 (0.0004) is good. Set Dim/Alpha - 32/32. Optimizer - i prefer Adafactor, but you can use AdamW8bit, Prodigy... 150 Dataset repeats, save every 10-15 epochs. All of this falls apart if the dataset is not good, of course.
I'm not on X, you can Instagram me @artproai

@chaorenai
Copy link
Author

I never EVER use captions for training faces, just trigger words (ohwx man, or ohwx woman.. girl etc). Default LR 1e-4 (0.0004) is good. Set Dim/Alpha - 32/32. Optimizer - i prefer Adafactor, but you can use AdamW8bit, Prodigy... 150 Dataset repeats, save every 10-15 epochs. All of this falls apart if the dataset is not good, of course. I'm not on X, you can Instagram me @artproai

I know that using masks, InstantID, and inpainting during the generation process can control the output. However, what I hope to achieve is solving the face contamination issue during the Lora training itself. I've tried various methods like regularization and layer-wise training, but they all failed... I'll register for an Instagram account and make a friend there to thank you. Thanks again!

@chaorenai
Copy link
Author

Have you tried using regularisation images?

Have you tried using regularisation images?

I’ve been following you on X and also left a comment on your X post regarding this issue. The regularization has been tested, but it still doesn't solve the problem. When you trained the character LoRA, did you experience any face contamination issues when generating a double or multi-person photo? How did you resolve this?

@GXcells
Copy link

GXcells commented Sep 13, 2024

Nothing works. Many many many people tried and discussed it in the discord and that is basically impossible.
Try training a Lokr with simple-tuner. People managed to train several people in same Lokr. So basically it is possible to not have bleeding with Lokr. Haven't seen it with my own eyes though.

@chaorenai
Copy link
Author

Nothing works. Many many many people tried and discussed it in the discord and that is basically impossible. Try training a Lokr with simple-tuner. People managed to train several people in same Lokr. So basically it is possible to not have bleeding with Lokr. Haven't seen it with my own eyes though.

If the same Lora is trained on several people, then when generating group photos with this Lora, it will also be limited to these specific people, right?

@GXcells
Copy link

GXcells commented Sep 14, 2024 via email

@chaorenai
Copy link
Author

https://huggingface.co/TheLastBen/The_Hound This Lora solves the problem of facial pollution, but I wonder if it has anything to do with him being a celebrity?

@chaorenai
Copy link
Author

https://huggingface.co/TheLastBen/The_Hound This Lora solves the problem of facial pollution, but I wonder if it has anything to do with him being a celebrity?

This one only trained 2 layers, and the dim value is relatively low. After careful testing, there is still facial pollution, but it is relatively small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants