Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HunyuanVideo #10106

Closed
hlky opened this issue Dec 3, 2024 · 6 comments · May be fixed by #10376
Closed

Add HunyuanVideo #10106

hlky opened this issue Dec 3, 2024 · 6 comments · May be fixed by #10376

Comments

@hlky
Copy link
Collaborator

hlky commented Dec 3, 2024

HunyuanVideo

We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. HunyuanVideo features a comprehensive framework that integrates several key contributions, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models.

Code
Weights

🎥 Demo (HQ version)

demo.mp4

@hlky
Copy link
Collaborator Author

hlky commented Dec 17, 2024

Closed by #10136

@hlky hlky closed this as completed Dec 17, 2024
@svjack
Copy link

svjack commented Dec 21, 2024

Closed by #10136

How about lora support ?
It seems lora in HunyuanVideoLoraLoaderMixin support lora block in text encoder not same with lora in comfyui support

ValueError: Target modules {'img_attn_qkv', 'txt_attn_proj', 'txt_mod.linear', 'img_mod.linear', 'img_attn_proj', 'txt_attn_qkv', 'linear1', 'fc2', 'modulation.linear', 'fc1', 'linear2'} not found in the base model. Please check the target modules and try again.

@a-r-r-o-w
Copy link
Member

@svjack LoRA loading support was added in #10254, and training support was added here: a-r-r-o-w/finetrainers#126

@svjack
Copy link

svjack commented Dec 21, 2024

@svjack LoRA loading support was added in #10254, and training support was added here: a-r-r-o-w/finetrainers#126

import torch
from diffusers import HunyuanVideoPipeline, HunyuanVideoTransformer3DModel
from diffusers.utils import export_to_video

from enhance_a_video import enable_enhance, inject_feta_for_hunyuanvideo, set_enhance_weight

model_id = "tencent/HunyuanVideo"
transformer = HunyuanVideoTransformer3DModel.from_pretrained(
    model_id, subfolder="transformer", torch_dtype=torch.bfloat16, revision="refs/pr/18"
)
pipe = HunyuanVideoPipeline.from_pretrained(
    model_id, transformer=transformer, revision="refs/pr/18", torch_dtype=torch.bfloat16
)

#### from https://huggingface.co/svjack/Genshin_Impact_XiangLing_Low_Res_HunyuanVideo_lora_early
pipe.load_lora_weights("Genshin_Impact_XiangLing_Low_Res_HunyuanVideo_lora_early/xiangling_ep2_lora.safetensors")
ValueError: Target modules {'img_attn_qkv', 'txt_attn_proj', 'txt_mod.linear', 'img_mod.linear', 'img_attn_proj', 'txt_attn_qkv', 'linear1', 'fc2', 'modulation.linear', 'fc1', 'linear2'} not found in the base model. Please check the target modules and try again.

@a-r-r-o-w
Copy link
Member

It seems like this lora was not trained on diffusers codebase, so the layer names are different than expected (seems to be from the original hunyuan codebase). Since I did see a few loras with original codebase, I'll add support for loading these soon. For now, only diffusers-format loras are supported

@junsukha
Copy link

@a-r-r-o-w
is there an example or script instructing how to train a LoRA for HunyuanVideo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants