Add HunyuanVideo #10106

hlky · 2024-12-03T20:09:15Z

HunyuanVideo

We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. HunyuanVideo features a comprehensive framework that integrates several key contributions, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models.

Code
Weights

🎥 Demo (HQ version)

demo.mp4

hlky · 2024-12-17T21:39:19Z

Closed by #10136

svjack · 2024-12-21T12:21:12Z

Closed by #10136

How about lora support ?
It seems lora in HunyuanVideoLoraLoaderMixin support lora block in text encoder not same with lora in comfyui support

ValueError: Target modules {'img_attn_qkv', 'txt_attn_proj', 'txt_mod.linear', 'img_mod.linear', 'img_attn_proj', 'txt_attn_qkv', 'linear1', 'fc2', 'modulation.linear', 'fc1', 'linear2'} not found in the base model. Please check the target modules and try again.

a-r-r-o-w · 2024-12-21T12:38:53Z

@svjack LoRA loading support was added in #10254, and training support was added here: a-r-r-o-w/finetrainers#126

svjack · 2024-12-21T12:56:14Z

@svjack LoRA loading support was added in #10254, and training support was added here: a-r-r-o-w/finetrainers#126

import torch
from diffusers import HunyuanVideoPipeline, HunyuanVideoTransformer3DModel
from diffusers.utils import export_to_video

from enhance_a_video import enable_enhance, inject_feta_for_hunyuanvideo, set_enhance_weight

model_id = "tencent/HunyuanVideo"
transformer = HunyuanVideoTransformer3DModel.from_pretrained(
    model_id, subfolder="transformer", torch_dtype=torch.bfloat16, revision="refs/pr/18"
)
pipe = HunyuanVideoPipeline.from_pretrained(
    model_id, transformer=transformer, revision="refs/pr/18", torch_dtype=torch.bfloat16
)

#### from https://huggingface.co/svjack/Genshin_Impact_XiangLing_Low_Res_HunyuanVideo_lora_early
pipe.load_lora_weights("Genshin_Impact_XiangLing_Low_Res_HunyuanVideo_lora_early/xiangling_ep2_lora.safetensors")

ValueError: Target modules {'img_attn_qkv', 'txt_attn_proj', 'txt_mod.linear', 'img_mod.linear', 'img_attn_proj', 'txt_attn_qkv', 'linear1', 'fc2', 'modulation.linear', 'fc1', 'linear2'} not found in the base model. Please check the target modules and try again.

a-r-r-o-w · 2024-12-21T12:58:42Z

It seems like this lora was not trained on diffusers codebase, so the layer names are different than expected (seems to be from the original hunyuan codebase). Since I did see a few loras with original codebase, I'll add support for loading these soon. For now, only diffusers-format loras are supported

junsukha · 2024-12-26T05:01:58Z

@a-r-r-o-w
is there an example or script instructing how to train a LoRA for HunyuanVideo?

hlky added the New pipeline/model label Dec 3, 2024

hlky closed this as completed Dec 17, 2024

a-r-r-o-w mentioned this issue Dec 25, 2024

[LoRA] Support original format loras for HunyuanVideo #10376

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HunyuanVideo #10106

Add HunyuanVideo #10106

hlky commented Dec 3, 2024

hlky commented Dec 17, 2024

svjack commented Dec 21, 2024 •

edited

Loading

a-r-r-o-w commented Dec 21, 2024

svjack commented Dec 21, 2024

a-r-r-o-w commented Dec 21, 2024

junsukha commented Dec 26, 2024

Add HunyuanVideo #10106

Add HunyuanVideo #10106

Comments

hlky commented Dec 3, 2024

HunyuanVideo

hlky commented Dec 17, 2024

svjack commented Dec 21, 2024 • edited Loading

a-r-r-o-w commented Dec 21, 2024

svjack commented Dec 21, 2024

a-r-r-o-w commented Dec 21, 2024

junsukha commented Dec 26, 2024

svjack commented Dec 21, 2024 •

edited

Loading