Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite creation... #163

Open
kyoushinosensei opened this issue Oct 3, 2024 · 0 comments
Open

Infinite creation... #163

kyoushinosensei opened this issue Oct 3, 2024 · 0 comments

Comments

@kyoushinosensei
Copy link

Hello everyone. Please tell me - why with such settings the training continues and does not stop for 5 hours? Only 9 images, Karl...

2024-10-03 09-09-51  - AnyDesk

2024-10-03 09-17-41 - AnyDesk

[2024-10-03 04:51:04] [INFO] Running N:\Games\pinokio\api\fluxgym.git\outputs\funi-funilab\train.bat [2024-10-03 04:51:04] [INFO] [2024-10-03 04:51:04] [INFO] (env) (base) N:\Games\pinokio\api\fluxgym.git>accelerate launch --mixed_precision bf16 --num_cpu_threads_per_process 1 sd-scripts/flux_train_network.py --pretrained_model_name_or_path "N:\Games\pinokio\api\fluxgym.git\models\unet\flux1-dev.sft" --clip_l "N:\Games\pinokio\api\fluxgym.git\models\clip\clip_l.safetensors" --t5xxl "N:\Games\pinokio\api\fluxgym.git\models\clip\t5xxl_fp16.safetensors" --ae "N:\Games\pinokio\api\fluxgym.git\models\vae\ae.sft" --cache_latents_to_disk --save_model_as safetensors --sdpa --persistent_data_loader_workers --max_data_loader_n_workers 2 --seed 42 --gradient_checkpointing --mixed_precision bf16 --save_precision bf16 --network_module networks.lora_flux --network_dim 4 --optimizer_type adafactor --optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False" --split_mode --network_args "train_blocks=single" --lr_scheduler constant_with_warmup --max_grad_norm 0.0 --sample_prompts="N:\Games\pinokio\api\fluxgym.git\outputs\funi-funilab\sample_prompts.txt" --sample_every_n_steps="100" --learning_rate 8e-4 --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --fp8_base --highvram --max_train_epochs 5 --save_every_n_epochs 4 --dataset_config "N:\Games\pinokio\api\fluxgym.git\outputs\funi-funilab\dataset.toml" --output_dir "N:\Games\pinokio\api\fluxgym.git\outputs\funi-funilab" --output_name funi-funilab --timestep_sampling shift --discrete_flow_shift 3.1582 --model_prediction_type raw --guidance_scale 1 --loss_type l2 [2024-10-03 04:51:08] [INFO] The following values were not passed to accelerate launchand had defaults used instead: [2024-10-03 04:51:08] [INFO]--num_processeswas set to a value of1[2024-10-03 04:51:08] [INFO]--num_machineswas set to a value of1[2024-10-03 04:51:08] [INFO]--dynamo_backendwas set to a value of'no'[2024-10-03 04:51:08] [INFO] To avoid this warning pass in values for each of the problematic parameters or runaccelerate config. [2024-10-03 04:51:12] [INFO] highvram is enabled / highvramが有効です [2024-10-03 04:51:12] [INFO] 2024-10-03 04:51:12 WARNING cache_latents_to_disk is train_util.py:4022 [2024-10-03 04:51:12] [INFO] enabled, so cache_latents is [2024-10-03 04:51:12] [INFO] also enabled / [2024-10-03 04:51:12] [INFO] cache_latents_to_diskが有効なた [2024-10-03 04:51:12] [INFO] め、cache_latentsを有効にします [2024-10-03 04:51:12] [INFO] 2024-10-03 04:51:12 INFO t5xxl_max_token_length: flux_train_network.py:155 [2024-10-03 04:51:12] [INFO] 512 [2024-10-03 04:51:13] [INFO] N:\Games\pinokio\api\fluxgym.git\env\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaceswas not set. It will be set toTrueby default. This behavior will be depracted in transformers v4.45, and will be then set toFalseby default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 [2024-10-03 04:51:13] [INFO] warnings.warn( [2024-10-03 04:51:13] [INFO] You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that thelegacy(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 [2024-10-03 04:51:14] [INFO] 2024-10-03 04:51:14 INFO Loading dataset config from train_network.py:280 [2024-10-03 04:51:14] [INFO] N:\Games\pinokio\api\fluxgym. [2024-10-03 04:51:14] [INFO] git\outputs\funi-funilab\data [2024-10-03 04:51:14] [INFO] set.toml [2024-10-03 04:51:14] [INFO] INFO prepare images. train_util.py:1872 [2024-10-03 04:51:14] [INFO] INFO get image size from name of train_util.py:1810 [2024-10-03 04:51:14] [INFO] cache files [2024-10-03 04:51:14] [INFO] 0%| | 0/9 [00:00<?, ?it/s] 100%|██████████| 9/9 [00:00<00:00, 9026.48it/s] [2024-10-03 04:51:14] [INFO] INFO set image size from cache train_util.py:1817 [2024-10-03 04:51:14] [INFO] files: 9/9 [2024-10-03 04:51:14] [INFO] INFO found directory train_util.py:1819 [2024-10-03 04:51:14] [INFO] N:\Games\pinokio\api\fluxgym.gi [2024-10-03 04:51:14] [INFO] t\datasets\funi-funilab [2024-10-03 04:51:14] [INFO] contains 9 image files [2024-10-03 04:51:14] [INFO] INFO 72 train images with repeating. train_util.py:1913 [2024-10-03 04:51:14] [INFO] INFO 0 reg images. train_util.py:1916 [2024-10-03 04:51:14] [INFO] WARNING no regularization images / train_util.py:1921 [2024-10-03 04:51:14] [INFO] 正則化画像が見つかりませんでし [2024-10-03 04:51:14] [INFO] た [2024-10-03 04:51:14] [INFO] INFO [Dataset 0] config_util.py:570 [2024-10-03 04:51:14] [INFO] batch_size: 1 [2024-10-03 04:51:14] [INFO] resolution: (1024, 1024) [2024-10-03 04:51:14] [INFO] enable_bucket: False [2024-10-03 04:51:14] [INFO] network_multiplier: 1.0 [2024-10-03 04:51:14] [INFO] [2024-10-03 04:51:14] [INFO] [Subset 0 of Dataset 0] [2024-10-03 04:51:14] [INFO] image_dir: [2024-10-03 04:51:14] [INFO] "N:\Games\pinokio\api\fluxgym.g [2024-10-03 04:51:14] [INFO] it\datasets\funi-funilab" [2024-10-03 04:51:14] [INFO] image_count: 9 [2024-10-03 04:51:14] [INFO] num_repeats: 8 [2024-10-03 04:51:14] [INFO] shuffle_caption: False [2024-10-03 04:51:14] [INFO] keep_tokens: 1 [2024-10-03 04:51:14] [INFO] keep_tokens_separator: [2024-10-03 04:51:14] [INFO] caption_separator: , [2024-10-03 04:51:14] [INFO] secondary_separator: None [2024-10-03 04:51:14] [INFO] enable_wildcard: False [2024-10-03 04:51:14] [INFO] caption_dropout_rate: 0.0 [2024-10-03 04:51:14] [INFO] caption_dropout_every_n_epo [2024-10-03 04:51:14] [INFO] ches: 0 [2024-10-03 04:51:14] [INFO] caption_tag_dropout_rate: [2024-10-03 04:51:14] [INFO] 0.0 [2024-10-03 04:51:14] [INFO] caption_prefix: None [2024-10-03 04:51:14] [INFO] caption_suffix: None [2024-10-03 04:51:14] [INFO] color_aug: False [2024-10-03 04:51:14] [INFO] flip_aug: False [2024-10-03 04:51:14] [INFO] face_crop_aug_range: None [2024-10-03 04:51:14] [INFO] random_crop: False [2024-10-03 04:51:14] [INFO] token_warmup_min: 1, [2024-10-03 04:51:14] [INFO] token_warmup_step: 0, [2024-10-03 04:51:14] [INFO] alpha_mask: False, [2024-10-03 04:51:14] [INFO] is_reg: False [2024-10-03 04:51:14] [INFO] class_tokens: Funi [2024-10-03 04:51:14] [INFO] caption_extension: .txt [2024-10-03 04:51:14] [INFO] [2024-10-03 04:51:14] [INFO] [2024-10-03 04:51:14] [INFO] INFO [Dataset 0] config_util.py:576 [2024-10-03 04:51:14] [INFO] INFO loading image sizes. train_util.py:909 [2024-10-03 04:51:14] [INFO] 0%| | 0/9 [00:00<?, ?it/s] 100%|██████████| 9/9 [00:00<?, ?it/s] [2024-10-03 04:51:14] [INFO] INFO prepare dataset train_util.py:917 [2024-10-03 04:51:14] [INFO] INFO preparing accelerator train_network.py:345 [2024-10-03 04:51:14] [INFO] accelerator device: cuda [2024-10-03 04:51:14] [INFO] INFO Building Flux model dev flux_utils.py:45 [2024-10-03 04:51:14] [INFO] INFO Loading state dict from flux_utils.py:52 [2024-10-03 04:51:14] [INFO] N:\Games\pinokio\api\fluxgym.git\ [2024-10-03 04:51:14] [INFO] models\unet\flux1-dev.sft [2024-10-03 04:51:14] [INFO] INFO Loaded Flux: <All keys matched flux_utils.py:55 [2024-10-03 04:51:14] [INFO] successfully> [2024-10-03 04:51:14] [INFO] INFO prepare split model flux_train_network.py:110 [2024-10-03 04:51:14] [INFO] INFO load state dict for flux_train_network.py:117 [2024-10-03 04:51:14] [INFO] lower [2024-10-03 04:51:14] [INFO] INFO load state dict for flux_train_network.py:122 [2024-10-03 04:51:14] [INFO] upper [2024-10-03 04:51:14] [INFO] INFO prepare upper model flux_train_network.py:125 [2024-10-03 04:52:11] [INFO] 2024-10-03 04:52:11 INFO split model prepared flux_train_network.py:140 [2024-10-03 04:52:11] [INFO] INFO Building CLIP flux_utils.py:74 [2024-10-03 04:52:11] [INFO] INFO Loading state dict from flux_utils.py:167 [2024-10-03 04:52:11] [INFO] N:\Games\pinokio\api\fluxgym.git [2024-10-03 04:52:11] [INFO] \models\clip\clip_l.safetensors [2024-10-03 04:52:11] [INFO] INFO Loaded CLIP: <All keys matched flux_utils.py:170 [2024-10-03 04:52:11] [INFO] successfully> [2024-10-03 04:52:11] [INFO] INFO Loading state dict from flux_utils.py:215 [2024-10-03 04:52:11] [INFO] N:\Games\pinokio\api\fluxgym.git [2024-10-03 04:52:11] [INFO] \models\clip\t5xxl_fp16.safetens [2024-10-03 04:52:11] [INFO] ors [2024-10-03 04:52:11] [INFO] INFO Loaded T5xxl: <All keys matched flux_utils.py:218 [2024-10-03 04:52:11] [INFO] successfully> [2024-10-03 04:52:11] [INFO] INFO Building AutoEncoder flux_utils.py:62 [2024-10-03 04:52:11] [INFO] INFO Loading state dict from flux_utils.py:66 [2024-10-03 04:52:11] [INFO] N:\Games\pinokio\api\fluxgym.git\ [2024-10-03 04:52:11] [INFO] models\vae\ae.sft [2024-10-03 04:52:11] [INFO] INFO Loaded AE: <All keys matched flux_utils.py:69 [2024-10-03 04:52:11] [INFO] successfully> [2024-10-03 04:52:11] [INFO] import network module: networks.lora_flux [2024-10-03 04:52:12] [INFO] 2024-10-03 04:52:12 INFO [Dataset 0] train_util.py:2396 [2024-10-03 04:52:12] [INFO] INFO caching latents with caching train_util.py:1017 [2024-10-03 04:52:12] [INFO] strategy. [2024-10-03 04:52:12] [INFO] INFO checking cache validity... train_util.py:1044 [2024-10-03 04:52:12] [INFO] 0%| | 0/9 [00:00<?, ?it/s] 100%|██████████| 9/9 [00:00<00:00, 8813.62it/s] [2024-10-03 04:52:12] [INFO] INFO no latents to cache train_util.py:1087 [2024-10-03 04:52:12] [INFO] INFO move vae and unet to cpu flux_train_network.py:208 [2024-10-03 04:52:12] [INFO] to save memory [2024-10-03 04:52:12] [INFO] INFO move text encoders to flux_train_network.py:216 [2024-10-03 04:52:12] [INFO] gpu [2024-10-03 04:52:40] [INFO] 2024-10-03 04:52:40 INFO [Dataset 0] train_util.py:2417 [2024-10-03 04:52:40] [INFO] INFO caching Text Encoder outputs train_util.py:1179 [2024-10-03 04:52:40] [INFO] with caching strategy. [2024-10-03 04:52:40] [INFO] INFO checking cache validity... train_util.py:1185 [2024-10-03 04:52:40] [INFO] 0%| | 0/9 [00:00<?, ?it/s] 100%|██████████| 9/9 [00:00<00:00, 543.80it/s] [2024-10-03 04:52:40] [INFO] INFO no Text Encoder outputs to train_util.py:1207 [2024-10-03 04:52:40] [INFO] cache [2024-10-03 04:52:40] [INFO] INFO cache Text Encoder flux_train_network.py:232 [2024-10-03 04:52:40] [INFO] outputs for sample [2024-10-03 04:52:40] [INFO] prompt: [2024-10-03 04:52:40] [INFO] N:\Games\pinokio\api\flu [2024-10-03 04:52:40] [INFO] xgym.git\outputs\funi-fu [2024-10-03 04:52:40] [INFO] nilab\sample_prompts.txt [2024-10-03 04:52:40] [INFO] INFO cache Text Encoder flux_train_network.py:243 [2024-10-03 04:52:40] [INFO] outputs for prompt: Funi [2024-10-03 04:52:40] [INFO] fantasy forest [2024-10-03 04:52:40] [INFO] N:\Games\pinokio\api\fluxgym.git\env\lib\site-packages\transformers\models\clip\modeling_clip.py:480: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) [2024-10-03 04:52:40] [INFO] attn_output = torch.nn.functional.scaled_dot_product_attention( [2024-10-03 04:52:40] [INFO] INFO cache Text Encoder flux_train_network.py:243 [2024-10-03 04:52:40] [INFO] outputs for prompt: [2024-10-03 04:52:41] [INFO] 2024-10-03 04:52:41 INFO move t5XXL back to cpu flux_train_network.py:256 [2024-10-03 04:52:45] [INFO] 2024-10-03 04:52:45 INFO move vae and unet back flux_train_network.py:261 [2024-10-03 04:52:45] [INFO] to original device [2024-10-03 04:52:45] [INFO] INFO create LoRA network. base dim lora_flux.py:594 [2024-10-03 04:52:45] [INFO] (rank): 4, alpha: 1 [2024-10-03 04:52:45] [INFO] INFO neuron dropout: p=None, rank lora_flux.py:595 [2024-10-03 04:52:45] [INFO] dropout: p=None, module dropout: [2024-10-03 04:52:45] [INFO] p=None [2024-10-03 04:52:45] [INFO] INFO train single blocks only lora_flux.py:605 [2024-10-03 04:52:45] [INFO] INFO create LoRA for Text Encoder 1: lora_flux.py:741 [2024-10-03 04:52:45] [INFO] INFO create LoRA for Text Encoder 1: lora_flux.py:744 [2024-10-03 04:52:45] [INFO] 72 modules. [2024-10-03 04:52:45] [INFO] INFO create LoRA for FLUX single lora_flux.py:765 [2024-10-03 04:52:45] [INFO] blocks: 114 modules. [2024-10-03 04:52:45] [INFO] INFO enable LoRA for text encoder: 72 lora_flux.py:911 [2024-10-03 04:52:45] [INFO] modules [2024-10-03 04:52:45] [INFO] INFO enable LoRA for U-Net: 114 lora_flux.py:916 [2024-10-03 04:52:45] [INFO] modules [2024-10-03 04:52:45] [INFO] FLUX: Gradient checkpointing enabled. [2024-10-03 04:52:45] [INFO] prepare optimizer, data loader etc. [2024-10-03 04:52:45] [INFO] INFO Text Encoder 1 (CLIP-L): 72 lora_flux.py:1018 [2024-10-03 04:52:45] [INFO] modules, LR 0.0008 [2024-10-03 04:52:45] [INFO] INFO use Adafactor optimizer | train_util.py:4641 [2024-10-03 04:52:45] [INFO] {'relative_step': False, [2024-10-03 04:52:45] [INFO] 'scale_parameter': False, [2024-10-03 04:52:45] [INFO] 'warmup_init': False} [2024-10-03 04:52:45] [INFO] override steps. steps for 5 epochs is / 指定エポックまでのステップ数: 360 [2024-10-03 04:52:45] [INFO] enable fp8 training for U-Net. [2024-10-03 04:52:45] [INFO] enable fp8 training for Text Encoder. [2024-10-03 04:53:39] [INFO] 2024-10-03 04:53:39 INFO prepare CLIP-L for fp8: flux_train_network.py:467 [2024-10-03 04:53:39] [INFO] set to [2024-10-03 04:53:39] [INFO] torch.float8_e4m3fn, set [2024-10-03 04:53:39] [INFO] embeddings to [2024-10-03 04:53:39] [INFO] torch.bfloat16 [2024-10-03 04:53:40] [INFO] running training / 学習開始 [2024-10-03 04:53:40] [INFO] num train images * repeats / 学習画像の数×繰り返し回数: 72 [2024-10-03 04:53:40] [INFO] num reg images / 正則化画像の数: 0 [2024-10-03 04:53:40] [INFO] num batches per epoch / 1epochのバッチ数: 72 [2024-10-03 04:53:40] [INFO] num epochs / epoch数: 5 [2024-10-03 04:53:40] [INFO] batch size per device / バッチサイズ: 1 [2024-10-03 04:53:40] [INFO] gradient accumulation steps / 勾配を合計するステップ数 = 1 [2024-10-03 04:53:40] [INFO] total optimization steps / 学習ステップ数: 360 [2024-10-03 04:54:31] [INFO] steps: 0%| | 0/360 [00:00<?, ?it/s]2024-10-03 04:54:31 INFO unet dtype: train_network.py:1058 [2024-10-03 04:54:31] [INFO] torch.float8_e4m3fn, device: [2024-10-03 04:54:31] [INFO] cuda:0 [2024-10-03 04:54:31] [INFO] INFO text_encoder [0] dtype: train_network.py:1064 [2024-10-03 04:54:31] [INFO] torch.float8_e4m3fn, device: [2024-10-03 04:54:31] [INFO] cuda:0 [2024-10-03 04:54:31] [INFO] INFO text_encoder [1] dtype: train_network.py:1064 [2024-10-03 04:54:31] [INFO] torch.bfloat16, device: cpu [2024-10-03 04:54:31] [INFO] [2024-10-03 04:54:31] [INFO] epoch 1/5 [2024-10-03 04:54:41] [INFO] 2024-10-03 04:54:41 INFO epoch is incremented. train_util.py:701 [2024-10-03 04:54:41] [INFO] 2024-10-03 04:54:41 current_epoch: 0, epoch: 1 [2024-10-03 04:54:41] [INFO] INFO epoch is incremented. train_util.py:701 [2024-10-03 04:54:41] [INFO] current_epoch: 0, epoch: 1 [2024-10-03 04:55:03] [INFO] N:\Games\pinokio\api\fluxgym.git\env\lib\site-packages\torch\utils\checkpoint.py:1399: FutureWarning: torch.cpu.amp.autocast(args...)is deprecated. Please usetorch.amp.autocast('cpu', args...) instead. [2024-10-03 04:55:04] [INFO] with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant