ZeroDivisionError: division by zero #95

NoteToSelfFindGoodNickname · 2023-02-01T00:02:45Z

Environment name is set as "ST" as per environment.yaml
anaconda3/miniconda3 detected in C:\Users\tomwe\miniconda3
Starting conda environment "ST" from C:\Users\tomwe\miniconda3
warning: redirecting to https://github.com/devilismyfriend/StableTuner.git/
Latest git hash: ef51982

(ST) C:\Users\tomwe\st4>accelerate "launch" "--mixed_precision=fp16" "scripts/trainer.py" "--attention=xformers" "--model_variant=base" "--normalize_masked_area_loss" "--unmasked_probability=0.0" "--max_denoising_strength=1.0" "--disable_cudnn_benchmark" "--use_text_files_as_captions" "--sample_step_interval=50" "--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base" "--pretrained_vae_name_or_path=" "--output_dir=models/iconex" "--seed=3434554" "--resolution=512" "--train_batch_size=24" "--num_train_epochs=100" "--mixed_precision=fp16" "--use_bucketing" "--aspect_mode=dynamic" "--aspect_mode_action_preference=add" "--use_8bit_adam" "--gradient_checkpointing" "--gradient_accumulation_steps=1" "--learning_rate=3e-6" "--lr_warmup_steps=0" "--lr_scheduler=constant" "--train_text_encoder" "--concepts_list=stabletune_concept_list.json" "--num_class_images=200" "--save_every_n_epoch=5" "--n_save_sample=2" "--sample_height=512" "--sample_width=512" "--dataset_repeats=1" "--add_sample_prompt=an apple by iconex" "--sample_on_training_start"
The following values were not passed to accelerate launch and had defaults used instead:
--num_processes was set to a value of 1
--num_machines was set to a value of 1
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
Booting Up StableTuner
Please wait a moment as we load up some stuff...
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link

CUDA SETUP: Loading binary C:\Users\tomwe\miniconda3\envs\ST\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll...
C:\Users\tomwe\miniconda3\envs\ST\lib\site-packages\diffusers\configuration_utils.py:195: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config.If you were trying to load a scheduler, please use <class 'diffusers.schedulers.scheduling_ddpm.DDPMScheduler'>.from_pretrained(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
Creating Auto Bucketing Dataloader
Rounded resolution to: 512
Preloading images...
** Processing C:/Users/tomwe/Desktop/auswahl: 100%|█████████████████████████████| 165/165 [00:00<00:00, 10560.81it/s]
** Number of buckets: 1
** Bucket (512, 512) found 35 images, will drop 11 images due to batch size 24
Number of image-caption pairs: 24

** Validation Set: val, steps: 1, repeats: 1

Loading Latent Cache from models\iconex\logs\latent_cache
Latents are ready.
Traceback (most recent call last):
File "C:\Users\tomwe\st4\scripts\trainer.py", line 2902, in
main()
File "C:\Users\tomwe\st4\scripts\trainer.py", line 2216, in main
args.num_train_epochs = math.ceil(args.max_train_steps / num_update_steps_per_epoch)
ZeroDivisionError: division by zero
Traceback (most recent call last):
File "C:\Users\tomwe\miniconda3\envs\ST\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\tomwe\miniconda3\envs\ST\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\tomwe\miniconda3\envs\ST\Scripts\accelerate.exe_main.py", line 7, in
File "C:\Users\tomwe\miniconda3\envs\ST\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "C:\Users\tomwe\miniconda3\envs\ST\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "C:\Users\tomwe\miniconda3\envs\ST\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\Users\tomwe\miniconda3\envs\ST\python.exe', 'scripts/trainer.py', '--attention=xformers', '--model_variant=base', '--normalize_masked_area_loss', '--unmasked_probability=0.0', '--max_denoising_strength=1.0', '--disable_cudnn_benchmark', '--use_text_files_as_captions', '--sample_step_interval=50', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2-1-base', '--pretrained_vae_name_or_path=', '--output_dir=models/iconex', '--seed=3434554', '--resolution=512', '--train_batch_size=24', '--num_train_epochs=100', '--mixed_precision=fp16', '--use_bucketing', '--aspect_mode=dynamic', '--aspect_mode_action_preference=add', '--use_8bit_adam', '--gradient_checkpointing', '--gradient_accumulation_steps=1', '--learning_rate=3e-6', '--lr_warmup_steps=0', '--lr_scheduler=constant', '--train_text_encoder', '--concepts_list=stabletune_concept_list.json', '--num_class_images=200', '--save_every_n_epoch=5', '--n_save_sample=2', '--sample_height=512', '--sample_width=512', '--dataset_repeats=1', '--add_sample_prompt=an apple by iconex', '--sample_on_training_start']' returned non-zero exit status 1

The text was updated successfully, but these errors were encountered:

NoteToSelfFindGoodNickname · 2023-02-01T01:00:47Z

There seems to be an error where you recalculate when images were dropped due to batch size.
For example, I had 29 images, but batch size was 24.
5 images were dropped.
Once I changed the batch size to 29, the error was gone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroDivisionError: division by zero #95

ZeroDivisionError: division by zero #95

NoteToSelfFindGoodNickname commented Feb 1, 2023

NoteToSelfFindGoodNickname commented Feb 1, 2023

ZeroDivisionError: division by zero #95

ZeroDivisionError: division by zero #95

Comments

NoteToSelfFindGoodNickname commented Feb 1, 2023

NoteToSelfFindGoodNickname commented Feb 1, 2023