Error when run colab notebook #52

ken2190 · 2023-04-16T09:37:35Z

i get the below error when i run training cell in colab FineTuning_colab.ipynb
also run cell Training parameters and all parameter parsed

No LSB modules are available.
Description: Ubuntu 20.04.5 LTS
diffusers==0.11.1
lora-diffusion @ file:///content/lora
torchvision @ https://download.pytorch.org/whl/cu118/torchvision-0.15.1%2Bcu118-cp39-cp39-linux_x86_64.whl
transformers==4.25.1
xformers==0.0.16rc425
2023-04-16 09:29:59.351268: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-16 09:30:00.246985: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Copy-and-paste the text below in your GitHub issue

Accelerate version: 0.15.0
Platform: Linux-5.10.147+-x86_64-with-glibc2.31
Python version: 3.9.16
Numpy version: 1.22.4
PyTorch version (GPU?): 1.13.1+cu117 (True)
Accelerate default config:
Not found
2023-04-16 09:30:04.094704: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-16 09:30:04.940115: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
usage: accelerate [] launch
[-h]
[--config_file CONFIG_FILE]
[--cpu]
[--mps]
[--multi_gpu]
[--tpu]
[--use_mps_device]
[--dynamo_backend {no,eager,aot_eager,inductor,nvfuser,aot_nvfuser,aot_cudagraphs,ofi,fx2trt,onnxrt,ipex}]
[--mixed_precision {no,fp16,bf16}]
[--fp16]
[--num_processes NUM_PROCESSES]
[--num_machines NUM_MACHINES]
[--num_cpu_threads_per_process NUM_CPU_THREADS_PER_PROCESS]
[--use_deepspeed]
[--use_fsdp]
[--use_megatron_lm]
[--gpu_ids GPU_IDS]
[--same_network]
[--machine_rank MACHINE_RANK]
[--main_process_ip MAIN_PROCESS_IP]
[--main_process_port MAIN_PROCESS_PORT]
[--rdzv_conf RDZV_CONF]
[--max_restarts MAX_RESTARTS]
[--monitor_interval MONITOR_INTERVAL]
[-m]
[--no_python]
[--main_training_function MAIN_TRAINING_FUNCTION]
[--downcast_bf16]
[--deepspeed_config_file DEEPSPEED_CONFIG_FILE]
[--zero_stage ZERO_STAGE]
[--offload_optimizer_device OFFLOAD_OPTIMIZER_DEVICE]
[--offload_param_device OFFLOAD_PARAM_DEVICE]
[--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
[--gradient_clipping GRADIENT_CLIPPING]
[--zero3_init_flag ZERO3_INIT_FLAG]
[--zero3_save_16bit_model ZERO3_SAVE_16BIT_MODEL]
[--deepspeed_hostfile DEEPSPEED_HOSTFILE]
[--deepspeed_exclusion_filter DEEPSPEED_EXCLUSION_FILTER]
[--deepspeed_inclusion_filter DEEPSPEED_INCLUSION_FILTER]
[--deepspeed_multinode_launcher DEEPSPEED_MULTINODE_LAUNCHER]
[--fsdp_offload_params FSDP_OFFLOAD_PARAMS]
[--fsdp_min_num_params FSDP_MIN_NUM_PARAMS]
[--fsdp_sharding_strategy FSDP_SHARDING_STRATEGY]
[--fsdp_auto_wrap_policy FSDP_AUTO_WRAP_POLICY]
[--fsdp_transformer_layer_cls_to_wrap FSDP_TRANSFORMER_LAYER_CLS_TO_WRAP]
[--fsdp_backward_prefetch_policy FSDP_BACKWARD_PREFETCH_POLICY]
[--fsdp_state_dict_type FSDP_STATE_DICT_TYPE]
[--megatron_lm_tp_degree MEGATRON_LM_TP_DEGREE]
[--megatron_lm_pp_degree MEGATRON_LM_PP_DEGREE]
[--megatron_lm_num_micro_batches MEGATRON_LM_NUM_MICRO_BATCHES]
[--megatron_lm_sequence_parallelism MEGATRON_LM_SEQUENCE_PARALLELISM]
[--megatron_lm_recompute_activations MEGATRON_LM_RECOMPUTE_ACTIVATIONS]
[--megatron_lm_use_distributed_optimizer MEGATRON_LM_USE_DISTRIBUTED_OPTIMIZER]
[--megatron_lm_gradient_clipping MEGATRON_LM_GRADIENT_CLIPPING]
[--aws_access_key_id AWS_ACCESS_KEY_ID]
[--aws_secret_access_key AWS_SECRET_ACCESS_KEY]
[--debug]
training_script
...
accelerate [] launch: error: argument --mixed_precision: invalid choice: '' (choose from 'no', 'fp16', 'bf16')

The text was updated successfully, but these errors were encountered:

ken2190 · 2023-04-16T09:39:04Z

set --mixed_precision="fp16" get another error

CalledProcessError: Command '['/usr/bin/python3',
'/content/Dreambooth/train.py', '--lora_rank=',
'--pretrained_model_name_or_path=', '--pretrained_vae_name_or_path=',
'--instance_data_dir=', '--class_data_dir=', '--output_dir=', '--logging_dir=',
'--prior_loss_weight=', '--instance_prompt=', '--class_prompt=',
'--conditioning_dropout_prob=', '--unconditional_prompt=', '--seed=',
'--resolution=', '--train_batch_size=', '--gradient_accumulation_steps=',
'--mixed_precision=', '--adam_beta1=', '--adam_beta2=', '--adam_weight_decay=',
'--adam_epsilon=', '--learning_rate=', '--learning_rate_text=',
'--lr_scheduler=', '--lr_warmup_steps=', '--lr_cosine_num_cycles=',
'--ema_inv_gamma=', '--ema_power=', '--ema_min_value=', '--ema_max_value=',
'--max_train_steps=', '--num_class_images=', '--sample_batch_size=',
'--save_min_steps=', '--save_interval=', '--n_save_sample=',
'--save_sample_prompt=', '--save_sample_negative_prompt=']' returned non-zero
exit status 2.

xam74er1 · 2023-07-09T11:52:42Z

I also have the isse , be sure to run all the previous step , including the experimental step

theodorhar · 2023-11-22T17:09:39Z

I had a very similar error, which also resolved once I did not skip past the Experimental steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when run colab notebook #52

Error when run colab notebook #52

ken2190 commented Apr 16, 2023

ken2190 commented Apr 16, 2023 •

edited

Loading

xam74er1 commented Jul 9, 2023

theodorhar commented Nov 22, 2023

Error when run colab notebook #52

Error when run colab notebook #52

Comments

ken2190 commented Apr 16, 2023

ken2190 commented Apr 16, 2023 • edited Loading

xam74er1 commented Jul 9, 2023

theodorhar commented Nov 22, 2023

ken2190 commented Apr 16, 2023 •

edited

Loading