nohup.out

  File "run.py", line 148
    soft_prompt_path=f'soft_prompt/soft_prompt_{model_name}_{n_tokens}.model')
                                                                            ^
SyntaxError: invalid syntax
  File "run.py", line 148
    soft_prompt_path=f'soft_prompt/soft_prompt_{model_name}_{n_tokens}.model')
                                                                            ^
SyntaxError: invalid syntax
  File "run.py", line 148
    soft_prompt_path=f'soft_prompt/soft_prompt_{model_name}_{n_tokens}.model')
                                                                            ^
SyntaxError: invalid syntax
  File "run.py", line 148
    soft_prompt_path=f'soft_prompt/soft_prompt_{model_name}_{n_tokens}.model')
                                                                            ^
SyntaxError: invalid syntax
***** Running training *****
  Num examples = 87599
  Num Epochs = 4
  Instantaneous batch size per device = 6
  Total train batch size (w. parallel, distributed & accumulation) = 12
  Gradient Accumulation steps = 1
  Total optimization steps = 29200
***** Running training *****
  Num examples = 87599
  Num Epochs = 4
  Instantaneous batch size per device = 10
  Total train batch size (w. parallel, distributed & accumulation) = 20
  Gradient Accumulation steps = 1
  Total optimization steps = 17520
***** Running training *****
  Num examples = 87599
  Num Epochs = 4
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 32
  Gradient Accumulation steps = 1
  Total optimization steps = 10952
***** Running training *****
  Num examples = 87599
  Num Epochs = 4
  Instantaneous batch size per device = 32
  Total train batch size (w. parallel, distributed & accumulation) = 64
  Gradient Accumulation steps = 1
  Total optimization steps = 5476
***** Running training *****
  Num examples = 87599
  Num Epochs = 4
  Instantaneous batch size per device = 128
  Total train batch size (w. parallel, distributed & accumulation) = 256
  Gradient Accumulation steps = 1
  Total optimization steps = 1372


Training completed. Do not forget to share your model on huggingface.co/models =)


Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 12
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 6
  Total train batch size (w. parallel, distributed & accumulation) = 18
  Gradient Accumulation steps = 1
  Total optimization steps = 19244
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 12
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 12
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 12
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 12
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 21648
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 21648
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 21648
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 21648


Training completed. Do not forget to share your model on huggingface.co/models =)


Training completed. Do not forget to share your model on huggingface.co/models =)


Training completed. Do not forget to share your model on huggingface.co/models =)


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 24
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 24
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 24
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 24
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 24
  Total train batch size (w. parallel, distributed & accumulation) = 24
  Gradient Accumulation steps = 1
  Total optimization steps = 14432
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 74160
  Num Epochs = 4
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 37080
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 117384
  Num Epochs = 4
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 58692


Training completed. Do not forget to share your model on huggingface.co/models =)


Training completed. Do not forget to share your model on huggingface.co/models =)


Training completed. Do not forget to share your model on huggingface.co/models =)


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (730 > 512). Running this sequence through the model will result in indexing errors


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (724 > 512). Running this sequence through the model will result in indexing errors
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
Using the `WAND_DISABLED` environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 173176
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 173176
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 173176
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 173176
***** Running training *****
  Num examples = 86588
  Num Epochs = 4
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 173176
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-43294
Configuration saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-43294/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-43294/pytorch_model.bin
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-43294
Configuration saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-43294/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-43294/pytorch_model.bin
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-43294
Configuration saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-43294/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-43294/pytorch_model.bin
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-43294
Configuration saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-43294/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-43294/pytorch_model.bin
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-43294
Configuration saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-43294/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-43294/pytorch_model.bin
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-86588
Configuration saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-86588/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-86588/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-43294] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-86588
Configuration saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-86588/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-86588/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-43294] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-86588
Configuration saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-86588/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-86588/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-43294] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-86588
Configuration saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-86588/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-86588/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-43294] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-86588
Configuration saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-86588/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-86588/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-43294] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-129882
Configuration saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-129882/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-129882/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-86588] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-129882
Configuration saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-129882/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-129882/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-86588] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-129882
Configuration saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-129882/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-129882/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-86588] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-129882
Configuration saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-129882/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-129882/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-86588] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-129882
Configuration saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-129882/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-129882/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-86588] due to args.save_total_limit
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-173176
Configuration saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-173176/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-173176/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/1/2022-02-05-131209/artifact/checkpoint-129882] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)


Saving model checkpoint to prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-173176
Configuration saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-173176/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-173176/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/5/2022-02-05-131219/artifact/checkpoint-129882] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)


Saving model checkpoint to prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-173176
Configuration saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-173176/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-173176/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/10/2022-02-05-131228/artifact/checkpoint-129882] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-173176
Configuration saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-173176/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-173176/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/20/2022-02-05-131238/artifact/checkpoint-129882] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors
Saving model checkpoint to prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-173176
Configuration saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-173176/config.json
Model weights saved in prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-173176/pytorch_model.bin
Deleting older checkpoint [prompt_tuning/SQuAD/t5-large/50/2022-02-05-131247/artifact/checkpoint-129882] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)


Token indices sequence length is longer than the specified maximum sequence length for this model (540 > 512). Running this sequence through the model will result in indexing errors