forked from artidoro/qlora
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathgenerate_v2-51255444.err
28 lines (27 loc) · 3.71 KB
/
generate_v2-51255444.err
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Running command git clone --quiet https://github.com/huggingface/transformers.git /tmp/pip-install-8ymk3egx/transformers_48cfc922af2040f8b4547ebe1ba2c81f
Running command git clone --quiet https://github.com/huggingface/peft.git /tmp/pip-install-8ymk3egx/peft_782f692bd3894683a816e9286738ee65
Running command git clone --quiet https://github.com/huggingface/accelerate.git /tmp/pip-install-8ymk3egx/accelerate_c78f2c58cb1949f9952d4f6bb3f4c973
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/home/brc4cb/.conda/envs/falcon_40B/lib/libcudart.so'), PosixPath('/home/brc4cb/.conda/envs/falcon_40B/lib/libcudart.so.11.0')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get `CUDA error: invalid device function` errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
warn(msg)
Loading checkpoint shards: 0%| | 0/9 [00:00<?, ?it/s]Loading checkpoint shards: 11%|█ | 1/9 [00:08<01:08, 8.55s/it]Loading checkpoint shards: 22%|██▏ | 2/9 [00:14<00:47, 6.76s/it]Loading checkpoint shards: 33%|███▎ | 3/9 [00:19<00:36, 6.11s/it]Loading checkpoint shards: 44%|████▍ | 4/9 [00:24<00:29, 5.85s/it]Loading checkpoint shards: 56%|█████▌ | 5/9 [00:30<00:22, 5.66s/it]Loading checkpoint shards: 67%|██████▋ | 6/9 [00:36<00:18, 6.04s/it]Loading checkpoint shards: 78%|███████▊ | 7/9 [00:42<00:11, 5.77s/it]Loading checkpoint shards: 89%|████████▉ | 8/9 [00:47<00:05, 5.63s/it]Loading checkpoint shards: 100%|██████████| 9/9 [00:51<00:00, 5.27s/it]Loading checkpoint shards: 100%|██████████| 9/9 [00:51<00:00, 5.77s/it]
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/transformers/generation/utils.py:1261: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
warnings.warn(
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/transformers/generation/utils.py:1355: UserWarning: Using `max_length`'s default (20) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
warnings.warn(
Input length of input_ids is 43, but `max_length` is set to 20. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/transformers/generation/utils.py:1454: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for exampleinput _ids = input_ids.to('cuda') before running `.generate()`.
warnings.warn(