generated from bryanchrist/qlora
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathgenerate-51580390.err
25 lines (24 loc) · 2.54 KB
/
generate-51580390.err
1
2
3
4
5
6
7
8
9
Running command git clone --quiet https://github.com/huggingface/transformers.git /tmp/pip-install-z1l97u7j/transformers_be9717155dc64daeb03cd6106377e328
Running command git clone --quiet https://github.com/huggingface/peft.git /tmp/pip-install-z1l97u7j/peft_07052e86b4204313a81f66b6f1c4e2cc
Running command git clone --quiet https://github.com/huggingface/accelerate.git /tmp/pip-install-z1l97u7j/accelerate_a13a368fef2d480e82b7193723fe8a6f
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/home/brc4cb/.conda/envs/falcon_40B/lib/libcudart.so.11.0'), PosixPath('/home/brc4cb/.conda/envs/falcon_40B/lib/libcudart.so')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get `CUDA error: invalid device function` errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
warn(msg)
Loading checkpoint shards: 0%| | 0/14 [00:00<?, ?it/s]Loading checkpoint shards: 7%|▋ | 1/14 [00:03<00:48, 3.72s/it]Loading checkpoint shards: 14%|█▍ | 2/14 [00:05<00:33, 2.78s/it]Loading checkpoint shards: 21%|██▏ | 3/14 [00:07<00:27, 2.49s/it]Loading checkpoint shards: 29%|██▊ | 4/14 [00:10<00:23, 2.35s/it]Loading checkpoint shards: 36%|███▌ | 5/14 [00:12<00:20, 2.27s/it]Loading checkpoint shards: 43%|████▎ | 6/14 [00:14<00:17, 2.23s/it]Loading checkpoint shards: 50%|█████ | 7/14 [00:16<00:15, 2.19s/it]Loading checkpoint shards: 57%|█████▋ | 8/14 [00:18<00:13, 2.18s/it]Loading checkpoint shards: 64%|██████▍ | 9/14 [00:20<00:10, 2.16s/it]Loading checkpoint shards: 71%|███████▏ | 10/14 [00:22<00:08, 2.15s/it]Loading checkpoint shards: 79%|███████▊ | 11/14 [00:25<00:06, 2.15s/it]Loading checkpoint shards: 86%|████████▌ | 12/14 [00:27<00:04, 2.15s/it]Loading checkpoint shards: 93%|█████████▎| 13/14 [00:29<00:02, 2.14s/it]Loading checkpoint shards: 100%|██████████| 14/14 [00:30<00:00, 1.78s/it]Loading checkpoint shards: 100%|██████████| 14/14 [00:30<00:00, 2.16s/it]
slurmstepd: error: *** JOB 51580390 ON udc-an37-19 CANCELLED AT 2023-07-14T08:26:02 ***