forked from artidoro/qlora
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathjob1-51151638.err
33 lines (32 loc) · 3.14 KB
/
job1-51151638.err
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Running command git clone --quiet https://github.com/huggingface/transformers.git /tmp/pip-install-o4gpk804/transformers_acab85b3478f442e899ab0360e168505
Running command git clone --quiet https://github.com/huggingface/peft.git /tmp/pip-install-o4gpk804/peft_9afeedb2a2ed4faea7ce3fc85f134bf7
Running command git clone --quiet https://github.com/huggingface/accelerate.git /tmp/pip-install-o4gpk804/accelerate_54857e1be5094f07a6f3a0447c2a826c
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/home/brc4cb/.conda/envs/falcon_40B/lib/libcudart.so'), PosixPath('/home/brc4cb/.conda/envs/falcon_40B/lib/libcudart.so.11.0')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get `CUDA error: invalid device function` errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
warn(msg)
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/transformers/configuration_utils.py:483: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
/home/brc4cb/.conda/envs/falcon_40B/lib/python3.9/site-packages/transformers/modeling_utils.py:2192: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
Loading checkpoint shards: 0%| | 0/9 [00:00<?, ?it/s]Loading checkpoint shards: 11%|█ | 1/9 [00:08<01:06, 8.31s/it]Loading checkpoint shards: 22%|██▏ | 2/9 [00:15<00:51, 7.38s/it]Loading checkpoint shards: 33%|███▎ | 3/9 [00:21<00:43, 7.17s/it]Loading checkpoint shards: 44%|████▍ | 4/9 [00:31<00:40, 8.14s/it]Loading checkpoint shards: 56%|█████▌ | 5/9 [00:37<00:30, 7.51s/it]Loading checkpoint shards: 67%|██████▋ | 6/9 [00:45<00:22, 7.45s/it]Loading checkpoint shards: 78%|███████▊ | 7/9 [00:53<00:15, 7.59s/it]Loading checkpoint shards: 89%|████████▉ | 8/9 [01:00<00:07, 7.42s/it]Loading checkpoint shards: 100%|██████████| 9/9 [01:06<00:00, 7.09s/it]Loading checkpoint shards: 100%|██████████| 9/9 [01:06<00:00, 7.40s/it]
Traceback (most recent call last):
File "/gpfs/gpfs0/project/SDS/research/christ_research/falcon/qlora/qlora.py", line 807, in <module>
train()
File "/gpfs/gpfs0/project/SDS/research/christ_research/falcon/qlora/qlora.py", line 678, in train
data_module = make_data_module(tokenizer=tokenizer, args=args)
File "/gpfs/gpfs0/project/SDS/research/christ_research/falcon/qlora/qlora.py", line 576, in make_data_module
dataset = load_data(args.dataset)
File "/gpfs/gpfs0/project/SDS/research/christ_research/falcon/qlora/qlora.py", line 540, in load_data
raise NotImplementedError(f"Dataset {dataset_name} not implemented yet.")
NotImplementedError: Dataset = not implemented yet.