Closed
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
作者您好,请问在finetune时候可以不使用bfloat16类型吗,我在finetune脚本中将bfloat16设为false但是运行的时候还是会提示我相关的报错,我的gpu类型是32g v100,因此想换成float16这是可行的吗,感觉好像不太可行,因为我把代码中所有bfloat16换成float16之后,就会有其他报错:
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 34, in do_one_step
data = pin_memory(data, device)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 60, in pin_memory
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 60, in <dictcomp>
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 55, in pin_memory
return data.pin_memory(device)
RuntimeError: cannot pin 'torch.cuda.HalfTensor' only dense CPU tensors can be pinned
Reproduction
GPUS=2 PER_DEVICE_BATCH_SIZE=1 sh shell/internvl2.0/2nd_finetune/internvl2_1b_qwen2_0_5b_dynamic_res_2nd_finetune_lora.sh
Environment
python 3.10
2张 v100 32g
Error traceback
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 34, in do_one_step
data = pin_memory(data, device)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 60, in pin_memory
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 60, in <dictcomp>
return type(data)({k: pin_memory(sample, device) for k, sample in data.items()}) # type: ignore[call-arg]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/pin_memory.py", line 55, in pin_memory
return data.pin_memory(device)
RuntimeError: cannot pin 'torch.cuda.HalfTensor' only dense CPU tensors can be pinned
Metadata
Metadata
Assignees
Labels
No labels