Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The confusion surrounding bnb_4bit_compute_dtype, torch_dtype, and prepare_model_for_kbit_training. #1515

Open
xiaobingbuhuitou opened this issue Feb 14, 2025 · 0 comments

Comments

@xiaobingbuhuitou
Copy link

xiaobingbuhuitou commented Feb 14, 2025

I want to implement QLoRA fine-tuning for a model with dtype=Float32 based on PEFT. When I load the base model using from_pretrained("PATH", BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 )), without setting torch_dtype, the model's dtype changes to Float16, and the obtained last_hidden_state also becomes Float16. However, when I set torch_dtype=Float32, both the model's dtype and last_hidden_state remain Float32. But when I wrap the quantized model with prepare_model_for_kbit_training(), everything changes back to Float32. I would like to know if using the prepare_model_for_kbit_training() function causes bnb_4bit_compute_dtype and torch_dtype to become ineffective. Additionally, I would like to ask when it is necessary to set prepare_model_for_kbit_training(). Furthermore, what determines the impact on the base model's and last_hidden_state's data types? Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant