This repository has been archived by the owner on Aug 1, 2024. It is now read-only.
QLoRA for ESM-2 and Error: EsmForSequenceClassification does not support gradient checkpointing #607
Unanswered
Amelie-Schreiber
asked this question in
Q&A
Replies: 1 comment
-
Any fix for this? Qlora/nf4 quantization dies badly when i try it - lots of bugs and weirdness. It also only hacks to working with mixed types, which prevents PEFT from saving it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Bug description
ESM-2 models do not seem to be compatible with QLoRA due to not being compatible with gradient checkpointing.
Reproduction steps
Code to reproduce:
this next part produces the error:
Expected behavior
The script should simply prepare the model for training with a QLoRA (Quantized Low Rank Adaptation). See here for example which is linked to in this article.
Logs
Please paste the command line output:
Additional context
This is a basic attempt at training a QLoRA for ESM-2 models such as
facebook/esm2_t6_8M_UR50D
for a sequence classification task. The error is not task dependent though, and I have the same error when trying to train a token classifier. Any assistance on making ESM-2 models compatible with QLoRA would be greatly appreciated.Beta Was this translation helpful? Give feedback.
All reactions