Description
Since the transformers
library was updated to 4.46.2, Lora no longer works in silnlp. If you train a model using Lora and evaluate on the validation set at each checkpoint, the clearML scalars page shows the BLEU score on the validation set increasing steadily (in my experiment, to 30+). However, at the end of training, the BLEU score on the test set is much lower (in my experiment, about 8). If you manually run the validation data through the trained model, it also has a very low BLEU score. Our hypothesis is that the Lora adapter is not getting saved properly, so it's not being used during inference after the model is saved. Mostly likely, there is a bug in the _merge_and_delete_adapter
function in hugging_face_config.py
. It's possible that it's a bug in the library peft
- Isaac has said that it's support for Lora in NLLB is not very stable.