Skip to content

Lora no longer works in silnlp #688

Open
@laura-burdick-sil

Description

@laura-burdick-sil

Since the transformers library was updated to 4.46.2, Lora no longer works in silnlp. If you train a model using Lora and evaluate on the validation set at each checkpoint, the clearML scalars page shows the BLEU score on the validation set increasing steadily (in my experiment, to 30+). However, at the end of training, the BLEU score on the test set is much lower (in my experiment, about 8). If you manually run the validation data through the trained model, it also has a very low BLEU score. Our hypothesis is that the Lora adapter is not getting saved properly, so it's not being used during inference after the model is saved. Mostly likely, there is a bug in the _merge_and_delete_adapter function in hugging_face_config.py. It's possible that it's a bug in the library peft - Isaac has said that it's support for Lora in NLLB is not very stable.

Metadata

Metadata

Labels

bugSomething isn't workingpipeline 4: trainIssue related to training a model.pipeline 6: inferIssue related to using a trained model to translate.researchResearch topics

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions