Various errors in orpo_finetuning_example.ipynb notebook #63

fairydreaming · 2024-12-07T14:49:13Z

In orpo_finetuning_example.ipynb the currently the used dataset is "trl-lib/ultrafeedback_binarized" and it's loaded with split="train". Previously it was "mlabonne/orpo-dpo-mix-40k" with split="all". The current dataset causes failure during trainer creation:

Shouldn't it be split=None when loading the dataset?

Also the title in the Exercise section of the notebook is:

Exercise: Aligning SmolLM2 with DPOTrainer

Did you mean ORPOTrainer?

Finally, I noticed that you changed the optimizer to optim="paged_adamw_8bit". This optimizer requires bitsandbytes library, but this library is not listed in requirements.txt.

The text was updated successfully, but these errors were encountered:

fairydreaming · 2024-12-07T15:30:03Z

After fixing these problems and finishing the training run I examined the metrics:

Accuracy is 0.5 the whole time, margins barely budged. Does it indicate failure of the training run (at least regarding the preference alignment part)?

fairydreaming · 2024-12-13T19:54:53Z

@mmeendez8 Did you manage to fix orpo_finetuning_example.ipynb notebook so that the preference alignment training process actually works based on the reported metrics? I see that you fixed most of issues that I reported here, did you get the same metrics values as me?

mmeendez8 · 2024-12-14T18:40:20Z

No sorry, I just went trough the notebook quickly to check and found out a couple of issues. Need to spend some time checking that

ltejedor · 2025-01-28T21:29:33Z

I had the same experience re: model not training

Sample output: After training:
user
Write a haiku about programming
assistant
I am a programming language. I am a language that allows you to create and manipulate data. I am a language that allows you to create and manipulate data. I am a language that allows you to create and manipulate data. I am a language that allows you to create and manipulate data. I am a language that allows you to create and manipulate data. I am a language that allows you to create and manipulate data. I am a language that allows you to create and manipulate data

Results from w&b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various errors in orpo_finetuning_example.ipynb notebook #63

Various errors in orpo_finetuning_example.ipynb notebook #63

fairydreaming commented Dec 7, 2024

fairydreaming commented Dec 7, 2024

fairydreaming commented Dec 13, 2024

mmeendez8 commented Dec 14, 2024

ltejedor commented Jan 28, 2025

Various errors in orpo_finetuning_example.ipynb notebook #63

Various errors in orpo_finetuning_example.ipynb notebook #63

Comments

fairydreaming commented Dec 7, 2024

fairydreaming commented Dec 7, 2024

fairydreaming commented Dec 13, 2024

mmeendez8 commented Dec 14, 2024

ltejedor commented Jan 28, 2025