-
Notifications
You must be signed in to change notification settings - Fork 325
Add an example using Optuna and Transformers #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Thanks for your work! However, I don't think its all that different from the current hyperparameter search docs in Transformers except its a more complete example. @merveenoyan @sergiopaniego what do you think? |
Just for the record, I'd actually wanted to include support for the transformer's library in their optuna-integration package. But since there is backend support provided by the transformers library, I contributed an starting example to their repo. This PR builds on that example and provides a more hands-on approach for users to understand how to apply HPO to transformer models 🙂 |
@ParagEkbote cookbook mostly contains end-to-end applied AI recipes where library integrations shine 💫 rather than minimal examples. it would be great to make it a more applied ML type of recipe |
…sh to hub to make it more applied.
I have now added the following improvements to the recipe to make it more applied:
Could you please review the changes? cc: @stevhliu, @merveenoyan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the contribution, @ParagEkbote! 🙌
To make it more aligned with the rest of the recipes, it would be great to add a bit more context about the problem this recipe is aiming to solve. Including visuals or showcasing the results would also go a long way in making the explanation clearer and more engaging. I’d recommend checking out the other recipes for inspiration 😄 Also, make sure it runs as it is in Colab
What do you think?
What type of visuals or results can help for the recipe. Also, are there any specific sections do you think additional context is required? cc: @sergiopaniego |
For example, we currently don't have any output throughout the notebook so we are not displaying any results 😄 |
What does this PR do?
In this end-to-end tutorial, we are going to utilize the optuna library to perform hyperparameter optimization on a BERT model using the IMDB dataset.
Firstly, we will load and preprocess the dataset and define the model we want to perform HPO on. Then, we shall set the metrics and wrap it inside the trainer class along with a search space that will search the best set of hyperparameters for the learning rate, weight decay and batch size. Lastly, we will visualize the results as well.
Please let me know if any modifications are required and I will make the necessary changes.
Who can review?
@stevhliu.