You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Run any example with a warmup scheduler, observe the the effective LR is 0 for the first step, unnecessarily wasting compute. See similar discussion for this issue on torchtune pytorch/torchtune#2010. See the code at
The proposed solution in the torchtune discussion is to add a min_lr argument with the default set to a small number (e.g. 1e-7), and return max(min_lr, computed_lr)
System Info
transformers commit: 52ea4aa (main at time of writing)
the rest isn't relevant.
Who can help?
trainer: @muellerzr @SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run any example with a warmup scheduler, observe the the effective LR is 0 for the first step, unnecessarily wasting compute. See similar discussion for this issue on torchtune pytorch/torchtune#2010. See the code at
transformers/src/transformers/optimization.py
Line 182 in 52ea4aa
and evaluate for step 0. Observe it returns LR factor = 0, weights will not be updated.
Expected behavior
Expect every optimizer step to adjust the weights of my model unless there is a good reason not to.
The text was updated successfully, but these errors were encountered: