Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Add option to support user defined learning rate scheduler for NeuralForecast Models #998

Merged
merged 7 commits into from
May 23, 2024

Conversation

JQGoh
Copy link
Contributor

@JQGoh JQGoh commented May 9, 2024

Rationale

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@JQGoh JQGoh force-pushed the feat/user-defined-scheduler branch 3 times, most recently from e3de57f to 2a7d761 Compare May 9, 2024 18:35
JQGoh added 2 commits May 10, 2024 02:42
Fix assertion check

Fix iterable issue

fix parameter passed to user-defined lr_scheduler
@JQGoh JQGoh force-pushed the feat/user-defined-scheduler branch from 2a7d761 to 17a895b Compare May 9, 2024 18:42
@JQGoh
Copy link
Contributor Author

JQGoh commented May 9, 2024

@jmoralez Please review. Hope this might be useful to the community too.

@elephaint
Copy link
Contributor

@JQGoh Thanks for your work, the PR looks good to me. @cchallu @jmoralez I'm happy to add this as feature to neuralforecast, wdyt?

@JQGoh
Copy link
Contributor Author

JQGoh commented May 16, 2024

@jmoralez @cchallu If you have some time, appreciate your review on this. As I notice recently there has been new models added to neuralforecast, would love to get this merged (subject to approval/reviewed) and later the other newly implemented models shall consider these arguments

@jmoralez
Copy link
Member

Can you please add warnings if the user provides lr_scheduler_kwargs but doesn't provide lr_scheduler (that the kwargs will be ignored)? We got a similar request for the optimizer by the sktime folks.

@JQGoh
Copy link
Contributor Author

JQGoh commented May 16, 2024

Can you please add warnings if the user provides lr_scheduler_kwargs but doesn't provide lr_scheduler (that the kwargs will be ignored)? We got a similar request for the optimizer by the sktime folks.

@jmoralez
Sounds good to me. I shall add warnings for

  • user provides lr_scheduler_kwargs but doesn't provide lr_scheduler
  • user provides optimizer_kwargs but doesn't provide optimizer.

If you have the link/reference to the ask by sktime folks, can mention in this PR too. Thanks

@jmoralez
Copy link
Member

sktime/sktime#6235 (comment)

@BrunoBelucci
Copy link

BrunoBelucci commented May 22, 2024

I was facing the same issue when I wanted to use a custom scheduler and you have a solution that is pretty much what I have, the only thing that I am missing in it is that I think we should also be able to change the other arguments of "lr_scheduler_config" like the frequency or the interval (https://lightning.ai/docs/pytorch/stable/common/lightning_module.html#configure-optimizers for a list of other parameters). My solution was in fact to add 2 new kwargs, lr_scheduler_kwargs as you have and lr_scheduler_config as expected by lightning, only adding the actual scheduler to it in configure_optimizers with lr_scheduler_config['scheduler'] = self.lr_scheduler(optimizer=optimizer, **lr_scheduler_kwargs). I think that if someone is bothering to manually change the lr_scheduler it makes sense to give them full control over it.

@JQGoh
Copy link
Contributor Author

JQGoh commented May 22, 2024

I was facing the same issue when I wanted to use a custom scheduler and you have a solution that is pretty much what I have, the only thing that I am missing in it is that I think we should also be able to change the other arguments of "lr_scheduler_config" like the frequency or the interval (https://lightning.ai/docs/pytorch/stable/common/lightning_module.html#configure-optimizers for a list of other parameters). My solution was in fact to add 2 new kwargs, lr_scheduler_kwargs as you have and lr_scheduler_config as expected by lightning, only adding the actual scheduler to it in configure_optimizers with lr_scheduler_config['scheduler'] = self.lr_scheduler(optimizer=optimizer, **lr_scheduler_kwargs). I think that if someone is bothering to manually change the lr_scheduler it makes sense to give them full control over it.

@BrunoBelucci
Thanks for your suggestion. I lean toward allowing the users to have more freedom in customizing the behaviour, including frequency.

On top of the prepared PR, think I may need an additional arg called lr_scheduler_config that represents the set of arguments as detailed in https://lightning.ai/docs/pytorch/stable/common/lightning_module.html#configure-optimizers, but it shall ignore the "scheduler" (since we will specify it separately, plus we can ensure that lr_scheduler using the same optimizer).
If Nixtla team agrees that lr_scheduler_config is a nice to have feature, I can prepare this in the next PR.
cc: @jmoralez

@jmoralez
Copy link
Member

I'm not really a fan of adding so many arguments to all models. Would it be possible to instead provide a function that overrides the configure_optimizers method? Either by calling it or monkey patching.

@BrunoBelucci
Copy link

Honestly, I think that in this case, the cleanest solution is to add one more argument. We already have arguments for the trainer, optimizer, and scheduler. Introducing a different approach for configuring the scheduler, which is part of the same system, could confuse users and require them to write additional boilerplate code to achieve their goals. Here are some scenarios to further develop my point:

  1. Using the CosineAnnealingLR Scheduler
    The default scheduler configuration is fine because the scheduler is not affected by it.

  2. Using the OneCycleLR Scheduler
    The default scheduler configuration is fine if the user knows that the default is {'frequency': 1, 'interval': 'step'}. They would only need to calculate the number of steps to pass as total_steps in the lr_scheduler_kwargs. However, if they do not know this, they would need to check the code (since it is not documented) to ensure the model trains as intended.

  3. Using the ReduceLROnPlateau Scheduler
    In this case, the user needs to change the scheduler configuration because, typically, we want to identify a plateau by epoch rather than by step. Additionally, the user has to specify the "monitor" metric they want. Without the ability to pass the scheduler configuration directly, users would need to use one approach to pass the scheduler and its kwargs and another approach (such as calling another function or monkey patching) to overwrite the scheduler configuration.

I see no reason for users to have to handle each scenario above differently. Additionally, this approach would allow us to document the default behavior clearly.

@jmoralez
Copy link
Member

What I mean is that we currently have two arguments (for the optimizer), this adds two more (for the scheduler) and you're proposing adding a fifth, when all of these are going to be used in the same method (configure_optimizers) and they all could've been just one where the user passes a callable that takes the model parameters and returns what pytorch lightning expects from the configure_optimizers method.

Given that sktime is already using the optimizer arguments it'd be a bad experience to deprecate them and introduce a new one that takes a function, so I think we should move forward with adding more to not deprecate something we recently introduced, I just wish we'd done the single argument approach from the start.

Copy link
Member

@jmoralez jmoralez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough tests, as always!

@jmoralez jmoralez merged commit 5fc342c into Nixtla:main May 23, 2024
14 checks passed
@JQGoh
Copy link
Contributor Author

JQGoh commented May 23, 2024

What I mean is that we currently have two arguments (for the optimizer), this adds two more (for the scheduler) and you're proposing adding a fifth, when all of these are going to be used in the same method (configure_optimizers) and they all could've been just one where the user passes a callable that takes the model parameters and returns what pytorch lightning expects from the configure_optimizers method.

Given that sktime is already using the optimizer arguments it'd be a bad experience to deprecate them and introduce a new one that takes a function, so I think we should move forward with adding more to not deprecate something we recently introduced, I just wish we'd done the single argument approach from the start.

@jmoralez
I also wished that I could have implemented it differently, did not consider about the option of modifying the default configure_optimizes behavior.

By the way, shall we re-consider this implementation and revert this PR?

I implemented the option of modifying configure_optimizers behavior via set_configure_optimizers at BaseModel level, please check the work in
#1015

@MLfreakPy
Copy link

Thank you so much for the implementation of a custom lr scheduler!

@BrunoBelucci :
Could you explain how it is possible to implement a ReduceLROnPlateau scheduler? I think some of your earlier discussion evolved around it, however, I couldn't make it work yet.
Currently I am passing through the lr_scheduler and lr_scheduler_kwargs parameters. However, I get the below error when implementing ReduceLROnPlateau. Other schedulers, like OneCycle, work perfectly though!

MisconfigurationException('The lr scheduler dict must include a monitor when a ReduceLROnPlateau scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": {"scheduler": scheduler, "monitor": "your_loss"}}').

@JQGoh
Copy link
Contributor Author

JQGoh commented Jul 30, 2024

Thank you so much for the implementation of a custom lr scheduler!

@BrunoBelucci : Could you explain how it is possible to implement a ReduceLROnPlateau scheduler? I think some of your earlier discussion evolved around it, however, I couldn't make it work yet. Currently I am passing through the lr_scheduler and lr_scheduler_kwargs parameters. However, I get the below error when implementing ReduceLROnPlateau. Other schedulers, like OneCycle, work perfectly though!

MisconfigurationException('The lr scheduler dict must include a monitor when a ReduceLROnPlateau scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": {"scheduler": scheduler, "monitor": "your_loss"}}').

@MLfreakPy
Could you open a new issue and continue the discussions related to this topic? You could link this PR as a reference. As this PR has already addressed the original purpose, it will be more beneficial for the public users to follow the discussed topic. Thanks for the interest in this and when I have more time, will definitely check about the use of ReduceLROnPlateau!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants