-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] - Can gocron read from a persistent storage the scheduled jobs and schedule #533
Comments
Any update on this ? |
@varsraja I'm not opposed to having this feature in gocron. If it were to be done, I'd like to have an interface that multiple databases could be implemented for (redis, etc.). The implementation in gocron, I would think something like:
What details would be required to be stored? A unique identifier for the job - use the job |
@JohnRoesler Appreciate the response. I was also looking for something along the lines you have suggested. |
Good news, the latest release added uuids for jobs, so we now have unique identifiers. The fields on the job struct are private, so we'll just want to make a new struct, something like JobStorage, that has public fields for the things that will need to be stored. Then when the job occurs, we could instantiate a JobStorage struct and save it to the database via the interface that will be defined. |
Hm, or perhaps, since the interface will be within the gocron project, it can just accept a Job and then convert it to a JobStorage object for a sql implementation 🤔 Some poking around at the implementation will help flush out some of these details |
Is it possible to store as file (json dump) as well apart from databases. |
Certainly. I think the beauty of the interface is that it can implemented in whatever way you'd like. As far as the implementation, please do have a go if you'd like! |
looking forward to this feature. I'm using tag and uuid to manage jobs, which is very inconvenient. |
I wonder if I can try to contribute to this? |
@4zore4 if you are interested in contributing - let's look at adding it to the v2 branch (as that's the future 😄) |
Ok, I will try to add this feature in the v2 version |
@4zore4 I think having a separate struct for the job loading - that isn't the internalJob or public Job would be best. You'll need to consider which fields from the job are important to store/load My initial thoughts on on what you need and don't need from the internalJob type internalJob struct {
- ctx context.Context~
- cancel context.CancelFunc~
+ id uuid.UUID
+ name string
+ tags []string
+ jobSchedule
+ lastRun, nextRun time.Time
+ function any
+ parameters []any
- timer clockwork.Timer
+ singletonMode bool
+ singletonLimitMode LimitMode
~ limitRunsTo *limitRunsTo // for this to be useful, you'd also have to store the # of runs
// when the scheduler is shutting down
~ startTime time.Time // this isn't useful beyond the initial run
~ startImmediately bool // this isn't useful beyond the initial run - but if you set
// start immediately, would you want your job to also start
/ /immediately when a new scheduler pod started? I don't
// think so, you'd want it to continue as close to where it left
// off as possible.
// event listeners
+ afterJobRuns func(jobID uuid.UUID)
+ beforeJobRuns func(jobID uuid.UUID)
+ afterJobRunsWithError func(jobID uuid.UUID, err error)
} Another thing we need to make sure is handled - is when scheduling the next run, if the lastRun is far enough in the past that the next run is also in the past. I don't think v2 handles that yet. |
Your idea is very good, as you said whether to execute expired tasks or not, I feel that this choice needs to be given to the user. I'll try to write a demo at the end of the week, if I'm not lazy. It is worth mentioning that I have used this library in my company's projects. Thank you very much for your contribution |
Any update on this? |
Hello, happy new year, About the startTime field, i believe that it's useful to add it to jobStorage struct, cause if a user want to execute it (with OneTimeJob function ) in 2 hours and scheduler shut down in this period, the job will be lost. I'm trying to implement this feature on this project, i am not a very experienced programmer but, you know, i'm trying |
Another thought - to make storing it the simplest - I think looking into converting the job export structure to some sort of string could be worth while. Then the export would be to a string and it would import from a string and decode that string into jobs. Or slice of strings...so it's not really long in the event of many many jobs. |
Any thoughts about how saving the function? I am thinking about saving only the function name. |
Sorry, I haven't updated it yet, because the company is busy near the end of the year. But I think your idea is great and consistent with mine, and I have implemented the demo.
|
What fields the redisJob has ? |
I am also very interested in this and may end up implementing it with gorm/mysql. Need a background task queue that can survive shutdowns and be distributed long term. |
I have been thinking about this while working on other components in my project, and while the example @4zore4 uses pointers, it will not work at scale IMHO. My thought jumped to using a wrapper package on a scheduler, which I already have, to use the lock and elector system and manage all jobs. You must create many job names or types and register them to task functions that handle them. You could then store this, load all tasks up on boot, and go where you left off. You can't store the job struct data in memory, especially with multiple nodes running (function pointers). So, some "job manager" abstraction is needed for this. I'm open to thoughts on how this might be designed, but I'll likely end up with an MVP for my needs, which is at least MIT, so others can use it as an example before I put any effort into making it reusable. |
I thought I would provide an update. I have forked go-cron some and can prob create a PR soon with the change, (added But... I have implemented a cron system abstraction here https://github.com/LumeWeb/portal/blob/e44bd0f59300b2d7ee164cef4714543639a65c48/service/cron.go. Overall I think it makes the most sense to just create a layer on-top vs try to make the library directly support it. Kudos! |
@pcfreak30 thanks for sharing that! Yes, I agree with your sense that having it be separate would be the best. Then it can wrap gocron as the core scheduling library without introducing a bunch of complexity that many won't need/use. |
For the purposes of restoring jobs from a data store, I think we'll need a method, perhaps a |
See https://github.com/LumeWeb/portal/blob/e0caec59acc68a5be80535add4b1b9f32747e0dd/service/cron.go#L94 for inspiration on what I am doing atm. If you have any thoughts on how your idea or another could refactor this code to be better, im all ears :). |
Hey, having a persistence layer can also help with distributed locks, since most storage solutions have functionality around locking. What's the status of this proposal? Having gocron integrated with postgres/ other storage solutions would open up a ton of possibilities. |
@Nikola-Milovic I understand what you mean but I concluded that its better as an abstraction layer. You can see what I have ended up at with https://github.com/LumeWeb/portal/blob/ee3347796fbbb8d42657f08994255d296998a056/service/cron.go (prob still has bugs). my system supports redis and uses gorm for the authoritative data. you can take inspiration from it. I also have several PR's open for things I need to implement what ive done, and I run on a fork atm. |
You could probably create a dedicated higher level package focusing on this. gocron IMHO is best as a lower level lego. |
Is your feature request related to a problem?
We are running gocron in a containerized environment where jobs are scheduled based on rest api inputs.
We would like to use some sort of persistence to store the received jobs details like frequency etc.
When the container restarts / the instance where container is running restarts we would like the gocron scheduler to fetch
the existing job status and schedule the next run of each job accordingly.
Describe the solution you'd like
Solution would be that on a restart , the scheduler will be able load data from persistent storage and continue the subsequent runs as it would have if system hadnt rebooted/restarted. The scheduler would update the last run time in the persistent storage so that on fresh startup, it can recalculate the next run time.
Describe alternatives you've considered
Have a wrapper function, which would be responsible for updating persistent storage on when the run was last scheduled. On fresh restart , it would load the scheduler by recalibrating the start time from the last run time and add the jobs.
Additional context
The text was updated successfully, but these errors were encountered: