Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically exit process after all jobs are finished #2

Open
robhawkes opened this issue Dec 16, 2015 · 2 comments
Open

Automatically exit process after all jobs are finished #2

robhawkes opened this issue Dec 16, 2015 · 2 comments

Comments

@robhawkes
Copy link
Contributor

Currently there's nothing in place to know when all jobs are finished so the process stays open indefinitely. We need a way to know when everything is done and a way to monitor that to perform last-minute actions (like generating the GeoJSON index) and exit the process.

One method could be to work out how many buildings are to be processed and then wait for that many of them to be completed, though the issue then is that there are an unknown number of jobs spawned for each building so how would you know when they are all finished?

Another method would be to store a simple state for each building in Redis that is set to "completed" when the final job for that building has finished, however something would have to monitor this and I'm unsure when or how often it would check. It could still be possible for previous buildings to finish processing and trigger an exit even though there is a rather large building still being streamed in for processing.

Another approach would be to periodically check for any incomplete jobs using the same batch ID for the input file – if there are none left, or they are all completed, then it might be safe to assume that everything has finished. Theoretically, there should always be at least one job active for the current batch if there is a building left to be processed, as jobs are spawned before the previous jobs are completed.

Even something as simple as finding out how many buildings there are is tricky as the system uses streaming to process buildings so we don't know how many there are until the very end, after a lot of them have already finished processing. Even if we did know how many buildings there were, we still wouldn't know how many jobs have been spawned and which ones to look out for.

@robhawkes
Copy link
Contributor Author

I've pushed an update that implements a first stab at automatically exiting the process once all jobs are completed.

@meetar
Copy link
Member

meetar commented Jun 1, 2016

Still seems to hang, generally after these notices:

Parser ended
Stream ended
Building count: 57089

…Though sometimes it also then manages to write the index:

Number of GeoJSON footprints: 57089
Saved GeoJSON index: converted/Reinickendorf

So if these processes are able to complete when the job is done, perhaps they can also trigger the exit? Or could there be something like a hung task which is preventing the queue from fully emptying?

@meetar meetar reopened this Jun 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants