Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: stopping the crawlers gracefully with
BasicCrawler.stop()
#2792feat: stopping the crawlers gracefully with
BasicCrawler.stop()
#2792Changes from 3 commits
284274b
55210ec
d5e469a
e538881
42cd320
e7388de
e62befc
d72105e
05b43b9
1611602
8e9ea84
287580c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the motivation for not making the method async and ignoring the promise instead? feels handy to be able to await it (and users can always decide themselves to ignore the promise, unlike with the current solution)
also I don't think you need the
async
in hereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I planned it like this in the beginning, see 284274b
The reasons for moving to this was:
a. parity w/ Python version
b.
^ this looks like a completely harmless piece of code, but will cause deadlock.
requestHandler
is waiting forautoscaledPool.pause
, which is waiting for all tasks to finish (including the aforementionedrequestHandler
).You and me both, but
@typescript-eslint/promise-function-async
thinks otherwise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I'm definitely for making this
async
, but we need to address these somehow. Also, note that even with the implementation from 284274b ,crawler.stop
resolves onceautoscaledPool.abort()
has resolved, but that (afaik) doesn't mean thecrawler.run()
has resolved.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly not sure that there is a good reason for an
async stop()
instead of non-async one. If you are expected to call it from a request handler, waiting for the crawler to stop there leads to a deadlock and the reasons are quite intuitive.Should someone need to wait for the crawler to finish outside of the request handler, they can always wait for the promise from
crawler.run()
to resolve.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always a bit wary about floating promises, but as long as you guys are fine with it (and there is the catch clause to stop it from killing the whole process), let's see how it goes.
I'll still add the promised tests from #2803 , just to stay on the safer side.