-
Notifications
You must be signed in to change notification settings - Fork 15
Events and callbacks
When an event occurs the framework calls the appropriate callback to handle the event. The default behavior of these callbacks is to simply log the event. By overriding them, you can define the logic of the crawler.
The following callbacks are available:
Callback which gets called when the crawler is started.
Typically used to perform initialization (create handles/connections) before the crawling starts.
Callback which gets called when the browser loads the page.
Typically used to find URLs to follow and extract specific data from the HTML source.
Callback which gets called when the page does not load in the browser within the timeout period.
Callback which gets called when the content type is not HTML.
Provides the opportunity to simply download the specific non-HTML resource.
Callback which gets called when a request is redirected.
Callback which gets called when a request error occurs.
Callback which gets called when the crawler is stopped.
Typically used to perform resource cleanup (close handles/connections) after the run.