Replies: 1 comment 7 replies
-
As a follow up, I would say it would be needed to for instance start 5 HttpSpiders each with its own params to be used inside the parse function. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I've been studying the code and playing around with crawly.
I would like to create a spider that computes both the item rules as well as the new_links over some data I may get from the DB.
For instance I could create a pool of HTTP spiders, each for different domain and those spiders would get the item properties and new links from an element described by a DB field. In this way, I could allow the user to personalyze the crawl by himself using a couple params.
But as I saw on crawly-ui, it uses fixed spider modules for the purpose (already compiled).
Is there any easy way to make it or should I refactory the module for my purpose?
I really like the way you set state machine working (middlewares, pipelines, backoff times, etc)
Thank you for your attention.
Beta Was this translation helpful? Give feedback.
All reactions