-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to stop spider check duplicate with redis ? #242
Comments
@Germey Any ideas? |
@milkeasd |
The way I see, let developer customize their communication rules and add a disable option for |
@milkeasd |
@milkeasd could you please provide your code or make some sample code? |
@LuckyPigeon it doesn't work. setting
Maybe there should be a custom scrapy-redis/src/scrapy_redis/dupefilter.py Line 128 in 48a7a89
From scrapy's doc: https://doc.scrapy.org/en/latest/topics/settings.html#dupefilter-class
|
Hi, everyone! I've made a little change in |
My spider was extremely slow when run with scrapy-redis. Because there is a big delay between slave and master. I want to reduce the commuication to just only getting the start_urls periodically or when all start_urls is done, Is there any ways to do so ?
Moreover, I want to stop the duplication check to reduce the number of connection.
But, I cant change the DUPEFILTER_CLASS to scrapy default one, it raise error.
Is there any other ways to stop the duplicate check ?
Or any ideas can help speed up the process ?
Thanks
The text was updated successfully, but these errors were encountered: