[Question] Fetch request url from redis fail #285

KokoTa · 2023-08-13T09:19:58Z

Description

If i insert start url to redis before run scrapy, is successful.

But if i run scrapy first and insert url, listen url will get fail info:

2023-08-13 17:11:59 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method RedisMixin.spider_idle of <TestHtmlSpider 'test_html' at 0x2b05c4162d0>>
Traceback (most recent call last):
  File "C:\Users\KokoTa\AppData\Local\Programs\Python\Python311\Lib\site-packages\scrapy\utils\signal.py", line 43, in send_catch_log
    response = robustApply(
               ^^^^^^^^^^^^
  File "C:\Users\KokoTa\AppData\Local\Programs\Python\Python311\Lib\site-packages\pydispatch\robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\KokoTa\AppData\Local\Programs\Python\Python311\Lib\site-packages\scrapy_redis\spiders.py", line 208, in spider_idle
    self.schedule_next_requests()
  File "C:\Users\KokoTa\AppData\Local\Programs\Python\Python311\Lib\site-packages\scrapy_redis\spiders.py", line 197, in schedule_next_requests
    self.crawler.engine.crawl(req, spider=self)
TypeError: ExecutionEngine.crawl() got an unexpected keyword argument 'spider'

I can't get url dynamically and scrapy will crush.

The text was updated successfully, but these errors were encountered:

Shleif91 · 2023-09-04T08:21:03Z

Same error... Found a solution?

gc1423 · 2023-09-14T11:15:40Z

Passing a spider argument to the crawl() methods of scrapy.core.engine.ExecutionEngine is no longer supported in scrapy v2.10.0. release notes

Try scrapy 2.9.0.

GeorgeA92 · 2023-11-25T19:41:55Z

It looks like pull request #286 that fix this already exist from Aug.
This can be easily applied for app with current scrapy-redis version by.. overriding schedule_next_request method.

class SomeSpider(RedisSpider):
    ## vvv _add this to spider code
    def schedule_next_requests(self):
        """Schedules a request if available"""
        # TODO: While there is capacity, schedule a batch of redis requests.
        for req in self.next_requests():
            self.crawler.engine.crawl(req, spider=self)
            # see https://github.com/scrapy/scrapy/issues/5994
            if scrapy_version >= (2, 6):
                self.crawler.engine.crawl(req)
            else:
                self.crawler.engine.crawl(req, spider=self)

xuexingdong · 2024-01-08T12:20:44Z

hope the fixed version quickly release

jordinl · 2024-05-16T15:05:51Z

@rmax would it be possible to release a fix for this? I'm also encountering this issue

migrant · 2024-06-18T16:19:47Z

The same problem...

georgeJzzz · 2024-06-24T02:16:29Z

@rmax would it be possible to release a fix for this? I'm also encountering this issue。 Thanks

rmax · 2024-07-04T06:33:50Z

Thank you for your patience. V0.8.0 has been released 🎉

RoyHuang2 added a commit to RoyHuang2/scrapy-redis that referenced this issue Nov 13, 2023

remove the "spider=self" parameter(rmax#285)

a311d07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Fetch request url from redis fail #285

[Question] Fetch request url from redis fail #285

KokoTa commented Aug 13, 2023 •

edited

Loading

Shleif91 commented Sep 4, 2023

gc1423 commented Sep 14, 2023

GeorgeA92 commented Nov 25, 2023

xuexingdong commented Jan 8, 2024

jordinl commented May 16, 2024

migrant commented Jun 18, 2024

georgeJzzz commented Jun 24, 2024

rmax commented Jul 4, 2024

[Question] Fetch request url from redis fail #285

[Question] Fetch request url from redis fail #285

Comments

KokoTa commented Aug 13, 2023 • edited Loading

Description

Shleif91 commented Sep 4, 2023

gc1423 commented Sep 14, 2023

GeorgeA92 commented Nov 25, 2023

xuexingdong commented Jan 8, 2024

jordinl commented May 16, 2024

migrant commented Jun 18, 2024

georgeJzzz commented Jun 24, 2024

rmax commented Jul 4, 2024

KokoTa commented Aug 13, 2023 •

edited

Loading