Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: SMAZ can only process ASCII text #11

Open
lopuhin opened this issue Sep 13, 2017 · 0 comments
Open

ValueError: SMAZ can only process ASCII text #11

lopuhin opened this issue Sep 13, 2017 · 0 comments

Comments

@lopuhin
Copy link
Contributor

lopuhin commented Sep 13, 2017

I thought this could not possibly happen, but somehow non-ascii URLs can reach SMAZ:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/site-packages/twisted/internet/task.py", line 517, in _oneWorkUnit
    result = next(self._iterator)
  File "/usr/local/lib/python3.5/site-packages/scrapy/utils/defer.py", line 63, in <genexpr>
    work = (callable(elem, *args, **named) for elem in iterable)
  File "/usr/local/lib/python3.5/site-packages/scrapy/core/scraper.py", line 183, in _process_spidermw_output
    self.crawler.engine.crawl(request=output, spider=spider)
  File "/usr/local/lib/python3.5/site-packages/scrapy/core/engine.py", line 210, in crawl
    self.schedule(request, spider)
  File "/usr/local/lib/python3.5/site-packages/scrapy/core/engine.py", line 216, in schedule
    if not self.slot.scheduler.enqueue_request(request):
  File "/usr/local/lib/python3.5/site-packages/scrapy_redis/scheduler.py", line 167, in enqueue_request
    self.queue.push(request)
  File "/dd_crawler/dd_crawler/queue.py", line 90, in push
    data = self._encode_request(request)
  File "/dd_crawler/dd_crawler/queue.py", line 392, in _encode_request
    return struct.pack('h', depth) + parent + url_compress(request.url)
  File "/dd_crawler/dd_crawler/queue.py", line 374, in url_compress
    return smaz.compress(url, compression_tree=smaz_tree).encode('latin1')
  File "/usr/local/lib/python3.5/site-packages/lib/smaz.py", line 399, in compress
    raise ValueError('SMAZ can only process ASCII text.')
ValueError: SMAZ can only process ASCII text
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant