Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.client.BadStatusLine: http/1.1 200 OK #24

Open
chris-aeviator opened this issue Sep 13, 2021 · 0 comments
Open

http.client.BadStatusLine: http/1.1 200 OK #24

chris-aeviator opened this issue Sep 13, 2021 · 0 comments

Comments

@chris-aeviator
Copy link

I'm getting a lot of these errors - some pages work just fine, all the warc files I'm reading have HTML, the error itself is strange enough since 200 ok is not a bad statusline

Error on record <urn:uuid:3b608490-1308-11ec-a263-3905f05120b4>
Traceback (most recent call last):
  File "/home/korny/.conda/envs/ploomber-gpt/lib/python3.9/site-packages/warcat/tool.py", line 108, in process
    self.action(record)
  File "/home/korny/.conda/envs/ploomber-gpt/lib/python3.9/site-packages/warcat/tool.py", line 216, in action
    response = util.parse_http_response(data)
  File "/home/korny/.conda/envs/ploomber-gpt/lib/python3.9/site-packages/warcat/util.py", line 273, in parse_http_response
    response.begin()
  File "/home/korny/.conda/envs/ploomber-gpt/lib/python3.9/http/client.py", line 319, in begin
    version, status, reason = self._read_status()
  File "/home/korny/.conda/envs/ploomber-gpt/lib/python3.9/http/client.py", line 301, in _read_status
    raise BadStatusLine(line)
http.client.BadStatusLine: http/1.1 200 OK

my code is

import warcat.tool
tool = warcat.tool.ExtractTool(
        ['/tmp/my.warc'],
        out_dir='/tmp/out/',
        preserve_block=False,
        keep_going=True
        )
tool.process()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant