You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/usr/bin/zimit", line 541, in <module>
zimit()
File "/usr/bin/zimit", line 443, in zimit
return warc2zim(warc2zim_args)
File "/app/zimit/lib/python3.10/site-packages/warc2zim/main.py", line 811, in warc2zim
return warc2zim.run()
File "/app/zimit/lib/python3.10/site-packages/warc2zim/main.py", line 433, in run
self.add_items_for_warc_record(record)
File "/app/zimit/lib/python3.10/site-packages/warc2zim/main.py", line 646, in add_items_for_warc_record
payload_item = WARCPayloadItem(record, self.head_insert, self.css_insert)
File "/app/zimit/lib/python3.10/site-packages/warc2zim/main.py", line 179, in __init__
self.title = parse_title(self.content)
File "/app/zimit/lib/python3.10/site-packages/warc2zim/main.py", line 714, in parse_title
soup = BeautifulSoup(content, "html.parser")
File "/app/zimit/lib/python3.10/site-packages/bs4/__init__.py", line 348, in __init__
self._feed()
File "/app/zimit/lib/python3.10/site-packages/bs4/__init__.py", line 434, in _feed
self.builder.feed(self.markup)
File "/app/zimit/lib/python3.10/site-packages/bs4/builder/_htmlparser.py", line 377, in feed
parser.feed(markup)
File "/usr/lib/python3.10/html/parser.py", line 110, in feed
self.goahead(0)
File "/usr/lib/python3.10/html/parser.py", line 178, in goahead
k = self.parse_html_declaration(i)
File "/usr/lib/python3.10/html/parser.py", line 263, in parse_html_declaration
return self.parse_marked_section(i)
File "/usr/lib/python3.10/_markupbase.py", line 144, in parse_marked_section
sectName, j = self._scan_name( i+3, i )
File "/usr/lib/python3.10/_markupbase.py", line 390, in _scan_name
raise AssertionError(
AssertionError: expected name token at '<![\x05�\x069�y�\x00"���@��\x11H'
FATAL: exception not rethrown
Before that, we have many times in the log:
[WARNING] Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
The text was updated successfully, but these errors were encountered:
youzim.it run of https://archives.nyphil.org/ failed reporting lots of unrecognized chars.
Task is here.
Command used:
Final error:
Before that, we have many times in the log:
The text was updated successfully, but these errors were encountered: