Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The incorrect h tag could cause the machine freeze due to excessive memory use #143

Open
vokiput opened this issue Aug 27, 2024 · 0 comments

Comments

@vokiput
Copy link

vokiput commented Aug 27, 2024

markdownify in ./.venv/lib/python3.8/site-packages (0.13.1)

The issue is found with atheris library

The code to reproduce the issue:

markdownify("<html><body><h5555555555>My First Heading</h5555555555><p>My first paragraph.</p></body></html>")

My machine had frozen. Ubuntu 20.04. 16 GB. The memory usage went to 100% in 2-3 seconds. The only way to fix it is to turn it off/on.

The only valid cases are h1 - h6. We should ignore everything else.
It could be an edge case but it could be possible to feed the string in the example into a server to cause resource exhaustion.

Related cases are (will be fixed if we fix the original issue)

import sys
markdownify(f"<h5{sys.maxsize // 10}>")
Traceback (most recent call last):
  File "/home/redacted/code/other/atheris/pythonProject/test_unit_009-1.py", line 22, in <module>
    markdownify(f"<h5{sys.maxsize // 10}>")
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 433, in markdownify
    return MarkdownConverter(**options).convert(html)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 105, in convert
    return self.convert_soup(soup)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 108, in convert_soup
    return self.process_tag(soup, convert_as_inline=False, children_only=True)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 151, in process_tag
    text += self.process_tag(el, convert_children_as_inline)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 156, in process_tag
    text = convert_fn(node, text, convert_as_inline)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 188, in convert_tag
    return self.convert_hn(n, el, text, convert_as_inline)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 283, in convert_hn
    hashes = '#' * n
MemoryError

and

import sys
markdownify(f"<h{sys.maxsize + 1}>")
Traceback (most recent call last):
  File "/home/redacted/code/other/atheris/pythonProject/test_unit_009-1.py", line 15, in <module>
    markdownify(f"<h{sys.maxsize + 1}>")
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 433, in markdownify
    return MarkdownConverter(**options).convert(html)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 105, in convert
    return self.convert_soup(soup)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 108, in convert_soup
    return self.process_tag(soup, convert_as_inline=False, children_only=True)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 151, in process_tag
    text += self.process_tag(el, convert_children_as_inline)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 156, in process_tag
    text = convert_fn(node, text, convert_as_inline)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 188, in convert_tag
    return self.convert_hn(n, el, text, convert_as_inline)
  File "/home/redacted/code/other/atheris/pythonProject/.venv/lib/python3.8/site-packages/markdownify/__init__.py", line 283, in convert_hn
    hashes = '#' * n
OverflowError: cannot fit 'int' into an index-sized intege
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant