Horizontal spaces / Tabs in a line result in text being read as two lines | TEXT_PRESERVE_WHITESPACE not working as intended #2810
Labels
not a bug
not a bug / user error / unable to reproduce
Please provide all mandatory information!
Describe the bug (mandatory)
When a line in my text contains a tab, it is being converted to a new line character and read as two lines. Using flags = 2 or TEXT_PRESERVE_WHITESPACE is not resolving the issue.
To Reproduce (mandatory)
test.pdf
import fitz
doc=fitz.open("/test.pdf")
for page in doc:
print(page.get_text(option = "text", flags = fitz.TEXT_PRESERVE_WHITESPACE))
output
Test
this is a test
Expected behavior (optional)
Print line by line. and keep whitespace in between
Test this is a test
The text was updated successfully, but these errors were encountered: