Closed
Description
Description of the bug
In my test case below, the font name (with Chinese characters) seems encoded with error when extracted with get_fonts()
or get_text('rawdict')
. Please look into it, thanks.
How to reproduce the bug
doc = fitz.Document('sample.pdf')
doc[0].get_fonts()
# output:
#[(6,
# 'ttf',
# 'TrueType',
# 'BCDEEE+å\x8d\x8eæ\x96\x87仿å®\x8b', <- from PDF Viewer, the name should be 华文仿宋
# 'F1',
# 'WinAnsiEncoding')]
PyMuPDF version
1.23.8
Operating system
Windows
Python version
3.8