Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When the file name has Chinese, an error occurred when converting pdf #937

Open
x1y9 opened this issue Feb 11, 2025 · 5 comments
Open

When the file name has Chinese, an error occurred when converting pdf #937

x1y9 opened this issue Feb 11, 2025 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@x1y9
Copy link

x1y9 commented Feb 11, 2025

Bug

When the file name has Chinese, an error occurred when converting pdf.
If I changed the name to English, everything is ok.

Steps to reproduce

docling 中文.pdf -vv
ERROR:docling.datamodel.document:An unexpected error occurred while opening the document 中文.pdf
Traceback (most recent call last):

Docling version

Docling version: 2.21.0
Docling Core version: 2.18.0
Docling IBM Models version: 3.3.1
Docling Parse version: 3.3.0
Python: cpython-311 (3.11.8)
Platform: Windows-10-10.0.22631-SP0

Python version

Python 3.11.8

@x1y9 x1y9 added the bug Something isn't working label Feb 11, 2025
@PeterStaar-IBM
Copy link
Contributor

@x1y9 Can you provide an example file so we can reproduce. I want to solve this asap.

@x1y9
Copy link
Author

x1y9 commented Feb 11, 2025

rename any pdf file to "中文.pdf", then run docling to convert it.

@dolfim-ibm
Copy link
Contributor

We cannot reproduce the issue. Could it be something specific to the Windows OS? Can you please share the stackstrace produced?

@happyTonakai
Copy link

Check if your full path contains Chinese characters

@PeterStaar-IBM
Copy link
Contributor

@x1y9 If you can provide us an example to reproduce, we can fix it. Otherwhise, we need to close this issue. We have indeed tried to rename the file, but no error was seen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants