Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug/ImportError: cannot import name 'open_filename' from 'pdfminer.utils' #3801

Closed
Antony-M1 opened this issue Nov 27, 2024 · 1 comment
Closed
Labels
bug Something isn't working

Comments

@Antony-M1
Copy link

Describe the bug
After installing the unstructured & pdfminer using below commands. I'm getting the following error

!pip install -q unstructured==0.16.8
!pip install -q pdfminer==20191125

Code

from unstructured.partition.pdf import partition_pdf
from unstructured.staging.base import elements_to_json

Error

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
[<ipython-input-44-899c36516ea4>](https://localhost:8080/#) in <cell line: 8>()
      6 from langchain.text_splitter import PythonCodeTextSplitter
      7 
----> 8 from unstructured.partition.pdf import partition_pdf
      9 from unstructured.staging.base import elements_to_json
     10 

[/usr/local/lib/python3.10/dist-packages/unstructured/partition/pdf.py](https://localhost:8080/#) in <module>
     34 )
     35 from pdfminer.pdftypes import PDFObjRef
---> 36 from pdfminer.utils import open_filename
     37 from PIL import Image as PILImage
     38 

ImportError: cannot import name 'open_filename' from 'pdfminer.utils' (/usr/local/lib/python3.10/dist-packages/pdfminer/utils.py)

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
@Antony-M1 Antony-M1 added the bug Something isn't working label Nov 27, 2024
@scanny
Copy link
Collaborator

scanny commented Dec 15, 2024

@Antony-M1 Unstructured uses pdfminer.six, not pdfminer and current version is 20231228.

To install unstructured, follow the installation instructions on the repo home page, in your case probably something roughly like:

$ pip install unstructured[pdf]

@scanny scanny closed this as completed Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants