Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: replace PyMuPDF by pdf2image, for license compatibility #818

Closed
wants to merge 2 commits into from

Conversation

charlesmindee
Copy link
Collaborator

@charlesmindee charlesmindee commented Feb 15, 2022

This PR replaces PyMuPDF by pdf2image, which is under MIT license.
This library allows to convert a pdf (from both path and stream) to a list of images.
We loose the text extraction feature for source pdf, but we didn't use it so far so it won't change anything.

Any feedback is welcome!
closes #486
closes #113

@charlesmindee charlesmindee added module: io Related to doctr.io type: breaking change Introducing a breaking change ext: demo Related to demo folder labels Feb 15, 2022
@charlesmindee charlesmindee added this to the 0.5.1 milestone Feb 15, 2022
@charlesmindee charlesmindee self-assigned this Feb 15, 2022
@felixdittrich92
Copy link
Contributor

@charlesmindee 👋
Do you have compared with pdfminer.six which is also under MIT but in this case without loosing anything !? 😄

@charlesmindee
Copy link
Collaborator Author

closing this since it requires poppler to be installed

@fg-mindee fg-mindee deleted the pdf2image branch March 18, 2022 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: demo Related to demo folder module: io Related to doctr.io type: breaking change Introducing a breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider removing PyMuPDF for dependency that is not AGPL licensed [conda] Unable to make a conda build
2 participants