Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FPubD] Searchable PDFs #7

Open
sk33wiff opened this issue Dec 23, 2016 · 6 comments
Open

[FPubD] Searchable PDFs #7

sk33wiff opened this issue Dec 23, 2016 · 6 comments

Comments

@sk33wiff
Copy link

sk33wiff commented Dec 23, 2016

Hi,

I was wondering if there is anyway of adding search support for the generated PDFs.

This one is good example:

https://issuu.com/guani/docs/designpatternscard

If downloaded through issuu.com, the PDF has searching support. But not when it's downloaded by the tool.

Congrats on the project!
Cheers

@robsonsmartins
Copy link
Owner

Hi,
Thank you for using my tool!

I appreciate your suggest. In this case mentioned by you, the publication's author submitted an original PDF file with text (or OCR'zed) capability, created, for example, in Adobe Acrobat Pro. And, the author authorized the publication for normal download into issuu.com.

My difficult is to find an open source pure JavaScript OCR library, because PDF generation library used no contains OCR capability. My tool generate a PDF file by source images (JPG), available to any person, from issuu.com, for all publications hosted in this site.

If you know any open source pure JavaScript OCR library (that works over static JPG images), please help me.

For this moment, to convert a 'non searcheable' PDF to a 'searcheable', use the OCR tool by Adobe Acrobat Pro software.

@sk33wiff
Copy link
Author

I see, thanks for the explanation.
In regards to the libs, I just recently came across these ones:

I've created a codepen as a Tesseract demo: http://codepen.io/anon/pen/pNmEMm
It currently uses data uri for the image due to cross-domain limitations

Ocrad demo: https://github.com/kdzwinel/JS-OCR-demo

Not sure how realist is to use them,
Cheers

@marcelocecin
Copy link

marcelocecin commented Jan 6, 2017

@robsonsmartins
Copy link
Owner

Thanks, @marcelocecin
This URL/pattern contains searchable PDFs of all pages of a publication.
I'm looking for a free/open source JavaScript library to merge many PDF files in one.

@robsonsmartins
Copy link
Owner

Thanks a lot! I need now of free time to work with this...

@robsonsmartins robsonsmartins changed the title Searchable PDFs [FPubD] Searchable PDFs Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants