Overview

Finds a Joplin notebook page. Converts a PDF to images. Runs optical-character recognition on the images and adds it as alternative text. Adds the images to the page in Joplin and saves.
Result: Searchable printout of a PDF added to the page.

How to use it?

Install Tesseract on your system.
- (Guide with Ubuntu: https://techviewleo.com/how-to-install-tesseract-ocr-on-ubuntu/)
Make sure you have installed the Python requirements as found in requirements.txt.
- It is recommended to use a virtual environment but not necessary.
Make sure Joplin Web Clipper is enabled and started in the client.
- Go to Tools -> Options -> Web Clipper and enable it.
Export your API key to the environment variable JOPLIN_API_TOKEN
- Alternative: Provide it through CLI with --api-token
Run ./noteUpload.py --notebooks "parent" "child" --note-name "title" --pdf "/path/to/pdf"
- Concrete example: ./noteUpload.py --notebooks "Embedded Real Time Systems" "Week 2" --note-name "SystemC" --pdf L4_SystemC_part3.pdf
  - Assumes notebook structure: Embedded Real-Time Systems -> Week 2 -> SystemC. Copies printout of "L4_SystemC_part3.pdf" to it.

Known issues:

Currently, the script does not support:

Changing server URL/port. Always assumes localhost:41184
Create the note structure if it doesn't exist.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
joplinFindNote.py		joplinFindNote.py
joplinOcrImages.py		joplinOcrImages.py
joplinPdf2Images.py		joplinPdf2Images.py
noteUpload.py		noteUpload.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

How to use it?

Known issues:

About

Releases

Packages

Languages

mortenhaahr/joplin-pdf-ocr-upload

Folders and files

Latest commit

History

Repository files navigation

Overview

How to use it?

Known issues:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages