Extract text from images

Goals

Able to read PDF, JPEG, PNG
Maintain formatting
Output to unique .html file
Run in terminal
Minimize packages to install

Layout

pdf2image and PIL to read files
~~pytesseract to read characters~~
- 2 niche installs is too many
train TensorFlow model