Skip to content

ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR

License

Notifications You must be signed in to change notification settings

shinchao88/ocr-docker

 
 

Repository files navigation

OCR-Docker

Extract text from images & pdf files

OCR-Docker is a Python & Flask powered, easy to use system that helps us to easily extract text from images and pdf files in multiple languages.

Features

  • Extract text from images (png, jpg, tiff).
  • Extract text from pdf files (single or multiple pages).

Components and Frameworks used in TTS-STT

The OCR (Optical Character Recognition) feature is free thanks to tesseract-ocr which is an Open Source OCR project.

Installation

docker-compose from hub

version: "3.7"
services:
  ocr:
    image: techblog/ocr-docker:latest
    ports:
      - "8080:8080"
    container_name: tts-stt
    labels:
      - "com.ouroboros.enable=true"
    networks:
      - default
    restart: unless-stopped

Now, run docker-compose up -d to pull and run your container. Open your browser and navigate to your container ip address with port 8080, you should see the following screen.

OCR

About

ocr-docker is small, Flask powerd web app, helps us to extract text from images and pdf document using OCR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • CSS 25.7%
  • HTML 25.3%
  • Python 23.3%
  • JavaScript 16.2%
  • Dockerfile 9.5%