Skip to content

Latest commit

 

History

History
46 lines (37 loc) · 1.71 KB

README.md

File metadata and controls

46 lines (37 loc) · 1.71 KB

OCR-Docker

Extract text from images & pdf files

OCR-Docker is a Python & Flask powered, easy to use system that helps us to easily extract text from images and pdf files in multiple languages.

Features

  • Extract text from images (png, jpg, tiff).
  • Extract text from pdf files (single or multiple pages).

Components and Frameworks used in TTS-STT

The OCR (Optical Character Recognition) feature is free thanks to tesseract-ocr which is an Open Source OCR project.

Installation

docker-compose from hub

version: "3.7"
services:
  ocr:
    image: techblog/ocr-docker:latest
    ports:
      - "8080:8080"
    container_name: tts-stt
    labels:
      - "com.ouroboros.enable=true"
    networks:
      - default
    restart: unless-stopped

Now, run docker-compose up -d to pull and run your container. Open your browser and navigate to your container ip address with port 8080, you should see the following screen.

OCR