Skip to content

OCR (Optical Character Recognition) App using ColPali implementation of the new Byaldi library + Huggingface transformers for Qwen2-VL.

Notifications You must be signed in to change notification settings

Prof-chaos-5/OCR-APP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR-APP

OCR (Optical Character Recognition) App using ColPali implementation of the new Byaldi library + Huggingface transformers for Qwen2-VL. Made By Prasoon Sharma

This project is an Optical Character Recognition (OCR) app built using Gradio and powered by deep learning models such as Qwen2-VL. The app allows users to upload an image, extract text in both Hindi and English, and optionally highlight keywords in the extracted text.

Features

  • Upload an image and extract text in multiple languages (supports Hindi and English).
  • Optional keyword search: Highlights specific keywords in the extracted text.
  • Uses OpenCV for image preprocessing.
  • Intuitive web-based UI with Gradio for easy interaction.

Setup

Requirements

  • Python 3.7+
  • PyTorch
  • OpenCV
  • Gradio
  • Transformers (Hugging Face)

Install dependencies using the following command:

pip install -r requirements.txt

How to run the app

Demo and Screenshots

-Demo and Screenshots

About

OCR (Optical Character Recognition) App using ColPali implementation of the new Byaldi library + Huggingface transformers for Qwen2-VL.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published