OCR-APP

OCR (Optical Character Recognition) App using ColPali implementation of the new Byaldi library + Huggingface transformers for Qwen2-VL. Made By Prasoon Sharma

This project is an Optical Character Recognition (OCR) app built using Gradio and powered by deep learning models such as Qwen2-VL. The app allows users to upload an image, extract text in both Hindi and English, and optionally highlight keywords in the extracted text.

Features

Upload an image and extract text in multiple languages (supports Hindi and English).
Optional keyword search: Highlights specific keywords in the extracted text.
Uses OpenCV for image preprocessing.
Intuitive web-based UI with Gradio for easy interaction.

Setup

Requirements

Python 3.7+
PyTorch
OpenCV
Gradio
Transformers (Hugging Face)

Install dependencies using the following command:

pip install -r requirements.txt

How to run the app

Run the Gradio app on Huggingface
Run on Google Collab
Run the Python file locally

Demo and Screenshots

-Demo and Screenshots

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
OCR_App.py		OCR_App.py
OCR_using_QwenVL_by_PS (1).ipynb		OCR_using_QwenVL_by_PS (1).ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR-APP

Features

Setup

Requirements

How to run the app

Demo and Screenshots

About

Releases

Packages

Languages

Prof-chaos-5/OCR-APP

Folders and files

Latest commit

History

Repository files navigation

OCR-APP

Features

Setup

Requirements

How to run the app

Demo and Screenshots

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages