🛠️ Component design with module-based functionality, allowing for on-demand feature acquisition, 🚀 easy to expand, and flexible to use, just like playing with building blocks!
Docr is a modular component-based toolkit for document analysis and processing. It's designed with flexibility and extensibility in mind, making it easy to expand and use various document processing functionalities as needed.
- 📄 Layout Analysis
- 🔢 Formula Detection and Recognition
- 📝 Optical Character Recognition (OCR)
- 📊 Table Structure Recognition
- 📚 Reading Order Analysis
- 🖼️ Image Processing Utilities
- Python 3.10 or higher
- Poetry (for dependency management)
-
Clone the repository:
git clone https://github.com/yjmm10/docr.git cd docr git clone https://huggingface.co/liferecords/Telos.git docr/models
-
Install dependencies:
poetry install -v
Here's a quick example of how to use Docr for OCR:
from docr import OCR
import cv2
# Initialize the OCR model
ocr_model = OCR()
# Read an image
image = cv2.imread("path/to/your/image.png")
# Perform OCR
result = ocr_model(image)
print(result)
Docr comes with a Streamlit-based web UI for easy demonstration of its capabilities:
-
Run the demo:
streamlit run webui/demo.py
-
Open your browser and navigate to the provided URL (usually http://localhost:8501)
-
Upload an image and select the model you want to use for processing
Docr also provides a FastAPI-based API service for integration into other applications:
-
Start the API server:
uvicorn api.docr_api:app --host 0.0.0.0 --port 8000
-
The API documentation will be available at http://localhost:8000/docs
For detailed information on development, please refer to the development guide. This guide will help you set up your IDE for working with Docr, including SRC Layout configuration.
We welcome contributions! Please see our Contributing Guidelines for more details.
Docr is released under the MIT License. See the LICENSE file for more details.
For any questions or feedback, please contact the project maintainer: liferecords [email protected]