Welcome to the PaulBornOCR project! This document provides comprehensive installation and setup instructions to help you get started with the project.
Author: Dustin Brunner ([email protected])
The purpose of this document is to guide users through the necessary prerequisites, setup instructions, and usage of the paul_born_ocr.py
script. This project utilizes Optical Character Recognition (OCR) to automate data entry processes, streamlining your workflow.
- Prerequisites
- Setup Instructions
- Using the
paul_born_ocr.py
Script - Using the
auto.py
Script - Important Notes
- Troubleshooting
Before proceeding with the setup, ensure that you have the following files in a folder named PaulBornOCR
located in your Downloads directory:
environment.yml
: This file contains the dependencies required for the project.utils.py
: This script contains utility functions used by thepaul_born_ocr.py
andauto.py
scripts.paul_born_ocr.py
: This script speeds up Paul Born data entry tasks in the Data Shot software using Optical Character Recognition (OCR).auto.py
: This script can be used to define custom keyboard shortcuts for automating data entry tasks in the Data Shot software.imgs/
: This directory contains images of the screen elements used for detecting their coordinates.
You can download the PaulBornOCR
folder from the GitHub repository (Green <> Code
Button in Top-Right -> Download ZIP
) or from the Born-Moser, Paul directory on the Google Drive of the entomological collection.
PyCharm is an Integrated Development Environment (IDE) that makes it easier to manage your Python projects.
- Download URL: PyCharm Community Edition (Important: Scroll down to the Community Edition don't download the paid Professional Edition)
- Run the installer.
- Follow the on-screen instructions to complete the installation.
- If any additional configuration is required, select the default options.
Tesseract is an open-source OCR engine that pytesseract
relies on. You need to install it and make sure it is available in the default path.
- Download the Tesseract installer from the official GitHub page:
- Download URL: Tesseract OCR Installer
- Run the installer and select the default installation options. Tesseract should be installed in the following directory:
C:\Users\<YourUsername>\AppData\Local\Programs\Tesseract-OCR\tesseract.exe
Miniconda is a lightweight version of Anaconda that lets you manage Python environments and packages.
- Download Miniconda from the official website:
- Download URL: Miniconda Download Page (Important: If prompted, choose "Skip registration" then scroll down to the Miniconda Installers section and download the Windows version)
- Run the installer and select the default installation options. Miniconda should be installed in the following directory:
C:\Users\<YourUsername>\AppData\Local\miniconda3
- Once installed, open the Anaconda Prompt (miniconda3) from the Windows Start Menu and verify the installation by typing:
You should see the version number of Miniconda displayed in the console.
conda --version
-
Open the Anaconda Prompt (miniconda3) from the Windows Start Menu.
-
Navigate to the project directory containing the
environment.yml
file:cd Users\<YourUsername>\Downloads\PaulBornOCR
-
Create the environment using the following command:
conda env create -f environment.yml
This will create a Conda environment named
data_entry_shortcuts
with the necessary dependencies.
-
Open PyCharm and Access Settings:
Launch PyCharm and open your project. Press Ctrl + Alt + S (Windows/Linux) or Cmd + , (macOS) to open Settings. Alternatively, click the gear icon in the bottom-right corner. -
Navigate to Python Interpreter:
In Settings, select Project: PaulBornOCR from the left sidebar, then click on Python Interpreter. Click Add Interpreter in the top-right corner. -
Select Conda Environment:
- In the Add Python Interpreter window, choose Conda Environment and select Existing environment.
- Click the folder icon to browse for the Conda executable, navigating to:
C:\Users\<Your Username>\AppData\Local\miniconda3\Scripts\conda.exe
(replace<Your Username>
with your actual username). - PyCharm will automatically load your Conda environments. Wait for the list to appear.
-
Select and Apply Changes:
From the loaded environments, selectdata_entry_shortcuts
, then click OK. Ensure it’s selected in the Python Interpreter settings, then click Apply and OK to save changes. -
Verify Configuration:
Check the bottom-right corner of PyCharm to confirm that the selected interpreter is active.
- Open the Data Shot software and load a specimen for data entry.
- Open the
paul_born_ocr.py
script in PyCharm by navigating to the project directory and double-clicking the file. - Run the script by clicking the green play button in the top-right corner of the PyCharm window or using the Shift + F10 shortcut.
- Verify if the script correctly detected the screen elements and is ready to automate the data entry process.
- if there are any issues ensure that the Data Shot software is open on your primary monitor (the one where the Windows login screen appears) and that all necessary screen elements are visible.
- You can now use the keyboard shortcuts provided by the script to navigate and enter data in the Data Shot software.
- To stop the script, press the red square stop button in the PyCharm window or close the PyCharm window.
The script provides several keyboard shortcuts for quickly navigating and entering data in the Data Shot software:
- Alt + 1: Move to the next specimen and zoom in on the pin labels.
- Alt + 2: Move to the previous specimen and zoom in on the pin labels.
- Middle Mouse Click: Automatically recognize the number on the Paul Born collection number label and copy it to the clipboard. (Ensure that the mouse is centered on the number before clicking.)
- § (Section Key): Fill in the collection number from the clipboard and set the collection to "Born-Moser, Paul."
- Alt + §: Show an input dialog for manual collection number entry if automatic recognition did not work.
- Alt + Q: Automatically detect the position of the collection number on the screen. (This feature works about 50% of the time and is likely slower than moving the mouse to the collection number and clicking the middle mouse button.)
- When running the OCR, verify that the correct data is captured, especially for numbers and labels, as OCR accuracy can vary depending on the clarity of the screen content.
The auto.py
script can be used to define custom keyboard shortcuts for automating data entry tasks. You can modify the script to include additional shortcuts or customize the existing ones to suit your workflow.
You can activate and deactivate shortcuts to your liking by commenting or uncommenting the respective lines in the script (add/remove #
at the beginning of the line to comment/uncomment the line).
- Open the
auto.py
script in PyCharm by navigating to the project directory and double-clicking the file. - Run the script by clicking the green play button in the top-right corner of the PyCharm window or using the Shift + F10 shortcut.
- The script will run in the background and listen for the defined keyboard shortcuts.
- Use the defined shortcuts to automate data entry tasks in the Data Shot software.
- To stop the script, press the red square stop button in the PyCharm window or close the PyCharm window.
If you encounter issues while setting up or running the PaulBornOCR
project, consider the following solutions:
-
PyCharm Fails to Detect Conda Environment:
- Ensure that you have correctly installed Miniconda and that the path to
conda.exe
is accurate. Double-check the directory:
C:\Users\<Your Username>\AppData\Local\miniconda3\Scripts\conda.exe
. - Restart PyCharm after installation to refresh the environment list.
- Ensure that you have correctly installed Miniconda and that the path to
-
Tesseract Installation Problems:
- If the Tesseract OCR engine is not found, verify that Tesseract is installed in the specified directory and that its path is included in your system’s environment variables.
- You can add the Tesseract installation path to your PATH variable by following these steps:
- Right-click on This PC or My Computer and select Properties.
- Click on Advanced system settings and then Environment Variables.
- In the System variables section, find and select the Path variable, then click Edit.
- Add the Tesseract installation path:
C:\Users\<YourUsername>\AppData\Local\Programs\Tesseract-OCR\
.
-
Script Throws Errors or Fails to Run:
- Ensure that all dependencies in the
environment.yml
file are installed correctly. You can try recreating the environment using:conda env remove -n data_entry_shortcuts conda env create -f environment.yml
- Ensure that Data Shot is open on your primary monitor (the one where the Windows login screen appears). The script relies on screen coordinates and can fail if Data Shot is opened on a different monitor. You can change the primary monitor in your Windows display settings.
- Ensure that all dependencies in the
If you need more detailed instructions on any of the steps or encounter a unique issue not covered here, consider using ChatGPT. You can ask specific questions about your setup, error messages, or any part of the process that is difficult to understand. ChatGPT can provide tailored guidance and troubleshooting tips to help you resolve your issues.