This repository provides three functions. The first is transfer dicom image format into PNG format. Second is doing OCR process with EKG data that in specfic format, and it will crop the part contains raw image data.
Python 3.9.7
- Create an empty directory on the path you want.
- Make sure
Virtualenv
is already installed. If your Python environment not installVirtualenv
yet. You can install it with pip.
pip3 install virtualenv
- Open Terminal or Command line tools. Change working directory to the new one. Then use
Virtualenv
to create a new venv.
dir your/path virtualenv --python=/path/to/your/python venv
cd your/path virtualenv --python=/path/to/your/python venv
- Enter the virtual environment mode.
.\venv\Scripts\activate.bat
source ./venv/bin/activate
If enter the virtual environment successfully. you will see your command line changed like below.
(venv) C:\>
(venv) $
- clone this repository into the directory.
git clone https://github.com/feather0611/EKG_OCR.git
- Use
pip
to install all the required modules withrequirements.txt
pip install -r requirements.txt
transer.py
will automatically rename EKG dicom file that provided by VGHTC(臺中榮總) and take out to specific directory. And also it will make a png copy in the path you want. Please edit line 18-21 of this file to assign the path of source, files that take out, and files that transfer into PNG. For example:
# path to dicom source source = './300EKG/0001/' # path to output destination dist = './prod300/dist/' # Path to store dicom files origin = './prod300/origin/' # Path to store PNG files.
- Then run it.
python transfer.py
You don't need to clean up separate directory into one directory. You can just change the source path and run several times, and the result will be added in the destination path.
9. Now we got PNG files that needed in OCR works. So you can change the path on line 26-27 of main.py
into the directory path you just store PNG format image and the path to store the raw EKG information part. For example:
img_path = './prod300/dist/' raw_path = './prod300/raw/'
- Then run this script.
python main.py
- If everything is OK, you will get two files and all the PNG files contain raw EKG data,
out.csv
anderr.csv
. The records onout.csv
are basically correct. But maybe some little problem will still happen. The records onerr.csv
are records have problems obviously.