Handwritten Diagram Images to Digital Format. Please see YoloGraph_Final_Report.pdf
for more details.
- Accurate diagram node detection
- Accurate text recognition
- Accurate arrow key point extrapolation
- Quick inference Time
- Create an environment with python version 3.10
- Install packages in requirements.txt
- Run yolog
pip install -r requirements.txt
./yolog -l=INFO preprocess
yolog, short for YoloGraph.
To create the FCA/FCB node detection dataset, clone the yolov5 directory, and train a model see below. Yolov5 package dependencies were already installed with requirements.txt above. Download the pre-trained yolov5s.pt model from their github.
python yolo_init.py
cd yolov5
python train.py --img 640 --epochs 100 --data ../yolo_dataset.yaml --weights ../pretrained_models/yolov5s.pt
Download the text_data.zip file from the google drive, extract it to text_data/ directory. Download the pre-trained TRBA-case sensitive model. Then run the code below to create the text_dataset_lmdb, and clone the deep-text-recognition-benchmark directory.
python text_init.py
cd deep-text-recognition-benchmark
python train.py --train_data ../train_dataset_lmdb/train/ --valid_data ../train_dataset_lmdb/test/ --saved_model ../pretrained_models/TPS-ResNet-BiLSTM-Attn-case-sensitive.pth --FT --select_data / --batch_ratio 1 --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --workers 0 --num_iter 300 --valInterval 5 --sensitive
To see some sample results run this command while in deep-text-recognition/
directory with TRBA_best_accuracy.pth
as recently trained model and sample images in demo_image/
directory:
python demo.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --image_folder demo_image/ --saved_model ../models/TRBA_best_accuracy.pth --sensitive
Look at ExampleDiagramDecoding.ipynb for how to do this. The code used in this notebook come from the text directory as well as the decode_diagrams.py
file. To download the pretrained model files go to the google drive or click here: node detection model, text recognition model.
For a list of references please look at our report.