Skip to content

Latest commit

 

History

History

text_perceptron_spot

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Text Perceptron Spotter

This code repository contains the implementations of the paper Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting (AAAI 2020).

Preparing Dataset

Original images can be downloaded from: Total-Text , ICDAR2013 , ICDAR2015, ICDAR2017_MLT.

The formatted training datalist and test datalist can be found in demo/text_spotting/datalist/.

Train On Your Own Dataset

1.Download the pre-trained model, which was well trained on SynthText and COCO-Text.

2.Modify the paths (ann_file, img_prefix, work_dir, etc..) in the config files.

3.Modify the paths in training script and run the following bash command in the command line

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/text_perceptron_spot/
bash dist_train.sh

Notice:We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add --no-validate command.

Train From Scratch

If you want to re-implement the model's performance from scratch, please following these steps:

1.End-to-End pre-training using the SynthText and COCO-Text. See demo/text_spotting/text_perceptron_spot/configs/tp_r50_e2e_pretrain.py for more details.

2.Fine-tune model on the mixed real dataset (include:ICADR2013, ICDAR2015, ICDAR2017-MLT, Total-Text). See demo/text_spotting/text_perceptron_spot/configs/tp_r50_e2e_finetune_ic13.py for more details.

Notice:We provide the implementation of online validation, if you want to close it to save training time, you may modify the startup script to add --no-validate command.

Offline Inference and Evaluation

We provide a demo of forward inference and evaluation. You can modify the parameter (iou_constraint, lexicon_type, etc..) in the testing script, and start testing:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/text_perceptron_spot/tools/
bash test_ic13.sh

The offline evaluation tool can be found in davarocr/demo/text_spotting/evaluation/.

Visualization

We provide a script to visualize the intermediate output results of the model. You can modify the paths (test_dataset, config_file, etc..) in the script, and start generating visualization results:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/text_perceptron_spot/tools/
python vis.py

Some visualization results are shown:

./vis/seg.png ./vis/text.png

Trained Model Download

All of the models are re-implemented and well trained in the based on the opensourced framework mmdetection.

Results on various datasets and trained models download:

Pipeline Pretrained-Dataset Links
tp_r50_fpn+conv6+bilstm+attention SynthText
COCO-Text

cfg , pth (Access Code: L6k9)

Dataset Backbone Pretrained Finetune Test Scale End-to-End Word Spotting Links
General Weak Strong General Weak Strong
ICDAR2013
(Reported)
ResNet-50-3stages-enlarge SynthText - L-1440 85.8 90.7 91.4 88.5 94.0 94.9 -
ICDAR2013 ResNet-50 SynthText
COCO-Text
ICDAR2013
ICDAR2015
ICDAR2017_MLT
Total-Text
L-1440 87.4 90.6 91.2 90.9 93.8 94.2

cfg , pth (Access Code: 5btM)

ICDAR2015
(Reported)
ResNet-50-3stages-enlarge SynthText - L-2000 65.1 76.6 80.5 67.9 79.4 84.1 -
ICDAR2015 ResNet-50 SynthText
COCO-Text
ICDAR2013
ICDAR2015
ICDAR2017_MLT
Total-Text
L-2000 70.3 77.0 80.0 70.8 79.8 83.2

cfg , pth (Access Code: 5btM)

Dataset Backbone Pretrained Finetune Test Scale End-to-End Word Spotting Links
None Full None Full
Total-Text
(Reported)
ResNet-50 SynthText - L-1350 - - 69.7 78.3 -
Total-Text ResNet-50 SynthText
COCO-Text
ICDAR2013
ICDAR2015
ICDAR2017_MLT
Total-Text
L-1350 70.7 77.3 73.9 81.8

cfg , pth (Access Code: 5btM)

Citation:

@inproceedings{qiao2020text,
  title={Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting},
  author={Qiao, Liang and Tang, Sanli and Cheng, Zhanzhan and Xu, Yunlu and Niu, Yi and Pu, Shiliang and Wu, Fei},
  booktitle={Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI)},
  pages={11899-11907},
  year={2020}
}

License

This project is released under the Apache 2.0 license

Contact

If there is any suggestion and problem, please feel free to contact the author with [email protected].