This repo is all about document recognition and analysis in the research direction of ICDAR
The original dataset can be downloaded here: https://lampsrv02.umiacs.umd.edu/projdb/project.php?id=72. The dataset used contains documents from the Tobacco industry in 10 different classes:
0 Ad 1 Email 2 Form 3 Letter 4 Memo 5 News 6 Note 7 Report 8 Resume 9 Scientific
The following AlexNet architecture is used: