Skip to content

Leopardi Dataset

Compare
Choose a tag to compare
@vittoriopippi vittoriopippi released this 11 Feb 12:44
· 11 commits to main since this release

To favor the research towards HTR systems able to work on historical documents even in the absence of large training datasets, we devise a new dataset consisting of a small collection of early 19th Century letters written in Italian by Giacomo Leopardi.

The letters are preserved at the Estense Library in Modena, and their high-resolution scans are also available at its Digital Library2. In particular, there are 168 pages containing text in Giacomo Leopardi’s handwriting, both letter bodies and envelope fronts.

@inproceedings{cascianelli2021learning,
  title={Learning to read L’Infinito: handwritten text recognition with synthetic training data},
  author={Cascianelli, Silvia and Cornia, Marcella and Baraldi, Lorenzo and Piazzi, Maria Ludovica and Schiuma, Rosiana and Cucchiara, Rita},
  booktitle={Computer Analysis of Images and Patterns: 19th International Conference, CAIP 2021, Virtual Event, September 28--30, 2021, Proceedings, Part II 19},
  pages={340--350},
  year={2021},
  organization={Springer}
}