Abstract
This paper describes a system prepared at Brno University of Technology for ICDAR 2021 Competition on Historical Document Classification, experiments leading to its design, and the main findings. The solved tasks include script and font classification, document origin localization, and dating. We combined patch-level and line-level approaches, where the line-level system utilizes an existing, publicly available page layout analysis engine. In both systems, neural networks provide local predictions which are combined into page-level decisions, and the results of both systems are fused using linear or log-linear interpolation. We propose loss functions suitable for weakly supervised classification problem where multiple possible labels are provided, and we propose loss functions suitable for interval regression in the dating task. The line-level system significantly improves results in script and font classification and in the dating task. The full system achieved 98.48%, 88.84%, and 79.69% accuracy in the font, script, and location classification tasks respectively. In the dating task, our system achieved a mean absolute error of 21.91 years. Our system achieved the best results in all tasks and became the overall winner of the competition.
Keywords
- Historical document classification
- Script and font classification
- Document origin localization
- Document dating
This is a preview of subscription content, access via your institution.
Buying options







Notes
- 1.
The splits are publicly available at https://pero.fit.vutbr.cz/hdc_dataset.
References
Cheikhrouhou, A., Kessentini, Y., Kanoun, S.: Multi-task learning for simultaneous script identification and keyword spotting in document images. Pattern Recogn. 113, 107832 (2021). https://doi.org/10.1016/j.patcog.2021.107832, https://www.sciencedirect.com/science/article/pii/S0031320321000194
Christlein, V., Spranger, L., Seuret, M., Nicolaou, A., Král, P., Maier, A.: Deep generalized max pooling. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1090–1096, September 2019. https://doi.org/10.1109/ICDAR.2019.00177, iSSN 2379-2140
Cloppet, F., Eglin, V., Helias-Baron, M., Kieu, C., Vincent, N., Stutzmann, D.: ICDAR2017 competition on the classification of medieval handwritings in latin script. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 1371–1376, November 2017. https://doi.org/10.1109/ICDAR.2017.224, iSSN 2379-2140
Cloppet, F., Églin, V., Kieu, V.C., Stutzmann, D., Vincent, N.: ICFHR2016 competition on the classification of medieval handwritings in latin script. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 590–595, October 2016. https://doi.org/10.1109/ICFHR.2016.0113, iSSN 2167-6445
Kodym, O., Hradiš, M.: page layout analysis system for unconstrained historic documents. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 492–506. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_32
Seuret, M., et al.: ICDAR 2021 competition on historical document classification. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12824, pp. 618–634. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86337-1_41
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1409.1556
Tensmeyer, C., Saunders, D., Martinez, T.: Convolutional neural networks for font classification. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 985–990, November 2017. https://doi.org/10.1109/ICDAR.2017.164, iSSN: 2379-2140
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500 (2017). https://openaccess.thecvf.com/content_cvpr_2017/html/Xie_Aggregated_Residual_Transformations_CVPR_2017_paper.html
Acknowledgement
This work has been supported by the Ministry of Culture Czech Republic in NAKI II project PERO (DG18P02OVV055) and by Czech National Science Foundation (GACR) project “NEUREM3” No. 19-26934X.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Kišš, M., Kohút, J., Beneš, K., Hradiš, M. (2022). Importance of Textlines in Historical Document Classification. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-06555-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06554-5
Online ISBN: 978-3-031-06555-2
eBook Packages: Computer ScienceComputer Science (R0)
-
Published in cooperation with
http://www.iapr.org/