Skip to main content

Importance of Textlines in Historical Document Classification

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13237)

Abstract

This paper describes a system prepared at Brno University of Technology for ICDAR 2021 Competition on Historical Document Classification, experiments leading to its design, and the main findings. The solved tasks include script and font classification, document origin localization, and dating. We combined patch-level and line-level approaches, where the line-level system utilizes an existing, publicly available page layout analysis engine. In both systems, neural networks provide local predictions which are combined into page-level decisions, and the results of both systems are fused using linear or log-linear interpolation. We propose loss functions suitable for weakly supervised classification problem where multiple possible labels are provided, and we propose loss functions suitable for interval regression in the dating task. The line-level system significantly improves results in script and font classification and in the dating task. The full system achieved 98.48%, 88.84%, and 79.69% accuracy in the font, script, and location classification tasks respectively. In the dating task, our system achieved a mean absolute error of 21.91 years. Our system achieved the best results in all tasks and became the overall winner of the competition.

Keywords

  • Historical document classification
  • Script and font classification
  • Document origin localization
  • Document dating

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-06555-2_11
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-031-06555-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

Notes

  1. 1.

    The splits are publicly available at https://pero.fit.vutbr.cz/hdc_dataset.

References

  1. Cheikhrouhou, A., Kessentini, Y., Kanoun, S.: Multi-task learning for simultaneous script identification and keyword spotting in document images. Pattern Recogn. 113, 107832 (2021). https://doi.org/10.1016/j.patcog.2021.107832, https://www.sciencedirect.com/science/article/pii/S0031320321000194

  2. Christlein, V., Spranger, L., Seuret, M., Nicolaou, A., Král, P., Maier, A.: Deep generalized max pooling. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1090–1096, September 2019. https://doi.org/10.1109/ICDAR.2019.00177, iSSN 2379-2140

  3. Cloppet, F., Eglin, V., Helias-Baron, M., Kieu, C., Vincent, N., Stutzmann, D.: ICDAR2017 competition on the classification of medieval handwritings in latin script. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 1371–1376, November 2017. https://doi.org/10.1109/ICDAR.2017.224, iSSN 2379-2140

  4. Cloppet, F., Églin, V., Kieu, V.C., Stutzmann, D., Vincent, N.: ICFHR2016 competition on the classification of medieval handwritings in latin script. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 590–595, October 2016. https://doi.org/10.1109/ICFHR.2016.0113, iSSN 2167-6445

  5. Kodym, O., Hradiš, M.: page layout analysis system for unconstrained historic documents. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 492–506. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_32

    CrossRef  Google Scholar 

  6. Seuret, M., et al.: ICDAR 2021 competition on historical document classification. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12824, pp. 618–634. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86337-1_41

    CrossRef  Google Scholar 

  7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1409.1556

  8. Tensmeyer, C., Saunders, D., Martinez, T.: Convolutional neural networks for font classification. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 985–990, November 2017. https://doi.org/10.1109/ICDAR.2017.164, iSSN: 2379-2140

  9. Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492–1500 (2017). https://openaccess.thecvf.com/content_cvpr_2017/html/Xie_Aggregated_Residual_Transformations_CVPR_2017_paper.html

Download references

Acknowledgement

This work has been supported by the Ministry of Culture Czech Republic in NAKI II project PERO (DG18P02OVV055) and by Czech National Science Foundation (GACR) project “NEUREM3” No. 19-26934X.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Kišš .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Kišš, M., Kohút, J., Beneš, K., Hradiš, M. (2022). Importance of Textlines in Historical Document Classification. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science, vol 13237. Springer, Cham. https://doi.org/10.1007/978-3-031-06555-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06555-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06554-5

  • Online ISBN: 978-3-031-06555-2

  • eBook Packages: Computer ScienceComputer Science (R0)