Abstract
Digitized archives contain and preserve the knowledge of generations of scholars in millions of documents. The size of these archives calls for automatic analysis since a manual analysis by specialists is often too expensive. In this paper, we focus on the analysis of handwriting in scanned documents from the art-historic archive of the Wildenstein Plattner Institute. Since the archive consists of documents written in several languages and lacks annotated training data for the creation of recognition models, we propose the task of handwriting classification as a new step for a handwriting OCR pipeline. We propose a handwriting classification model that labels extracted text fragments, e.g., numbers, dates, or words, based on their visual structure. Such a classification supports historians by highlighting documents that contain a specific class of text without the need to read the entire content. To this end, we develop and compare several deep learning-based models for text classification. In extensive experiments, we show the advantages and disadvantages of our proposed approach and discuss possible usage scenarios on a real-world dataset.
C. Bartz and H. Rätz—Equal contribution.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992). https://doi.org/10.1080/00031305.1992.10475879
Bhunia, A.K., Das, A., Bhunia, A.K., Kishore, P.S.R., Roy, P.P.: Handwriting recognition in low-resource scripts using adversarial learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4767–4776 (2019)
Bluche, T., Ney, H., Kermorvant, C.: Tandem HMM with convolutional neural network for handwritten word recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2390–2394, May 2013. https://doi.org/10.1109/ICASSP.2013.6638083, iSSN: 2379-190X
Brümmer, N., du Preez, J.: Application-independent evaluation of speaker detection. Comput. Speech Lang. 20(2), 230–275 (2006). https://doi.org/10.1016/j.csl.2005.08.001
Carbune, V., et al.: Fast multi-language LSTM-based online handwriting recognition. Int. J. Doc. Anal. Recogn. (IJDAR) 23, 89–102 (2020)
Fischer, A., Frinken, V., Fornés, A., Bunke, H.: Transcription alignment of Latin manuscripts using hidden Markov models. In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, HIP 2011, pp. 29–36. Association for Computing Machinery, New York, September 2011. https://doi.org/10.1145/2037342.2037348
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recogn. Lett. 33(7), 934–942 (2012). https://doi.org/10.1016/j.patrec.2011.09.009
Giménez, A., Khoury, I., Andrés-Ferrer, J., Juan, A.: Handwriting word recognition using windowed Bernoulli HMMs. Pattern Recogn. Lett. 35, 149–156 (2014). https://doi.org/10.1016/j.patrec.2012.09.002
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1501–1510 (2017)
Kang, L., Riba, P., Wang, Y., Rusiñol, M., Fornés, A., Villegas, M.: GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images. arXiv:2003.02567 [cs], March 2020, arXiv: 2003.02567
Kang, L., Toledo, J.I., Riba, P., Villegas, M., Fornés, A., Rusiñol, M.: Convolve, attend and spell: an attention-based sequence-to-sequence model for handwritten word recognition. In: Brox, T., Bruhn, A., Fritz, M. (eds.) GCPR 2018. LNCS, vol. 11269, pp. 459–472. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12939-2_32
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May, 2015, Conference Track Proceedings (2015)
Krishnan, P., Jawahar, C.V.: Generating Synthetic Data for Text Recognition. arXiv:1608.04224 [cs], August 2016, arXiv: 1608.04224
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012)
Liwicki, M., Bunke, H.: IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), vol. 2, pp. 956–961, August 2005. https://doi.org/10.1109/ICDAR.2005.132, iSSN: 2379-2140
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982). https://doi.org/10.1109/TIT.1982.1056489
Mandal, R., Roy, P.P., Pal, U., Blumenstein, M.: Multi-lingual date field extraction for automatic document retrieval by machine. Inf. Sci. 314, 277–292 (2015). https://doi.org/10.1016/j.ins.2014.08.037
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5(1), 39–46 (2002). https://doi.org/10.1007/s100320200071
Pearson, K.: LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2(11), 559–572 (1901), publisher: Taylor & Francis
Plamondon, R., Srihari, S.: Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 63–84 (2000). https://doi.org/10.1109/34.824821
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)
Tappert, C., Suen, C., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990). https://doi.org/10.1109/34.57669
Tokui, S., Oono, K., Hido, S., Clayton, J.: Chainer: a next-generation open source framework for deep learning. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Twenty-ninth Annual Conference on Neural Information Processing Systems (NIPS) (2015)
Ul-Hasan, A., Shafait, F., Breuel, T.: High-Performance OCR for Printed English and Fraktur using LSTM. Networks (2013). https://doi.org/10.1109/ICDAR.2013.140
Wigington, C., Stewart, S., Davis, B., Barrett, B., Price, B., Cohen, S.: Data augmentation for recognition of handwritten words and lines using a CNN-LSTM network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 639–645, November 2017. https://doi.org/10.1109/ICDAR.2017.110, iSSN: 2379-2140
Wigington, C., Tensmeyer, C., Davis, B., Barrett, W., Price, B., Cohen, S.: Start, follow, read: end-to-end full-page handwriting recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 372–388. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_23
Acknowledgments
We thank the Wildenstein Plattner Institute for providing us with access to their art-historical archive and their expertise.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bartz, C., Rätz, H., Meinel, C. (2021). Handwriting Classification for the Analysis of Art-Historical Documents. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12663. Springer, Cham. https://doi.org/10.1007/978-3-030-68796-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-68796-0_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68795-3
Online ISBN: 978-3-030-68796-0
eBook Packages: Computer ScienceComputer Science (R0)