Abstract
In this paper we present a hybrid language model for the recognition of handwritten historical documents with a structured syntactical layout. Using a hidden Markov model-based recognition framework, a word-based grammar with a closed dictionary is enhanced by a character sequence recognition method. This allows to recognize out-of-dictionary words in controlled parts of the recognition, while keeping a closed vocabulary restriction for other parts. While the current status is work in progress, we can report an improvement in terms of character error rate.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
España-Boquera, S., Castro-Bleda, M.J., Gorbe-Moya, J., Zamora-Martinez, F.: Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(4), 767–779 (2011)
Wuthrich, M., Liwicki, M., Fischer, A., Indermuhle, E., Bunke, H., Viehhauser, G., Stolz, M.: Language Model Integration for the Recognition of Handwritten Medieval Documents. In: 10th International Conference on Document Analysis and Recognition (2009)
Zimmermann, M., Bunke, H.: N-gram language models for offline handwritten text recognition. In: 9th International Workshop on Frontiers in Handwriting Recognition (2004)
Toselli, A.H., Juan, A., González, J., Salvador, I., Vidal, E., Casacuberta, F., Keysers, D., Ney, H.: Integrated Handwriting Recognition and Interpretation Using Finite-State Models. International Journal of Pattern Recognition and Artificial Intelligence 18(8), 519–539 (2004)
Marti, U.-V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems. Hidden Markov Models, 65–90 (2001)
Romero, V., Fornés, A., Serrano, N., Sánchez, J.A., Toselli, A.H., Frinken, V., Vidal, E., Lladós, J.: The ESPOSALLES Database: An Ancient Marriage License Corpus for Off-line Handwriting Recognition. Pattern Recognition 46(6), 1658–1669 (2013)
Romero, V., Sánchez, J.A., Serrano, N., Vidal, E.: Handwritten Text Recognition for Marriage Register Books. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 533–537 (2011)
Sayre, K.M.: Machine Recognition of Handwritten Words: A Project Report. Pattern Recognition 3(3), 213–228 (1973)
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book. Technical report, Cambridge University Engeneering Department (December 2006)
Goodman, J.T.: A Bit of Progress in Language Modeling - Extended Version. Technical Report MSR-TR-2001-72, Microsoft Research, One Microsoft Way Redmond, WA 98052, 8 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cirera, N., Fornés, A., Frinken, V., Lladós, J. (2013). Hybrid Grammar Language Model for Handwritten Historical Documents Recognition. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-38628-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)