Abstract
In this article, a handwriting recognition model whose complexity does not depend on the lexicon size is proposed. It is an alternative to lexicon-driven decoding, based on a lexicon verification process that allows to deal with millions of words, without any time consuming decoding stage. This lexicon verification is included in a cascade framework that uses complementary LSTM RNN classifiers. An original and very efficient method to obtain hundreds of complementary LSTM RNN extracted from a single training, called cohort, is proposed. The proposed approach achieves new state-of-the art performance on the Rimes and IAM datasets, and provides 90% of accuracy on the Rimes dataset when dealing with a gigantic lexicon record of 3 millions of words. The last contribution extends the idea of cohort and lexicon verification in a ROVER combination for handwriting line recognition, and achieves state-of-the-art results on the Rimes dataset.
Similar content being viewed by others
Notes
Namely on certain upper case characters, especially for letter “j” in the word “je” or “j’” where many ground truth errors occur in the dataset.
evaluated on an Intel CPU i7-3740QM.
References
Bharath A, Madhvanath S (2012) Hmm-based lexicon-driven and lexicon-free word recognition for online handwritten indic scripts. IEEE PAMI 34 (4):670–682
Bideault G, Mioulet L, Chatelain C, Paquet T (2015) Spotting handwritten words and regex using a two stage blstm-hmm architecture. In: Document recognition and retrieval
Bluche T, Louradour J, Knibbe M, Moysset B, Benzeghiba MF, Kermorvant C (2014) The a2ia arabic handwritten text recognition system at the open hart2013 evaluation. In: Document Analysis Systems, pp 161–165
Brakensiek A, Rottland J, Rigoll G (2002) Handwritten address recognition with open vocabulary using character n-grams. In: WFHR, pp. 357–362
Chatelain C, Heutte L, Paquet T (2006) A two-stage outlier rejection strategy for numerical field extraction in handwritten documents. In: ICPR, Vol. 3, pp. 224–227
Chatelain C, Heutte L, Paquet T (2006) Segmentation-driven recognition applied to numerical field extraction from handwritten incoming mail documents. In: Document Analysis System, pp. 564–575
Chelba C, Mikolov T, Schuster M, Ge Q, Brants T, Koehn P, Robinson T One billion word benchmark for measuring progress in statistical language modeling. arXiv:1312.3005
Choromanska A, Henaff M, Mathieu M, Arous G, LeCun Y (2015) The loss surfaces of multilayer networks. In: AISTATS
Chung J, Gulcehre C, Cho K, Bengio Y Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, Vol. 1, pp. 886–893
El Abed H, Margner V, Kherallah M, Alimi AM (2009) Icdar 2009 online arabic handwriting recognition competition. In: ICDAR, pp. 1388–1392
El-Yacoubi A, Gilloux M, Sabourin R, Suen CY (1999) An hmm-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE PAMI 21(8):752–760
Fiscus JG (1997) A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (rover). In: Automatic Speech Recognition and Understanding,pp. 347–354
Fissore L, Micca G, Pieraccini R, Palace P (1988) Strategies for lexical access to very large vocabularies. Speech Comm 7(4):355–366
Graves A Rnnlib: A recurrent neural network library for sequence learning problems
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: Continual prediction with lstm. Neural computation 12(10):2451–2471
Gers FA, Schraudolph NN, Schmidhuber J (2003) Learning precise timing with lstm recurrent networks. J Mach Learn Res 3:115–143
Graves A (2012) Supervised sequence labelling with recurrent neural networks, Vol. 385 springer
Graves A, Fernández S, Gomez FJ, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18 (5):602–610
Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: NIPS, pp. 545–552
Grosicki E, El Abed H (2009) Icdar 2009 handwriting recognition competition. In: ICDAR, pp. 1398–1402
Hamdani M, Doetsch P, Kozielski M, Mousa AE-D, Ney H (2014) The rwth large vocabulary arabic handwriting recognition system. In: IAPR International Workshop on Document Analysis Systems, pp. 111–115
Hamdani M, Mousa AE-D, Ney H (2013) Open vocabulary arabic handwriting recognition using morphological decomposition. In: ICDAR, IEEE, pp. 280–284
Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6(02):107–116
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computation 9(8):1735–1780
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences 79(8):2554–2558
Koerich AL, Sabourin R, Suen CY (2003) Large vocabulary off-line handwriting recognition: a survey. Pattern Analysis & Applications 6(2):97–121
Kozielski M, Rybach D, Hahn S, Schlüter R, Ney H (2013) Open vocabulary handwriting recognition using combined word-level and character-level language models. In: ICASSP, pp. 8257–8261
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Madhvanath S, Govindaraju V (1996) Holistic lexicon reduction for handwritten word recognition. In: Document Recognition III, pp. 224–234
Madhvanath S, Krpasundar V, Govindaraju V (2001) Syntactic methodology of pruning large lexicons in cursive script recognition. Pattern Recogn 34 (1):37–46
Marti U-V, Bunke H (2002) The iam-database: an english sentence database for offline handwriting recognition. IJDAR 5(1):39–46
Menasri F, Louradour J, Bianne-Bernard A-L, Kermorvant C (2012) The a2ia french handwriting recognition system at the rimes-icdar2011 competition. In: Document Recognition and Retrieval XIX, pp. 82970Y–82970Y
Mioulet L, Bideault G, Chatelain C, Paquet T, Brunessaux S (2015) Exploring multiple feature combination strategies with a recurrent neural network architecture for off-line handwriting recognition. In: Document Recognition and Retrieval, pp. 94020F–94020F
Pham V, Bluche T, Kermorvant C, Louradour J (2014) Dropout improves recurrent neural networks for handwriting recognition. In: ICFHR, pp. 285–290
Plamondon R, Srihari SN (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE PAMI 22(1):63–84
Poznanski A, Wolf L (2016) Cnn-n-gram for handwriting word recognition. In: CVPR, pp. 2305–2314
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on 45(11):2673–2681
Senior A, Robinson T (1996) Forward-backward retraining of recurrent neural networks. NIPS, pp 743–749
Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE PAMI 39(11):2298–2304
Shridhar M, Houle G, Kimura F (1997) Handwritten word recognition using lexicon free and lexicon directed word recognition algorithms. In: ICDAR, Vol. 2, pp. 861–865
Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: NIPS, pp. 3104–3112
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, Vol. 1, pp. I–511
Voigtlaender P, Doetsch P, Ney H (2016) Handwriting recognition with large multidimensional long short-term memory recurrent neural networks. In: ICFHR, pp. 228–233
Zamora-Martinez F, Frinken V, España-Boquera S, Castro-Bleda MJ, Fischer A, Bunke H (2014) Neural network language models for off-line handwriting recognition. Pattern Recogn 47(4):1642–1652
Zhang B (2013) Reliable classification of vehicle types based on cascade classifier ensembles. Intelligent Transportation Systems 14(1):322–332
Zhang P, Bui T, Suen C (2007) A novel cascade ensemble classifier system with a high recognition performance on handwritten digits. Pattern Recogn 40 (12):3415–3429
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Stuner, B., Chatelain, C. & Paquet, T. Handwriting recognition using cohort of LSTM and lexicon verification with extremely large lexicon. Multimed Tools Appl 79, 34407–34427 (2020). https://doi.org/10.1007/s11042-020-09198-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09198-6