Abstract
In this paper, we propose a query by string word spotting system able to extract arbitrary keywords in handwritten documents, taking both segmentation and recognition decisions at the line level. The system relies on the combination of a HMM line model made of keyword and non-keyword (filler) models, with a deep neural network that estimates the state-dependent observation probabilities. Experiments are carried out on RIMES database, an unconstrained handwritten document database that is used for benchmarking different handwriting recognition tasks. The obtained results show the superiority of the proposed framework over the classical GMM–HMM and standard HMM hybrid architectures.
Similar content being viewed by others
Notes
An extension of this approach has been proposed in [6], where the HMM is replaced by a combination of a LSTM neural network with a CTC algorithm.
Note that a Restricted Boltzman Machine (RBM) can also be used, in this case they are called Deep Belief Networks [33].
A vertical scaling procedure is applied to normalize the height of the text line image to \(h=54\).
An alternative to this strategy is to use a small \(\varepsilon\), with a maximum number of iterations, as it is the case in [45].
References
Cao H, Govindaraju V (2007) Template-free word spotting in low-quality manuscripts. In: ICDAR, pp 392–396
Choisy C (2007) Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the ninth ICDAR, vol 1, pp 242–246
Rodríguez-Serrano JA, Perronnin F, Llados J (2009) A similarity measure between vector sequences with application to handwritten word image retrieval. In: CVPR, pp 1722–1729
Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit pp 2106–2116
Chatelain C, Heutte L, Paquet T (2008) Recognition-based vs syntax-directed models for numerical field extraction in handwritten documents. In: ICFHR, Montreal, p 6
Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224
Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMS. Pattern Recognit Lett 33(7):934–942
Paquet T, Heutte L, Koch G, Chatelain C (2012) A categorization system for handwritten documents. Int J Doc Anal Recognit 15(4):315–330
Vinciarelli A, Bengio S, Bunke H (2004) Offline recognition of unconstrained handwritten texts using hmms and statistical langage models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720
Rath T, Manmatha R (2003) Features for word spotting in historical manuscripts. In: ICDAR, pp 218–222
Adamek T, Connor N, Smeaton A (2007) Word matching using single closed contours for indexing handwritten historical documents. Int J Doc Anal Recognit 9(2):153–165
Rusinol M, Aldavert D, Toledo R, Lladós J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: ICDAR, pp 63–67
Thomas S, Chatelain C, Heutte L, Paquet T (2010) An information extraction model for unconstrained handwritten documents. In: ICPR, Istanbul, p 4
Fischer A, Keller A, Frinken V, Bunke H (2010) HMM-based word spotting in handwritten documents using subword models. In: ICPR, pp 3416–3419
Woodland P, Povey D (2002) Large scale discriminative training of hidden markov models for speech recognition. Comput Speech Lang 16(1):25–47
Do TMT, Artières T (2009) Maximum margin training of gaussian HMMS for handwriting recognition. In: ICDAR, pp 976–980
Keshet J, Grangier D, Bengio S (2009) Discriminative keyword spotting. Speech Commun 51:317–329
Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Interspeech pp 504–507
Huang BQ, Du CJ, Zhang YB, Kechadi MT (2006) A hybrid hmm-svm method for online handwriting symbol recognition. In: IEEE international conference on intelligent systems design and applications, pp 887–891
Boquera S, Bleda M, Gorbe-Moya J, Zamora-Martínez F (2011) Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans Pattern Anal Mach Intell 33(4):767–779
Graves A, Liwicki M, Fernandez S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868
Marukatat S, Artières T, Gallinari P, Dorizzi B (2001) Sentence recognition through hybrid neuro-markovian modeling. In: ICDAR, pp 731–735
Grosicki E, El-Abed H (2009) ICDAR 2009 handwriting recognition competition. In: ICDAR, pp 1398–1402
El-Yacoubi MA, Gilloux M, Bertille J-M (2002) A statistical approach for phrase location and recognition within a text line: an application to street name recognition. IEEE Trans Pattern Anal Mach Intell 24(2):172–188
Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. In: Readings in speech recognition, pp 267–296
Kessentini Y, Paquet T, Benhamadou A (2010) Off-line handwritten word recognition using multi-stream hidden Markov models. Pattern Recognit Lett 31:60–70
Bengio Y, LeCun Y, Nohl C, Burges C (1995) LeRec: a NN/HMM hybrid for on-line handwriting recognition. Neural Comput 7:1289–1303
Knerr S, Augustin E (1998) A neural network-hidden Markov model hybrid for cursive word recognition. ICPR 2:1518–1520
Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Speech transcription, workshop, pp 504–507
Mohamed A, Dahl G, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20(1):14–22
Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42
Ortmanns S, Ney H, Aubert X (1997) A word graph algorithm for large vocabulary continuous speech recognition. Comput Speech Lang 11(1):43–72
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2011) Convolutional neural network committees for handwritten character classification. In: ICDAR, pp 1135–1139
Niu X, Suen C (2012) A novel hybrid CNN SVM classifier for recognizing handwritten digits. Pattern Recognit 45(4):1318–1325
Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: NIPS, pp 1096–1104
Le Q, Zou W, Yeung S, Ng A (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: CVPR, pp 3361–3368
Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: NIPS, pp 153–160
Ranzato M, Boureau Y-L, LeCun Y (2007) Sparse feature learning for deep belief networks. In: NIPS
Schenk J, Rigoll G (2006) Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. In: IWFHR
Dreuw P, Doetsch P, Plahl C, Ney H (2011) Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: a comparison for offline handwriting recognition. In: IEEE international conference on image processing
Kimura F, Tsuruoka S, Miyake Y, Shridhar M (1994) A lexicon directed algorithm for recognition of unconstrained handwritten words. IEICE Trans Inf Syst E77-D(7):785–793
Al-Hajj R, Mokbel C, Likforman-Sulem L (2007) Combination of HMM-based classifiers for the recognition of arabic handwritten words. In: ICDAR, pp 959–963
Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127
Larochelle H, Bengio Y, Louradour J, Lamblin P (June 2009) Exploring strategies for training deep neural networks. J Mach Learn Res 10:1–40
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thomas, S., Chatelain, C., Heutte, L. et al. A deep HMM model for multiple keywords spotting in handwritten documents. Pattern Anal Applic 18, 1003–1015 (2015). https://doi.org/10.1007/s10044-014-0433-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-014-0433-3