Skip to main content
Log in

A deep HMM model for multiple keywords spotting in handwritten documents

  • Industrial and Commercial Application
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a query by string word spotting system able to extract arbitrary keywords in handwritten documents, taking both segmentation and recognition decisions at the line level. The system relies on the combination of a HMM line model made of keyword and non-keyword (filler) models, with a deep neural network that estimates the state-dependent observation probabilities. Experiments are carried out on RIMES database, an unconstrained handwritten document database that is used for benchmarking different handwriting recognition tasks. The obtained results show the superiority of the proposed framework over the classical GMM–HMM and standard HMM hybrid architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. An extension of this approach has been proposed in [6], where the HMM is replaced by a combination of a LSTM neural network with a CTC algorithm.

  2. Note that a Restricted Boltzman Machine (RBM) can also be used, in this case they are called Deep Belief Networks [33].

  3. A vertical scaling procedure is applied to normalize the height of the text line image to \(h=54\).

  4. An alternative to this strategy is to use a small \(\varepsilon\), with a maximum number of iterations, as it is the case in [45].

References

  1. Cao H, Govindaraju V (2007) Template-free word spotting in low-quality manuscripts. In: ICDAR, pp 392–396

  2. Choisy C (2007) Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the ninth ICDAR, vol 1, pp 242–246

  3. Rodríguez-Serrano JA, Perronnin F, Llados J (2009) A similarity measure between vector sequences with application to handwritten word image retrieval. In: CVPR, pp 1722–1729

  4. Rodríguez-Serrano JA, Perronnin F (2009) Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recognit pp 2106–2116

  5. Chatelain C, Heutte L, Paquet T (2008) Recognition-based vs syntax-directed models for numerical field extraction in handwritten documents. In: ICFHR, Montreal, p 6

  6. Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224

    Article  Google Scholar 

  7. Fischer A, Keller A, Frinken V, Bunke H (2012) Lexicon-free handwritten word spotting using character HMMS. Pattern Recognit Lett 33(7):934–942

    Article  Google Scholar 

  8. Paquet T, Heutte L, Koch G, Chatelain C (2012) A categorization system for handwritten documents. Int J Doc Anal Recognit 15(4):315–330

    Article  Google Scholar 

  9. Vinciarelli A, Bengio S, Bunke H (2004) Offline recognition of unconstrained handwritten texts using hmms and statistical langage models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720

    Article  Google Scholar 

  10. Rath T, Manmatha R (2003) Features for word spotting in historical manuscripts. In: ICDAR, pp 218–222

  11. Adamek T, Connor N, Smeaton A (2007) Word matching using single closed contours for indexing handwritten historical documents. Int J Doc Anal Recognit 9(2):153–165

    Article  Google Scholar 

  12. Rusinol M, Aldavert D, Toledo R, Lladós J (2011) Browsing heterogeneous document collections by a segmentation-free word spotting method. In: ICDAR, pp 63–67

  13. Thomas S, Chatelain C, Heutte L, Paquet T (2010) An information extraction model for unconstrained handwritten documents. In: ICPR, Istanbul, p 4

  14. Fischer A, Keller A, Frinken V, Bunke H (2010) HMM-based word spotting in handwritten documents using subword models. In: ICPR, pp 3416–3419

  15. Woodland P, Povey D (2002) Large scale discriminative training of hidden markov models for speech recognition. Comput Speech Lang 16(1):25–47

    Article  Google Scholar 

  16. Do TMT, Artières T (2009) Maximum margin training of gaussian HMMS for handwriting recognition. In: ICDAR, pp 976–980

  17. Keshet J, Grangier D, Bengio S (2009) Discriminative keyword spotting. Speech Commun 51:317–329

    Article  Google Scholar 

  18. Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Interspeech pp 504–507

  19. Huang BQ, Du CJ, Zhang YB, Kechadi MT (2006) A hybrid hmm-svm method for online handwriting symbol recognition. In: IEEE international conference on intelligent systems design and applications, pp 887–891

  20. Boquera S, Bleda M, Gorbe-Moya J, Zamora-Martínez F (2011) Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans Pattern Anal Mach Intell 33(4):767–779

    Article  Google Scholar 

  21. Graves A, Liwicki M, Fernandez S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868

    Article  Google Scholar 

  22. Marukatat S, Artières T, Gallinari P, Dorizzi B (2001) Sentence recognition through hybrid neuro-markovian modeling. In: ICDAR, pp 731–735

  23. Grosicki E, El-Abed H (2009) ICDAR 2009 handwriting recognition competition. In: ICDAR, pp 1398–1402

  24. El-Yacoubi MA, Gilloux M, Bertille J-M (2002) A statistical approach for phrase location and recognition within a text line: an application to street name recognition. IEEE Trans Pattern Anal Mach Intell 24(2):172–188

    Article  Google Scholar 

  25. Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. In: Readings in speech recognition, pp 267–296

  26. Kessentini Y, Paquet T, Benhamadou A (2010) Off-line handwritten word recognition using multi-stream hidden Markov models. Pattern Recognit Lett 31:60–70

    Article  Google Scholar 

  27. Bengio Y, LeCun Y, Nohl C, Burges C (1995) LeRec: a NN/HMM hybrid for on-line handwriting recognition. Neural Comput 7:1289–1303

    Article  Google Scholar 

  28. Knerr S, Augustin E (1998) A neural network-hidden Markov model hybrid for cursive word recognition. ICPR 2:1518–1520

    Google Scholar 

  29. Ganapathiraju A, Hamaker J, Picone J (2000) Hybrid SVM/HMM architectures for speech recognition. In: Speech transcription, workshop, pp 504–507

  30. Mohamed A, Dahl G, Hinton G (2012) Acoustic modeling using deep belief networks. IEEE Trans Audio Speech Lang Process 20(1):14–22

    Article  Google Scholar 

  31. Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42

    Article  Google Scholar 

  32. Ortmanns S, Ney H, Aubert X (1997) A word graph algorithm for large vocabulary continuous speech recognition. Comput Speech Lang 11(1):43–72

    Article  Google Scholar 

  33. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MATH  MathSciNet  Google Scholar 

  34. Ciresan DC, Meier U, Gambardella LM, Schmidhuber J (2011) Convolutional neural network committees for handwritten character classification. In: ICDAR, pp 1135–1139

  35. Niu X, Suen C (2012) A novel hybrid CNN SVM classifier for recognizing handwritten digits. Pattern Recognit 45(4):1318–1325

    Article  Google Scholar 

  36. Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: NIPS, pp 1096–1104

  37. Le Q, Zou W, Yeung S, Ng A (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: CVPR, pp 3361–3368

  38. Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: NIPS, pp 153–160

  39. Ranzato M, Boureau Y-L, LeCun Y (2007) Sparse feature learning for deep belief networks. In: NIPS

  40. Schenk J, Rigoll G (2006) Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. In: IWFHR

  41. Dreuw P, Doetsch P, Plahl C, Ney H (2011) Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: a comparison for offline handwriting recognition. In: IEEE international conference on image processing

  42. Kimura F, Tsuruoka S, Miyake Y, Shridhar M (1994) A lexicon directed algorithm for recognition of unconstrained handwritten words. IEICE Trans Inf Syst E77-D(7):785–793

  43. Al-Hajj R, Mokbel C, Likforman-Sulem L (2007) Combination of HMM-based classifiers for the recognition of arabic handwritten words. In: ICDAR, pp 959–963

  44. Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler

  45. Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127

    Article  MATH  Google Scholar 

  46. Larochelle H, Bengio Y, Louradour J, Lamblin P (June 2009) Exploring strategies for training deep neural networks. J Mach Learn Res 10:1–40

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Clément Chatelain.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thomas, S., Chatelain, C., Heutte, L. et al. A deep HMM model for multiple keywords spotting in handwritten documents. Pattern Anal Applic 18, 1003–1015 (2015). https://doi.org/10.1007/s10044-014-0433-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-014-0433-3

Keywords

Navigation