Skip to main content
Log in

Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

We present a novel confidence- and margin-based discriminative training approach for model adaptation of a hidden Markov model (HMM)-based handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood (ML) trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer-specific data. Here, discriminative training based on the maximum mutual information (MMI) and minimum phone error (MPE) criteria are used to train writer-independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. The proposed methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IFN/ENIT Arabic handwriting database, where the word error rate is decreased by 33% relative compared to a ML trained baseline system. On the large-vocabulary line recognition task of the IAM English handwriting database, the word error rate is decreased by 25% relative.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anastasakos, T., Balakrishnan, S.V.: The use of confidence measures in unsupervised adaptation of speech recognizers. In: International Conference on Spoken Language Processing, Sydney, Australia (1998)

  2. Bazzi I., Schwartz R., Makhoul J.: An omnifont open-vocabulary ocr system for english and arabic. IEEE TPAMI 21(6), 495–504 (1999)

    Article  Google Scholar 

  3. Bertolami R., Bunke H.: Hidden markov model-based ensemble methods for offline handwritten text line recognition. Pattern Recognit. 41(11), 3452–3460 (2008)

    Article  MATH  Google Scholar 

  4. Bertolami R., Bunke H.: Hmm-based ensamble methods for offline handwritten text line recognition. Pattern Recognit. 41, 3452–3460 (2008)

    Article  MATH  Google Scholar 

  5. Biem A.: Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1041–1051 (2006)

    Article  Google Scholar 

  6. Dreuw, P., Heigold, G., Ney, H.: Confidence-based discriminative training for model adaptation in offline arabic handwriting recognition. In: International Conference on Document Analysis and Recognition, pp. 596–600. Barcelona, Spain, July 2009

  7. Dreuw, P., Jonas, S., Ney, H.: White-space models for offline arabic handwriting recognition. In: International Conference on Pattern Recognition, Tampa, Florida, USA, December 2008

  8. Dreuw, P., Rybach, D., Gollan, C., Ney, H.: Writer adaptive training and writing variant model refinement for offline arabic handwriting recognition. In: ICDAR, Barcelona, Spain, July 2009

  9. El Abed, H., Märgner, V.: Improvement of arabic handwriting recognition systems: combination and/or reject? In: Document Recognition and Retrieval XVI, volume 7247 of SPIE, San Jose, CA, USA, January 2009

  10. Espana-Boquera, S., Castro-Bleda, M., Gorbe-Moya, J., Zamora-Martinez, F.: Improving offline handwritten text recognition with hybrid hmm/ann models. IEEE TPAMI, PP(99):pre-print (2010)

  11. Fink, G.A., Plötz, T.: Unsupervised estimation of writing style models for improved unconstrained off-line handwriting recognition. In: International Workshop on Frontiers in Handwriting Recognition, La Baule, France, October 2006

  12. Gollan, C., Bacchiani, M.: Confidence scores for acoustic model adaptation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 4289–4292. Las Vegas, NV, USA, April 2008

  13. Graves A., Liwicki M., Fernandez S., Bertolami R., Bunke H., Schmidhuber J.: A novel connectionist system for unconstrained handwriting recognition. IEEE TPAMI 31(5), 855–868 (2009)

    Article  Google Scholar 

  14. Heigold, G., Schlüter, R., Ney, H.: On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields. In: INTERSPEECH, Antwerp, Belgium, August 2007

  15. Heigold, G.: A log-linear discriminative modeling framework for speech recognition. PhD thesis, RWTH Aachen University, Aachen, Germany, June 2010

  16. Heigold, G., Deselaers, T., Schlüter, R., Ney, H.: Modified mmi/mpe: a direct evaluation of the margin in speech recognition. In: ICML, pp. 384–391. Helsinki, Finland, July 2008

  17. Heigold G., Dreuw P., Hahn S., Schlüter R., Ney H.: Margin-based discriminative training for string recognition. . J. Sel. Top. Signal Process. Stat. Learn. Methods Speech Lang. Process. 4(6), 917–925 (2010)

    Google Scholar 

  18. Jebara, T.: Discriminative, generative, and imitative learning. PhD thesis, Massachusetts Institute of Technology (2002)

  19. Jonas, S.: Improved modeling in handwriting recognition. Master’s thesis, Human Language Technology and Pattern Recognition Group, RWTH Aachen University, Aachen, Germany, June 2009

  20. Juan A., Toselli A.H., Domnech J., Gonzlez J., Salvador I., Vidal E., Casacuberta F.: Integrated handwriting recognition and interpretation via finite-state models. Int. J. Pattern Recognit. Artif. Intell. 2004, 519–539 (2001)

    Google Scholar 

  21. Kemp, T., Schaaf, T.: Estimating confidence using word lattices. In: European Conference on Speech Communication and Technology, Rhodes, Greece (1997)

  22. Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: IEEE ICASSP, vol. 1, pp. 49–52. Detroit, MI (1995)

  23. Lorigo L.M., Govindaraju V.: Offline Arabic handwriting recognition: a survey. IEEE PAMI 28(85), 712–724 (2006)

    Article  Google Scholar 

  24. Märgner, V., El Abed, H.: ICDAR 2009 Arabic handwriting recognition competition. In: ICDAR, pp. 1383–1387, Barcelona, Spain, July 2009

  25. Märgner, V., Pechwitz, M., Abed, H.E.: ICDAR 2005 Arabic handwriting recognition competition. In ICDAR, vol. 1, pp. 70–74. Seoul, Korea, August 2005

  26. Märgner, V., El Abed, H.: ICFHR 2010 arabic handwriting recognition competition. In: ICFHR, November 2010

  27. Marti U.-V., Bunke H.: The iam-database: an english sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5(1), 39–46 (2002)

    Article  MATH  Google Scholar 

  28. Natarajan, P., Saleem, S., Prasad, R., MacRostie, E., Subramanian, K.: Arabic and Chinese Handwriting Recognition, volume 4768/ 2008 of LNCS, chapter Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach, pp. 231–250. Springer Berlin / Heidelberg, 2008

  29. Nopsuwanchai, R., Povey, D.: Discriminative training for HMM-based offline handwritten character recognition. In: ICDAR, pp. 114–118 (2003)

  30. Nopsuwanchai R., Biem A., Clocksin W.F.: Maximization of mutual information for offline thai handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(8), 1347–1351 (2006)

    Article  Google Scholar 

  31. Padmanabhan, M., Saon, G., Zweig, G.: Lattice-based unsupervised mllr for speaker adaptation. In: ISCA ITRW Automatic Speech Recognition: Challenges for the Millenium, Paris, France (2000)

  32. Pechwitz, M., Snoussi, Maddouri, S., Mägner, V., Ellouze, N., Amiri, H.: IFN/ENIT-database of handwritten Arabic words. In: Colloque International Francophone sur l’Ecrit et le Document (CIFED), Hammamet, Tunis, October 2002

  33. Pitz, M., Wessel, F., Ney, H.: Improved mllr speaker adaptation using confidence measures for conversational speech recognition. In: International Conference on Spoken Language Processing, Beijing, China (2000)

  34. Povey, D.: Discriminative training for large vocabulary speech recognition. PhD thesis, Cambridge, England (2004)

  35. Povey, D., Kanevsky, D., Kingsbury, B., Ramabhadran, B., Saon, G., Visweswariah, K.: Boosted MMI for model and feature-space discriminative training. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, NV, USA, April 2008

  36. Povey, D., Woodland, P.C.: Minimum phone error and I-smoothing for improved discriminative training. In: International Conference on Acoustics, Speech, and Signal Processing, volume 1, Orlando, FL (2002)

  37. Romero V., Alabau V., Benedi J.M.: Combination of n-grams and stochastic context-free grammars in an offline handwritten recognition system. Lect. Notes Comput. Sci. 4477, 467–474 (2007)

    Article  Google Scholar 

  38. Schambach, M.-P., Rottland, J., Alary, T.: How to convert a latin handwriting recognition system to arabic. In: ICFHR (2008)

  39. Schlüter, R., Müller, B., Wessel, F., Ney, H.: Interdependence of language models and discriminative training. In: IEEE Automatic Speech Recognition and Understanding Workshop volume 1, pp. 119–122. Keystone, CO, December 1999

  40. Schlüter, Ralf.: Investigations on Discriminative Training Criteria. PhD thesis, RWTH Aachen University, Aachen, Germany, September 2000

  41. Zhang, J., Jin, R., Yang, Y., Hauptmann, A.G.: Modified logistic regression: an approximation to SVM and its applications in large-scale text categorization. In: ICML, August 2003

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Philippe Dreuw.

Additional information

Extended version of ICDAR 2009 work presented in [6].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dreuw, P., Heigold, G. & Ney, H. Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition. IJDAR 14, 273–288 (2011). https://doi.org/10.1007/s10032-011-0160-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-011-0160-x

Keywords

Navigation