Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition

Dreuw, Philippe; Heigold, Georg; Ney, Hermann

doi:10.1007/s10032-011-0160-x

Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition

Original Paper
Published: 05 April 2011

Volume 14, pages 273–288, (2011)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Philippe Dreuw¹,
Georg Heigold¹ &
Hermann Ney¹

209 Accesses
22 Citations
Explore all metrics

Abstract

We present a novel confidence- and margin-based discriminative training approach for model adaptation of a hidden Markov model (HMM)-based handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood (ML) trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer-specific data. Here, discriminative training based on the maximum mutual information (MMI) and minimum phone error (MPE) criteria are used to train writer-independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. The proposed methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IFN/ENIT Arabic handwriting database, where the word error rate is decreased by 33% relative compared to a ML trained baseline system. On the large-vocabulary line recognition task of the IAM English handwriting database, the word error rate is decreased by 25% relative.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved discriminative region selection methodology for online handwriting recognition

Article 16 November 2018

A Hybrid CRF/HMM Approach for Handwriting Recognition

An Experimental Study of Pruning Techniques in Handwritten Text Recognition Systems

References

Anastasakos, T., Balakrishnan, S.V.: The use of confidence measures in unsupervised adaptation of speech recognizers. In: International Conference on Spoken Language Processing, Sydney, Australia (1998)
Bazzi I., Schwartz R., Makhoul J.: An omnifont open-vocabulary ocr system for english and arabic. IEEE TPAMI 21(6), 495–504 (1999)
Article Google Scholar
Bertolami R., Bunke H.: Hidden markov model-based ensemble methods for offline handwritten text line recognition. Pattern Recognit. 41(11), 3452–3460 (2008)
Article MATH Google Scholar
Bertolami R., Bunke H.: Hmm-based ensamble methods for offline handwritten text line recognition. Pattern Recognit. 41, 3452–3460 (2008)
Article MATH Google Scholar
Biem A.: Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1041–1051 (2006)
Article Google Scholar
Dreuw, P., Heigold, G., Ney, H.: Confidence-based discriminative training for model adaptation in offline arabic handwriting recognition. In: International Conference on Document Analysis and Recognition, pp. 596–600. Barcelona, Spain, July 2009
Dreuw, P., Jonas, S., Ney, H.: White-space models for offline arabic handwriting recognition. In: International Conference on Pattern Recognition, Tampa, Florida, USA, December 2008
Dreuw, P., Rybach, D., Gollan, C., Ney, H.: Writer adaptive training and writing variant model refinement for offline arabic handwriting recognition. In: ICDAR, Barcelona, Spain, July 2009
El Abed, H., Märgner, V.: Improvement of arabic handwriting recognition systems: combination and/or reject? In: Document Recognition and Retrieval XVI, volume 7247 of SPIE, San Jose, CA, USA, January 2009
Espana-Boquera, S., Castro-Bleda, M., Gorbe-Moya, J., Zamora-Martinez, F.: Improving offline handwritten text recognition with hybrid hmm/ann models. IEEE TPAMI, PP(99):pre-print (2010)
Fink, G.A., Plötz, T.: Unsupervised estimation of writing style models for improved unconstrained off-line handwriting recognition. In: International Workshop on Frontiers in Handwriting Recognition, La Baule, France, October 2006
Gollan, C., Bacchiani, M.: Confidence scores for acoustic model adaptation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 4289–4292. Las Vegas, NV, USA, April 2008
Graves A., Liwicki M., Fernandez S., Bertolami R., Bunke H., Schmidhuber J.: A novel connectionist system for unconstrained handwriting recognition. IEEE TPAMI 31(5), 855–868 (2009)
Article Google Scholar
Heigold, G., Schlüter, R., Ney, H.: On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields. In: INTERSPEECH, Antwerp, Belgium, August 2007
Heigold, G.: A log-linear discriminative modeling framework for speech recognition. PhD thesis, RWTH Aachen University, Aachen, Germany, June 2010
Heigold, G., Deselaers, T., Schlüter, R., Ney, H.: Modified mmi/mpe: a direct evaluation of the margin in speech recognition. In: ICML, pp. 384–391. Helsinki, Finland, July 2008
Heigold G., Dreuw P., Hahn S., Schlüter R., Ney H.: Margin-based discriminative training for string recognition. . J. Sel. Top. Signal Process. Stat. Learn. Methods Speech Lang. Process. 4(6), 917–925 (2010)
Google Scholar
Jebara, T.: Discriminative, generative, and imitative learning. PhD thesis, Massachusetts Institute of Technology (2002)
Jonas, S.: Improved modeling in handwriting recognition. Master’s thesis, Human Language Technology and Pattern Recognition Group, RWTH Aachen University, Aachen, Germany, June 2009
Juan A., Toselli A.H., Domnech J., Gonzlez J., Salvador I., Vidal E., Casacuberta F.: Integrated handwriting recognition and interpretation via finite-state models. Int. J. Pattern Recognit. Artif. Intell. 2004, 519–539 (2001)
Google Scholar
Kemp, T., Schaaf, T.: Estimating confidence using word lattices. In: European Conference on Speech Communication and Technology, Rhodes, Greece (1997)
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: IEEE ICASSP, vol. 1, pp. 49–52. Detroit, MI (1995)
Lorigo L.M., Govindaraju V.: Offline Arabic handwriting recognition: a survey. IEEE PAMI 28(85), 712–724 (2006)
Article Google Scholar
Märgner, V., El Abed, H.: ICDAR 2009 Arabic handwriting recognition competition. In: ICDAR, pp. 1383–1387, Barcelona, Spain, July 2009
Märgner, V., Pechwitz, M., Abed, H.E.: ICDAR 2005 Arabic handwriting recognition competition. In ICDAR, vol. 1, pp. 70–74. Seoul, Korea, August 2005
Märgner, V., El Abed, H.: ICFHR 2010 arabic handwriting recognition competition. In: ICFHR, November 2010
Marti U.-V., Bunke H.: The iam-database: an english sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5(1), 39–46 (2002)
Article MATH Google Scholar
Natarajan, P., Saleem, S., Prasad, R., MacRostie, E., Subramanian, K.: Arabic and Chinese Handwriting Recognition, volume 4768/ 2008 of LNCS, chapter Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach, pp. 231–250. Springer Berlin / Heidelberg, 2008
Nopsuwanchai, R., Povey, D.: Discriminative training for HMM-based offline handwritten character recognition. In: ICDAR, pp. 114–118 (2003)
Nopsuwanchai R., Biem A., Clocksin W.F.: Maximization of mutual information for offline thai handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(8), 1347–1351 (2006)
Article Google Scholar
Padmanabhan, M., Saon, G., Zweig, G.: Lattice-based unsupervised mllr for speaker adaptation. In: ISCA ITRW Automatic Speech Recognition: Challenges for the Millenium, Paris, France (2000)
Pechwitz, M., Snoussi, Maddouri, S., Mägner, V., Ellouze, N., Amiri, H.: IFN/ENIT-database of handwritten Arabic words. In: Colloque International Francophone sur l’Ecrit et le Document (CIFED), Hammamet, Tunis, October 2002
Pitz, M., Wessel, F., Ney, H.: Improved mllr speaker adaptation using confidence measures for conversational speech recognition. In: International Conference on Spoken Language Processing, Beijing, China (2000)
Povey, D.: Discriminative training for large vocabulary speech recognition. PhD thesis, Cambridge, England (2004)
Povey, D., Kanevsky, D., Kingsbury, B., Ramabhadran, B., Saon, G., Visweswariah, K.: Boosted MMI for model and feature-space discriminative training. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, NV, USA, April 2008
Povey, D., Woodland, P.C.: Minimum phone error and I-smoothing for improved discriminative training. In: International Conference on Acoustics, Speech, and Signal Processing, volume 1, Orlando, FL (2002)
Romero V., Alabau V., Benedi J.M.: Combination of n-grams and stochastic context-free grammars in an offline handwritten recognition system. Lect. Notes Comput. Sci. 4477, 467–474 (2007)
Article Google Scholar
Schambach, M.-P., Rottland, J., Alary, T.: How to convert a latin handwriting recognition system to arabic. In: ICFHR (2008)
Schlüter, R., Müller, B., Wessel, F., Ney, H.: Interdependence of language models and discriminative training. In: IEEE Automatic Speech Recognition and Understanding Workshop volume 1, pp. 119–122. Keystone, CO, December 1999
Schlüter, Ralf.: Investigations on Discriminative Training Criteria. PhD thesis, RWTH Aachen University, Aachen, Germany, September 2000
Zhang, J., Jin, R., Yang, Y., Hauptmann, A.G.: Modified logistic regression: an approximation to SVM and its applications in large-scale text categorization. In: ICML, August 2003

Download references

Author information

Authors and Affiliations

RWTH Aachen University, Human Language Technology and Pattern Recognition, Ahornstr 55, 52056, Aachen, Germany
Philippe Dreuw, Georg Heigold & Hermann Ney

Authors

Philippe Dreuw
View author publications
You can also search for this author in PubMed Google Scholar
Georg Heigold
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Ney
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Dreuw.

Additional information

Extended version of ICDAR 2009 work presented in [6].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dreuw, P., Heigold, G. & Ney, H. Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition. IJDAR 14, 273–288 (2011). https://doi.org/10.1007/s10032-011-0160-x

Download citation

Received: 16 February 2010
Revised: 03 October 2010
Accepted: 17 March 2011
Published: 05 April 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s10032-011-0160-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition

Abstract

Access this article

Similar content being viewed by others

An improved discriminative region selection methodology for online handwriting recognition

A Hybrid CRF/HMM Approach for Handwriting Recognition

An Experimental Study of Pruning Techniques in Handwritten Text Recognition Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition

Abstract

Access this article

Similar content being viewed by others

An improved discriminative region selection methodology for online handwriting recognition

A Hybrid CRF/HMM Approach for Handwriting Recognition

An Experimental Study of Pruning Techniques in Handwritten Text Recognition Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation