Abstract
We present a novel confidence- and margin-based discriminative training approach for model adaptation of a hidden Markov model (HMM)-based handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood (ML) trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer-specific data. Here, discriminative training based on the maximum mutual information (MMI) and minimum phone error (MPE) criteria are used to train writer-independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. The proposed methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IFN/ENIT Arabic handwriting database, where the word error rate is decreased by 33% relative compared to a ML trained baseline system. On the large-vocabulary line recognition task of the IAM English handwriting database, the word error rate is decreased by 25% relative.
Similar content being viewed by others
References
Anastasakos, T., Balakrishnan, S.V.: The use of confidence measures in unsupervised adaptation of speech recognizers. In: International Conference on Spoken Language Processing, Sydney, Australia (1998)
Bazzi I., Schwartz R., Makhoul J.: An omnifont open-vocabulary ocr system for english and arabic. IEEE TPAMI 21(6), 495–504 (1999)
Bertolami R., Bunke H.: Hidden markov model-based ensemble methods for offline handwritten text line recognition. Pattern Recognit. 41(11), 3452–3460 (2008)
Bertolami R., Bunke H.: Hmm-based ensamble methods for offline handwritten text line recognition. Pattern Recognit. 41, 3452–3460 (2008)
Biem A.: Minimum classification error training for online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1041–1051 (2006)
Dreuw, P., Heigold, G., Ney, H.: Confidence-based discriminative training for model adaptation in offline arabic handwriting recognition. In: International Conference on Document Analysis and Recognition, pp. 596–600. Barcelona, Spain, July 2009
Dreuw, P., Jonas, S., Ney, H.: White-space models for offline arabic handwriting recognition. In: International Conference on Pattern Recognition, Tampa, Florida, USA, December 2008
Dreuw, P., Rybach, D., Gollan, C., Ney, H.: Writer adaptive training and writing variant model refinement for offline arabic handwriting recognition. In: ICDAR, Barcelona, Spain, July 2009
El Abed, H., Märgner, V.: Improvement of arabic handwriting recognition systems: combination and/or reject? In: Document Recognition and Retrieval XVI, volume 7247 of SPIE, San Jose, CA, USA, January 2009
Espana-Boquera, S., Castro-Bleda, M., Gorbe-Moya, J., Zamora-Martinez, F.: Improving offline handwritten text recognition with hybrid hmm/ann models. IEEE TPAMI, PP(99):pre-print (2010)
Fink, G.A., Plötz, T.: Unsupervised estimation of writing style models for improved unconstrained off-line handwriting recognition. In: International Workshop on Frontiers in Handwriting Recognition, La Baule, France, October 2006
Gollan, C., Bacchiani, M.: Confidence scores for acoustic model adaptation. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 4289–4292. Las Vegas, NV, USA, April 2008
Graves A., Liwicki M., Fernandez S., Bertolami R., Bunke H., Schmidhuber J.: A novel connectionist system for unconstrained handwriting recognition. IEEE TPAMI 31(5), 855–868 (2009)
Heigold, G., Schlüter, R., Ney, H.: On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields. In: INTERSPEECH, Antwerp, Belgium, August 2007
Heigold, G.: A log-linear discriminative modeling framework for speech recognition. PhD thesis, RWTH Aachen University, Aachen, Germany, June 2010
Heigold, G., Deselaers, T., Schlüter, R., Ney, H.: Modified mmi/mpe: a direct evaluation of the margin in speech recognition. In: ICML, pp. 384–391. Helsinki, Finland, July 2008
Heigold G., Dreuw P., Hahn S., Schlüter R., Ney H.: Margin-based discriminative training for string recognition. . J. Sel. Top. Signal Process. Stat. Learn. Methods Speech Lang. Process. 4(6), 917–925 (2010)
Jebara, T.: Discriminative, generative, and imitative learning. PhD thesis, Massachusetts Institute of Technology (2002)
Jonas, S.: Improved modeling in handwriting recognition. Master’s thesis, Human Language Technology and Pattern Recognition Group, RWTH Aachen University, Aachen, Germany, June 2009
Juan A., Toselli A.H., Domnech J., Gonzlez J., Salvador I., Vidal E., Casacuberta F.: Integrated handwriting recognition and interpretation via finite-state models. Int. J. Pattern Recognit. Artif. Intell. 2004, 519–539 (2001)
Kemp, T., Schaaf, T.: Estimating confidence using word lattices. In: European Conference on Speech Communication and Technology, Rhodes, Greece (1997)
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: IEEE ICASSP, vol. 1, pp. 49–52. Detroit, MI (1995)
Lorigo L.M., Govindaraju V.: Offline Arabic handwriting recognition: a survey. IEEE PAMI 28(85), 712–724 (2006)
Märgner, V., El Abed, H.: ICDAR 2009 Arabic handwriting recognition competition. In: ICDAR, pp. 1383–1387, Barcelona, Spain, July 2009
Märgner, V., Pechwitz, M., Abed, H.E.: ICDAR 2005 Arabic handwriting recognition competition. In ICDAR, vol. 1, pp. 70–74. Seoul, Korea, August 2005
Märgner, V., El Abed, H.: ICFHR 2010 arabic handwriting recognition competition. In: ICFHR, November 2010
Marti U.-V., Bunke H.: The iam-database: an english sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5(1), 39–46 (2002)
Natarajan, P., Saleem, S., Prasad, R., MacRostie, E., Subramanian, K.: Arabic and Chinese Handwriting Recognition, volume 4768/ 2008 of LNCS, chapter Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach, pp. 231–250. Springer Berlin / Heidelberg, 2008
Nopsuwanchai, R., Povey, D.: Discriminative training for HMM-based offline handwritten character recognition. In: ICDAR, pp. 114–118 (2003)
Nopsuwanchai R., Biem A., Clocksin W.F.: Maximization of mutual information for offline thai handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(8), 1347–1351 (2006)
Padmanabhan, M., Saon, G., Zweig, G.: Lattice-based unsupervised mllr for speaker adaptation. In: ISCA ITRW Automatic Speech Recognition: Challenges for the Millenium, Paris, France (2000)
Pechwitz, M., Snoussi, Maddouri, S., Mägner, V., Ellouze, N., Amiri, H.: IFN/ENIT-database of handwritten Arabic words. In: Colloque International Francophone sur l’Ecrit et le Document (CIFED), Hammamet, Tunis, October 2002
Pitz, M., Wessel, F., Ney, H.: Improved mllr speaker adaptation using confidence measures for conversational speech recognition. In: International Conference on Spoken Language Processing, Beijing, China (2000)
Povey, D.: Discriminative training for large vocabulary speech recognition. PhD thesis, Cambridge, England (2004)
Povey, D., Kanevsky, D., Kingsbury, B., Ramabhadran, B., Saon, G., Visweswariah, K.: Boosted MMI for model and feature-space discriminative training. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas, NV, USA, April 2008
Povey, D., Woodland, P.C.: Minimum phone error and I-smoothing for improved discriminative training. In: International Conference on Acoustics, Speech, and Signal Processing, volume 1, Orlando, FL (2002)
Romero V., Alabau V., Benedi J.M.: Combination of n-grams and stochastic context-free grammars in an offline handwritten recognition system. Lect. Notes Comput. Sci. 4477, 467–474 (2007)
Schambach, M.-P., Rottland, J., Alary, T.: How to convert a latin handwriting recognition system to arabic. In: ICFHR (2008)
Schlüter, R., Müller, B., Wessel, F., Ney, H.: Interdependence of language models and discriminative training. In: IEEE Automatic Speech Recognition and Understanding Workshop volume 1, pp. 119–122. Keystone, CO, December 1999
Schlüter, Ralf.: Investigations on Discriminative Training Criteria. PhD thesis, RWTH Aachen University, Aachen, Germany, September 2000
Zhang, J., Jin, R., Yang, Y., Hauptmann, A.G.: Modified logistic regression: an approximation to SVM and its applications in large-scale text categorization. In: ICML, August 2003
Author information
Authors and Affiliations
Corresponding author
Additional information
Extended version of ICDAR 2009 work presented in [6].
Rights and permissions
About this article
Cite this article
Dreuw, P., Heigold, G. & Ney, H. Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition. IJDAR 14, 273–288 (2011). https://doi.org/10.1007/s10032-011-0160-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-011-0160-x