Pattern Recognition and Image Analysis

, Volume 18, Issue 2, pp 207–215 | Cite as

Developing pattern recognition systems based on Markov models: The ESMERALDA framework

  • G. A. FinkEmail author
  • T. Plötz
Plenary Papers


In this paper we describe ESMERALDA—an integrated Environment for Statistical Model Estimation and Recognition on Arbitrary Linear Data Arrays—which is a framework for building statistical recognizers operating on sequential data as, e.g., speech, handwriting, or biological sequences. ESMERALDA primarily supports continuous density Hidden Markov Models (HMMs) of different topologies and with user-definable internal structure. Furthermore, the framework supports the incorporation of Markov chain models (realized as statistical n-gram models) for long-term sequential restrictions and Gaussian mixture models (GMMs) for general classification tasks. ESMERALDA is used by several academic and industrial institutions. It was successfully applied to a number of challenging recognition problems in the fields of automatic speech recognition, offline handwriting recognition, and protein sequence analysis. The software is open source and can be retrieved under the terms of the LGPL.


Hide Markov Model Speech Recognition Automatic Speech Recognition Markov Chain Model Handwriting Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. A. Markov, “Example of Statistical Investigations of the Text of “Eugen Onegin,” which Demonstrates the Connection of Events in a Chain,” in Bulletin de l’Académie Impériale des Sciences de St.-Pétersbourg (Sankt-Petersburg, 1913), pp. 153–162 [in Russian].Google Scholar
  2. 2.
    L. Baum, T. Petrie, G. Soules, and N. Weiss, “A Maximization technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains,” Ann. Math. Statist. 41, pp. 164–171 (1970).CrossRefMathSciNetzbMATHGoogle Scholar
  3. 3.
    A. Viterbi, “Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm,” IEEE Trans. on Information Theory 13, 260–269 (1967).CrossRefzbMATHGoogle Scholar
  4. 4.
    B. Lowerre and D. Reddy, “The Harpy Speech Understanding System,” in Trends in Speech Recognition, Ed. by W. Lea (Englewood Cliffs, Prentice-Hall Inc., New Jersey, 1980), pp. 340–360.Google Scholar
  5. 5.
    L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” in Proceedings of the IEEE, 1989, vol. 77, no. 2, pp. 257–286.Google Scholar
  6. 6.
    G. A. Fink, Markov Models for Pattern Recognition, From Theory to Applications (Springer Heidelberg, 2008).Google Scholar
  7. 7.
    A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, Series B 39(1), 1–22 (1977).MathSciNetzbMATHGoogle Scholar
  8. 8.
    S. F. Chen and J. Goodman, “An Empirical Study of Smoothing Techniques for Language Modeling,” Computer Speech and Language 13, 359–394 (1999).CrossRefGoogle Scholar
  9. 9.
    E. G. Schukat-Talamazzini, M. Bielecki, H. Niemann, T. Kuhn, and S. Rieck, “A Non-Metrical Space Search Algorithm for Fast Gaussian Vector Quantization,” in Proceedings Int. Conf. on Acoustics, Speech, and Signal Processing (Minneapolis, 1993), pp. 688–691.Google Scholar
  10. 10.
    T. Plötz and G. A. Fink, “Pattern Recognition Methods for Advanced Stochastic Protein Sequence Analysis Using HMMs,” Pattern Recognition, Special Issue on Bioinformatics 39, 2267–2280 (2006).zbMATHGoogle Scholar
  11. 11.
    G. A. Fink, “Developing HMM-Based Recognizers with ESMERALDA”, in Lecture Notes in Artificial Intelligence, Ed. by V. Matoušek, P. Mautner, J. Ocelíková, and P. Sojka (Springer, Berlin Heidelberg, 1999), vol. 1692, pp. 229–234.Google Scholar
  12. 12.
    G. A. Fink, C. Schillo, F. Kummert, and G. Sagerer, “Incremental Speech Recognition for Multimodal Interfaces,” in Proceedings Annual Conference of the IEEE Industrial Electronics Society, 1998, vol. 4, pp. 2012–2017.Google Scholar
  13. 13.
    S. Wachsmuth, G. A. Fink, and G. Sagerer, “Integration of Parsing and Incremental Speech Recognition,” in Proceedings European Signal Processing Conference, 1998, vol. 1, pp. 371–375.Google Scholar
  14. 14.
    K. Kirchhoff, G. A. Fink, and G. Sagerer, “Combining Acoustic and Articulatory Information for Robust Speech Recognition,” Speech Communication 37(3–4), 303–319 (2002).CrossRefzbMATHGoogle Scholar
  15. 15.
    A. Haasch, S. Hohenner, S. Hüwel, M. Kleinehagenbrock, S. Lang, I. Toptsis, G. A. Fink, J. Fritsch, B. Wrede, and G. Sagerer, “BIRON—The Bielefeld Robot Companion,” in Proceedings Int. Workshop on Advances in Service Robotics, Ed. by E. Prassler, G. Lawitzky, P. Fiorini, and M. Hägele (Fraunhofer IRB Verlag, Stuttgart, Germany, May 2004), pp. 27–32.Google Scholar
  16. 16.
    G. A. Fink, J. Fritsch, N. Leßmann, H. Ritter, G. Sagerer, J. J. Steil, and I. Wachsmuth, “Architectures of Situated Communicators: From Perception to Cognition to Learning,” in Situated Communication, Ed. by G. Rickheit and I. Wachsmuth, pp. 357–376 (Berlin, Mouton de Gruyter; Trends in Linguistics, 2006).Google Scholar
  17. 17.
    G. A. Fink and T. Plötz, “Integrating Speaker Identification and Learning with Adaptive Speech Recognition,” in 2004: A Speaker Odyssey—The Speaker and, Language Recognition Workshop, pp. 185–192 (2004).Google Scholar
  18. 18.
    C. Schillo, G. A. Fink, and F. Kummert, “Grapheme Based Speech Recognition for Large Vocabularies,” in International Conference on Spoken Language Processing vol. 4. (Beijing, China, 2000), pp. 584–587.Google Scholar
  19. 19.
    T. Plötz and G. A. Fink, “Robust Time-Synchronous Environmental Adaptation for Continuous Speech Recognition Systems,” in International Conference on Spoken Language Processing (2002).Google Scholar
  20. 20.
    T. Starner, J. Makhoul, R. Schwartz, and G. Chou, “Online Cursive Handwriting Recognition Using Speech Recognition Methods,” in Proceedings Int. Conf. on Acoustics, Speech, and Signal Processing, 1994, vol. 5, pp. 125–128.Google Scholar
  21. 21.
    M. Wienecke, G. A. Fink, and G. Sagerer, “Toward Automatic Video-Based Whiteboard Reading,” Int. Journal on Document Analysis and Recognition 7(2–3), 188–200 (2005).CrossRefGoogle Scholar
  22. 22.
    R. Durbin, S. R. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (Cambridge University Press, 1998).Google Scholar
  23. 23.
    T. Plötz and G. A. Fink, “Robust Remote Homology Detection by Feature Based Profile Hidden Markov Models,” Statistical Applications in Genetics and Molecular Biology 4(1) (2005).Google Scholar
  24. 24.
    T. Plötz and G. A. Fink, “Feature Extraction for Improved Profile HMM Based Biological Sequence Analysis,” in Proceedings Int. Conf. on Pattern Recognition (IEEE, 2004), no. 2, pp. 315–318.Google Scholar
  25. 25.
    T. Plötz and G. A. Fink, “A New Approach for HMM Based Protein Sequence Modeling and Its Application to Remote Homology Classification,” in Proceedings Workshop Statistical Signal Processing (IEEE, Bordeaux, France, 2005).Google Scholar

Copyright information

© Pleiades Publishing, Ltd. 2008

Authors and Affiliations

  1. 1.Intelligent Systems Group, Robotics Research InstituteDortmund University of TechnologyDortmundGermany

Personalised recommendations