A Variable Initialization Approach to the EM Algorithm for Better Estimation of the Parameters of Hidden Markov Model Based Acoustic Modeling of Speech Signals

  • Md. Shamsul Huda
  • Ranadhir Ghosh
  • John Yearwood
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4065)


The traditional method for estimation of the parameters of Hidden Markov Model (HMM) based acoustic modeling of speech uses the Expectation-Maximization (EM) algorithm. The EM algorithm is sensitive to initial values of HMM parameters and is likely to terminate at a local maximum of likelihood function resulting in non-optimized estimation for HMM and lower recognition accuracy. In this paper, to obtain better estimation for HMM and higher recognition accuracy, several candidate HMMs are created by applying EM on multiple initial models. The best HMM is chosen from the candidate HMMs which has highest value for likelihood function. Initial models are created by varying maximum frame number in the segmentation step of HMM initialization process. A binary search is applied while creating the initial models. The proposed method has been tested on TIMIT database. Experimental results show that our approach obtains improved values for likelihood function and improved recognition accuracy.


Hide Markov Model Speech Signal Gaussian Mixture Model Recognition Accuracy Binary Search 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Levinson, S.E., Rabiner, L.R., Sondhi, M.M.: An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. The Bell System Technical Journal 62(4) (1983)Google Scholar
  2. 2.
    Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. of IEEE 77, 257–286 (1989)CrossRefGoogle Scholar
  3. 3.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via EM algorithm. Journal of royal statistical society. Series B (Methodological) 39, 1–38 (1977)MATHMathSciNetGoogle Scholar
  4. 4.
    Ghahramami, Z., Jordan, M.I.: Learning from incomplete data, Technical Report AI Lab Memo No. 1509, CBCL Paper No. 108, MIT AI Lab (1995)Google Scholar
  5. 5.
    Wu, C.F.J.: On the convergence properties of the EM algorithm. The Annals of Statistics 11, 95–103 (1983)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Xu, L., Jordan, M.I.: On convergence properties of the EM algorithm for Gaussian mixtures. Neural Computation 8(9), 129–151 (1996)CrossRefGoogle Scholar
  7. 7.
    Chau, C.W., Kwong, S., Diu, C.K., Fahrner, W.R.: Optimization of HMM by a Genetic Algorithm. In: Proc. of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1997) (1997)Google Scholar
  8. 8.
    Martinez, A.M., Vitria, J.: Learning mixture models using a genetic version of the EM algorithm. Pattern Recognition Letters 21, 759–769 (2000)CrossRefGoogle Scholar
  9. 9.
    Martinez, A.M., Vitria, J.: Clustering in image space for place recognition and visual annotations for human-robot interaction. IEEE Transactions on Systems, Man, and Cybernetics - Part B 31, 669–682 (2001)CrossRefGoogle Scholar
  10. 10.
    Michalewicz, Z., Schoenauer, M.: Evolutionary Algorithms for Constrained Parameter Optimization Problems. Evolutionary Computation 4(1), 1–32 (1996)CrossRefGoogle Scholar
  11. 11.
    Back, T.: Evolutionary Algorithm in Theory and Practice. Oxford University Press, Oxford (1996)Google Scholar
  12. 12.
    Back, T., Schwefel, H.: Evolutionary computation: An overview. In: IEEE Conference on Evolutionary Computation, pp. 20–29 (1996)Google Scholar
  13. 13.
    Pernkopf, F., Bouchaffra, D.: Genetic-Based EM Algorithm for Learning Gaussian Mixture Models. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(28) (2005)Google Scholar
  14. 14.
    Kapadia, S., Valtchev, V., Young, S.J.: MMI training for continuous phoneme recognition on the TIMIT database. In: Proc. of the IEEE Conference on Acoustic Speech and Signal Processing, vol. 2, pp. 491–494 (1993)Google Scholar
  15. 15.
    Rabiner, L.R., Juang, B.H., Levisnon, S.E., Sondhi, M.M.: Some properties of continuous Hidden Markov Model representation, AT & T Tech. Journal, Vol. 64(6), pp.1251–1270 (1985)Google Scholar
  16. 16.
    Soong, F.K., Svendsen, T.: On the Automatic Segmentation of Speech. In: Proc. ICASSP (1987)Google Scholar
  17. 17.
    Ghosh, R.: Connection topologies for combining genetic and least square methods for neural learning. Journal of Intelligent System 13(3), 199–232 (2004)Google Scholar
  18. 18.
    Veterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE transaction on Information Theory IT-13, 260–269 (1967)CrossRefGoogle Scholar
  19. 19.
    Garofolo, S.J., Lamel, L., Fisher, M.W.: TIMIT Acoustic-Phonetic Continuous Speech Corpus, Linguistic Data Consortium, University of Pennsylvania ISBN: 1-58563-019-5Google Scholar
  20. 20.
    Yung, Y.S., Oh, Y.H.: A segmental-feature HMM for continuous speech recognition based on a parametric trajectory model. Journal of Speech Communication 38(1), 115–130 (2002)MATHCrossRefGoogle Scholar
  21. 21.
    Lamel, L., Gauvain, J.L.: High performance speaker-independent phone recognition using CDHMM. In: Proc. EUROSPEECH, pp. 121–124 (1993)Google Scholar
  22. 22.
    Figueiredo, M.A.T., Jain, A.K.: Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 1–16 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Md. Shamsul Huda
    • 1
  • Ranadhir Ghosh
    • 1
  • John Yearwood
    • 1
  1. 1.School of Information Technology and Mathematical ScienceUniversity of BallaratBallaratAustralia

Personalised recommendations