Covariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data

  • Jan Vaněk
  • Lukáš Machlica
  • Josef V. Psutka
  • Josef Psutka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8113)


An estimation of parameters of a multivariate Gaussian Mixture Model is usually based on a criterion (e.g. Maximum Likelihood) that is focused mostly on training data. Therefore, testing data, which were not seen during the training procedure, may cause problems. Moreover, numerical instabilities can occur (e.g. for low-occupied Gaussians especially when working with full-covariance matrices in high-dimensional spaces). Another question concerns the number of Gaussians to be trained for a specific data set. The approach proposed in this paper can handle all these issues. It is based on an assumption that the training and testing data were generated from the same source distribution. The key part of the approach is to use a criterion based on the source distribution rather than using the training data itself. It is shown how to modify an estimation procedure in order to fit the source distribution better (despite the fact that it is unknown), and subsequently new estimation algorithm for diagonal- as well as full-covariance matrices is derived and tested.


Gaussian Mixture Models Full Covariance Full Covariance Matrix Regularization Automatic Speech Recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Young, S., et al.: The HTK Book (for HTK Version 3.4), Cambridge (2006)Google Scholar
  2. 2.
    Diehl, F., Gales, M.J.F., Liu, X., Tomalin, M., Woodland, P.C.: Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems. In: Proc. INTERSPEECH 2011, pp. 777–780 (2011)Google Scholar
  3. 3.
    Bell, P., King, S.: A Shrinkage Estimator for Speech Recognition with Full Covariance HMMs. In: Proc. Interspeech 2008, Brisbane, Australia (2008)Google Scholar
  4. 4.
    Bell, P.: Full Covariance Modelling for Speech Recognition. Ph.D. Thesis, The University of EdinburghGoogle Scholar
  5. 5.
    Lee, Y., Lee, K.Y., Lee, J.: The Estimating Optimal Number of Gaussian Mixtures Based on Incremental k-means for Speaker Identification. International Journal of Information Technology 12(7), 13–21 (2006)Google Scholar
  6. 6.
    Figueiredo, M.A.T., Leitão, J.M.N., Jain, A.K.: On Fitting Mixture Models. In: Hancock, E.R., Pelillo, M. (eds.) EMMCVPR 1999. LNCS, vol. 1654, pp. 54–69. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  7. 7.
    Mclachlan, G.J., Peel, D.: On a Resampling Approach to Choosing the Number of Components in Normal Mixture Models. Computing Science and Statistics 28, 260–266 (1997)Google Scholar
  8. 8.
    Paclík, P., Novovičová, J.: Number of Components and Initialization in Gaussian Mixture Model for Pattern Recognition. In: Proc. Artificial Neural Nets and Genetic Algorithms, pp. 406–409. Springer, Wien (2001)Google Scholar
  9. 9.
    Schwarz, G.E.: Estimating the dimension of a model. Annals of Statistics 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Akaike, H.: On entropy maximization principle. In: Applications of Statistics, pp. 27–41. North-Holland, Amsterdam (1977)Google Scholar
  11. 11.
    Machlica, L., Vanek, J., Zajic, Z.: Fast Estimation of Gaussian Mixture Model Parameters on GPU using CUDA. In: Proc. PDCAT, Gwangju, South Korea (2011)Google Scholar
  12. 12.
    Vanek, J., Trmal, J., Psutka, J.V., Psutka, J.: Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors. IEEE Transactions on Audio, Speech and Language Processing 20(6), 1818–1828 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Jan Vaněk
    • 1
  • Lukáš Machlica
    • 1
  • Josef V. Psutka
    • 1
  • Josef Psutka
    • 1
  1. 1.Faculty of Applied Sciences, Department of CyberneticsUniversity of West Bohemia in PilsenPilsenCzech Republic

Personalised recommendations