A Direct Criterion Minimization Based fMLLR via Gradient Descend

  • Jan Vaněk
  • Zbyněk Zajíc
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8082)


Adaptation techniques are necessary in automatic speech recognizers to improve a recognition accuracy. Linear Transformation methods (MLLR or fMLLR) are the most favorite in the case of limited available data. The fMLLR is the feature-space transformation. This is the advantage with contrast to MLLR that transforms the entire acoustic model. The classical fMLLR estimation involves maximization of the likelihood criterion based on individual Gaussian components statistic. We proposed an approach which takes into account the overall likelihood of a HMM state. It estimates the transformation to optimize the ML criterion of HMM directly using gradient descent algorithm.


ASR adaptation fMLLR gradient descend Hessian matrix 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In: Readings in Speech Recognition, pp. 267–296 (1990)Google Scholar
  2. 2.
    Gauvain, L., Lee, C.H.: Maximum A-Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains. IEEE Transactions SAP, 2:291–2:298 (1994)Google Scholar
  3. 3.
    Leggeter, C.J., Woodland, P.C.: Maximum Likelihood Linear Regression for Speaker Adaption of Continuous Density Hidden Markov Models. Computer Speech and Language, 9:171–9:185 (1995)Google Scholar
  4. 4.
    Gales, M.J.F.: Maximum Likelihood Linear Transformation for HMM-based Speech Recognition. Tech. Report, CUED/FINFENG/TR291, Cambridge Univ. (1997)Google Scholar
  5. 5.
    Machlica, L., Zajíc, Z., Pražák, A.: Methods of Unsupervised Adaptation in Online Speech Recognition. In: Specom. St.Petersburg (2009)Google Scholar
  6. 6.
    Povey, D., Saon, G.: Feature and Model Space Speaker Adaptation with Full Covariance Gaussians. In: Interspeech, paper 2050-Tue2BuP.14 (2006)Google Scholar
  7. 7.
    Visweswariah, K., Gopinath, R.: Adaptation of front end parameters in a speech recognizer. In: Interspeech, pp. 21–24 (2004)Google Scholar
  8. 8.
    Psutka, J., Müller, L., Matoušek, J., Radová, V.: Mluvíme s počítačem česky, Academia, Praha (2007) ISBN:80-200-1309-1Google Scholar
  9. 9.
    Gales, M.J.F.: The Generation and use of Regression class Trees for MLLR Adaptation. Cambridge University Engineering Department (1996)Google Scholar
  10. 10.
    Balakrishnan, S.V.: Fast incremental adaptation using maximum likelihood regression and stochastic gradient descent. In: Eurospeech, pp. 1521–1524 (2003)Google Scholar
  11. 11.
    Pollak, P., et al.: SpeechDat(E) - Eastern European Telephone Speech Databases, XLDB - Very Large Telephone Speech Databases (ELRA), Paris (2000)Google Scholar
  12. 12.
    Pražák, A., Psutka, J.V., Hoidekr, J., Kanis, J., Müller, L., Psutka, J.: Automatic Online Subtitling of the Czech Parliament Meetings. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 501–508. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jan Vaněk
    • 1
  • Zbyněk Zajíc
    • 1
  1. 1.Faculty of Applied Sciences, Department of CyberneticsUniversity of West Bohemia in PilsenPilsenCzech Republic

Personalised recommendations