Mixed environment compensation based on maximum a posteriori estimation for robust speech recognition



Noise robustness is a fundamental problem for speech recognition system in the real environments. The paper presents mixed environment compensation technique in which feature compensation algorithm and acoustic model compensation algorithm is combined together. The target is to obtain the fine compensated static acoustic model and the dynamic compensated speech. Therefore, the modified speech sequence can well match the modified acoustic model. The experimental results show that significant performance improvement has been observed.


Robust speech recognition Mixed environment compensation MAP EM 


  1. Acero A, Deng L, Kristjansson K, Zhang J (2000) HMM adaptation using vector Taylor series for noisy speech recognition. ICSLPGoogle Scholar
  2. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39: 1–38MATHMathSciNetGoogle Scholar
  3. Gales MJF (1995) Model-based techniques for noise robust speech recognition. Ph.D. thesis. University of CambridgeGoogle Scholar
  4. Kim NS (1998) Statistical linear approximation for environment compensation. IEEE Signal Process Lett 5(1): 8–10. doi: 10.1109/97.654866 CrossRefGoogle Scholar
  5. Moreno PJ, Raj B, Stern RM (1995) A vector taylor series approach for environment-independent speech recognition. Proc IEEE 733–736Google Scholar
  6. Raj B, Gouvea EB, Moreno PJ, Stern RM (1996) Cepstral compensation by polynomial approximation for environment-independent speech recognition. In: Proceedings of International Conference on Spoken Language Processing. 2340–2343Google Scholar
  7. Sarikaya R, Hansen JH (2000) PCA-PMC: a novel use of a priori knowledge for parallel model combination. ICASSP. 1113–1116Google Scholar
  8. Shen HF, Guo J, Liu G, Li QX (2005a) Two-domain feature compensation for robust speech recognition. In: Wang J, Liao X, Yi Z (ed) Advance in Neural Network-ISNN 2005, Lecture Notes in Computer Science 3497, Springer-Verlag, pp 351–356Google Scholar
  9. Shen HF, Guo J, Liu G (2005b) HMM parameter adaptation using the truncated first-order VTS and EM algorithm for robust speech recognition. In: Hao Y et al (ed) CIS 2005, Part I, LNAI 3801, Springer-Verlag, pp 979–984Google Scholar
  10. Shen HF, Guo J, Liu G, Huang PM, Li QX (2005c) Environment compensation based on maximum a posteriori estimation for improved speech recognition. In: Gelbukh A et al (ed) MICAI 2005, LNAI 3789, Springer-Verlag, pp 854–862Google Scholar
  11. Shen HF, Li QX, Guo J, Liu G (2006) Model-based feature compensation. Fundamenta Informaticae 72: 1–11MathSciNetGoogle Scholar
  12. Stern RM, Raj B, Moreno PJ (1997) Compensation for environmental degradation in automatic speech recognition. ESCA-NATO tutorial research workshop robust speech recognition for unknown communication channels 33–42Google Scholar
  13. Varga A, Steenneken HJM, Tomilson M, Jones D (1992) The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Documentation on the NOISEX-92 CD-ROMsGoogle Scholar
  14. Zu YQ (1998) Issues in the scientific design of the continuous speech database. http://www.cass.net.cn/chinese/s18_yys/yuyin/report/report_1998.htm

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  1. 1.Beijing University of Posts and TelecommunicationsBeijingChina

Personalised recommendations