Skip to main content

Noise Robust Features Based on MVA Post-processing

  • Conference paper

Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT,volume 456)

Abstract

In this paper we present effective technique to improve the performance of the automatic speech recognition (ASR) system. This technique consisting mean subtraction, variance normalization and application of temporal auto regression moving average (ARMA) filtering. This technique is called MVA. We applied MVA as post-processing stage to Mel frequency cespstral coefficients (MFCC) features and Perceptual Linear Prediction (RASTA-PLP) features, to improve automatic speech recognition (ASR) system.

We evaluate MVA post-processing scheme with aurora 2 database, in presence of various additive noise (subway, babble because, exhibition hall, restaurant, street, airport, train station). Experimental results demonstrate that our method provides substantial improvements in recognition accuracy for speech in the clean training case. We have completed study by comparing MFCC and RSTA-PLP After MVA post processing.

References

  1. Young, S., et al.: The HTK Book Version 3.3 (2005)

    Google Scholar 

  2. Hirsch, H.G., Pearce, D.: The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems under Noisy Conditions. In: Proc. ISCA ITRW ASR (2000)

    Google Scholar 

  3. Hermansky, H.: Perceptual linear prediction analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)

    CrossRef  Google Scholar 

  4. Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE. Trans. Speech Audio Process. 2(4), 578–589 (1994)

    CrossRef  Google Scholar 

  5. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech, and Signal Processing 28(4), 357–366 (1980)

    CrossRef  Google Scholar 

  6. Atal, B.: Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. Journal of the Acoustical Society of America 55, 1304–1312 (1974)

    CrossRef  Google Scholar 

  7. Furui, S.: Cepstral analysis technique for automatic speaker verification. IEEE Trans. Acoust., Speech, Signal Process. 29(2), 254–272 (1981)

    CrossRef  Google Scholar 

  8. Jain, P., Hermansky, H.: Improved mean and variance normalization for robust speech recognition. In: IEEE Int. Conf. Acoust., Speech and Signal Processing (May 2001)

    Google Scholar 

  9. Cook, G.D., Kershaw, D.J., Christie, J.D.M., Seymour, C.W., Waterhouse, S.R.: Transcription of broadcast television and radio news: the 1996 abbot system. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Munich, Germany (1997)

    Google Scholar 

  10. Chen, C.-P., Bilmes, J., Kirchhoff, K.: Low-resource noise-robust feature post-processing on Aurora 2.0. In: Proc. Int. Conf. Spoken Lang. Process. (ICSLP), pp. 2445–2448 (2002)

    Google Scholar 

  11. Chen, C.-P., Filali, K., Bilmes, J.: Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databases. In: Proc. Int. Conf. Spoken Lang. Process. (ICSLP), pp. 241–244 (2002)

    Google Scholar 

  12. Chen, C.-P., Bilmes, J.: MVA processing of speech features Dept. Elect. Eng., Univ. Washington, Seattle, WA, Tech. Rep. UWEETR- 2003-0024 (2003), http://www.ee.washington.edu/techsite/papers

  13. Ellis, D.: PLP and RASTA (and MFCC, and inversion) in MATLAB using melfcc.m and invmelfcc.m (2006), http://labrosa.ee.columbia.edu/matlab/rastamat/

  14. Stuttle, M.N., Gales, M.J.F.: A Mixture of Gaussians Front End for Speech Recognition. In: Eurospeech 2001, Scandinavia, pp. 675–678 (2001)

    Google Scholar 

  15. Potamifis, J., Fakotakis, N., Kokkinakis, G.: Improving the robustness of noisy MFCC features using minimal recurrent neural networks. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, vol. 5, pp. 271–276 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Cherif Amara Korba .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 IFIP International Federation for Information Processing

About this paper

Cite this paper

Korba, M.C.A., Messadeg, D., Bourouba, H., Djemili, R. (2015). Noise Robust Features Based on MVA Post-processing. In: Amine, A., Bellatreche, L., Elberrichi, Z., Neuhold, E., Wrembel, R. (eds) Computer Science and Its Applications. CIIA 2015. IFIP Advances in Information and Communication Technology, vol 456. Springer, Cham. https://doi.org/10.1007/978-3-319-19578-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19578-0_13

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19577-3

  • Online ISBN: 978-3-319-19578-0

  • eBook Packages: Computer ScienceComputer Science (R0)