Speaker Invariant and Noise Robust Speech Recognition Using Enhanced Auditory and VTL Based Features

  • S. D. Umarani
  • R. S. D. Wahidabanu
  • P. Raviram
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 199)


This paper focuses on design and implementation of a noise-resilient and speaker independent speech recognition system for isolated word recognition. In this work auditory transform (AT) based features called as Cochlear Filter Cepstral Coefficients (CFCCs) has been used for feature extraction and its robustness against noise and variation in vocal track length (VTL) performance has been enhanced by the application of wavelet based denoising algorithm and invariant-integration method respectively. The resultant features are called as enhanced CFCC Invariant-Integration Features (ECFCCIIFs). To accomplish the objective of this paper, feature-finding neural network (FFNN) is used as classifier for the recognition of isolated words. Results are compared with the results obtained by the standard CFCC features and it is observed that, at both matching and mismatching conditions the ECFCCIIFs features remains high recognition rate under low Signal-to-noise ratios (SNRs) and their performance are more effective under high SNRs too.


Denoising Invariant integration CFCC FFNN SNR Auditory VTL 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Acero, A., Stern, R.M.: Environmental robustness in automatic speech recognition. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1990), vol. 2, pp. 849–852. IEEE Press, Albuquerque (1990)CrossRefGoogle Scholar
  2. 2.
    Li, Q.: An auditory-based transform for audio signal processing. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York (2009)Google Scholar
  3. 3.
    Zhang, J., Li, G.-L., Zheng, Y.-Z., Liu, X.-Y.: A Novel Noise-robust Speech Recognition System Based on Adaptively Enhanced Bark Wavelet MFCC. In: Sixth International Conference on  Fuzzy Systems and Knowledge Discovery (FSKD 2009), Tianjin, pp. 443–447 (2009)Google Scholar
  4. 4.
    Muller, F., Mertins, A.: Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton, pp. 2975–2978 (2009)Google Scholar
  5. 5.
    Muller, F., Mertins, A.: On Using the Auditory Image Model and Invariant-Integration for Noise Robust Automatic Speech Recognition. In: Proc. Int. Conf. Audio, Speech, and Signal Processing, Kyoto, Japan, pp. 4905–4908 (2012)Google Scholar
  6. 6.
    Gramss, T., Strube, H.W.: Recognition of isolated words based on psychoacoustics and neurobiology. Speech Communication 9, 35–40 (1990)CrossRefGoogle Scholar
  7. 7.
    Cooke, M., Lee, T.-W.: Speech separation challenge,

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • S. D. Umarani
    • 1
  • R. S. D. Wahidabanu
    • 1
  • P. Raviram
    • 2
  1. 1.Government College of EngineeringSalemIndia
  2. 2.Department of CSEMahendra Engineering CollegeTiruchengodeIndia

Personalised recommendations