Speaker Invariant and Noise Robust Speech Recognition Using Enhanced Auditory and VTL Based Features
This paper focuses on design and implementation of a noise-resilient and speaker independent speech recognition system for isolated word recognition. In this work auditory transform (AT) based features called as Cochlear Filter Cepstral Coefficients (CFCCs) has been used for feature extraction and its robustness against noise and variation in vocal track length (VTL) performance has been enhanced by the application of wavelet based denoising algorithm and invariant-integration method respectively. The resultant features are called as enhanced CFCC Invariant-Integration Features (ECFCCIIFs). To accomplish the objective of this paper, feature-finding neural network (FFNN) is used as classifier for the recognition of isolated words. Results are compared with the results obtained by the standard CFCC features and it is observed that, at both matching and mismatching conditions the ECFCCIIFs features remains high recognition rate under low Signal-to-noise ratios (SNRs) and their performance are more effective under high SNRs too.
KeywordsDenoising Invariant integration CFCC FFNN SNR Auditory VTL
Unable to display preview. Download preview PDF.
- 2.Li, Q.: An auditory-based transform for audio signal processing. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York (2009)Google Scholar
- 3.Zhang, J., Li, G.-L., Zheng, Y.-Z., Liu, X.-Y.: A Novel Noise-robust Speech Recognition System Based on Adaptively Enhanced Bark Wavelet MFCC. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009), Tianjin, pp. 443–447 (2009)Google Scholar
- 4.Muller, F., Mertins, A.: Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton, pp. 2975–2978 (2009)Google Scholar
- 5.Muller, F., Mertins, A.: On Using the Auditory Image Model and Invariant-Integration for Noise Robust Automatic Speech Recognition. In: Proc. Int. Conf. Audio, Speech, and Signal Processing, Kyoto, Japan, pp. 4905–4908 (2012)Google Scholar
- 7.Cooke, M., Lee, T.-W.: Speech separation challenge, http://www.interspeech2006.org