Speech Recognition Based on Feature Extraction with Variable Rate Frequency Sampling
Most feature extraction techniques involve in their primary stage a Discrete Fourier Transform (DFT) of consecutive, short, overlapping windows. The spectral resolution of the DFT representation is uniform and is given by °f = 2π/N where N is the length of the window The present paper investigates the use of non-uniform rate frequency sampling, varying as a function of the spectral characteristics of each frame, in the context of Automatic Speech Recognition. We are motivated by the non-uniform spectral sensitivity of human hearing and the necessity for a feature extraction technique that auto-focuses on most reliable parts of the spectrum in noisy cases.
KeywordsSpeech Recognition Automatic Speech Recognition Feature Extraction Technique Word Recognition Accuracy Noisy Situation
Unable to display preview. Download preview PDF.
- 2.Zhu Q., Alwan A., “On the use of variable frame rate analysis in speech recognition”, ICASSP, 2000, pp. 3264–3267.Google Scholar
- 3.Vaseghi S., “Advanced Signal Processing and Digital Noise Reduction” Wiley Teubner, 1996.Google Scholar
- 4.Deller J., Proakis J., Hansen J., “Discrete-Time Processing of Speech Signals”, Prentice Hall, 1987.Google Scholar