Warped and Warped-Twice MVDR Spectral Estimation With and Without Filterbanks

  • Matthias Wölfel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4299)


This paper describes a novel extension to warped minimum variance distortionless response (MVDR) spectral estimation which allows to steer the resolution of the spectral envelope estimation to lower or higher frequencies while keeping the overall resolution of the estimate and the frequency axis fixed. This effect can be achieved by the introduction of a second bilinear transformation to the warped MVDR spectral estimation, but now in the frequency domain as opposed to the first bilinear transformation which is applied in the time domain, and a compensation step to adjust for the pre-emphasis of both bilinear transformations. In the feature extraction process of an automatic speech recognition system this novel extension allows to emphasize classification relevant characteristics while dropping classification irrelevant characteristics of speech features according to the characteristics of the signal to analyze.

We have compared the novel extension to warped MVDR and the traditional Mel frequency cepstral coefficients (MFCC) on development and evaluation data of the Rich Transcription 2005 Spring Meeting Recognition Evaluation lecture meeting task. The results are promising and we are going to use the described warped and warped-twice front-end settings in the upcoming NIST evaluation.


Speech Recognition Model Order High Frequency Region Warp Factor Frequency Axis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Malayath, N.: Data-driven methods for extracting features from speech. Ph.D. dissertation, Oregon Graduate Institute of Science and Technology (January 2000)Google Scholar
  2. 2.
    Wölfel, M., McDonough, J.: Minimum variance distortionless response spectral estimation, review and refinements. IEEE Signal Processing Magazine 22(5), 117–126 (2005)CrossRefGoogle Scholar
  3. 3.
    Murthi, M., Rao, B.: All-pole model parameter estimation for voiced speech. In: IEEE Workshop Speech Coding Telecommunications Proc., Pacono Manor, PA (1997)Google Scholar
  4. 4.
    Murthi, M., Rao, B.: All-pole modeling of speech based on the minimum variance distortionless response spectrum. IEEE Trans. Speech Audio Processing 8(3), 221–239 (2000)CrossRefGoogle Scholar
  5. 5.
    Dharanipragada, S., Rao, B.: MVDR based feature extraction for robust speech recognition. In: Proc. ICASSP, vol. 1, pp. 309–312 (2001)Google Scholar
  6. 6.
    Wölfel, M., McDonough, J., Waibel, A.: Minimum variance distortionless response on a warped frequency scale. In: Proc. Eurospeech, pp. 1021–1024 (2003)Google Scholar
  7. 7.
    Nakatoh, Y., Nishizaki, M., Yoshizawa, S., Yamada, M.: An adaptive Mel-LP analysis for speech recognition. In: Proc. ICSLP (2004)Google Scholar
  8. 8.
    Musicus, B.: Fast MLM power spectrum estimation from uniformly spaced correlations. IEEE Trans. Acoustics, Speech, Signal Processing 33, 1333–1335 (1985)CrossRefGoogle Scholar
  9. 9.
    Matsumoto, H., Moroto, M.: Evaluation of Mel-LPC cepstrum in a large vocabulary continuous speech recognition. In: Proc. ICASSP, vol. 1, pp. 117–120 (2001)Google Scholar
  10. 10.
    Oppenheim, A.V., Schafer, R.W.: Discrete-time signal processing. Prentice-Hall Inc., Englewood Cliffs (1989)zbMATHGoogle Scholar
  11. 11.
    National Institute of Standards and Technology (NIST), Rich transcription 2005 spring meeting recognition evaluation (June 2005),
  12. 12.
    Linguistic Data Consortium (LDC), Translanguage english database, LDC2002S04Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Matthias Wölfel
    • 1
  1. 1.Institut für Theoretische InformatikUniversität Karlsruhe (TH)KarlsruheGermany

Personalised recommendations