Advertisement

International Journal of Speech Technology

, Volume 3, Issue 1, pp 35–49 | Cite as

Formant and Pitch Detection Using Time-Frequency Distribution

  • Wanda W. Zhao
  • Tokunbo Ogunfunmi
Article

Abstract

The Wigner-Ville distribution of a multi-component signalhas a unique structure. Based on this structure, a formant and pitchestimation method for speech signals is introduced. Formants andpitch estimated with this method are more accurate, have betterresolution, and are easier to recognize than those estimated by othermethods. A one pitch-period segment is adequate for formantestimation while a minimal two pitch-period segment is needed forboth pitch and formant detection with one step. Experimental resultsare provided to demonstrate the performance of this method, andcomparisons with other methods are provided.

speech signals formant and pitch estimation time-frequency distribution 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bogert, B.P., Healy, M.J.R., and Tukey, J.W. (1963). The quefrency analysis of times series for echoes: Cepstrum, pseudoautocovariance, cross-cepstrum, and saphe cracking. Proc. Symp. Time Series Analysis. NY: John Wiley and Sons, pp. 209–243.Google Scholar
  2. Boudreaux-Bartels, G.F. and Parks, T.W. (1986). Time-varying filtering and signal estimation using the wigner distribution synthesis techniques. IEEE Trans. on Acoust., Speech, and Sig. Proc., 34:442–451.Google Scholar
  3. Choi, H. and Williams, William J. (1989). Improved time-frequency representation of multicomponent signals using exponential kernels. IEEE Trans. on Acoust., Speech, and Sig. Proc., 37(6):862–871.Google Scholar
  4. Claasen, T.C.M. and Mecklenbrauker, W.F.G. (1980). The Wigner distribution—a tool for time-frequency signal analysis. Philips J. Res., 35:276–300.Google Scholar
  5. Markel, J.D. (1972). The SIFT algorithm for fundamental frequency estimation. IEEE Trans. on Audio and Electroacoustics, AU-20(5):367–377.Google Scholar
  6. Marksym, J.N. (1973). Real-time pitch extraction by adaptive prediction of the speech waveform. IEEE Trans. on Audio and Electroacoustics, AU-21(3):149–153.Google Scholar
  7. Miller, N.J. (1975). Pitch detection by data reduction. IEEE Trans. on Acoust., Speech, and Sig. Proc., 23:72–79.Google Scholar
  8. Nickel, R.M. and Williams, W.J. (1998). Using the time-frequency structure of pitch periods to improve speaker verification systems. Proc. IEEE Int. Conf. on Acoust., Speech, and Sig. Proc., pp. 145–148.Google Scholar
  9. Parsons, T. (1987). Voice and Speech Processing. NY: McGraw-Hill Book Co.Google Scholar
  10. Rabiner, L. and Juang, B.-H. (1993). Fundamentals of Speech Recognition. NJ: Prentice-Hall.Google Scholar
  11. Rabiner, L.R. and Schafer, R.W. (1978). Digital Processing of Speech Signals. NJ: Prentice-Hall.Google Scholar
  12. Ville, J. (1958). Theory and applications of the notion of complex signal. RAND Corporation Technical Report, T-92, Santa Monica, CA.Google Scholar
  13. Wigner, E.P. (1932). On the quantum correction for thermodynamic equilibrium. Physical Review, 40:749–759.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Wanda W. Zhao
    • 1
  • Tokunbo Ogunfunmi
    • 1
  1. 1.Department of Electrical EngineeringSanta Clara UniversitySanta ClaraUSA

Personalised recommendations