Using Dynamic Time Warping of T0 Contours in the Evaluation of Cycle-to-Cycle Pitch Detection Algorithms

  • Carlos Ferrer
  • Diana Torres
  • María E. Hernández-Díaz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5197)

Abstract

This paper addresses the comparison of Pitch Detection Algorithms working on a cycle to cycle basis. An alignment problem between detected and reference pitch contours is described and a Dynamic Time Warping procedure to correct it is proposed. The method is evaluated using hand-marked real signals and three well known Pitch Detection Algorithms. Results demonstrate the occurrence of shifts in practice and the usefulness of the proposed Dynamic Time Warping procedure.

Keywords

Dynamic time warping pitch determination waveform matching 

References

  1. 1.
    Hess, W.: Pitch Determination of Speech Signal: Algorithms and Devices. Springer, Berlin (1983)CrossRefGoogle Scholar
  2. 2.
    Hess, W.J.: Pitch and Voicing Determination. In: Furui, S., Sondhi, M.M. (eds.) Advances in Speech Signal Processing. Marcel Dekker, New York (1992)Google Scholar
  3. 3.
    Titze, I.R., Liang, H.: Comparison of Fo extraction methods for high-precision voice perturbation measurements. J. Speech Hear Res. 36, 1120–1133 (1993)CrossRefGoogle Scholar
  4. 4.
    Bagshaw, S., Hiller, M., Jack, M.A.: Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching. In: Proc. of Eurospeech, pp. 1003–1006 (1993)Google Scholar
  5. 5.
    Titze, I.R.: Summary Statement. In: Workshop-on acoustic voice analysis. National Center of Voice and Speech, Iowa (1995)Google Scholar
  6. 6.
    Parsa, V., Jamieson, D.G.: A comparison of high precision Fo extraction algorithms for sustained vowels. J. Speech Lang. Hear. Res. 42, 112–126 (1999)CrossRefGoogle Scholar
  7. 7.
    Veprek, P., Scordilis, M.S.: Analysis, enhancement and evaluation of five pitch determination techniques. Speech Communication 37, 249–270 (2002)CrossRefMATHGoogle Scholar
  8. 8.
    de Chevigné, A., Kawahara, H.: Comparative evaluation of F0 estimation algorithms. In: Proceedings of EuroSpeech 2001, Scandinavia (2001)Google Scholar
  9. 9.
    Milenkovick, P.: Least mean squares measures of voice perturbation. J. Speech Hear. Res. 30, 529–538 (1987)CrossRefGoogle Scholar
  10. 10.
    Boersma, P., Weenink, D.: Praat: doing phonetics by computer (2007), http://www.fon.hum.uva.nl/praat/
  11. 11.
    Rabiner, L.R., Cheng, M.J., Rosemberg, A.E., McGonegal, C.A.: A comparative study of several pitch-detection algorithms. IEEE Trans. Acoust. Speech Signal Process 24, 399–417 (1976)CrossRefGoogle Scholar
  12. 12.
    Nakatani, T., Irino, T.: Robust and accurate fundamental frequency estimation based on dominant harmonic components. J. Acoust. Soc. Am. 116, 3690–3700 (2004)CrossRefGoogle Scholar
  13. 13.
    Shahnaz, C., Zhu, W.P., Ahmad, M.O.: Robust pitch estimation at very low SNR exploiting time and frequency domain cues. In: ICASSP 2005, pp. 389–392 (2005)Google Scholar
  14. 14.
    Moorer, J.A.: The optimum comb method of pitch period analysis of continuous digitized speech. IEEE Trans. Acoust. Speech & Sig. Proc. 22, 330–338 (1974)CrossRefGoogle Scholar
  15. 15.
    Deem, J.F., Manning, W.H., Knack, J.V., Matesich, J.S.: The automatic extraction of pitch perturbation using microcomputers: Some methodological considerations. J. of Speech Hear. Res. 32, 689–697 (1989)CrossRefGoogle Scholar
  16. 16.
    Deliyski, D.D., Shaw, H.S., Evans, M.K.: Adverse effects of environmental noise on acoustic voice quality measures. J. of Voice 19, 15–28 (2005)CrossRefGoogle Scholar
  17. 17.
    Wise, J.D., Caprio, J.R., Parks, T.W.: Maximum likelihood pitch estimation. IEEE Trans. Acoust. Speech & Signal Processing 24, 418–423 (1976)CrossRefGoogle Scholar
  18. 18.
    Baken, R.J.: Clinical Measurement of Speech and Voice. Singular, San Diego (1999)Google Scholar
  19. 19.
    Karnell, M.P., Hall, K.D., Landahl, K.L.: Comparison of fundamental frequency and perturbation measurements among three analysis systems. J. Voice 9, 383–393 (1995)CrossRefGoogle Scholar
  20. 20.
    Bielamowicz, S., Kreiman, J., Gerratt, B.R., Dauer, M.S., Berke, G.S.: Comparison of voice analysis systems for perturbation measurement. J. Speech Hear Res. 39, 126–134 (1996)CrossRefGoogle Scholar
  21. 21.
    Perry, C.K., Ingrisano, D.R., Palmer, M.A., McDonald, E.J.: Effects of environmental noise on computer-derived voice estimates from female speakers. J. Voice 14, 146–153 (2000)CrossRefGoogle Scholar
  22. 22.
    Perry, C.K., Ingrisano, D.R., Eggleston, K.D.: The effect of noise on computer-aided measures of voice: A comparison of CSpeechSP and the Multi-Dimensional Voice Program software using the CSL 4300B module and Multi-Speech for Windows. J. Voice. 17, 12–20 (2003)CrossRefGoogle Scholar
  23. 23.
    Medan, Y., Yair, E., Chazan, D.: Super resolution pitch determination of the speech signals. IEEE Trans. Signal Proc. 39, 40–48 (1991)CrossRefGoogle Scholar
  24. 24.
    Kay Elemetrics Corp. Voice disorders database. Voice and Speech Laboratory Massachussets Eye and Ear Infirmary. 2 BridgeWater Lane, Lincoln Park. NJ 07035, USA (1994)Google Scholar
  25. 25.
    Ross, M.J., Shaffer, H.L., Cohen, A., Freudberg, R., Manley, H.J.: Average magnitude difference function pitch extractor. IEEE Trans. on Ac. Speech and Signal Proc. 5, 353–362 (1974)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Carlos Ferrer
    • 1
  • Diana Torres
    • 1
  • María E. Hernández-Díaz
    • 1
  1. 1.Center for Studies on Electronics and Information TechnologiesCentral University of Las VillasSanta ClaraCuba

Personalised recommendations