Automatic Classification of Regular vs. Irregular Phonation Types

  • Tamás Bőhm
  • Zoltán Both
  • Géza Németh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5933)

Abstract

Irregular phonation (also called creaky voice, glottalization and laryngealization) may have various communicative functions in speech. Thus the automatic classification of phonation type into regular and irregular can have a number of applications in speech technology. In this paper, we propose such a classifier that extracts six acoustic cues from vowels and then labels them as regular or irregular by means of a support vector machine. We integrated cues from earlier phonation type classifiers and improved their performance in five out of the six cases. The classifier with the improved cue set produced a 98.85% hit rate and a 3.47% false alarm rate on a subset of the TIMIT corpus.

Keywords

Irregular phonation creaky voice glottalization laryngealization phonation type voice quality support vector machine 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Surana, S., Slifka, J.: Acoustic cues for the classification of regular and irregular phonation. In: Interspeech 2006, pp. 693–696 (2006)Google Scholar
  2. 2.
    Slifka, J.: Irregular phonation and its preferred role as cue to silence in phonological systems. In: XVIth International Congress of Phonetic Sciences, pp. 229–232 (2007)Google Scholar
  3. 3.
    Henton, C.G., Bladon, A.: Creak as a sociophonetic marker. In: Hyman, L.M., Li, C.N. (eds.) Language, speech and mind: Studies in honour of Victoria A. Fromkin, pp. 3–29. Routledge (1987)Google Scholar
  4. 4.
    Gobl, C., Ní Chasaide, A.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40, 189–212 (2003)MATHCrossRefGoogle Scholar
  5. 5.
    Surana, K.: Classification of vocal fold vibration as regular or irregular in normal voiced speech. MEng. thesis. MIT (2006) Google Scholar
  6. 6.
    Ishi, C.T., Sakakibara, K.-I., Ishiguro, H., Hagita, N.: A method for automatic detection of vocal fry. IEEE Tr. on Audio, Speech and Language Proc. 16(1), 47–56 (2008)CrossRefGoogle Scholar
  7. 7.
    Vishnubhotla, S., Espy-Wilson, C.: Detection of irregular phonation in speech. In: XVIth International Congress of Phonetic Sciences, pp. 2053–2056 (2007)Google Scholar
  8. 8.
    Yoon, T.-J., Zhuang, X., Cole, J., Hasegawa-Johnson, M.: Voice quality dependent speech recognition. In: International Symposium on Linguistic Patterns in Spontaneous Speech (2006)Google Scholar
  9. 9.
    Kiessling, A., Kompe, R., Niemann, H., Nöth, E., Batliner, A.: Voice source state as a source of information in speech recognition: Detection of laryngealizations. In: Rubio-Ayuso, Lopez-Soler (eds.) Speech Recognition and Coding: New advances and trends, pp. 329–332. Springer, Heidelberg (1995)Google Scholar
  10. 10.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 86–874 (2006)Google Scholar
  11. 11.
    Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. IPS Proceedings 17, 97–100 (1993)Google Scholar
  12. 12.
    Kochanski, G., Grabe, E., Coleman, J., Rosner, B.: Loudness predicts prominence: Fundamental frequency lends little. JASA 118(2), 1038–1054 (2005)Google Scholar
  13. 13.
    Bennett, K.P., Campbell, C.: Support vector machines: Hype or hallelujah? SIGKDD Explorations 2(2), 1–13 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Tamás Bőhm
    • 1
  • Zoltán Both
    • 1
  • Géza Németh
    • 1
  1. 1.Department of Telecommunications and Media InformaticsBudapest University of Technology and EconomicsBudapestHungary

Personalised recommendations