Automatic Classification of Regular vs. Irregular Phonation Types
Irregular phonation (also called creaky voice, glottalization and laryngealization) may have various communicative functions in speech. Thus the automatic classification of phonation type into regular and irregular can have a number of applications in speech technology. In this paper, we propose such a classifier that extracts six acoustic cues from vowels and then labels them as regular or irregular by means of a support vector machine. We integrated cues from earlier phonation type classifiers and improved their performance in five out of the six cases. The classifier with the improved cue set produced a 98.85% hit rate and a 3.47% false alarm rate on a subset of the TIMIT corpus.
KeywordsIrregular phonation creaky voice glottalization laryngealization phonation type voice quality support vector machine
Unable to display preview. Download preview PDF.
- 1.Surana, S., Slifka, J.: Acoustic cues for the classification of regular and irregular phonation. In: Interspeech 2006, pp. 693–696 (2006)Google Scholar
- 2.Slifka, J.: Irregular phonation and its preferred role as cue to silence in phonological systems. In: XVIth International Congress of Phonetic Sciences, pp. 229–232 (2007)Google Scholar
- 3.Henton, C.G., Bladon, A.: Creak as a sociophonetic marker. In: Hyman, L.M., Li, C.N. (eds.) Language, speech and mind: Studies in honour of Victoria A. Fromkin, pp. 3–29. Routledge (1987)Google Scholar
- 5.Surana, K.: Classification of vocal fold vibration as regular or irregular in normal voiced speech. MEng. thesis. MIT (2006) Google Scholar
- 7.Vishnubhotla, S., Espy-Wilson, C.: Detection of irregular phonation in speech. In: XVIth International Congress of Phonetic Sciences, pp. 2053–2056 (2007)Google Scholar
- 8.Yoon, T.-J., Zhuang, X., Cole, J., Hasegawa-Johnson, M.: Voice quality dependent speech recognition. In: International Symposium on Linguistic Patterns in Spontaneous Speech (2006)Google Scholar
- 9.Kiessling, A., Kompe, R., Niemann, H., Nöth, E., Batliner, A.: Voice source state as a source of information in speech recognition: Detection of laryngealizations. In: Rubio-Ayuso, Lopez-Soler (eds.) Speech Recognition and Coding: New advances and trends, pp. 329–332. Springer, Heidelberg (1995)Google Scholar
- 10.Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 86–874 (2006)Google Scholar
- 11.Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. IPS Proceedings 17, 97–100 (1993)Google Scholar
- 12.Kochanski, G., Grabe, E., Coleman, J., Rosner, B.: Loudness predicts prominence: Fundamental frequency lends little. JASA 118(2), 1038–1054 (2005)Google Scholar