Abstract
The recognition of laryngeal pathology by analysis of the voice is investigated. The fundamental frequency and the first three formants are considered. The recognition strategy is based on comparison with normal ranges calculated over 200 ordinary voices, grouped in ten age classes ranging from 20 to 70 years, for males and females. 220 test voices are studied divided into four groups: normal voices, functional dysphonia, nodules and recurrent nerve palsy. Each subject is marked according to his/her normal range. Parameters (or items) are calculated on the Interactive Laboratory System workstation. The vocalic material is composed of 11 vowels taken from a sentence. Results are given in terms of the number of values out of the normal ranges. Statistical analysis considers both parameter ability and error rates in pathology recognition. Pathology recognition shows the following error percentages: 23% for dysphonia, 14% for nodules and 33% for recurrent nerve palsy. Parameters do not show the same efficiency for voice pathology characterisation. Formants appear to be better than the fundamental frequency.
Similar content being viewed by others
References
Akerlund, L. (1993): ‘Averages of sound pressure levels and mean frequency of speech in relation to phonetograms: comparison of nonorganic dysphonia patients before and after therapy’,Acta ORL Stockh.,113, pp. 102–108
Choi, S.E., Kim, H.M., andKim, G.R. (1980): ‘The medico-sonagraphic study of Korean hoarseness due to laryngeal pathology’,J. Méd. Sci.,13, p. 82
Dagnelie, P. (1992): ‘Statistique théorique et appliquée, Vol 2’, (Presses Agronomiques de Gembloux, Belgium)
Fourcin, A. (1986): ‘Electrolaryngographic assessment of vocal folds function’,J. Phonetics,14, pp. 435–442
Fourcin, A. (1992): ‘Application of neural network processing to pathological voices’,Bulletin d'Audiophonologie, Ann. Sc. Univ., Franche-Comté,8, pp. 5–24
Forsyth, A., Bagshaw, M.E., andJack, M.A. (1994): ‘Incorporating discriminating observation probabilities; (DOP) into semicontinuous HMM for speaker verification’. Int. Workshop on Automatic speaker recognition, identification & verification, Martigny, Switzerland, 5–7 April, pp. 19–22.
Garnier, S., Collet, L., andBerger-Vachon, C. ‘Spectral and cepstral properties of vowels as a means of characterizing velar incompetency in children’,Cleft Palate CarnioFacial J., (in press)
Hammaberg, B., Fritzell, B., Gauffin, J., andSundberg, J. (1986): ‘Acoustic and perceptual analysis of vocal dysfunction’,J. Phonetics,14, pp. 523–547
Holst, M., Hertegard, S., andPersson, A. (1990): ‘Vocal dysfunction following cricothyroidotomy: A prospective study’,Laryngoscope,100, pp. 749–755
Hurme, P., andSonniner, A. (1986): ‘Acoustic perceptual and clinical studies of normal and dysphonic voices’,J. Phonetics,14, pp. 489–492
Kasuya, H., Ogawa, S., Mashima, K., andEbimara, S. (1986): ‘Normalized noise energy as an acoustic measure to evaluate pathologic voices’,J. Accoust. Soc. Am.,80, pp. 1329–1334
Klingholtz, F. (1990): ‘Acoustic recognition of the voice disorders: a comparative study of running speech versus sustained vowels’,J. Acoust. Soc. Am.,87, pp. 2218–2224
Koike, Y. (1973): ‘Application of some acoustic measures for the evaluation of laryngeal dysfunction’,Studia Phonologica,7, pp. 17–23
Kwang-Moon, K., Yuki, K., andMinoru, H. (1982): ‘Sound spectrographic analysis of the voice of patients with recurrent nerve laryngeal paralysis’,Folia Phoniatrica,34, pp. 124–133
Lieberman, P. (1963): ‘Some acoustics measures of the fundamental periodicity of normal and pathologic larynges’,J. Acoust. Soc. Am.,35(3), pp. 344–353
Pegoraro-Krook, M.I. (1988): ‘Speaking fundamental frequency characteristics of normal Swedish subjects obtained by glottal frequency analysis’,Folia Phoniatrica,40, pp. 82–90
Perrin, E., Berger-Vachon, C., Le Dissez, C., andMorgon, A. (1995): ‘The voice of cochlear implanted children’,Adv. ORL,50, pp. 167–173
Plante, F., Berger-Vachon, C., andKauffmann, I. (1993): ‘Acoustic discrimination of velar impairment in children’,Folia Phoniatrica,45, pp. 112–119
Pruszewitch, A., Obrebowski, A., Swimdzinski, A., Demenko, G., andWojchiechowska, A. (1991): ‘Usefulness of acoustic studies on the differential diagnostics of organic and functional dysphonia’,Acta ORL Stockh,111, pp. 414–419
Ruiz, R., Legros, C., andGuell A., (1990): ‘Voice analysis: application to the study of the influence of a workload’,J. Acoustique,3, pp. 153–159
Saporta, G. (1988): ‘Probabilités, analyse des données et statistique’ Technips Eds.
Sebestyen, G.S. (1962): ‘Decision making processes in pattern recognition,’ (McMillan, New York)
Zyski, B.J., Bull, G.L., McDonald, W.E., andJohns, M.E. (1984): ‘Perturbation analysis of normal and pathological larynges’,Folia Phoniatrica,36, pp. 190–198
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Perrin, E., Berger-Vachon, C., Kauffmann, I. et al. Acoustical recognition of laryngeal pathology using the fundamental frequency and the first three formants of vowels. Med. Biol. Eng. Comput. 35, 361–368 (1997). https://doi.org/10.1007/BF02534091
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02534091