Abstract
In this study, two different tools developed for the parametric extraction and acoustic analysis of voice samples are compared. The main goal of the paper is to contrast the results obtained using the classical Multi Dimensional Voice Program (MDVP), with the results obtained with the novel WPCVox. The aim of this comparison was to find differences and similarities in the parameters extracted with both systems in order to make comparison of measurements and data transfer among both equipments. The study was carried out in two stages: in the first, a wide sample of healthy voices belonging to Spanish-speaking adults from both genders were used to carry out a direct comparison between the results given by MDVP and those obtained with WPCVox. In the second stage, a sample of 200 speakers (53 normal and 173 pathological) taken from a commercially available database of voice disorders were used to demonstrate the usefulness of WPCVox for the acoustic analysis and the characterization of normal and pathological voices. The results conclude that WPCVox provides very reliable measurements which are very similar to those obtained using MDVP, and very similar capabilities to discriminate among normal and pathological voices.
Similar content being viewed by others
Notes
Not all the patients were subjected to an endoscopy; only those presenting a vocal disorder based on a previous psychoacoustic judgement of their voice. In total, 200 patients underwent these exploration techniques.
Contrasting is more confident evaluating the statistical significance with 95% than 99%.
The normative values in the MDVP manual were obtained with English speakers from the sustained phonation of vowel ‘ah’
References
Kay Elemetrics Corporation (1994) MDVP operations manual. Model 4305
Godino-Llorente JI, Sáenz-Lechón N, Osma-Ruiz V et al (2006) An integrated tool for the diagnosis of voice disorders. Med Eng Phys 28:276–289
Smits I, Ceuppens P, de Bodt M (2005) A comparative study of acoustic voice measurements by means of Dr. Speech and Computerized Speech Lab. J Voice 19:187–196
Hirano M (1981) Psycho-acoustic evaluation of voice. Springer, New York
Kay Elemetrics Corp (1994) Voice Disorders Database, version 1.03. Lincoln Park, NJ
Parsa V, Jamieson D (2000) Identification of pathological voices using glottal noise measures. J Speech Lang Hear Res 43:469–485
Childers D (2000) Speech processing and synthesis toolboxes. Wiley, New York
Baken RJ, Orlikoff R (2000) Clinical measurement of speech and voice, 2nd edn. Singular Publishing Group, San Diego, CA, USA
Feijoo S, Hernández-Espinosa C (1990) Short-term stability measures for the evaluation of vocal quality. J Speech Hear Res 33:324–334
Deliyski D (1993) Acoustic model and evaluation of pathological voice production. In: Proceedings of Eurospeech’93. Berlin, Germany 3:1969–1972
Yumoto E, Gould W, Baer T (1982) Harmonics-to-noise ratio as an index of the degree of hoarseness. J Acoust Soc Am 71:1544–1550
de Krom G (1993) A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. J Speech Hear Res 36:254–266
Kasuya H, Ogawa S, Mashima K, Ebihara S (1986) Normalized noise energy as an acoustic measure to evaluate pathologic voice. J Acoust Soc Am 80:1329–1334
Weiss NA (2000) Introductory statistics, 6th edn. Addison Wesley, Reading, USA
Michaelis D, Fröhlich M, Strube HW (1998) Selection and combination of acoustic features for the description of pathologic voices. J Acoust Soc Am 103:1628–1639
Fröhlich M, Michaelis D, Strube HW, Kruse E (2000) Acoustic voice analysis by means of the hoarseness diagram. J Speech Lang Hear Res 43:706–720
Hanley J, McNeil B (1983) A method of comparing the areas under receiver operating characteristics curves derived from the same cases. Radiology 148:839–843
Fernández R, Damborenea J et al (1999) Acoustic analysis of the normal voice in nonsmoking adults. Acta Otorrinolaringol Española 50:134–141
Damborenea J, Fernández R et al (1999) The effect of tobacco consumption on acoustic voice analysis. Acta Otorrinolaringol Española 50:448–452
Karnell MP, Hall KD, et al (1995) Comparison of fundamental frequency and perturbation measurements among three analysis systems. J Voice 9:383–393
Baker K, Ramig L, Jones S, Freed CR (1997) Preliminary voice and speech analysis following fetal dopamine transplants in 5 individuals with Parkinson disease. J Speech Lang Hear Res 40:615–626
McAllister A, Sundberg J, Hibi S (1998) Acoustic measurements and perceptual evaluation of hoarseness in children’s voices. Logopedics Phoniatrics Vocol 23:27–38
Wolfe V, Cornell R, Palmer C (1991) Acoustic correlates of pathologic voice types. J Speech Hear Res 34(3):509–516
Wolfe V, Martin D (1997) Acoustic correlates of dysphonia: type and severity. J Commun Disord 30(5):403–416
Preciado JA, Fernández S (1998) Digital analysis of the acoustic signal in vocal pathology diagnosis. Sensitivity and specificity of shimmer and jitter measurements. Acta Otorrinolaringol Española 49:475–481
Hollien H, Thompson CL, Cannon B (1973) Speech intelligibility as a function of ambient pressure and Heo 2 atmosphere. Aerosp Med 44:249–253
Orlikoff R, Baken RJ (1990) Considerations on the relationship between the fundamental frequency of phonation and vocal jitter. Folia Phoniatrica 42:31–40
González T, Cervera T, Miralles J (2002) Análisis acústico de la voz: fiabilidad de un conjunto de parámetros multidimensionales. Acta Otorrinolaringol Española 53:256–268
Orlikoff R, Walton JH (1994) Speaker race identification from acoustic cues in the vocal signal. J Speech Hear Res 37:738–745
Dwire A, McCauley R (1995) Repeated measures of vocal fundamental frequency perturbation obtained using the Visi-pitch. J Voice 9(2):156–162
Takahashi H, Koike Y (1976) Some perceptual dimensions and acoustical correlates of pathologic voices. Acta Otolaryngol 338:1–24
Karnell MP (1991) Laryngeal perturbation analysis: minimum length of analysis window. J Speech Hear Res 34:544–548
Horii Y (1979) Fundamental frequency perturbation observed in sustained phonation. J Speech Hear Res 22:5–19
Orlikoff R, Kahane JC (1991) Influence of mean sound pressure level on jitter and shimmer measures. J Voice 5:113–119
Acknowledgments
This research was partially carried out under grants: TIC2003-08956-C02-00 from the Ministry of Education of Spain, and AL06-EX-PID-033 from the Universidad Politécnica de Madrid, Spain.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Godino-Llorente, J.I., Osma-Ruiz, V., Sáenz-Lechón, N. et al. Acoustic analysis of voice using WPCVox: a comparative study with Multi Dimensional Voice Program. Eur Arch Otorhinolaryngol 265, 465–476 (2008). https://doi.org/10.1007/s00405-007-0467-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00405-007-0467-x