Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening

Uloza, Virgilijus; Padervinskis, Evaldas; Vegiene, Aurelija; Pribuisiene, Ruta; Saferis, Viktoras; Vaiciukynas, Evaldas; Gelzinis, Adas; Verikas, Antanas

doi:10.1007/s00405-015-3708-4

Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening

Laryngology
Published: 11 July 2015

Volume 272, pages 3391–3399, (2015)
Cite this article

European Archives of Oto-Rhino-Laryngology Aims and scope Submit manuscript

Virgilijus Uloza¹,
Evaldas Padervinskis¹,
Aurelija Vegiene¹,
Ruta Pribuisiene¹,
Viktoras Saferis²,
Evaldas Vaiciukynas³,
Adas Gelzinis³ &
…
Antanas Verikas^3,4

1080 Accesses
51 Citations
Explore all metrics

Abstract

The objective of this study is to evaluate the reliability of acoustic voice parameters obtained using smart phone (SP) microphones and investigate the utility of use of SP voice recordings for voice screening. Voice samples of sustained vowel/a/obtained from 118 subjects (34 normal and 84 pathological voices) were recorded simultaneously through two microphones: oral AKG Perception 220 microphone and SP Samsung Galaxy Note3 microphone. Acoustic voice signal data were measured for fundamental frequency, jitter and shimmer, normalized noise energy (NNE), signal to noise ratio and harmonic to noise ratio using Dr. Speech software. Discriminant analysis-based Correct Classification Rate (CCR) and Random Forest Classifier (RFC) based Equal Error Rate (EER) were used to evaluate the feasibility of acoustic voice parameters classifying normal and pathological voice classes. Lithuanian version of Glottal Function Index (LT_GFI) questionnaire was utilized for self-assessment of the severity of voice disorder. The correlations of acoustic voice parameters obtained with two types of microphones were statistically significant and strong (r = 0.73–1.0) for the entire measurements. When classifying into normal/pathological voice classes, the Oral-NNE revealed the CCR of 73.7 % and the pair of SP-NNE and SP-shimmer parameters revealed CCR of 79.5 %. However, fusion of the results obtained from SP voice recordings and GFI data provided the CCR of 84.60 % and RFC revealed the EER of 7.9 %, respectively. In conclusion, measurements of acoustic voice parameters using SP microphone were shown to be reliable in clinical settings demonstrating high CCR and low EER when distinguishing normal and pathological voice classes, and validated the suitability of the SP microphone signal for the task of automatic voice analysis and screening.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring the feasibility of the combination of acoustic voice quality index and glottal function index for voice pathology screening

Article 23 April 2019

Smart Data Driven System for Pathological Voices Classification

Spectral Analysis of Speech Signal Characteristics: A Comparison Between Healthy Controls and Laryngeal Disorder

References

Roy N, Merrill RM, Thibeault S, Parsa RA, Gray SD, Smith EM (2004) Prevalence of voice disorders in teachers and the general population. J Speech Lang Hear Res 47:281–293
Article PubMed Google Scholar
Branski RC, Cukier-Blaj S, Pusic A, Cano SJ, Klassen A, Mener D et al (2010) Measuring quality of life in dysphonic patients: a systematic review of content development in patient-reported outcomes measures. J Voice 24:193–198
Article PubMed Google Scholar
Bhattacharyya N (2014) The prevalence of voice problems among adults in the united states. Laryngoscope 124:2359–2362
Article PubMed Google Scholar
Cohen SM, Kim J, Roy N, Courey M (2014) Delayed otolaryngology referral for voice disorders increases health care costs. Am J Med 128:11–18
Google Scholar
Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G et al (2001) A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Eur Arch Otorhinolaryngol 258:77–82
Article CAS PubMed Google Scholar
Kaleem MF, Ghoraani B, Guergachi A, Krishnan S (2011) Telephone-quality pathological speech classification using empirical mode decomposition. Conf Proc IEEE Eng Med Biol Soc 2011:7095–7098
CAS PubMed Google Scholar
Mat Baki M, Wood G, Alston M, Ratcliffe P, Sandhu G, Rubin JS, Birchall MA (2015) Reliability of operavox against multidimensional voice program (MDVP). Clin Otolaryngol 40:22–28
Article CAS PubMed Google Scholar
Reynolds DA (1995) Large population speaker identification using clean and telephone speech. Signal Process Lett IEEE 2:46–48
Article Google Scholar
Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-based voice pathology assessment using automated speech analysis. IEEE Trans Biomed Eng 53:468–477
Article PubMed Google Scholar
Wormald RN, Moran RJ, Reilly RB, Lacy PD (2008) Performance of an automated, remote system to detect vocal fold paralysis. Ann Otol Rhinol Laryngol 117:834–838
Article PubMed Google Scholar
Jokinen E, Yrttiaho S, Pulakka H, Vainio M, Alku P (2012) Signal-to-noise ratio adaptive post-filtering method for intelligibility enhancement of telephone speech. J Acoust Soc Am 132:3990–4001
Article PubMed Google Scholar
Lin E, Hornibrook J, Ormond T (2012) Evaluating iphone recordings for acoustic voice assessment. Folia Phoniatr Logop 64:122–130
Article PubMed Google Scholar
Bach KK, Belafsky PC, Wasylik K, Postma GN, Koufman JA (2005) Validity and reliability of the glottal function index. Arch Otolaryngol Head Neck Surg 131:961–964
Article PubMed Google Scholar
Pribuisiene R, Baceviciene M, Uloza V, Vegiene A, Antuseva J (2012) Validation of the Lithuanian version of the glottal function index. J Voice 26:73–78
Article Google Scholar
Verikas A, Gelzinis A, Bacauskiene M, Uloza V, Kaseta M (2009) Using the patient’s questionnaire data to screen laryngeal disorders. Comput Biol Med 39:148–155
Article CAS PubMed Google Scholar
Verikas A, Bacauskiene M, Gelzinis A, Vaiciukynas E, Uloza V (2012) Questionnaire-versus voice-based screening for laryngeal disorders. Expert Syst Appl 39:6254–6262
Article Google Scholar
Uloza V, Saferis V, Uloziene I (2005) Perceptual and acoustic assessment of voice pathology and the efficacy of endolaryngeal phonomicrosurgery. J Voice 19:138–145
Article PubMed Google Scholar
Bland JM, Altman D (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 327:307–310
Article Google Scholar
Elliott AC, Woodward WA (2007) Statistical analysis quick reference guidebook: with SPSS examples. Sage Publications, New York
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article Google Scholar
Saenz-Lechon N, Godino-Llorente JI, Osma-Ruiz V, Gomez-Vilda P (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed Signal Process Control 1:120–128
Article Google Scholar
Brümmer N, de Villiers E (2013) The BOSARIS toolkit: Theory, algorithms and code for surviving the new dcf. ArXiv Preprint ArXiv 1304.2865
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Gr Stat 15:651–674
Article Google Scholar
Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14:323–348
Article PubMed Central PubMed Google Scholar
Eadie TL, Doyle PC (2005) Classification of dysphonic voice: acoustic and auditory-perceptual measures. J Voice 19:1–14
Article PubMed Google Scholar
Smits I, Ceuppens P, De Bodt MS (2005) A comparative study of acoustic voice measurements by means of Dr. Speech and computerized speech lab. J Voice 19:187–196
Article PubMed Google Scholar
Oguz H, Demirci M, Safak MA, Arslan N, Islam A, Kargin S (2007) Effects of unilateral vocal cord paralysis on objective voice measures obtained by Praat. Eur Arch Otorhinolaryngol 264:257–261
Article PubMed Google Scholar
Zhang Y, Jiang JJ (2008) Acoustic analyses of sustained and running voices from patients with laryngeal pathologies. J Voice 22:1–9
Article PubMed Google Scholar
Maryn Y, Corthals P, De Bodt M, Van Cauwenberge P, Deliyski D (2009) Perturbation measures of voice: a comparative study between multi-dimensional voice program and praat. Folia Phoniatr Logop 61:217–226
Article PubMed Google Scholar
Linder R, Albers AE, Hess M, Pöppl SJ, Schönweiler R (2008) Artificial neural network-based classification to screen for dysphonia using psychoacoustic scaling of acoustic voice features. J Voice 22:155–163
Article PubMed Google Scholar
Muhammad G, Mesallam TA, Malki KH, Farahat M, Mahmood A, Alsulaiman M (2012) Multidirectional regression (MDR)-based features for automatic voice disorder detection. J Voice 26:19–27
Article Google Scholar
Svec JG, Granqvist S (2010) Guidelines for selecting microphones for human voice production research. Am J Speech Lang Pathol 19:356–368
Article PubMed Google Scholar
Moon KR, Chung SM, Park HS, Kim HS (2012) Materials of acoustic analysis: sustained vowel versus sentence. J Voice 26:563–565
Article PubMed Google Scholar
Kaleem M, Ghoraani B, Guergachi A, Krishnan S (2013) Pathological speech signal analysis and classification using empirical mode decomposition. Med Biol Eng Comput 51:811–821
Article PubMed Google Scholar
Henríquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Díaz-de-María F (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. Audio Speech Lang Process IEEE Trans 17:1186–1195
Article Google Scholar
Uloza V, Verikas A, Bacauskiene M, Gelzinis A, Pribuisiene R, Kaseta M, Saferis V (2011) Categorizing normal and pathological voices: automated and perceptual categorization. J Voice 25:700–708
Article PubMed Google Scholar
Vaiciukynas E, Verikas A, Gelzinis A, Bacauskiene M, Uloza V (2012) Exploring similarity-based classification of larynx disorders from human voice. Speech Commun 54:601–610
Article Google Scholar

Download references

Acknowledgments

This study was supported by grant VP1-3.1- ŠMM-10-V-02-030 from the Ministry of Education and Science of Republic of Lithuania.

Author information

Authors and Affiliations

Department of Otolaryngology, Lithuanian University of Health Sciences, Eiveniu 2, 50009, Kaunas, Lithuania
Virgilijus Uloza, Evaldas Padervinskis, Aurelija Vegiene & Ruta Pribuisiene
Department of Physics, Mathematics and Biophysics, Lithuanian University of Health Sciences, Kaunas, Lithuania
Viktoras Saferis
Department of Electric Power Systems, Kaunas University of Technology, Kaunas, Lithuania
Evaldas Vaiciukynas, Adas Gelzinis & Antanas Verikas
Department of Intelligent Systems, Halmstad University, Halmstad, Sweden
Antanas Verikas

Authors

Virgilijus Uloza
View author publications
You can also search for this author in PubMed Google Scholar
Evaldas Padervinskis
View author publications
You can also search for this author in PubMed Google Scholar
Aurelija Vegiene
View author publications
You can also search for this author in PubMed Google Scholar
Ruta Pribuisiene
View author publications
You can also search for this author in PubMed Google Scholar
Viktoras Saferis
View author publications
You can also search for this author in PubMed Google Scholar
Evaldas Vaiciukynas
View author publications
You can also search for this author in PubMed Google Scholar
Adas Gelzinis
View author publications
You can also search for this author in PubMed Google Scholar
Antanas Verikas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Virgilijus Uloza.

Ethics declarations

Conflict of interest

No conflicts of interest to declare.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Uloza, V., Padervinskis, E., Vegiene, A. et al. Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening. Eur Arch Otorhinolaryngol 272, 3391–3399 (2015). https://doi.org/10.1007/s00405-015-3708-4

Download citation

Received: 20 April 2015
Accepted: 30 June 2015
Published: 11 July 2015
Issue Date: November 2015
DOI: https://doi.org/10.1007/s00405-015-3708-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening

Abstract

Access this article

Similar content being viewed by others

Exploring the feasibility of the combination of acoustic voice quality index and glottal function index for voice pathology screening

Smart Data Driven System for Pathological Voices Classification

Spectral Analysis of Speech Signal Characteristics: A Comparison Between Healthy Controls and Laryngeal Disorder

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening

Abstract

Access this article

Similar content being viewed by others

Exploring the feasibility of the combination of acoustic voice quality index and glottal function index for voice pathology screening

Smart Data Driven System for Pathological Voices Classification

Spectral Analysis of Speech Signal Characteristics: A Comparison Between Healthy Controls and Laryngeal Disorder

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation