Improving Automatic Detection of Obstructive Sleep Apnea Through Nonlinear Analysis of Sustained Speech
- 231 Downloads
We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients’ voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients’ voices, which should be found in continuous speech.
KeywordsObstructive sleep apnea (OSA) Continuous speech Sustained speech Gaussian mixture models (GMMs) Nonlinear analysis Speech dynamics Classification and regression tree (CART)
The activities described in this paper were funded by the Spanish Ministry of Science and Innovation as part of the TEC2009-14719-C02-02 (PriorSpeech) project. The corresponding author also acknowledges the support from Universidad Politécnica de Madrid full-time PhD scholarship program. Finally, authors would like to thank Athanasios Tsanas, Max Little and Professor J. I. Godino Llorente, for their comments and suggestions.
- 1.Faundez-Zanuy M, McLaughlin S, Esposito A, Hussain A, Schoentgen J, Kubin G, Kleijn WB, Maragos P. Nonlinear speech processing: overview and applications. Control Intell Syst. 2002;30:1–10.Google Scholar
- 2.Kubin G. Nonlinear processing of speech. In: Kleijn WB, Paliwal KK, editors. Speech coding and synthesis. Amsterdam: Elsevier Science; 1995.Google Scholar
- 5.Gómez-Vilda P, Rodellar-Biarge MV, Nieto-Lluis V, Muñoz-Mulas C, Mazaira-Fernández LM, Ramírez-Calvo C, Fernández-Fernández M, Toribio-Díaz E. Neurological disease detection and monitoring from voice production. Lecture notes in artificial intelligence. Volume 7015: nonlinear speech processing NOLISP 2011, Springer; 2011.Google Scholar
- 6.Arias-Londoño JD, Godino-Llorente JI, Sáenz-Lechón N, Osma-Ruiz V, Castellanos-Domínguez G. Automatic detection of pathological voices using complexity measures, noise parameters, and mel-cepstral coefficients. IEEE Trans Biomed Eng 2011;58(2):370–9.Google Scholar
- 7.KayPENTAX. Massachusetts Eye and Ear Infirmary (MEEI) Voice and Speech Lab. Disordered Voice Database and Program, Model 4337. Viewed September 2011; 2011. http://www.kaypentax.com.
- 8.Puertas FJ, Pin G, María JM, Durán J. Documento de consenso Nacional sobre el síndrome de Apneas-hipopneas del sueño. Grupo Español De Sueño; 2005.Google Scholar
- 10.Nieto FJ, Peppard PE, Young T, Finn L, Hla KM, Farré R. Sleep disordered breathing and cancer mortality: results from the Wisconsin Sleep Cohort Study. Am J Respir Crit Care Med. 2012;186(2):190–4.Google Scholar
- 13.Calisti M, Bocchi L, Manfredi C, Romagnoli I, Gigliotti F, Donzelli G. Automatic detection of snore episodes from full night sound recordings: home and clinical application. In: Proceedings of the 3rd advanced voice function assessment international workshop. 2009.Google Scholar
- 14.Alcázar JD, Fernández R, Blanco JL, Hernández L, López L, Linde F, Torre-Toledano D. Automatic speaker recognition techniques: a new tool for sleep apnoea diagnosis. Am J Respir Crit Care Med. 2009;179:A2131.Google Scholar
- 15.Fernández-Pozo R, Blanco-Murillo JL, Hernández-Gómez L, López-Gonzalo E, Alcázar-Ramírez J, Torre-Toledano D. Assessment of severe apnoea through voice analysis, automatic speech, and speaker recognition techniques. EURASIP J Adv Signal Process. 2009;2009(982531). doi: 10.1155/2009/982531.
- 16.Blanco JL, Fernández R, Díaz-Pardo D, Sigüenza A, Hernández L, Alcázar J. Analyzing GMMs to characterize resonance anomalies in speaker suffering from apnoea. In: Proceedings of the 10th annual conference of the international speech communication association. 2009.Google Scholar
- 17.Blanco JL, Fernández R, Torre D, Caminero FJ, López E. Analyzing training dependencies and posterior fusion in discriminative classification of apnea patients based on sustained and connected speech. In: Proceedings of the 12th annual conference of the international speech communication association. 2011.Google Scholar
- 22.Kummer A. Cleft palate and craniofacial anomalies: effects on speech and resonance. Clifton Park: Thomson Delmar Learning; 2001.Google Scholar
- 25.Fernandez R, Hernández LA, López E, Alcázar J, Portillo G, Toledano DT. Design of a multimodal database for research on automatic detection of severe apnoea cases. In: Proceedings of 6th language resources and evaluation conference. LREC, Marrakech; 2008.Google Scholar
- 26.Linde de Luna F, Alcazar J, Vergara C, Blanco JL, Fernandez R, Hernandez LA, Lopez E. Combining voice classification scores with clinical data for improving sleep apnea syndrome diagnosis. Am J Respir Crit Care Med. 2012;185:A6427.Google Scholar
- 27.Huang X, Acero A, Hon WH. Spoken language processing. Englewood Cliffs: Prentice-Hall; 2001.Google Scholar
- 30.Blouet R, Mokbel C, Mokbel H, Sanchez-Soto E, Chollet G, Greige, H. BECARS: a Free Software for Speaker Verification. In: Proceedings of the speaker and language recognition workshop, ODYSSEY; 2004. p. 145–148.Google Scholar
- 31.Young SJ, Evermann G, Gales MJF, Hain T, Kershaw D, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland PC. The HTK Book, version 3.4. Cambridge, UK: Cambridge University Press; 2006.Google Scholar
- 32.Moreno A, Poch D, Bonafonte A, Lleida E, Llisterri J, Mariño JB, Nadeu C. ALBAYZIN speech database: design of the phonetic corpus. In: Proceedings of Eurospeech 93, vol. 1. Berlin, Germany, 1993. p. 175–178.Google Scholar
- 33.Childers DG. Speech processing and synthesis toolboxes. New York: Wiley; 2000.Google Scholar
- 34.Farrús M, Hernando J. Using jitter and shimmer in speaker verification. IET Signal Process J. Special issue on biometric recognition; 2008. doi: 10.1049/iet-spr.2008.0147.
- 35.Brookes M. VOICEBOX: Speech processing toolbox for Matlab. Department of Electrical & Electronic Engineering. Imperial College, London; 2002. http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.