Cognitive Computation

, Volume 5, Issue 4, pp 458–472 | Cite as

Improving Automatic Detection of Obstructive Sleep Apnea Through Nonlinear Analysis of Sustained Speech

  • José Luis Blanco
  • Luis A. Hernández
  • Rubén Fernández
  • Daniel Ramos


We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients’ voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients’ voices, which should be found in continuous speech.


Obstructive sleep apnea (OSA) Continuous speech Sustained speech Gaussian mixture models (GMMs) Nonlinear analysis Speech dynamics Classification and regression tree (CART) 



The activities described in this paper were funded by the Spanish Ministry of Science and Innovation as part of the TEC2009-14719-C02-02 (PriorSpeech) project. The corresponding author also acknowledges the support from Universidad Politécnica de Madrid full-time PhD scholarship program. Finally, authors would like to thank Athanasios Tsanas, Max Little and Professor J. I. Godino Llorente, for their comments and suggestions.


  1. 1.
    Faundez-Zanuy M, McLaughlin S, Esposito A, Hussain A, Schoentgen J, Kubin G, Kleijn WB, Maragos P. Nonlinear speech processing: overview and applications. Control Intell Syst. 2002;30:1–10.Google Scholar
  2. 2.
    Kubin G. Nonlinear processing of speech. In: Kleijn WB, Paliwal KK, editors. Speech coding and synthesis. Amsterdam: Elsevier Science; 1995.Google Scholar
  3. 3.
    Little MA, Costello DAE, Harries ML. Objective dysphonia quantification in vocal fold paralysis: comparing nonlinear with classical measures. J Voice. 2009;25(1):21–31.PubMedCrossRefGoogle Scholar
  4. 4.
    Tsanas A, Little MA, McSharry PE, Ramig LO. Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson’s disease symptom severity. J R Soc Interface. 2010;8:842–55.PubMedCrossRefGoogle Scholar
  5. 5.
    Gómez-Vilda P, Rodellar-Biarge MV, Nieto-Lluis V, Muñoz-Mulas C, Mazaira-Fernández LM, Ramírez-Calvo C, Fernández-Fernández M, Toribio-Díaz E. Neurological disease detection and monitoring from voice production. Lecture notes in artificial intelligence. Volume 7015: nonlinear speech processing NOLISP 2011, Springer; 2011.Google Scholar
  6. 6.
    Arias-Londoño JD, Godino-Llorente JI, Sáenz-Lechón N, Osma-Ruiz V, Castellanos-Domínguez G. Automatic detection of pathological voices using complexity measures, noise parameters, and mel-cepstral coefficients. IEEE Trans Biomed Eng 2011;58(2):370–9.Google Scholar
  7. 7.
    KayPENTAX. Massachusetts Eye and Ear Infirmary (MEEI) Voice and Speech Lab. Disordered Voice Database and Program, Model 4337. Viewed September 2011; 2011.
  8. 8.
    Puertas FJ, Pin G, María JM, Durán J. Documento de consenso Nacional sobre el síndrome de Apneas-hipopneas del sueño. Grupo Español De Sueño; 2005.Google Scholar
  9. 9.
    Coccagna G, Pollini A, Provini F. Cardiovascular disorders and obstructive sleep apnea syndrome. Clin Exp Hypertens. 2006;28:217–24.PubMedCrossRefGoogle Scholar
  10. 10.
    Nieto FJ, Peppard PE, Young T, Finn L, Hla KM, Farré R. Sleep disordered breathing and cancer mortality: results from the Wisconsin Sleep Cohort Study. Am J Respir Crit Care Med. 2012;186(2):190–4.Google Scholar
  11. 11.
    Lloberes P, Levy G, Descals C, et al. Self-reported sleepiness while driving as a risk factor for traffic accidents in patients with obstructive sleep apnoea syndrome and in non-apnoeic snorers. Respir Med. 2000;94(10):971–6.PubMedCrossRefGoogle Scholar
  12. 12.
    Penzel T, McNames J, de Chazal P, Raymond B, Murray A, Moody G. Systematic comparison of different algorithms for apnoea detection based on electrocardiogram recordings. Med Biol Eng Comput. 2002;40(4):402–7.PubMedCrossRefGoogle Scholar
  13. 13.
    Calisti M, Bocchi L, Manfredi C, Romagnoli I, Gigliotti F, Donzelli G. Automatic detection of snore episodes from full night sound recordings: home and clinical application. In: Proceedings of the 3rd advanced voice function assessment international workshop. 2009.Google Scholar
  14. 14.
    Alcázar JD, Fernández R, Blanco JL, Hernández L, López L, Linde F, Torre-Toledano D. Automatic speaker recognition techniques: a new tool for sleep apnoea diagnosis. Am J Respir Crit Care Med. 2009;179:A2131.Google Scholar
  15. 15.
    Fernández-Pozo R, Blanco-Murillo JL, Hernández-Gómez L, López-Gonzalo E, Alcázar-Ramírez J, Torre-Toledano D. Assessment of severe apnoea through voice analysis, automatic speech, and speaker recognition techniques. EURASIP J Adv Signal Process. 2009;2009(982531). doi: 10.1155/2009/982531.
  16. 16.
    Blanco JL, Fernández R, Díaz-Pardo D, Sigüenza A, Hernández L, Alcázar J. Analyzing GMMs to characterize resonance anomalies in speaker suffering from apnoea. In: Proceedings of the 10th annual conference of the international speech communication association. 2009.Google Scholar
  17. 17.
    Blanco JL, Fernández R, Torre D, Caminero FJ, López E. Analyzing training dependencies and posterior fusion in discriminative classification of apnea patients based on sustained and connected speech. In: Proceedings of the 12th annual conference of the international speech communication association. 2011.Google Scholar
  18. 18.
    Goldshtein E, Tarasiuk A, Zigel Y. Automatic detection of obstructive sleep apnea using speech signals. IEEE Trans Biomed Eng. 2011;58(5):1373–82.PubMedCrossRefGoogle Scholar
  19. 19.
    Ryan CM, Bradley TD. Pathogenesis of obstructive sleep apnoea. J Appl Physiol. 2005;99(6):2440–50.PubMedCrossRefGoogle Scholar
  20. 20.
    Davidson TM. The Great Leap Forward: the anatomic evolution of obstructive sleep apnoea. Sleep Med. 2003;4:185–94.PubMedCrossRefGoogle Scholar
  21. 21.
    Fox AW, Monoson PK, Morgan CD. Speech dysfunction of obstructive sleep apnea. A discriminant analysis of its descriptors. Chest. 1996;96(3):589–95.CrossRefGoogle Scholar
  22. 22.
    Kummer A. Cleft palate and craniofacial anomalies: effects on speech and resonance. Clifton Park: Thomson Delmar Learning; 2001.Google Scholar
  23. 23.
    Robb MP, Yates J, Morgan EJ. Vocal tract resonance characteristics of adults with obstructive sleep apnea. Acta Otolaryngol. 1997;117(5):760–3.PubMedCrossRefGoogle Scholar
  24. 24.
    Fiz JA, Morera J, Abad J, et al. Acoustic analysis of vowel emission in obstructive sleep apnea. Chest. 1993;104(4):1093–6.PubMedCrossRefGoogle Scholar
  25. 25.
    Fernandez R, Hernández LA, López E, Alcázar J, Portillo G, Toledano DT. Design of a multimodal database for research on automatic detection of severe apnoea cases. In: Proceedings of 6th language resources and evaluation conference. LREC, Marrakech; 2008.Google Scholar
  26. 26.
    Linde de Luna F, Alcazar J, Vergara C, Blanco JL, Fernandez R, Hernandez LA, Lopez E. Combining voice classification scores with clinical data for improving sleep apnea syndrome diagnosis. Am J Respir Crit Care Med. 2012;185:A6427.Google Scholar
  27. 27.
    Huang X, Acero A, Hon WH. Spoken language processing. Englewood Cliffs: Prentice-Hall; 2001.Google Scholar
  28. 28.
    Reynolds DA, Quatieri TF, Dunn RB. Speaker verification using adapted gaussian mixture models. Digit Signal Process. 2000;10:19–41.CrossRefGoogle Scholar
  29. 29.
    Godino-Llorente JI, Gomez-Vilda P, Blanco-Velasco M. Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Trans Biomed Eng. 2006;53(10):1943–53.PubMedCrossRefGoogle Scholar
  30. 30.
    Blouet R, Mokbel C, Mokbel H, Sanchez-Soto E, Chollet G, Greige, H. BECARS: a Free Software for Speaker Verification. In: Proceedings of the speaker and language recognition workshop, ODYSSEY; 2004. p. 145–148.Google Scholar
  31. 31.
    Young SJ, Evermann G, Gales MJF, Hain T, Kershaw D, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland PC. The HTK Book, version 3.4. Cambridge, UK: Cambridge University Press; 2006.Google Scholar
  32. 32.
    Moreno A, Poch D, Bonafonte A, Lleida E, Llisterri J, Mariño JB, Nadeu C. ALBAYZIN speech database: design of the phonetic corpus. In: Proceedings of Eurospeech 93, vol. 1. Berlin, Germany, 1993. p. 175–178.Google Scholar
  33. 33.
    Childers DG. Speech processing and synthesis toolboxes. New York: Wiley; 2000.Google Scholar
  34. 34.
    Farrús M, Hernando J. Using jitter and shimmer in speaker verification. IET Signal Process J. Special issue on biometric recognition; 2008. doi: 10.1049/iet-spr.2008.0147.
  35. 35.
    Brookes M. VOICEBOX: Speech processing toolbox for Matlab. Department of Electrical & Electronic Engineering. Imperial College, London; 2002.

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • José Luis Blanco
    • 1
  • Luis A. Hernández
    • 1
  • Rubén Fernández
    • 1
  • Daniel Ramos
    • 2
  1. 1.Signal Processing Applications GroupUniversidad Politécnica de Madrid, ETSI de TelecomunicaciónMadridSpain
  2. 2.Biometric Recognition Group (ATVS)Universidad Autónoma de Madrid, Escuela Politécnica SuperiorMadridSpain

Personalised recommendations