Advertisement

Pathological speech signal analysis and classification using empirical mode decomposition

Abstract

Automated classification of normal and pathological speech signals can provide an objective and accurate mechanism for pathological speech diagnosis, and is an active area of research. A large part of this research is based on analysis of acoustic measures extracted from sustained vowels. However, sustained vowels do not reflect real-world attributes of voice as effectively as continuous speech, which can take into account important attributes of speech such as rapid voice onset and termination, changes in voice frequency and amplitude, and sudden discontinuities in speech. This paper presents a methodology based on empirical mode decomposition (EMD) for classification of continuous normal and pathological speech signals obtained from a well-known database. EMD is used to decompose randomly chosen portions of speech signals into intrinsic mode functions, which are then analyzed to extract meaningful temporal and spectral features, including true instantaneous features which can capture discriminative information in signals hidden at local time-scales. A total of six features are extracted, and a linear classifier is used with the feature vector to classify continuous speech portions obtained from a database consisting of 51 normal and 161 pathological speakers. A classification accuracy of 95.7 % is obtained, thus demonstrating the effectiveness of the methodology.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3

References

  1. 1.

    Henriquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Diaz-de-Maria F (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans Audio Speech Lang Process 17(6):1186–1195

  2. 2.

    Parsa V, Jamieson DG (2000) Identification of pathological voices using glottal noise measures. J Speech Lang Hear Res 43(2):469–485

  3. 3.

    Saenz-Lechona N, Godino-Llorentea JI, Osma-Ruiza V, Gomez-Vilda P (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed Signal Process Control 1(2):120–128

  4. 4.

    Gelzinis A, Verikas A, Bacauskiene M (2008) Automated speech analysis applied to laryngeal disease categorization. Comput Methods Programs Biomed 91(1):36–47

  5. 5.

    Schlotthauer G, Torres ME, Jackson-Menaldi MC (2010) A pattern recognition approach to spasmodic dysphonia and muscle tension dysphonia automatic classification. J Voice 24(3):346–353

  6. 6.

    Godino-Llorente JI, Gomez-Vilda P (2004) Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans Biomed Eng 51(2):380–384

  7. 7.

    Shama K, Krishna A, Cholayya NU (2007) Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology. EURASIP J Adv Signal Process. doi:10.1155/2007/85286

  8. 8.

    Markaki M, Stylianou Y, Arias-Londono JD, Godino-Llorente JI (2010) Dysphonia detection based on modulation spectral features and cepstral coefficients. In: Douglas S, Kehtarnavaz N (eds) Proceedings of the 2010 IEEE international conference on acoustics, speech, and signal processing, Dallas, Texas, USA, pp 5162–5165

  9. 9.

    Umapathy K, Krishnan S, Parsa V, Jamieson DG (2005) Discrimination of pathological voices using a time–frequency approach. IEEE Trans Biomed Eng 52(3):421–430

  10. 10.

    Ghoraani B, Krishnan S (2009) A joint time–frequency and matrix decomposition feature extraction methodology for pathological voice classification. EURASIP J Adv Signal Process. doi:10.1155/2009/928974

  11. 11.

    Parsa V, Jamieson DG (2001) Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. J Speech Lang Hear Res 4(2):327–338

  12. 12.

    Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH (1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond A 454(1971):903–995

  13. 13.

    Kaleem MF, Sugavaneswaran L, Guergachi A, Krishnan S (2010) Application of empirical mode decomposition and Teager energy operator to EEG signals for mental task classification. In: Armentano R, Monzon JE, Sacristan E, Lovell N (eds) Proceedings of the 2010 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Buenos Aires, Brazil, pp 4590–4593

  14. 14.

    Mijovic B, De Vos M, Gligorijevic I, Taelman J, Van Huffel S (2010) Source separation from single-channel recordings by combining empirical mode decomposition and independent component analysis. IEEE Trans Biomed Eng 57(9):2188–2196

  15. 15.

    Schlotthauer G, Torres ME, Rufiner HL (2009) Voice fundamental frequency extraction algorithm based on ensemble empirical mode decomposition and entropies. In: Doessel O, Schlegel WC (eds) IFMBE proceedings, world congress on medical physics and biomedical engineering, vol 25/4, Springer, Berlin, pp 984–987

  16. 16.

    Schlotthauer G, Torres ME, Rufiner HL (2010) Pathological voice analysis and classification based on empirical mode decomposition. In: Esposito A et al (eds) Development of multimodal interfaces: active listening and synchrony; LNCS 5967, pp 364–381

  17. 17.

    Kay Elemetrics Corporation (1994) Massachusetts eye and ear infirmary voice disorders database. Version 1.03 (CDROM), Lincoln Park, NJ, USA

  18. 18.

    Sugavaneswaran L, Umapathy K, Krishnan S (2010) Exploiting the ambiguity domain for non-stationary biomedical signal classification. In: Armentano R, Monzon JE, Sacristan E, Lovell N (eds) Proceedings of the 2010 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Buenos Aires, Brazil, pp 1934–1937

  19. 19.

    Malyska N, Quatieri TF, Sturim D (2005) Automatic dysphonia recognition using iologically-inspired amplitude-modulation features. In: Petropulu AP, Bystrom M (eds) Proceedings of the 2005 IEEE international conference on acoustics, speech, and signal processing, Philadelphia, Pennsylvania, USA, vol 1, pp 873–876

  20. 20.

    Furui S (1986) On the role of spectral transition for speech perception. J Acoust Soc Am 80(4):1016–1025

  21. 21.

    Adam O (2006) Advantages of the Hilbert Huang transform for marine mammals signal analysis. J Acoust Soc Am 120(5):2965–2973

  22. 22.

    Flandrin P et al. (2007) Matlab codes for empirical mode decomposition algorithm. http://perso.ens-lyon.fr/patrick.flandrin/emd.html. Accessed 25 Jan 2013

  23. 23.

    Hettmansperger TP, McKean J (2010) Robust nonparametric statistical methods, 2nd edn. Chapman and Hall/CRC Monographs on Statistics and Applied Probability, CRC Press, New York

  24. 24.

    Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley and Sons, New York

  25. 25.

    Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv Adapt Data Analysis 1(1):1:41

  26. 26.

    Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-based voice pathology assessment using automated speech analysis. IEEE Trans Biomed Eng 53(3):468–477

  27. 27.

    Kaleem MF, Ghoraani B, Guergachi A, Krishnan S (2011) Telephone-quality pathological speech classification using empirical mode decomposition. In: Bonato P, Laine A, Lovell N (eds) Proceedings of the 2011 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Boston, MA, USA, pp 7095–7098

Download references

Author information

Correspondence to Muhammad Kaleem.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Kaleem, M., Ghoraani, B., Guergachi, A. et al. Pathological speech signal analysis and classification using empirical mode decomposition. Med Biol Eng Comput 51, 811–821 (2013) doi:10.1007/s11517-013-1051-8

Download citation

Keywords

  • Empirical mode decomposition
  • Speech signal analysis
  • Feature extraction
  • Pathological speech classification