Continuous Speech Classification Systems for Voice Pathologies Identification

  • Hugo CordeiroEmail author
  • Carlos Meneses
  • José Fonseca
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 450)


Voice pathologies identification using speech processing methods can be used as a preliminary diagnostic. The aim of this study is to compare the performance of sustained vowel /a/ and continuous speech task in identification systems to diagnose voice pathologies. The system recognizes between three classes consisting of two different pathologies sets and healthy subjects. The signals are evaluated using MFCC (Mel Frequency Cepstral Coefficients) as speech signal features, applied to SVM (Support Vector Machines) and GMM (Gaussian Mixture Models) classifiers. For continuous speech, the GMM system reaches 74% accuracy rate while the SVM system obtains 72% accuracy rate. For the sustained vowel /a/, the accuracy achieved by the GMM and the SVM is 66% and 69% respectively, a lower result than with continuous speech.


Voice pathologies identification Continuous speech Gaussian mixture models Support vector machines 


  1. 1.
    Lieberman, P.: Some acoustic measures of the fundamental periodicity of normal and pathologic larynges. J. Acoust. Soc. Amer. 35, 344–353 (1963)CrossRefGoogle Scholar
  2. 2.
    Iwata, S.: Periodicities of pitch perturbations in normal and pathological larynges. J. Acoust. Soc. Amer. 45, 344–353 (1972)Google Scholar
  3. 3.
    Shama, K., Krishna, A., Niranjan Cholayya, N.U.: Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology. EURASIP Journal on Advances in Signal Processing 1 (2007)Google Scholar
  4. 4.
    Cordeiro, H., Fonseca, J., Meneses C.: Spectral Envelope and Periodic Component in Classification Trees for Pathological Voice Diagnostic. In: 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4607–4610 (2014)Google Scholar
  5. 5.
    Dibazar, A., Narayanan S.: A system for automatic detection of pathological speech. In: 36th Asilomar Conf., Signal, Systems & Computers (2002)Google Scholar
  6. 6.
    Fonseca, E.S., Guido, R.C., Scalassara, P.R., Maciel, C.D., Pereira, J.C.: Wavelet time–frequency analysis and least squares support vector machines for the identification of voice disorders. Comput. Biol. Med. 37, 571–578 (2006)CrossRefGoogle Scholar
  7. 7.
    Sáenz-Lechón, N., Godino-Llorente, J.I., Osma-Ruiz, V., Gómez-Vilda, P.: Methodological issues in the development of automatic systems for voice pathology detection. Biomedical Signal Processing and Control 1, 120–128 (2006)Google Scholar
  8. 8.
    Scalassara, P.R., Dajer, M.E., Maciel, C.D., Guido, R.C., Pereira, J.C.: Relative entropy measures applied to healthy and pathological voice characterization. Applied Mathematics and Computation 207, 95–108 (2009)CrossRefzbMATHGoogle Scholar
  9. 9.
    Markaki M., Stylianou Y.: Using modulation spectra for voice pathology detection and classification. In: Proc. IEEE EMBC 2009, Minneapolis, pp. 2514–2517 (2009)Google Scholar
  10. 10.
    Key Elemetrics, Elemetrics Disordered Voice Database (1994)Google Scholar
  11. 11.
    Markaki, M., Stylianou, Y.: Voice Pathology Detection and Discrimination Based on Modulation Spectral Features. IEEE Transactions on Audio, Speech, and Language Processing 19, 1938–1948 (2011)CrossRefGoogle Scholar
  12. 12.
    Muhammad, G., Alsulaiman, M., Mahmood, A., Ali, Z.: Automatic voice disorder classification using vowel formants. In: IEEE Int. Conf. Multimedia and Expo (ICME) (2011)Google Scholar
  13. 13.
    Fonseca, E.S., Pereira, J.C.: Normal versus pathological voice signals. IEEE Engineering in Medicine and Biology Magazine 28, 44–48 (2009)CrossRefGoogle Scholar
  14. 14.
    Carvalho, R.T.S., Cavalcante, C.C., Cortez, P.C.: Wavelet transform and artificial neural networks applied to voice disorders identification. In: Third World Congress on Nature and Biologically Inspired Computing (NaBIC), pp. 371–376 (2011)Google Scholar
  15. 15.
    Cordeiro, H., Fonseca, J., Meneses, C.: Edema and Nodules Identification in vowels using spectral features and jitter. In: CETC 2013, Conference on Electronics, Telecommunications and Computers, Procedia Technology, vol. 17, pp. 202–208 (2014)Google Scholar
  16. 16.
    Lamel, L., Rabiner, L., Rosenberg, A., Wilpon, J.: An Improved Endpoint Detector for Isolated Word Recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 29, 777–785 (1981)CrossRefGoogle Scholar
  17. 17.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 14 (1995)Google Scholar
  18. 18.
    Chih-Wei, H., Chih-Jen, L.: A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  19. 19.
    Reynolds, D.: Speaker identification and verification using Gaussian mixture speaker models. Speech Communications 17, 91–108 (1995)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  1. 1.Department of Electrical EngineeringFaculty of Sciences and Technology of the New University of LisbonCaparicaPortugal
  2. 2.Department of Electronics and Telecommunications and ComputersHigh Institute of Engineering of LisbonLisbonPortugal

Personalised recommendations