Voice Pathology Detection Using Artificial Neural Networks and Support Vector Machines Powered by a Multicriteria Optimization Algorithm

  • Henry Jhoán Areiza-Laverde
  • Andrés Eduardo Castro-OspinaEmail author
  • Diego Hernán Peluffo-Ordóñez
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 915)


Computer-aided diagnosis (CAD) systems have allowed to enhance the performance of conventional, medical diagnosis procedures in different scenarios. Particularly, in the context of voice pathology detection, the use of machine learning algorithms has proved to be a promising and suitable alternative. This work proposes the implementation of two well known classification algorithms, namely artificial neural networks (ANN) and support vector machines (SVM), optimized by particle swarm optimization (PSO) algorithm, aimed at classifying voice signals between healthy and pathologic ones. Three different configurations of the Saarbrucken voice database (SVD) are used. The effect of using balanced and unbalanced versions of this dataset is proved as well as the usefulness of the considered optimization algorithm to improve the final performance outcomes. Also, proposed approach is comparable with state-of-the-art methods.


Voice pathology Computer-aided diagnosis Optimization Classification 



This work was partially supported by the grants provided by Programa Nacional de Jóvenes Investigadores e Innovadores – COLCIENCIAS – Announcement 775 of 2017 and the support for Instituto Tecnológico Metropolitano from Medellin-Colombia.

Also, authors specially thank the support given by the SDAS Research Group.


  1. 1.
    Acharya, U.R., Fujita, H., Oh, S.L., Hagiwara, Y., Tan, J.H., Adam, M.: Application of deep convolutional neural network for automated detection of myocardial infarction using ecg signals. Inf. Sci. 415, 190–198 (2017)CrossRefGoogle Scholar
  2. 2.
    Al-nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z.: Investigation of voice pathology detection and classification on different frequency regions using correlation functions. J. Voice 31(1), 3–15 (2017)CrossRefGoogle Scholar
  3. 3.
    Al-nasheri, A., et al.: An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. J. Voice 31(1), 113–e9 (2017)CrossRefGoogle Scholar
  4. 4.
    Ali, F.: Voice recognition anatomy, processing, uses and application in C (2017)Google Scholar
  5. 5.
    AlZubaidi, A.K., Sideseq, F.B., Faeq, A., Basil, M.: Computer aided diagnosis in digital pathology application: review and perspective approach in lung cancer classification. In: 2017 Annual Conference on New Trends in Information & Communications Technology Applications (NTICT), pp. 219–224. IEEE (2017)Google Scholar
  6. 6.
    Barry, W., Pützer, M.: Saarbrucken voice database. Institute of Phonetics, Universität des Saarlandes (2007).
  7. 7.
    Béranger, J.: Big Data and Ethics: The Medical Datasphere. Elsevier, New York City (2016)Google Scholar
  8. 8.
    Castro-Ospina, A., Castro-Hoyos, C., Peluffo-Ordonez, D., Castellanos-Dominguez, G.: Novel heuristic search for ventricular arrhythmia detection using normalized cut clustering. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 7076–7079. IEEE (2013)Google Scholar
  9. 9.
    Chiu, C.C., et al.: State-of-the-art speech recognition with sequence-to-sequence models. arXiv preprint arXiv:1712.01769 (2017)
  10. 10.
    Harar, P., Alonso-Hernandezy, J.B., Mekyska, J., Galaz, Z., Burget, R., Smekal, Z.: Voice pathology detection using deep learning: a preliminary study. In: 2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI), pp. 1–4. IEEE (2017)Google Scholar
  11. 11.
    Hemmerling, D., Skalski, A., Gajda, J.: Voice data mining for laryngeal pathology assessment. Comput. Biol. Med. 69, 270–276 (2016)CrossRefGoogle Scholar
  12. 12.
    Ibrahim, S., Djemal, R., Alsuwailem, A.: Electroencephalography (EEG) signal processing for epilepsy and autism spectrum disorder diagnosis. Biocybern. Biomed. Eng. 38(1), 16–26 (2018)CrossRefGoogle Scholar
  13. 13.
    Lytras, M.D., Papadopoulou, P.: Applying Big Data Analytics in Bioinformatics and Medicine. IGI Global, Pennsylvania (2017)Google Scholar
  14. 14.
    Martínez, D., Lleida, E., Ortega, A., Miguel, A., Villalba, J.: Voice pathology detection on the Saarbrücken voice database with calibration and fusion of scores using MultiFocal toolkit. In: Torre Toledano, D., et al. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 99–109. Springer, Heidelberg (2012). Scholar
  15. 15.
    Mendoza, L., Peña, J., Muñoz-Bedoya, L., Velandia-Villamizar, H.: Speech subvocal signal processing using packet wavelet and neuronal network. TecnoLógicas, 655–667 (2013).
  16. 16.
    Muhammad, G., Alhamid, M.F., Hossain, M.S., Almogren, A.S., Vasilakos, A.V.: Enhanced living by assessing voice pathology using a co-occurrence matrix. Sensors 17(2), 267 (2017)CrossRefGoogle Scholar
  17. 17.
    Muhammad, G., et al.: Voice pathology detection using interlaced derivative pattern on glottal source excitation. Biomed. Signal Process. Control 31, 156–164 (2017)CrossRefGoogle Scholar
  18. 18.
    Muhammad, G., et al.: Automatic voice pathology detection and classification using vocal tract area irregularity. Biocybern. Biomed. Eng. 36(2), 309–317 (2016)CrossRefGoogle Scholar
  19. 19.
    Orozco-Naranjo, A.J., Muñoz-Gutiérrez, P.A.: Detection of pathological and normal heartbeat using wavelet packet, support vector machines and multilayer perceptron. Tecno Lógicas 31, 73–91 (2013)Google Scholar
  20. 20.
    Parascandolo, P., Cesario, L., Vosilla, L., Viano, G.: Computer aided diagnosis: state-of-the-art and application to musculoskeletal diseases. In: Magnenat-Thalmann, N., Ratib, O., Choi, H.F. (eds.) 3D Multiscale Physiological Human, pp. 277–296. Springer, London (2014). Scholar
  21. 21.
    Schalkoff, R.J.: Artificial Neural Networks, vol. 1. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  22. 22.
    Schilling, R.J., Harris, S.L.: Fundamentals of Digital Signal Processing Using MATLAB. Cengage Learning, Boston (2011)Google Scholar
  23. 23.
    Semmlow, J.L., Griffel, B.: Biosignal and Medical Image Processing. CRC Press, Boca Raton (2014)Google Scholar
  24. 24.
    Shinohara, S., et al.: Multilingual evaluation of voice disability index using pitch rate. ASTESJ 2(3), 765–772 (2017)CrossRefGoogle Scholar
  25. 25.
    Shriberg, L.D., et al.: A diagnostic marker to discriminate childhood apraxia of speech from speech delay: II. Validity studies of the pause marker. J. Speech Lang. Hear. Res. 60(4), S1118–S1134 (2017)CrossRefGoogle Scholar
  26. 26.
    Summers, R.M.: Deep learning and computer-aided diagnosis for medical image processing: a personal perspective. In: Lu, L., Zheng, Y., Carneiro, G., Yang, L. (eds.) Deep Learning and Convolutional Neural Networks for Medical Image Computing. ACVPR, pp. 3–10. Springer, Cham (2017). Scholar
  27. 27.
    von Tscharner, V.: Time-frequency and principal-component methods for the analysis of emgs recorded during a mildly fatiguing exercise on a cycle ergometer. J. Electromyogr. Kinesiol. 12(6), 479–492 (2002)CrossRefGoogle Scholar
  28. 28.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1999). Scholar
  29. 29.
    Verde, L., De Pietro, G., Sannino, G.: Voice disorder identification by using machine learning techniques. IEEE Access 6, 16246–16255 (2018)CrossRefGoogle Scholar
  30. 30.
    Wojcicki, K.: HTK MFCC MATLAB. MATLAB Central File Exchange (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Grupo de Investigación Automática, Electrónica y Ciencias ComputacionalesInstituto Tecnológico MetropolitanoMedellínColombia
  2. 2.SDAS Research GroupYachay TechUrcuquíEcuador

Personalised recommendations