Abstract
Parkinson’s disease (PD) is a neurodegenerative disorder that negatively affects millions of people. PD is usually diagnosed by a series of empirical tests and sometimes by invasive methods. Classifying People with Parkinsonism (PWP) from healthy people using speech signals may lead to innovative, noninvasive PD diagnosis. In this study, we developed a machine learning system to classify PWP using their speech signals. In the system, four feature selection algorithms, six classifiers, and two validation methods were employed for accurate classification of PWP. The system calculated the accuracy, sensitivity, specificity, and Matthews correlation coefficient of the results. Additionally, the execution times of the algorithms were computed. All utilized algorithms, classifiers, validation methods, and evaluation metrics are briefly reviewed in the article. The main innovative part of this study is developing a comprehensive machine learning system for classifying PWP and testing it on a PD dataset, which consisted of multiple types of speech signals. Applying feature selection methods greatly increased the accuracy of classification. The most significant and discriminative features of speech signals were obtained and explained with a medical background. The importance of the selected features is also evaluated from the medical perspective.
Similar content being viewed by others
References
Sakar B.E., Isenkul M.E., Sakar C.O., Sertbas A., Gurgen F., Delil S., Apaydin H., Kursun O.: Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE J. Biomed. Health Inf. 17(4), 828–834 (2013). doi:10.1109/jbhi.2013.2245674
Jankovic J.: Parkinson’s disease: clinical features and diagnosis. J. Neurol. Neurosurg. Psychiatry 79(4), 368–376 (2008). doi:10.1136/jnnp.2007.131045
Singh N., Pillay V., Choonara Y.E.: Advances in the treatment of Parkinson’s disease. Prog. Neurobiol. 81(1), 29–44 (2007). doi:10.1016/j.pneurobio.2006.11.009
Conditions N.C.C.f.C: Parkinson’s Disease. Royal College of Physicians, London (2006)
Harel B., Cannizzaro M., Snyder P.J.: Variability in fundamental frequency during speech in prodromal and incipient Parkinson’s disease: A longitudinal case study. Brain Cogn. 56(1), 24–29 (2004). doi:10.1016/j.bandc.2004.05.002
Tsanas A., Little M.A., McSharry P.E., Spielman J., Ramig L.O.: Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans. Biomed. Eng. 59(5), 1264–1271 (2012). doi:10.1109/tbme.2012.2183367
Little M.A., McSharry P.E., Hunter E.J., Spielman J., Ramig L.O.: Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 56(4), 1015–1022 (2009). doi:10.1109/tbme.2008.2005954
Tsanas A., Little M.A., McSharry P.E., Ramig L.O.: Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson’s disease symptom severity. J. R. Soc. Interface 8(59), 842–855 (2011). doi:10.1098/rsif.2010.0456
Tsanas A., Little M.A., McSharry P.E., Ramig L.O.: Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans. Biomed. Eng. 57(4), 884–893 (2010). doi:10.1109/tbme.2009.2036000
Gok M.: An ensemble of k-nearest neighbours algorithm for detection of Parkinson’s disease. Int. J. Syst. Sci. 46(6), 1108–1112 (2015). doi:10.1080/00207721.2013.809613
Bayestehtashk A., Asgari M., Shafran I., McNames J.: Fully automated assessment of the severity of Parkinson’s disease from speech. Comput. Speech Lang. 29(1), 172–185 (2015). doi:10.1016/j.csl.2013.12.001
Khan T., Westin J., Dougherty M.: Classification of speech intelligibility in Parkinson’s disease. Biocybern. Biomed. Eng. 34(1), 35–45 (2014). doi:10.1016/j.bbe.2013.10.003
Khan T., Westin J., Dougherty M.: Cepstral separation difference: a novel approach for speech impairment quantification in Parkinson’s disease. Biocybern. Biomed. Eng. 34(1), 25–34 (2014). doi:10.1016/j.bbe.2013.06.001
Boersma, O.; Weenink, D.: Praat: doing phonetics by computer. http://www.fon.hum.uva.nl/praat/. Accessed 06 July 2015
Hsu C.W., Lin C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Tibshirani R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B-Methodol. 58(1), 267–288 (1996)
Peng H.C., Long F.H., Ding C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Kira K., Rendell L.A: A Practical Approach to Feature-Selection. Machine Learning. Morgan Kaufmann Pub Inc, San Mateo (1992)
Sun Y.J., Todorovic S., Goodison S.: Local-learning-based feature selection for high-dimensional data analysis. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1610–1626 (2010). doi:10.1109/tpami.2009.190
Gilad-Bachrach, R.; Navot, A.; Tishby, N.: Margin Based Feature Selection—Theory and Algorithms. Paper presented at the Proceedings of the twenty-first international conference on Machine learning, Banff, Alberta, Canada
Freund Y., Schapire R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997). doi:10.1006/jcss.1997.1504
Wu X.D., Kumar V., Quinlan J.R., Ghosh J., Yang Q., Motoda H., McLachlan G.J., Ng A., Liu B., Yu P.S., Zhou Z.H., Steinbach M., Hand D.J., Steinberg D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008). doi:10.1007/s10115-007-0114-2
Chang, C.C.; Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (2011). doi:10.1145/1961189.1961199
Jain A.K., Duin R.P.W., Mao J.C.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000). doi:10.1109/34.824819
Ensemble classifier. http://www.mathworks.com/help/stats/classificationensemble-class.html. Accessed 15 April 2015
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, vol. 1857. Lecture Notes in Computer Science, pp. 1–15. Springer, Berlin (2000)
Friedman N., Geiger D., Goldszmidt M.: Bayesian network classifiers. Mach. Learn. 29(2-3), 131–163 (1997). doi:10.1023/a:1007465528199
Bowen L.K., Hands G.L., Pradhan S., Stepp C.E.: Effects of Parkinson’s disease on fundamental frequency variability in running speech. J. Med. Speech Lang. Pathol. 21(3), 235–244 (2013)
Rusz J., Cmejla R., Ruzickova H., Ruzicka E.: Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson’s disease. J. Acoust. Soc. Am. 129(1), 350–367 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cantürk, İ., Karabiber, F. A Machine Learning System for the Diagnosis of Parkinson’s Disease from Speech Signals and Its Application to Multiple Speech Signal Types. Arab J Sci Eng 41, 5049–5059 (2016). https://doi.org/10.1007/s13369-016-2206-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-016-2206-3