An Analysis of Automated Parkinson’s Diagnosis Using Voice: Methodology and Future Directions

  • Timothy J. Wroge
  • Reza Hosseini GhomiEmail author


The explosion of machine learning and artificial intelligence research along with expanding computing capabilities in the last decade has made accurate applications possible and accessible to everyday life to the degree where we are now seeing massive adoption by individuals (smart devices) and industries (retail, marketing, real estate, etc.). Healthcare tends to lag in adoption of new technology and applications to existing solutions due to several barriers including regulatory, reimbursement, and medical security. One area we have observed growth outside of healthcare with now parallel work for medical applications is voice computing. In recent years voice recognition technology has achieved significant milestones, offering users highly accurate performance leading to widespread adoption, now with over half of mobile device users engaging with voice assistant regularly. In healthcare we have seen a number of voice computing companies emerge including NeuroLex Laboratories and Lyssn who aim to use machine learning to process the voice signal and provide a clinical measure for patients and providers. Our work here demonstrates the application of machine and deep learning tools to a dataset of voice recordings from several thousand subjects, including a cohort with Parkinson’s disease. Our goal is to demonstrate the feasibility and performance of voice computing to detect the Parkinson’s disease phenotype using only voice. This may enable future use of voice as a digital biomarker for Parkinson’s disease with benefits including improved access to screening and diagnosis, symptom tracking, decreased cost, and increased accuracy of diagnosis. From this work, we demonstrate the application of voice data to diagnose Parkinson’s disease accurately. The methodology demonstrated here can be extended to diagnose any illness that physiologically affects the vocal tract using voice as a digital biomarker.


Parkinson’s disease diagnosis Machine learning Audio analysis Clinical support tools 



We would like to acknowledge Yasin Özkanca, Cenk Demiroglu, Dong Si and David C. Atkins for their help in the preparation and review for the experiments described in this research. Yasin Özkanca and Cenk Demiroglu helped in the feature extraction and generation of the mRMR algorithms. Dong Si and David C. Atkins provided their input in the original study and feedback about the content on this work.

Data was contributed by users of the Parkinson mPower mobile application as part of the mPower study developed by Sage Bionetworks and described in Synapse [25].

Disclosure Statement At the time of manuscript preparation, Dr. Hosseini Ghomi was an employee of NeuroLex Laboratories and owns stock in the company.


  1. 1.
    Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al. (2016). Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (pp. 265–283).Google Scholar
  2. 2.
    Wood, M. (2017). Introducing Gluon: a new library for machine learning from AWS and Microsoft: Introducing Gluon. Amazon Web Services. Scholar
  3. 3.
    Giannakopoulos, T. (2015). pyAudioAnalysis: An open-source python library for audio signal analysis. PLoS One 10(12), e0144610.CrossRefGoogle Scholar
  4. 4.
    Bedi, G., Carrillo, F., Cecchi, G. A., Slezak, D. F., Sigman, M., Mota, N. B., et al. (2015). Automated analysis of free speech predicts psychosis onset in high-risk youths. npj Schizophrenia, 1(1). Article number: 15030.Google Scholar
  5. 5.
    Pestian, J. P., Sorter, M., Connolly, B., Bretonnel Cohen, K., McCullumsmith, C., Gee, J. T., et al. (2017). A machine learning approach to identifying the thought markers of suicidal subjects: A prospective multicenter trial. Suicide and Life-Threatening Behavior, 47(1), 112–121.CrossRefGoogle Scholar
  6. 6.
    Khodabakhsh, A., Yesil, F., Guner, E., & Demiroglu, C. (2015). Evaluation of linguistic and prosodic features for detection of Alzheimer’s disease in Turkish conversational speech. EURASIP Journal on Audio, Speech, and Music Processing, 2015, 9.CrossRefGoogle Scholar
  7. 7.
    Human voiceome project 2019.Google Scholar
  8. 8.
    Tysnes, O.-B., & Storstein, A. (2017). Epidemiology of Parkinson’s disease. Journal of Neural Transmission, 124(8), 901–905.CrossRefGoogle Scholar
  9. 9.
  10. 10.
    Savitt, J. M., Dawson, V. L., & Dawson, T. M. (2006). Diagnosis and treatment of Parkinson disease: Molecules to medicine. Journal of Clinical Investigation, 116(7), 1744–1754.CrossRefGoogle Scholar
  11. 11.
    Goetz, C. G., Tilley, B. C., Shaftman, S. R., Stebbins, G. T., Fahn, S., Martinez-Martin, P., et al. (2008). Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): Scale presentation and clinimetric testing results. Movement Disorders, 23(15), 2129–2170.CrossRefGoogle Scholar
  12. 12.
    Hoehn, M. M., & Yahr, M. D. (1967). Parkinsonism: Onset, progression and mortality. Neurology, 17(5), 427–442.CrossRefGoogle Scholar
  13. 13.
    Magrinelli, F., Picelli, A., Tocco, P., Federico, A., Roncari, L., Smania, N., et al. (2016). Pathophysiology of motor dysfunction in Parkinson’s disease as the rationale for drug treatment and rehabilitation. Parkinson’s Disease, 2016, 9832839.Google Scholar
  14. 14.
    Uitti, R. J., Baba, Y., Wszolek, Z. K., & Putzke, D. J. (2005). Defining the Parkinson’s disease phenotype: Initial symptoms and baseline characteristics in a clinical cohort. Parkinsonism & Related Disorders, 11(3), 139–145.CrossRefGoogle Scholar
  15. 15.
    Asgari, M., & Shafran, I. (2010). Extracting cues from speech for predicting severity of Parkinson’s disease. In 2010 IEEE International Workshop on Machine Learning for Signal Processing, pp. 462–467.Google Scholar
  16. 16.
    Bernheimer, H., Birkmayer, W., Hornykiewicz, O., Jellinger, K., & Seitelberger, F. (1973). Brain dopamine and the syndromes of Parkinson and Huntington clinical, morphological and neurochemical correlations. Journal of the Neurological Sciences, 20(4), 415–455.CrossRefGoogle Scholar
  17. 17.
    Harel, B., Cannizzaro, M., & Snyder, P. J. (2004). Variability in fundamental frequency during speech in prodromal and incipient Parkinson’s disease: A longitudinal case study. Brain Cognition, 56(1), 24–29.CrossRefGoogle Scholar
  18. 18.
    Harel, B. T., Cannizzaro, M. S., Cohen, H., Reilly, N., & Snyder, P. J. (2004). Acoustic characteristics of Parkinsonian speech: A potential biomarker of early disease progression and treatment. Journal of Neurolinguistics, 17(6), 439–453.CrossRefGoogle Scholar
  19. 19.
    Garcia, A. M., Carrillo, F., Orozco-Arroyave, J. R., Trujillo, N., Vargas Bonilla, J. F., Fittipaldi, S., et al. (2016). How language flows when movements don’t: An automated analysis of spontaneous discourse in Parkinson’s disease. Brain and Language, 162, 19–28.CrossRefGoogle Scholar
  20. 20.
    Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J., & Ramig, L. O. (2012). Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Transactions on Biomedical Engineering, 59(5), 1264–1271.CrossRefGoogle Scholar
  21. 21.
    Tsanas, A., Little, M. A., & Ramig, L. O. (2010). Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests. IEEE Transactions on Biomedical Engineering, 57(4), p. 10.Google Scholar
  22. 22.
    Khan, T. (2014). Running-speech MFCC are better markers of Parkinsonian speech deficits than vowel phonation and diadochokinetic. Scholar
  23. 23.
    Arora, S., Venkataraman, V., Zhan, A., Donohue, S., Biglan, K. M., Dorsey, E. R., et al. (2015). Detecting and monitoring the symptoms of Parkinson’s disease using smartphones: A pilot study. Parkinsonism & Related Disorders, 21(6), 650–653.CrossRefGoogle Scholar
  24. 24.
    Zhan, A., Mohan, S., Tarolli, C., Schneider, R. B., Adams, J. L., Sharma, S., et al. (2018). Using smartphones and machine learning to quantify Parkinson disease severity: The mobile Parkinson disease score. JAMA Neurology, 75(7), 876–880.CrossRefGoogle Scholar
  25. 25.
    Bot, B. M., Suver, C., Neto, E. C., Kellen, M., Klein, A., Bare, C., et al. (2016). The mPower study, Parkinson disease mobile data collected using ResearchKit. Scientific Data, 3, 160011.CrossRefGoogle Scholar
  26. 26.
    Rizzo, G., Copetti, M., Arcuti, S., Martino, D., Fontana, A., & Logroscino, G. (2016). Accuracy of clinical diagnosis of Parkinson disease a systematic review and meta-analysis. Neurology, 86(6), 566–576.CrossRefGoogle Scholar
  27. 27.
    ITU-T. Objective measurement of active speech level. Recommendation P.56. International Telecommunications Union, 2011.Google Scholar
  28. 28.
    Brookes, M. (1997). VOICEBOX: A speech processing toolbox for MATLAB. Software library, Imperial College, London, 1997–2018.Google Scholar
  29. 29.
    Eyben, F., Scherer, K. R., Schuller, B. W., Sundberg, J., André, E., Busso, C., et al. (2016). The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2), 190–202.CrossRefGoogle Scholar
  30. 30.
    Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., et al. (2013). The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism. In Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon.Google Scholar
  31. 31.
    Zheng, F., Zhang, G., & Song, Z. (2001). Comparison of different implementations of MFCC. Journal of Computer Science and Technology, 16(6), 582–589.zbMATHCrossRefGoogle Scholar
  32. 32.
    Valstar, M., Schuller, B., Smith, K., Eyben, F., Jiang, B., Bilakhia, S., et al. (2013). AVEC 2013: The continuous audio/visual emotion and depression recognition challenge. In Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge (pp. 3–10). New York: ACM.CrossRefGoogle Scholar
  33. 33.
    Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile: the munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM International Conference on Multimedia (pp. 1459–1462). New York: ACM.Google Scholar
  34. 34.
    Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226–1238.CrossRefGoogle Scholar
  35. 35.
    Özkanca, Y., Demiroglu, C., Besirli, A., & Celik, S. (2018). Multi-lingual depression-level assessment from conversational speech using acoustic and text features. Proceedings of the INTERSPEECH 2018 (pp. 3398–3402).Google Scholar
  36. 36.
    Zhang, Y., Ding, C., & Li, T. (2008). Gene selection algorithm by combining ReliefF and mRMR. BMC Genomics, 9(2), S27.CrossRefGoogle Scholar
  37. 37.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.MathSciNetzbMATHGoogle Scholar
  38. 38.
    Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.Google Scholar
  39. 39.
    Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer.zbMATHGoogle Scholar
  40. 40.
    Breiman, L. (2017). Classification and regression trees. Abingdon: Routledge.CrossRefGoogle Scholar
  41. 41.
    Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.zbMATHCrossRefGoogle Scholar
  42. 42.
    Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.zbMATHCrossRefGoogle Scholar
  43. 43.
    Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232.MathSciNetzbMATHCrossRefGoogle Scholar
  44. 44.
    Suykens, J. A., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293–300.CrossRefGoogle Scholar
  45. 45.
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.zbMATHGoogle Scholar
  46. 46.
    Hsu, C.-W., Chang, C.-C., Lin, C.-J., et al. (2003). A Practical Guide to Support Vector Classification.Google Scholar
  47. 47.
    Franklin, J. (2005). The elements of statistical learning: Data mining, inference and prediction. The Mathematical Intelligencer, 27(2), 83–85.CrossRefGoogle Scholar
  48. 48.
    Gers, F. A., Schmidhuber, J., & Cummins, F. (1999). Learning to Forget: Continual Prediction with LSTM.Google Scholar
  49. 49.
    Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (pp. 1097–1105).Google Scholar
  50. 50.
    Rosenblatt, F. (1961). Principles of neurodynamics. Perceptrons and the theory of brain mechanisms. Tech. rep., Cornell Aeronautical Lab Inc., Buffalo, NY.Google Scholar
  51. 51.
    Pedamonti, D. (2018). Comparison of nonlinear activation functions for deep neural networks on MNIST classification task. Preprint. arXiv:1804.02763.Google Scholar
  52. 52.
    Ruder, S. (2016). An overview of gradient descent optimization algorithms. Preprint. arXiv:1609.04747.Google Scholar
  53. 53.
    Ng, A. Y. (2004). Feature selection, L1 vs. L2 regularization, and rotational invariance. In Proceedings of the Twenty-First International Conference on Machine Learning (p. 78). New York: ACM.Google Scholar
  54. 54.
    Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al. (2016). Tensorflow: A system for large-scale machine learning. In OSDI (Vol. 16, pp. 265–283).Google Scholar
  55. 55.
    Chollet, F., et al. (2015). Keras.Google Scholar
  56. 56.
    Adler, C. H., Beach, T. G., Hentz, J. G., Shill, H. A., Caviness, J. N., Driver-Dunckley, E., et al. (2014). Low clinical diagnostic accuracy of early vs advanced Parkinson disease. Neurology, 83, 406–412.CrossRefGoogle Scholar
  57. 57.
    Schrag, A., Ben-Shlomo, Y., & Quinn, N. (2002). How valid is the clinical diagnosis of Parkinson’s disease in the community? Journal of Neurology, Neurosurgery, and Psychiatry, 73(5), 529–534.CrossRefGoogle Scholar
  58. 58.
    Pittman, B., Hosseini Ghomi, R., & Si, D. (2018). Parkinson’s disease classification of mPower walking activity participants. In IEEE Engineering in Medicine and Biology Conference.Google Scholar
  59. 59.
    Zhang, L., Chen, X., Vakil, A., Byott, A., & Ghom, R. H. (2019). DigiVoice: Voice biomarker featurization and analysis pipeline. Preprint. arXiv:1906.07222.Google Scholar
  60. 60.
    Schwoebel, J. (2019). Introduction to Voice Computing in Python. Scotts Valley: CreateSpace Independent Publishing Platform. Google-Books-ID.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer EngineeringUniversity of PittsburghPittsburghUSA
  2. 2.NeuroLex LaboratoriesSeattleUSA

Personalised recommendations