Abstract
In the last years, literature exhibits successful results in the automatic detection of Parkinson’s disease using voice/speech, especially for patients in medium or late stages of the disorder. By contrast, the prediction of the UPDRS scores -used to assess the severity of the disorder or the efficacy of treatments- has been shown to perform mostly poor. These results could be explained by the need of more complex machine learning models compared to the detection case, and the lack of large databases for properly training artificial intelligence models. To analyse possible solutions to these problems, this work will explore the potentiality of Deep Neural Network (DNN) and Convolutional Neural Network (CNN) models, along transfer learning approaches, for the automatic prediction of the UPDRS scores. Experiments are carried out using feature engineering and feature learning methodologies. In particular for feature engineering, a series of well-know features that are used to characterise vocal conditions are employed to train a DNN. Likewise, the feature learning approach is based on transformation of the input speech using Modulation spectra transformations to train a CNN, considering a transfer learning approach. For transfer learning, the networks will be trained using voice signals from patients of databases of organic and functional voice pathologies; following a network architecture that has been proven successful recently for voice quality assessment using the GRB scale. The approach includes the combination of feature learning and feature engineering approaches using a multimodal strategy. The fine-tuning procedure of the last layers in the second network will be carried out using two databases of PD patients. The results present insights about the potential of deep learning along with transfer learning strategies for the prediction of UPDRS score in parkinsonian speechs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anand, S., Skowronski, M.D., Shrivastav, R., Eddins, D.A.: Perceptual and quantitative assessment of dysphonia across vowel categories. J. Voice 33(4), 473–481 (2019)
Arias-Londoño, J.D., Gómez-García, J.A., Godino-Llorente, J.I.: Multimodal and multi-output deep learning architectures for the automatic assessment of voice quality using the grb scale. IEEE J. Selected Topics Signal Proces. 20(2), 413–422 (2020)
Arias-Londoño, J.D., Godino-Llorente, J.I.: Entropies from markov models as complexity measures of embedded attractors. Entropy 17(6), 3595–3620 (2015)
Arias-Londoño, J.D., Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Castellanos-Domínguez, G.: Automatic detection of pathological voices using complexity measures, noise parameters, and mel-cepstral coefficients. IEEE Trans. Biomed. Eng. 58(2), 370–379 (2011)
Atlas, L., Shamma, S.A.: Joint acoustic and modulation frequency. EURASIP J. Adv. Signal Process. 2003(7), 310290 (2003)
Baccianella, S., Esuli, A., Sebastiani, F.: Evaluation measures for ordinal regression. In: 2009 Ninth International Conference on Intelligent Systems Design and Applications, pp. 283–287. IEEE (2009)
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4945–4949. IEEE (2016)
Bhidayasiri, R., Martinez-Martin, P.: Clinical Assessments in Parkinson’s Disease: Scales and Monitoring, vol. 132. Elsevier Inc., 1 edition (2017)
Cerasa, A.: Machine learning on Parkinson’s disease? Let’s translate into clinical practice. J. Neurosci. Methods 266, 161–162 (2016)
Chen, W., Peng, C., Zhu, X., Wan, B., Wei, D.: SVM-based identification of pathological voices. In: Proceedings of 29th Annual International Conference of the IEEE EMBS, Lyon, France, pp. 3786–3789 (2007)
Cummins, N., Baird, A., Schuller, B.J.: Speech analysis for health: current state-of-the-art and the increasing impact of deep learning. Methods 151, 41–54 (2018)
de Krom, G.: A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals. J. Speech Lang. Hear. Res. 36(2), 254–266 (1993)
De Lau, L.M., Breteler, M.M.: Epidemiology of parkinson’s disease. Lancet Neurol. 5(6), 525–535 (2006)
Espinoza-Cuadros, F., Fernández-Pozo, R., Toledano, D.T., Alcázar-Ramírez, J.D., Lopez-Gonzalo, E., Hernandez-Gomez, L.A.: Reviewing the connection between speech and obstructive sleep apnea. Biomed. Eng. Online 15(1), 20 (2016)
Goetz, C.G., et al.: Movement disorder society-sponsored revision of the unified parkinson’s disease rating scale (mds-updrs): scale presentation and clinimetric testing results. Mov. Disord. 23(15), 2129–2170 (2008)
Gómez-García, J.A., Moro-Velázquez, L., Godino-Llorente, J.I.: On the design of automatic voice condition analysis systems. part i: Review of concepts and an insight to the state of the art. Biomed. Signal Process. Control 51, 181–199 (2019)
Gómez-García, J.A., Moro-Velázquez, L., Godino-Llorente, J.I.: On the design of automatic voice condition analysis systems part ii: Review of speaker recognition techniques and study on the effects of different variability factors. Biomed. Signal Process. Control 48, 128–143 (2019)
Gómez-García, J.A., Moro-Velázquez, L., Mendes-Laureano, J., Castellanos-Domínguez, G., Godino-Llorente, J.I.: Emulating the perceptual capabilities of a human evaluator to map the GRB scale for the assessment of voice disorders. Eng. Appl. Artif. Intell. 82, 236–251 (2019)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems. pp. 2672–2680 (2014)
Gustavsson, A., et al.: Cost of disorders of the brain in Europe 2010. European Neuropsychopharmacology 21(10), 718–779 (2011)
Hentz, J.G., Mehta, S.H., Shill, H.A., Driver-Dunckley, E., Beach, T.G., Adler, C.H.: Simplified conversion method for unified parkinson’s disease rating scale motor examinations. Mov. Disord. 30(14), 1967–1970 (2015)
Hinton, D., et al.: Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Hughes, A.J., Daniel, S.E., Ben-Shlomo, Y., Lees, A.J.: The accuracy of diagnosis of parkinsonian syndromes in a specialist movement disorder service. Brain 125(4), 861–870 (2002)
Kasuya, H., Ogawa, S., Mashima, K., Ebihara, S.: Normalized noise energy as an acoustic measure to evaluate pathologic voice. J. Acoust. Soc. Am. 80, 1329–1334 (1986)
Little, M.A., McSharry, P.E., Roberts, S.J., Costello, D.A., Moroz, I.M.: Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection. Biomed. Eng. Online 6(23), (2007)
Markaki, M., Stylianou, Y.: Voice pathology detection and discrimination based on modulation spectral features. IEEE Trans. Audio Speech Lang. Process. 19(7), 1938–1948 (2011)
Martínez-Martín, P., et al.: Parkinson’s disease severity levels and mds-unified parkinson’s disease rating scale. Parkinsonism Rel. Disord. 21(1), 50–54 (2015)
Michaelis, D., Gramss, T., Strube, H.W.: Glottal-to-noise excitation ratio - a new measure for describing pathological voices. Acustica/Acta Acustica 83, 700–706 (1997)
Moro-Velázquez, L., Gómez-García, J.A., Godino-Llorente, J.I.: Voice pathology detection using modulation spectrum-optimized metrics. Front. Bioeng. Biotechnol. 4(1) (2016)
Moro-Velázquez, L., Gómez-García, J.A., Godino-Llorente, J.I., Andrade-Miranda, G.: Modulation spectra morphological parameters: a new method to assess voice pathologies according to the GRBAS scale. BioMed. Res. Int. 2015 (2015)
Moro-Velazquez, L., Gómez-García, J.A., Godino-Llorente, J.I., Grandas-Perez, F., Shattuck-Hufnagel, S., Yagüe-Jimenez, V., Dehak, N.: Phonetic relevance and phonemic grouping of speech in the automatic detection of parkinson’s disease. Scientific Reports 9(1), 1–16 (2019)
Moro-Velazquez, L., Gomez-Garcia, J.A., Godino-Llorente, J.I., Villalba, J., Orozco-Arroyave, J.R., Dehak, N.: Analysis of speaker recognition methodologies and the influence of kinetic changes to automatically detect parkinsonś disease. Appl. Soft Comput. 62, 649–666 (2018)
Moro-Velazquez, L., et al.: A forced gaussians based methodology for the differential evaluation of parkinson’s disease by means of speech processing. Biomed. Signal Process. Control 48, 205–220 (2019)
Oktay, A.B., Kocer, A.: Differential diagnosis of parkinson and essential tremor with convolutional lstm networks. Biomed. Signal Process. Control 56, 101683 (2020)
Orozco-Arroyave, J.R., Arias-Londoño, J.D., Vargas-Bonilla, J.F., Gonzalez-Rátiva, M.C., Nöth, E.: New spanish speech corpus database for the analysis of people suffering from parkinson’s disease, pp. 342–347 (2014)
Patel, S., Parveen, S., Anand, S.: Prosodic changes in parkinson’s disease. J. Acoust. Soc. Am. 140(4), 3442–3442 (2016)
Pfeiffer, R.F., Wszolek, Z.K., Ebadi, M.: Parkinson’s Disease. CRC Press (2013)
Pincus, S.M.: Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. 88, 2297–2301 (1991)
Povey, D.: The kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society (2011) IEEE Catalog No.: CFP11SRW-USB
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)
Richman, J.S., Moorman, J.R.: Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 278(6), 2039–2049 (2000)
Rusz, J., Cmejla, R., Ruzickova, H., Ruzicka, E.: Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated parkinson’s disease. J. Acoust. Soc. Am. 129(1), 350–367 (2011)
Shinde, S.: Predictive markers for Parkinson’s disease using deep neural nets on neuromelanin sensitive MRI. Neuroimage: Clinical 22, 101748 (2019)
Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Bocklet, T., Nöth, E.: Towards an automatic evaluation of the dysarthria level of patients with parkinson’s disease. J. Commun. Disord. 76, 21–36 (2018)
Xie, H.-B., He, W.-X., Liu, H.: Measuring time series regularity using nonlinear similarity-based sample entropy. Phys. Lett. A 372(48), 7140–7146 (2008)
Xu, L.S., Wang, K.Q., Wang, L.: Gaussian kernel approximate entropy algorithm for analyzing irregularity of time series. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, China, pp. 5605–5608 (2005)
Zanin, M., Zunino, L., Rosso, O.A., Papo, D.: Permutation entropy and its main biomedical and econophysics applications: a review. Entropy 14(12), 1553–1577 (2012)
Zhang, Z., Cummins, N., Schuller, B.: Advanced data exploitation in speech analysis: an overview. IEEE Signal Process. Mag. 34(4), 107–129 (2017)
Zhao, J., Mao, X., Chen, L.: Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomed. Signal Process. Control 47, 312–323 (2019)
Acknowledgements
This work was supported by the Universidad de Antioquia, Medellín, Colombia, and the Ministry of Economy and Competitiveness of Spain under grant DPI2017-83405-R1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Arias-Londoño, J.D., Gómez-García, J.A. (2020). Predicting UPDRS Scores in Parkinson’s Disease Using Voice Signals: A Deep Learning/Transfer-Learning-Based Approach. In: Godino-Llorente, J.I. (eds) Automatic Assessment of Parkinsonian Speech. AAPS 2019. Communications in Computer and Information Science, vol 1295. Springer, Cham. https://doi.org/10.1007/978-3-030-65654-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-65654-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65653-9
Online ISBN: 978-3-030-65654-6
eBook Packages: Computer ScienceComputer Science (R0)