Language Independent Detection Possibilities of Depression by Speech

Kiss, Gábor; Tulics, Miklós Gábriel; Sztahó, Dávid; Esposito, Anna; Vicsi, Klára

doi:10.1007/978-3-319-28109-4_11

Gábor Kiss¹⁰,
Miklós Gábriel Tulics¹⁰,
Dávid Sztahó¹⁰,
Anna Esposito¹¹ &
…
Klára Vicsi¹⁰

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 48))

968 Accesses
12 Citations

Abstract

In this study, acoustic-phonetic analysis of continuous speech and statistical analyses were performed in order to find parameters in depressed speech that show significant differences compared to a healthy reference group. Read speech materials were gathered in the Hungarian and Italian languages from both healthy people and patients diagnosed with different degrees of depression. By statistical examination it was found that there are many parameters in the speech of depressed people that show significant differences compared to a healthy reference group. Moreover, most of those parameters behave similarly in other languages such as in Italian. For classification of the healthy and depressed speech, these parameters were used as an input for the classifiers. Two classification methods were compared: Support Vector Machine (SVM) and a two-layer feed-forward neural network (NN). No difference was found between the results of the two methods when trained and tested on Hungarian language (both SVM and NN classification accuracy was 75 %). In the case of training with Hungarian and testing with Italian healthy and depressed speech both classifiers reached 77 % of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ivry, R.B., Justus, T.C., Middleton, C.: The cerebellum, timing, and language: implications for the study of dyslexia. In: Wolf, M. (ed.), Dyslexia Fluency and the Brain, pp. 198–211. York Press, Timonium, MD (2001)
Google Scholar
Esposito, A., Bourbakis, N.: The role of timing in speech perception and speech production processes and its effects on language impaired individuals. In: Sixth IEEE Symposium on BioIn-formatics and BioEngineering, 2006. BIBE 2006. IEEE (2006)
Google Scholar
Vicsi, K., Sztahó, D.: Problems of the automatic emotion recognitions in spontaneous speech; an example for the recognition in a dispatcher center. In: Esposito, A., Martone, R., Müller, V., Scarpetta, G. (eds.) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues, vol. 6456, pp. 331–339. Springer, Heidelberg (2011)
Google Scholar
Tóth, S.Z.L., Sztahó, D., Vicsi, K.: Speech emotion perception by human and machine. In: Proceeding of COST Action 2102 International Conference: Revised Papers in Verbal and Nonverbal Parameters of Human-Human and Human-Machine Interaction, pp. 213–224. Springer, Patras (2007)
Google Scholar
Askenfelt, A., Sjoelin, A.: Voice analysis in depressed patients: rate of change of fundamental frequency related to mental state. Speech Transmission Laboratory—Quarterly Progress and Status Report, pp. 71–84. Royal Institute of Technology, Stockholm (1980)
Google Scholar
Daniel, J., at all: Acoustical properties of speech as indicators of depression and suicidal risk. In: IEEE Transactions on Biomedical Engineering, vol. 47, no. 7 (2000)
Google Scholar
Thaweesak, Y., et al.: Characterizing sub-band spectral entropy based acoustics as assessment of vocal correlate of depression. In: International Conference on Control, Automation and Systems, 27–30 Oct 2010
Google Scholar
Terapong, B., et al.: Assessment of vocal correlates of clinical depression in female subjects with probabilistic mixture modeling of speech cepstrum. In: 2011 11th International Conference on Control, Automation and Systems, 26–29 Oct 2011
Google Scholar
James, C., et al.: Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J. Neurolinguistics (2007)
Google Scholar
Elliot, M., et al.: Investigating the role of glottal parameters in classifying clinical depression. In: Proceedings of the 25th Annual International Conference of the IEEE, pp. 2849–2852 (2003)
Google Scholar
Sanchez, M.H., et al.: Using prosodic and spectral parameters in detecting depression in elderly males. INTERSPEECH 2011. Florence, Italy, 27–31 Aug 2011
Google Scholar
Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Breakspear, M., Parker, G.: Detecting depression—a comparison between spontaneous and read speech. In: 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013)
Google Scholar
Helfer, B.S., Quatieri, T.F., Williamson, J.R., Mehta, D.D., Horwitz, R., Yu, B.: Classification of depression state based on articulatory precision. In: 14th Annual Conference of the International Speech Communication Association (2013)
Google Scholar
Mundt, J.C., Snyder, P.J., Cannizzaro, M.S., Chappie, K., Geralts, D.S.: Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J. Neurolinguistics (2007)
Google Scholar
Kiss, G., Vicsi, K.: Physiological and cognitive status monitoring on the base of acoustic-phonetic speech parameters. In: Besacier, L., Dediu, A.-H., Martín-Vide, C. (eds.) Lecture Notes in Computer Science: Statistical Language and Speech Processing. Grenoble, France, 14–16 Oct 2014
Google Scholar
Abela, J.R.Z., D’Allesandro, D.U.: Beck’s cognitive theory of depression: the diathesis-stress and causal mediation components. Br. J. Clin. Psychol. 41, 111–128 (2002)
Article Google Scholar
Kiss, G., Sztahó, D., Vicsi, K.: Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features. In: 4th IEEE International Conference on Cognitive Infocommunications—CogInfoCom 2013. Budapest, Hungary, 2013.12.02–2013.12.06
Google Scholar
Kiss, G., Sztahó, D., Vicsi, K., Golemis, A.: Connection between body condition and speech parameters—especially in the case of hypoxia. In: 5th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2014), pp. 333–336. Vietri, Italy, 05–07 Nov 2014
Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2:27:1–27:27 (2011). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Google Scholar
Beale, M.H., Hagan, M.T., Demuth, H.B.: Neural Network Toolbox, User’s Guide. The Mathworks Inc. (2010)
Google Scholar
DeVault, D., Georgila, K., Artstein, R., Morbini, F., Traum, D., Scherer, S., Skip Rizzo, A., Morency, L.-P.: Verbal indicators of psychological distress in interactive dialogue with a virtual human. In: The 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SigDial 2013), pp. 193–202. Metz, France, Aug 2013
Google Scholar

Download references

Acknowledgments

The authors would like to thank European Space Agency COALA project: Psychological Status Monitoring by Computerised Analysis of Language phenomena (COALA) (AO-11-Concordia).

Author information

Authors and Affiliations

Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
Gábor Kiss, Miklós Gábriel Tulics, Dávid Sztahó & Klára Vicsi
Department of Psychology and IIASS, Seconda Università di Napoli, Caserta, Italy
Anna Esposito

Authors

Gábor Kiss
View author publications
You can also search for this author in PubMed Google Scholar
Miklós Gábriel Tulics
View author publications
You can also search for this author in PubMed Google Scholar
Dávid Sztahó
View author publications
You can also search for this author in PubMed Google Scholar
Anna Esposito
View author publications
You can also search for this author in PubMed Google Scholar
Klára Vicsi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gábor Kiss .

Editor information

Editors and Affiliations

Department of Psychology, Seconda Università di Napoli and IIASS, Caserta, Italy
Anna Esposito
(Pompeu Fabra University), Escola Superior Politècnica Tecnocampus, Mataró, Spain
Marcos Faundez-Zanuy
sezione di Napoli Osservatorio, Istituto Nazionale di Geofisica e Vulcan, Napoli, Italy
Antonietta M. Esposito
Department of Psychology, Seconda Universita di Napoli and IIASS, Caserta, Italy
Gennaro Cordasco
Boulevard Dolez, University of Mons, TCTS Lab.31, Mons, Belgium
Thomas Drugman
Data and Signal Processing Research Grou, University of Vic, Vic, Spain
Jordi Solé-Casals
NeuroLab, Università degli Studi "Mediterranea" di, Reggio Calabria, Italy
Francesco Carlo Morabito

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kiss, G., Tulics, M.G., Sztahó, D., Esposito, A., Vicsi, K. (2016). Language Independent Detection Possibilities of Depression by Speech. In: Esposito, A., et al. Recent Advances in Nonlinear Speech Processing. Smart Innovation, Systems and Technologies, vol 48. Springer, Cham. https://doi.org/10.1007/978-3-319-28109-4_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-28109-4_11
Published: 23 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28107-0
Online ISBN: 978-3-319-28109-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics