Skip to main content

Language Independent Detection Possibilities of Depression by Speech

  • Chapter
  • First Online:
Recent Advances in Nonlinear Speech Processing

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 48))


In this study, acoustic-phonetic analysis of continuous speech and statistical analyses were performed in order to find parameters in depressed speech that show significant differences compared to a healthy reference group. Read speech materials were gathered in the Hungarian and Italian languages from both healthy people and patients diagnosed with different degrees of depression. By statistical examination it was found that there are many parameters in the speech of depressed people that show significant differences compared to a healthy reference group. Moreover, most of those parameters behave similarly in other languages such as in Italian. For classification of the healthy and depressed speech, these parameters were used as an input for the classifiers. Two classification methods were compared: Support Vector Machine (SVM) and a two-layer feed-forward neural network (NN). No difference was found between the results of the two methods when trained and tested on Hungarian language (both SVM and NN classification accuracy was 75 %). In the case of training with Hungarian and testing with Italian healthy and depressed speech both classifiers reached 77 % of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. Ivry, R.B., Justus, T.C., Middleton, C.: The cerebellum, timing, and language: implications for the study of dyslexia. In: Wolf, M. (ed.), Dyslexia Fluency and the Brain, pp. 198–211. York Press, Timonium, MD (2001)

    Google Scholar 

  2. Esposito, A., Bourbakis, N.: The role of timing in speech perception and speech production processes and its effects on language impaired individuals. In: Sixth IEEE Symposium on BioIn-formatics and BioEngineering, 2006. BIBE 2006. IEEE (2006)

    Google Scholar 

  3. Vicsi, K., Sztahó, D.: Problems of the automatic emotion recognitions in spontaneous speech; an example for the recognition in a dispatcher center. In: Esposito, A., Martone, R., Müller, V., Scarpetta, G. (eds.) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues, vol. 6456, pp. 331–339. Springer, Heidelberg (2011)

    Google Scholar 

  4. Tóth, S.Z.L., Sztahó, D., Vicsi, K.: Speech emotion perception by human and machine. In: Proceeding of COST Action 2102 International Conference: Revised Papers in Verbal and Nonverbal Parameters of Human-Human and Human-Machine Interaction, pp. 213–224. Springer, Patras (2007)

    Google Scholar 

  5. Askenfelt, A., Sjoelin, A.: Voice analysis in depressed patients: rate of change of fundamental frequency related to mental state. Speech Transmission Laboratory—Quarterly Progress and Status Report, pp. 71–84. Royal Institute of Technology, Stockholm (1980)

    Google Scholar 

  6. Daniel, J., at all: Acoustical properties of speech as indicators of depression and suicidal risk. In: IEEE Transactions on Biomedical Engineering, vol. 47, no. 7 (2000)

    Google Scholar 

  7. Thaweesak, Y., et al.: Characterizing sub-band spectral entropy based acoustics as assessment of vocal correlate of depression. In: International Conference on Control, Automation and Systems, 27–30 Oct 2010

    Google Scholar 

  8. Terapong, B., et al.: Assessment of vocal correlates of clinical depression in female subjects with probabilistic mixture modeling of speech cepstrum. In: 2011 11th International Conference on Control, Automation and Systems, 26–29 Oct 2011

    Google Scholar 

  9. James, C., et al.: Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J. Neurolinguistics (2007)

    Google Scholar 

  10. Elliot, M., et al.: Investigating the role of glottal parameters in classifying clinical depression. In: Proceedings of the 25th Annual International Conference of the IEEE, pp. 2849–2852 (2003)

    Google Scholar 

  11. Sanchez, M.H., et al.: Using prosodic and spectral parameters in detecting depression in elderly males. INTERSPEECH 2011. Florence, Italy, 27–31 Aug 2011

    Google Scholar 

  12. Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Breakspear, M., Parker, G.: Detecting depression—a comparison between spontaneous and read speech. In: 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013)

    Google Scholar 

  13. Helfer, B.S., Quatieri, T.F., Williamson, J.R., Mehta, D.D., Horwitz, R., Yu, B.: Classification of depression state based on articulatory precision. In: 14th Annual Conference of the International Speech Communication Association (2013)

    Google Scholar 

  14. Mundt, J.C., Snyder, P.J., Cannizzaro, M.S., Chappie, K., Geralts, D.S.: Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J. Neurolinguistics (2007)

    Google Scholar 

  15. Kiss, G., Vicsi, K.: Physiological and cognitive status monitoring on the base of acoustic-phonetic speech parameters. In: Besacier, L., Dediu, A.-H., Martín-Vide, C. (eds.) Lecture Notes in Computer Science: Statistical Language and Speech Processing. Grenoble, France, 14–16 Oct 2014

    Google Scholar 

  16. Abela, J.R.Z., D’Allesandro, D.U.: Beck’s cognitive theory of depression: the diathesis-stress and causal mediation components. Br. J. Clin. Psychol. 41, 111–128 (2002)

    Article  Google Scholar 

  17. Kiss, G., Sztahó, D., Vicsi, K.: Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features. In: 4th IEEE International Conference on Cognitive Infocommunications—CogInfoCom 2013. Budapest, Hungary, 2013.12.02–2013.12.06

    Google Scholar 

  18. Kiss, G., Sztahó, D., Vicsi, K., Golemis, A.: Connection between body condition and speech parameters—especially in the case of hypoxia. In: 5th IEEE International Conference on Cognitive Infocommunications (CogInfoCom 2014), pp. 333–336. Vietri, Italy, 05–07 Nov 2014

    Google Scholar 

  19. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2:27:1–27:27 (2011). Software available at

    Google Scholar 

  20. Beale, M.H., Hagan, M.T., Demuth, H.B.: Neural Network Toolbox, User’s Guide. The Mathworks Inc. (2010)

    Google Scholar 

  21. DeVault, D., Georgila, K., Artstein, R., Morbini, F., Traum, D., Scherer, S., Skip Rizzo, A., Morency, L.-P.: Verbal indicators of psychological distress in interactive dialogue with a virtual human. In: The 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SigDial 2013), pp. 193–202. Metz, France, Aug 2013

    Google Scholar 

Download references


The authors would like to thank European Space Agency COALA project: Psychological Status Monitoring by Computerised Analysis of Language phenomena (COALA) (AO-11-Concordia).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Gábor Kiss .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Kiss, G., Tulics, M.G., Sztahó, D., Esposito, A., Vicsi, K. (2016). Language Independent Detection Possibilities of Depression by Speech. In: Esposito, A., et al. Recent Advances in Nonlinear Speech Processing. Smart Innovation, Systems and Technologies, vol 48. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28107-0

  • Online ISBN: 978-3-319-28109-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics