Skip to main content

Advertisement

Log in

D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Nowadays, depression heavily affects humans’ physical and mental health. Depression occurs due to changes in mood, loss of interest, and stress, which leads to self-harm events and suicide. Thus analyzing depression is very important to reduce suicidal acts. In recent years, automatic depression evaluation has been developed in computer vision technology. Several models were investigated for depression analysis, but they are limited only to video and audio data analysis. In this paper, hybrid Artificial Intelligence (AI) based Multi-modal depression analysis was proposed in which the severity of depression from multi-modal data such as video, audio and text descriptors are extracted. Initially, the proposed approach estimates the Patient Health Questionnaire (PHQ) depression scale by a hybrid framework Residual Network based Deep Neural Network (D-ResNet), which computes the PHQ-8 score from video and audio features. Then, Paragraph Vector Kernel Extreme Learning Machine (PV-KELM) is developed to infer the mental and physical states of the individuals related to the psychoanalytic features of depression. It recognizes the absence (or) presence of the measured psychoanalytic symptoms. Finally, the estimated PHQ-8 score and psychoanalytic symptoms are extracted from the Residual Network based Deep Neural Network and the Paragraph Vector based Kernel Extreme Learning Machine, which is fed together into the ensemble classifier. In the ensemble classifier, three classifiers are used, namely Support Vector Machine (SVM), Naive-Bayes (NB), and Decision Tree (DT) classifier, to classify whether the individual is depressed or not. The proposed approach is implemented in PYTHON software, and the experiments will be carried out using the Distress Analysis Interview Corpus-Wizard of -OZ interview depression dataset. By using the proposed approach, the accuracy, precision, recall, F-measure, RMSE, MAE, JSD and Contextual similarity obtained are 0.89, 0.86, 0.86 and 0.86, 0.373, 0.35, 0.355 and 0.689 respectively. Our proposed approach has been compared with the state-of-the-art approaches, and the performance result shows the efficiency of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article.

References

  1. Alakus TB, Turkoglu I (2020) Comparison of deep learning approaches to predict COVID-19 infection. Chaos, Solitons Fractals 140:110120

    Article  MathSciNet  Google Scholar 

  2. Aloshban N, Esposito A, Vinciarelli A (2021) Language or Paralanguage, This is the Problem: Comparing Depressed and Non-Depressed Speakers Through the Analysis of Gated Multimodal Units Proc Interspeech 2021: 2496–2500.

  3. Ansari H, Vijayvergia A and Kumar K (2018) Dcr-hmm: Depression detection based on content rating using hidden markov model. In 2018 Conference on information and communication technology (CICT), IEEE 1–6.

  4. Avots E, Jermakovs K, Bachmann M, Päeske L, Ozcinar C, Anbarjafari G (2021) Ensemble approach for detection of depression using EEG features. arXiv preprint arXiv:2103.08467.

  5. Cai H, Qu Z, Li Z, Zhang Y, Hu X, Hu B (2020) Feature-level fusion approaches based on multi-modal EEG data for depression recognition. Inform Fusion 59:127–138

    Article  Google Scholar 

  6. Ceccarelli F, Mahmoud M (2021) Multimodal temporal machine learning for bipolar disorder and depression recognition. Pattern Anal Applic 1–12.

  7. Chen Q, Chaturvedi I, Ji S, Cambria E (2021) Sequential fusion of facial appearance and dynamics for depression recognition. Pattern Recogn Lett 150:115–121

    Article  Google Scholar 

  8. Chiu CY, Lane HY, Koh JL, Chen AL (2021) Multimodal depression detection on instagram considering time interval of posts. J Intell Inf Syst 56(1):25–47

    Article  Google Scholar 

  9. Chow YY, Verdonschot M, McEvoy CT, Peeters G (2022) Associations between depression and cognition, mild cognitive impairment and dementia in persons with diabetes mellitus: a systematic review and meta-analysis.” Diabetes Research and Clinical Practice, Elsevier 109227.

  10. Churi H, Keshri P, Khamkar S, Sankhe A (2021) A deep learning approach for depression classification using audio features.

  11. Cohn JF, Kruez TS, Matthews I, Yang Y, Nguyen MH, Padilla MT, De la Torre F (2009) Detecting depression from facial actions and vocal prosody. In 2009 3rd international conference on affective computing and intelligent interaction and workshops, IEEE 1-7.

  12. Cohn JF, Cummins N, Epps J, Goecke R, Joshi J, Scherer S (2018) Multi-modal assessment of depression from behavioral signals. In The Handbook of Multimodal-Multisensor Interfaces: Signal Processing, Architectures, and Detection of Emotion and Cognition-Volume 2: 375–417.

  13. Dai Z, Zhou H, Ba Q, Zhou Y, Wang L, Li G (2021) Improving depression prediction using a novel feature selection algorithm coupled with context-aware analysis. J Affect Disord 295:1040–1048

    Article  Google Scholar 

  14. Das NN, Kumar N, Kaur M, Kumar V, Singh D (2020) Automated deep transfer learning-based approach for detection of COVID-19 infection in chest X-rays. Elsevier, Irbm

    Google Scholar 

  15. Francese R, Attanasio P (2021) Supporting depression screening with multi-modal emotion detection. In CHItaly 2021: 14th biannual conference of the Italian SIGCHI chapter 1-8.

  16. Gao S, Calhoun VD, Sui J (2018) Machine learning in major depression: from classification to treatment outcome prediction. CNS NeurosciTherapeut 24(11):1037–1052

    Google Scholar 

  17. Gray JP, Müller VI, Eickhoff SB, Fox PT (2020) Multi-modal abnormalities of brain structure and function in major depressive disorder: a meta-analysis of neuroimaging studies. Am J Psychiatr 177(5):422–434

    Article  Google Scholar 

  18. Gui T, Zhu L, Zhang Q, Peng M, Zhou X, Ding K, Chen Z (2019) Cooperative multi-modal approach to depression detection in twitter. Proc AAAI Conf Artificial Intell 33(01):110–117

    Google Scholar 

  19. Islam M, Kabir MA, Ahmed A, Kamal ARM, Wang H, Ulhaq A (2018) Depression detection from social network data using machine learning techniques. Health Inform Sci Syst 6(1):1–12

    Google Scholar 

  20. Kwon I, Jo G, Shin K-S (2021) A deep neural network based on ResNet for predicting solutions of Poisson–Boltzmann equation. Electronics. Researchgate.net, 10(21): 2627.

  21. Lalousis PA, Wood SJ, Schmaal L, Chisholm K, Griffiths SL, Reniers RL, PRONIA Consortium (2021) Heterogeneity and classification of recent onset psychosis and depression: a multi-modal machine learning approach. Schizophr Bull 47:1130–1140

    Article  Google Scholar 

  22. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In International conference on machine learning, PMLR, 1188–1196.

  23. Li M, Cao L, Zhai Q, Li P, Liu S, Li R, Lu S (2020) Method of Depression classification based on behavioral and physiological signals of eye movement Complexity, 2020

  24. Liu X, Li L, Li M, Ren Z, Ma P (2021) Characterizing the subtype of anhedonia in major depressive disorder: a symptom-specific multi-modal MRI study. Psychiatry Res Neuroimaging 308:111239

    Article  Google Scholar 

  25. Malhotra A, Jindal R (2020) Multi-modal deep learning based framework for detecting depression and suicidal behaviour by affective analysis of social media posts. EAI Endorsed Transac Pervasive Health Technol 6(21):e1

    Google Scholar 

  26. Mann P, Paes A, Matsushima EH (2020) See and read: detecting depression symptoms in higher education students using multi-modal social media data. In Proceedings of the International AAAI Conference on Web and social media 14: 440–451.

  27. Meng Y, Speier W, Ong MK, Arnold CW (2021) Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression. IEEE Journal of Biomedical and Health Informatics.

  28. Morales MR (2018) Multimodal depression detection: an investigation of features and fusion techniques for automated systems. City University of New York.

  29. Morales M, Scherer S, Levitan R (2018) A linguistically-informed fusion approach for multi-modal depression detection. In proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic 13-24.

  30. Nikolin S, Tan YY, Schwaab A, Moffa A, Loo CK and Martin D (2021) An investigation of working memory deficits in depression using the n-back task: a systematic review and meta-analysis. J Affective Disord, Elsevier, 284: 1–8.

  31. Pinto G, Carvalho JM, Barros F, Soares SC, Pinho AJ, Brás S (2020) Multi-modal emotion evaluation: a physiological model for cost-effective emotion classification. Sensors 20(12):3510

    Article  Google Scholar 

  32. Qureshi SA, Saha S, Hasanuzzaman M, Dias G (2019) Multitask representation learning for multi-modal estimation of depression level. IEEE Intell Syst 34(5):45–52

    Article  Google Scholar 

  33. Rohanian M, Hough J, Purver M (2019) Detecting depression with word-level multimodal fusion. In Interspeech 1443–1447.

  34. Rutowski T, Harati A, Lu Y, & Shriberg E (2019) Optimizing speech-input length for speaker-independent depression classification. In INTERSPEECH 3023–3027.

  35. Shalu H, CN HS, Das A, Majumder S, Datar A, MS SM, Kadiwala J (2020) Depression status estimation by deep learning based hybrid multi-modal fusion model. arXiv preprint arXiv:2011.14966.

  36. Sharma S, Kumar K, Singh N (2017) D-FES: deep facial expression recognition system. In 2017 conference on information and communication technology (CICT), IEEE1-6.

  37. Sharma S, Kumar P, Kumar K (2017) LEXER: lexicon based emotion analyzer. In international conference on pattern recognition and machine intelligence, springer, Cham 373-379.

  38. Sharma S, Shivhare SN, Singh N, Kumar K (2019) Computationally efficient ann model for small-scale problems. In Machine intelligence and signal analysis, Springer, Singapore 423–435.

  39. Sharma S, Kumar K and Singh N (2020) Deep eigen space based ASL recognition system. IETE Journal of Research, Taylor and Francis, 1–11.

  40. Shi Y, Song R, Wang Z, Zhang H, Zhu J, Yue Y, Zhao Y, Zhang Z (2021) Potential clinical value of circular RNAs as peripheral biomarkers for the diagnosis and treatment of major depressive disorder. EBio Med 66:103337

    Google Scholar 

  41. Shrestha A, Serra E, Spezzano F (2020) Multi-modal social and psycho-linguistic embedding via recurrent neural networks to identify depressed users in online forums. Network Model Analy Health Inform Bioinform 9(1):1–11

    Google Scholar 

  42. Singh H, Dhanak N, Ansari H and Kumar K (2017) HDML: habit detection with machine learning. In proceedings of the 7th international conference on computer and communication technology, 29-33.

  43. Solieman H, Pustozerov EA (2021) The Detection of Depression Using Multi-modal Models Based on Text and Voice Quality Features. In 2021 IEEE conference of Russian young researchers in electrical and electronic engineering (ElConRus), IEEE 1843–1848.

  44. Vidal-Ribas P, Janiri D, Doucet GE, Pornpattananangkul N, Nielson DM, Frangou S, Stringaris A (2021) Multimodal neuroimaging of suicidal thoughts and behaviors in a US population-based sample of school-age children. Am J Psychiatr 178(4):321–332

    Article  Google Scholar 

  45. Vijayvergia A, Kumar K (2018) STAR: rating of reviewS by exploiting variation in emotions using trAnsferleaRning framework. In 2018 conference on information and communication technology (CICT), IEEE 1-6.

  46. Vijayvergia A, Kumar K (2021) Selective shallow models strength integration for emotion detection using GloVe and LSTM. Multimedia Tools App, Springer 80(18):28349–28363

    Article  Google Scholar 

  47. Villatoro-Tello E, Ramírez-de-la-Rosa G, Gática-Pérez D, Magimai-Doss M, Jiménez-Salazar H (2021) Approximating the Mental Lexicon from Clinical Interviews as a Support Tool for Depression Detection. In Proceedings of the 2021 International Conference on Multimodal Interaction 557–566.

  48. Yang L, Jiang D, Sahli H (2018) Integrating deep and shallow models for multi-modal depression analysis—hybrid architectures. IEEE Trans Affect Comput

  49. Yazdavar AH, Mahdavinejad MS, Bajaj G, Romine W, Sheth A, Monadjemi AH, Hitzler P (2020) Multimodal mental health analysis in social media. PLoS One 15(4):e0226248

    Article  Google Scholar 

  50. Zhang X, Shen J, ud Din Z, Liu J, Wang G, Hu B (2019) Multimodal depression detection: fusion of electroencephalography and paralinguistic behaviors using a novel strategy for classifier ensemble. IEEE J Biomed Health Inform 23(6):2265–2275

    Article  Google Scholar 

  51. Zheng W, Yan L, Gou C, Wang FY (2020). Graph attention model embedded with multi-modal knowledge for depression detection. In 2020 IEEE international conference on multimedia and expo (ICME), IEEE 1-6.

  52. Zhou X, Huang P, Liu H, &Niu S (2019) Learning content-adaptive feature pooling for facial depression recognition in videos. Electron Lett 55(11): 648–650.

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors have equal contributions in this work.

Corresponding author

Correspondence to Swasthika Jain T J.

Ethics declarations

Conflict of interest

Authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

All the authors involved have agreed to participate in this submitted article.

Consent to publish

All the authors involved in this manuscript give full consent for publication of this submitted article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

T J, S.J., Jacob, I.J. & Mandava, A.K. D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis. Multimed Tools Appl 82, 25973–26004 (2023). https://doi.org/10.1007/s11042-023-14351-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14351-y

Keywords

Navigation