D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis

T J, Swasthika Jain; Jacob, I. Jeena; Mandava, Ajay Kumar

doi:10.1007/s11042-023-14351-y

D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis

Published: 11 January 2023

Volume 82, pages 25973–26004, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Swasthika Jain T J¹,
I. Jeena Jacob¹ &
Ajay Kumar Mandava²

575 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Nowadays, depression heavily affects humans’ physical and mental health. Depression occurs due to changes in mood, loss of interest, and stress, which leads to self-harm events and suicide. Thus analyzing depression is very important to reduce suicidal acts. In recent years, automatic depression evaluation has been developed in computer vision technology. Several models were investigated for depression analysis, but they are limited only to video and audio data analysis. In this paper, hybrid Artificial Intelligence (AI) based Multi-modal depression analysis was proposed in which the severity of depression from multi-modal data such as video, audio and text descriptors are extracted. Initially, the proposed approach estimates the Patient Health Questionnaire (PHQ) depression scale by a hybrid framework Residual Network based Deep Neural Network (D-ResNet), which computes the PHQ-8 score from video and audio features. Then, Paragraph Vector Kernel Extreme Learning Machine (PV-KELM) is developed to infer the mental and physical states of the individuals related to the psychoanalytic features of depression. It recognizes the absence (or) presence of the measured psychoanalytic symptoms. Finally, the estimated PHQ-8 score and psychoanalytic symptoms are extracted from the Residual Network based Deep Neural Network and the Paragraph Vector based Kernel Extreme Learning Machine, which is fed together into the ensemble classifier. In the ensemble classifier, three classifiers are used, namely Support Vector Machine (SVM), Naive-Bayes (NB), and Decision Tree (DT) classifier, to classify whether the individual is depressed or not. The proposed approach is implemented in PYTHON software, and the experiments will be carried out using the Distress Analysis Interview Corpus-Wizard of -OZ interview depression dataset. By using the proposed approach, the accuracy, precision, recall, F-measure, RMSE, MAE, JSD and Contextual similarity obtained are 0.89, 0.86, 0.86 and 0.86, 0.373, 0.35, 0.355 and 0.689 respectively. Our proposed approach has been compared with the state-of-the-art approaches, and the performance result shows the efficiency of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video

Article 03 May 2024

Harnessing emotions for depression detection

Article 09 September 2021

Multimodal Depression Severity Detection Using Deep Neural Networks and Depression Assessment Scale

Data availability

Data sharing is not applicable to this article.

References

Alakus TB, Turkoglu I (2020) Comparison of deep learning approaches to predict COVID-19 infection. Chaos, Solitons Fractals 140:110120
Article MathSciNet Google Scholar
Aloshban N, Esposito A, Vinciarelli A (2021) Language or Paralanguage, This is the Problem: Comparing Depressed and Non-Depressed Speakers Through the Analysis of Gated Multimodal Units Proc Interspeech 2021: 2496–2500.
Ansari H, Vijayvergia A and Kumar K (2018) Dcr-hmm: Depression detection based on content rating using hidden markov model. In 2018 Conference on information and communication technology (CICT), IEEE 1–6.
Avots E, Jermakovs K, Bachmann M, Päeske L, Ozcinar C, Anbarjafari G (2021) Ensemble approach for detection of depression using EEG features. arXiv preprint arXiv:2103.08467.
Cai H, Qu Z, Li Z, Zhang Y, Hu X, Hu B (2020) Feature-level fusion approaches based on multi-modal EEG data for depression recognition. Inform Fusion 59:127–138
Article Google Scholar
Ceccarelli F, Mahmoud M (2021) Multimodal temporal machine learning for bipolar disorder and depression recognition. Pattern Anal Applic 1–12.
Chen Q, Chaturvedi I, Ji S, Cambria E (2021) Sequential fusion of facial appearance and dynamics for depression recognition. Pattern Recogn Lett 150:115–121
Article Google Scholar
Chiu CY, Lane HY, Koh JL, Chen AL (2021) Multimodal depression detection on instagram considering time interval of posts. J Intell Inf Syst 56(1):25–47
Article Google Scholar
Chow YY, Verdonschot M, McEvoy CT, Peeters G (2022) Associations between depression and cognition, mild cognitive impairment and dementia in persons with diabetes mellitus: a systematic review and meta-analysis.” Diabetes Research and Clinical Practice, Elsevier 109227.
Churi H, Keshri P, Khamkar S, Sankhe A (2021) A deep learning approach for depression classification using audio features.
Cohn JF, Kruez TS, Matthews I, Yang Y, Nguyen MH, Padilla MT, De la Torre F (2009) Detecting depression from facial actions and vocal prosody. In 2009 3rd international conference on affective computing and intelligent interaction and workshops, IEEE 1-7.
Cohn JF, Cummins N, Epps J, Goecke R, Joshi J, Scherer S (2018) Multi-modal assessment of depression from behavioral signals. In The Handbook of Multimodal-Multisensor Interfaces: Signal Processing, Architectures, and Detection of Emotion and Cognition-Volume 2: 375–417.
Dai Z, Zhou H, Ba Q, Zhou Y, Wang L, Li G (2021) Improving depression prediction using a novel feature selection algorithm coupled with context-aware analysis. J Affect Disord 295:1040–1048
Article Google Scholar
Das NN, Kumar N, Kaur M, Kumar V, Singh D (2020) Automated deep transfer learning-based approach for detection of COVID-19 infection in chest X-rays. Elsevier, Irbm
Google Scholar
Francese R, Attanasio P (2021) Supporting depression screening with multi-modal emotion detection. In CHItaly 2021: 14th biannual conference of the Italian SIGCHI chapter 1-8.
Gao S, Calhoun VD, Sui J (2018) Machine learning in major depression: from classification to treatment outcome prediction. CNS NeurosciTherapeut 24(11):1037–1052
Google Scholar
Gray JP, Müller VI, Eickhoff SB, Fox PT (2020) Multi-modal abnormalities of brain structure and function in major depressive disorder: a meta-analysis of neuroimaging studies. Am J Psychiatr 177(5):422–434
Article Google Scholar
Gui T, Zhu L, Zhang Q, Peng M, Zhou X, Ding K, Chen Z (2019) Cooperative multi-modal approach to depression detection in twitter. Proc AAAI Conf Artificial Intell 33(01):110–117
Google Scholar
Islam M, Kabir MA, Ahmed A, Kamal ARM, Wang H, Ulhaq A (2018) Depression detection from social network data using machine learning techniques. Health Inform Sci Syst 6(1):1–12
Google Scholar
Kwon I, Jo G, Shin K-S (2021) A deep neural network based on ResNet for predicting solutions of Poisson–Boltzmann equation. Electronics. Researchgate.net, 10(21): 2627.
Lalousis PA, Wood SJ, Schmaal L, Chisholm K, Griffiths SL, Reniers RL, PRONIA Consortium (2021) Heterogeneity and classification of recent onset psychosis and depression: a multi-modal machine learning approach. Schizophr Bull 47:1130–1140
Article Google Scholar
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In International conference on machine learning, PMLR, 1188–1196.
Li M, Cao L, Zhai Q, Li P, Liu S, Li R, Lu S (2020) Method of Depression classification based on behavioral and physiological signals of eye movement Complexity, 2020
Liu X, Li L, Li M, Ren Z, Ma P (2021) Characterizing the subtype of anhedonia in major depressive disorder: a symptom-specific multi-modal MRI study. Psychiatry Res Neuroimaging 308:111239
Article Google Scholar
Malhotra A, Jindal R (2020) Multi-modal deep learning based framework for detecting depression and suicidal behaviour by affective analysis of social media posts. EAI Endorsed Transac Pervasive Health Technol 6(21):e1
Google Scholar
Mann P, Paes A, Matsushima EH (2020) See and read: detecting depression symptoms in higher education students using multi-modal social media data. In Proceedings of the International AAAI Conference on Web and social media 14: 440–451.
Meng Y, Speier W, Ong MK, Arnold CW (2021) Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression. IEEE Journal of Biomedical and Health Informatics.
Morales MR (2018) Multimodal depression detection: an investigation of features and fusion techniques for automated systems. City University of New York.
Morales M, Scherer S, Levitan R (2018) A linguistically-informed fusion approach for multi-modal depression detection. In proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic 13-24.
Nikolin S, Tan YY, Schwaab A, Moffa A, Loo CK and Martin D (2021) An investigation of working memory deficits in depression using the n-back task: a systematic review and meta-analysis. J Affective Disord, Elsevier, 284: 1–8.
Pinto G, Carvalho JM, Barros F, Soares SC, Pinho AJ, Brás S (2020) Multi-modal emotion evaluation: a physiological model for cost-effective emotion classification. Sensors 20(12):3510
Article Google Scholar
Qureshi SA, Saha S, Hasanuzzaman M, Dias G (2019) Multitask representation learning for multi-modal estimation of depression level. IEEE Intell Syst 34(5):45–52
Article Google Scholar
Rohanian M, Hough J, Purver M (2019) Detecting depression with word-level multimodal fusion. In Interspeech 1443–1447.
Rutowski T, Harati A, Lu Y, & Shriberg E (2019) Optimizing speech-input length for speaker-independent depression classification. In INTERSPEECH 3023–3027.
Shalu H, CN HS, Das A, Majumder S, Datar A, MS SM, Kadiwala J (2020) Depression status estimation by deep learning based hybrid multi-modal fusion model. arXiv preprint arXiv:2011.14966.
Sharma S, Kumar K, Singh N (2017) D-FES: deep facial expression recognition system. In 2017 conference on information and communication technology (CICT), IEEE1-6.
Sharma S, Kumar P, Kumar K (2017) LEXER: lexicon based emotion analyzer. In international conference on pattern recognition and machine intelligence, springer, Cham 373-379.
Sharma S, Shivhare SN, Singh N, Kumar K (2019) Computationally efficient ann model for small-scale problems. In Machine intelligence and signal analysis, Springer, Singapore 423–435.
Sharma S, Kumar K and Singh N (2020) Deep eigen space based ASL recognition system. IETE Journal of Research, Taylor and Francis, 1–11.
Shi Y, Song R, Wang Z, Zhang H, Zhu J, Yue Y, Zhao Y, Zhang Z (2021) Potential clinical value of circular RNAs as peripheral biomarkers for the diagnosis and treatment of major depressive disorder. EBio Med 66:103337
Google Scholar
Shrestha A, Serra E, Spezzano F (2020) Multi-modal social and psycho-linguistic embedding via recurrent neural networks to identify depressed users in online forums. Network Model Analy Health Inform Bioinform 9(1):1–11
Google Scholar
Singh H, Dhanak N, Ansari H and Kumar K (2017) HDML: habit detection with machine learning. In proceedings of the 7th international conference on computer and communication technology, 29-33.
Solieman H, Pustozerov EA (2021) The Detection of Depression Using Multi-modal Models Based on Text and Voice Quality Features. In 2021 IEEE conference of Russian young researchers in electrical and electronic engineering (ElConRus), IEEE 1843–1848.
Vidal-Ribas P, Janiri D, Doucet GE, Pornpattananangkul N, Nielson DM, Frangou S, Stringaris A (2021) Multimodal neuroimaging of suicidal thoughts and behaviors in a US population-based sample of school-age children. Am J Psychiatr 178(4):321–332
Article Google Scholar
Vijayvergia A, Kumar K (2018) STAR: rating of reviewS by exploiting variation in emotions using trAnsferleaRning framework. In 2018 conference on information and communication technology (CICT), IEEE 1-6.
Vijayvergia A, Kumar K (2021) Selective shallow models strength integration for emotion detection using GloVe and LSTM. Multimedia Tools App, Springer 80(18):28349–28363
Article Google Scholar
Villatoro-Tello E, Ramírez-de-la-Rosa G, Gática-Pérez D, Magimai-Doss M, Jiménez-Salazar H (2021) Approximating the Mental Lexicon from Clinical Interviews as a Support Tool for Depression Detection. In Proceedings of the 2021 International Conference on Multimodal Interaction 557–566.
Yang L, Jiang D, Sahli H (2018) Integrating deep and shallow models for multi-modal depression analysis—hybrid architectures. IEEE Trans Affect Comput
Yazdavar AH, Mahdavinejad MS, Bajaj G, Romine W, Sheth A, Monadjemi AH, Hitzler P (2020) Multimodal mental health analysis in social media. PLoS One 15(4):e0226248
Article Google Scholar
Zhang X, Shen J, ud Din Z, Liu J, Wang G, Hu B (2019) Multimodal depression detection: fusion of electroencephalography and paralinguistic behaviors using a novel strategy for classifier ensemble. IEEE J Biomed Health Inform 23(6):2265–2275
Article Google Scholar
Zheng W, Yan L, Gou C, Wang FY (2020). Graph attention model embedded with multi-modal knowledge for depression detection. In 2020 IEEE international conference on multimedia and expo (ICME), IEEE 1-6.
Zhou X, Huang P, Liu H, &Niu S (2019) Learning content-adaptive feature pooling for facial depression recognition in videos. Electron Lett 55(11): 648–650.

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, GITAM School of Technology, Bangalore Campus, Bengaluru, India
Swasthika Jain T J & I. Jeena Jacob
Department of Electrical, Electronics & Communication Engineering, GITAM School of Technology, Bangalore Campus, Bengaluru, India
Ajay Kumar Mandava

Authors

Swasthika Jain T J
View author publications
You can also search for this author in PubMed Google Scholar
I. Jeena Jacob
View author publications
You can also search for this author in PubMed Google Scholar
Ajay Kumar Mandava
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have equal contributions in this work.

Corresponding author

Correspondence to Swasthika Jain T J.

Ethics declarations

Conflict of interest

Authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

All the authors involved have agreed to participate in this submitted article.

Consent to publish

All the authors involved in this manuscript give full consent for publication of this submitted article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

T J, S.J., Jacob, I.J. & Mandava, A.K. D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis. Multimed Tools Appl 82, 25973–26004 (2023). https://doi.org/10.1007/s11042-023-14351-y

Download citation

Received: 21 June 2022
Revised: 09 November 2022
Accepted: 02 January 2023
Published: 11 January 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11042-023-14351-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis

Abstract

Access this article

Similar content being viewed by others

A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video

Harnessing emotions for depression detection

Multimodal Depression Severity Detection Using Deep Neural Networks and Depression Assessment Scale

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

D-ResNet-PVKELM: deep neural network and paragraph vector based kernel extreme machine learning model for multimodal depression analysis

Abstract

Access this article

Similar content being viewed by others

A Multimodal Approach for Detection and Assessment of Depression Using Text, Audio and Video

Harnessing emotions for depression detection

Multimodal Depression Severity Detection Using Deep Neural Networks and Depression Assessment Scale

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation