Multimodal mental state analysis

Rai, Bipin Kumar; Jain, Ishika; Tiwari, Baibhav; Saxena, Abhay

doi:10.1007/s10742-024-00329-2

Bipin Kumar Rai ORCID: orcid.org/0000-0002-9834-8093¹,
Ishika Jain²,
Baibhav Tiwari² &
…
Abhay Saxena²

39 Accesses
Explore all metrics

Abstract

Self-reports or professional interviews have typically been used to diagnose depression, although these methods often miss significant behavioral signals. Sometimes, people with depression may not express their feelings accurately, which can make it hard for psychologists to diagnose them correctly. We believe that paying attention to how people speak and behave can help us better identify depression. In real-life situations, psychologists can use different methods, like listening to how someone talks, their body language and change in their emotions while talking. To detect signs of depression more accurately authors presents MANOBAL, a system that analyzes voice, text, and facial expressions to detect depression. We use the DAIC-WoZ dataset, which was requested from the University of Southern California (UoS). We used this dataset for the multimodal depression detection model. Deep learning is challenged with such complicated data, therefore MANOBAL used a multimodal method. It uses elements from audio recordings, text, and facial expressions to predict both depression and its severity. This fusion has two advantages: first, it can substitute for uncertain data in one modality (such as voice) by using input from another (text, facial expressions). Second, it can give more weight to more dependable data sources, which improves accuracy. Small datasets are not very helpful when testing accuracy in fusion models, but MANOBAL overcomes this by exploiting DAIC-Woz dataset's transfer characteristics and increasing training labels. The initial results are encouraging, with a root mean square error of 0.168 for predicting depression severity. Experiments show the effectiveness of combining modalities. High-level features based on Mel Frequency Cepstral Coefficients (MFCC) give useful information on depression, but adding additional audio characteristics and facial action unit increases accuracy by 10% and 20%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Article Open access 13 February 2024

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

Article Open access 07 March 2023

Data availability

Not Applicable.

References

Abbaschian, B.J., Sierra-Sosa, D., Elmaghraby, A.: Deep learning techniques for speech emotion recognition, from databases to models. Sensors 21(4), 1249 (2021). https://doi.org/10.3390/s21041249
Article PubMed PubMed Central Google Scholar
Alanazi, S.A., et al.: Public’s mental health monitoring via sentimental analysis of financial text using machine learning techniques. Int. J. Environ. Res.s Public Health 19, 15 (2022). https://doi.org/10.3390/ijerph19159695
Article Google Scholar
Babu, N.V., Kanaga, E.G.: Sentiment analysis in social media data for depression detection using artificial intelligence: a review. SN Comput. Sci. 3(1), 74 (2022). https://doi.org/10.1007/s42979-021-00958-1
Article PubMed Google Scholar
Bota, P.J., Wang, C., Fred, A.L., Da Silva, H.P.: A review, current challenges, and future possibilities on emotion recognition using machine learning and physiological signals. IEEE Access 26(7), 140990–141020 (2019)
Article Google Scholar
Campbell, F., Blank, L., Cantrell, A., Baxter, S., Blackmore, C., Dixon, J., Goyder, E.: Factors that influence mental health of university and college students in the UK: a systematic review. BMC Public Health 22(1), 1778 (2022). https://doi.org/10.1186/s12889-022-13943-x
Article PubMed PubMed Central Google Scholar
Chung, J., Teo, J.: Mental Health prediction using machine learning: taxonomy, applications, and challenges. Appl. Comput. Intell. Soft Comput. 5(2022), 1–9 (2022). https://doi.org/10.1155/2022/9970363
Article Google Scholar
Ehiabhi, J., Wang, H.: A systematic review of machine learning models in mental health analysis based on multi-channel multi-modal biometric signals. BioMedInformatics 3(1), 193–219 (2023). https://doi.org/10.3390/biomedinformatics3010014
Article Google Scholar
Garcia-Ceja, E., Riegler, M., Nordgreen, T., Jakobsen, P., Oedegaard, K.J., Tørresen, J.: Mental health monitoring with multimodal sensing and machine learning: a survey. Pervasive Mobile Comput. 1(51), 1–26 (2018). https://doi.org/10.1016/j.pmcj.2018.09.003
Article Google Scholar
Hernández-Torrano, D., Ibrayeva, L., Sparks, J., Lim, N., Clementi, A., Almukhambetova, A., Nurtayev, Y., Muratkyzy, A.: Mental health and well-being of university students: a bibliometric mapping of the literature. Front. Psychol. 9(11), 540000 (2020). https://doi.org/10.3389/fpsyg.2020.01226
Article Google Scholar
Kazemitabar, M., Lajoie, S.P., Doleck, T.: Analysis of emotion regulation using posture, voice, and attention: a qualitative case study. Comput. Education Open 2, 100030 (2021). https://doi.org/10.1016/j.caeo.2021.100030
Article Google Scholar
Khalil, R.A., Jones, E., Babar, M.I., Jan, T., Zafar, M.H., Alhussain, T.: Speech emotion recognition using deep learning techniques: a review. IEEE Access 7, 117327–117345 (2019). https://doi.org/10.1109/ACCESS.2019.2936124
Article Google Scholar
Lin, L., Chen, X., Shen, Y., Zhang, L.: Towards automatic depression detection: a bilstm/1d cnn-based model. Appl. Sci. (switzerland) 10(23), 1–20 (2020). https://doi.org/10.3390/app10238701
Article CAS Google Scholar
Nandwani, P., Verma, R.: A review on sentiment analysis and emotion detection from text. Soc. Netw. Anal. Mining 11(1), 81 (2021). https://doi.org/10.1007/s13278-021-00776-6
Article Google Scholar
Rahman, R.A., Omar, K., Noah, S.A.M., Danuri, M.S.N.M., Al-Garadi, M.A.: Application of machine learning methods in mental health detection: a systematic review. IEEE Access 8, 183952–183964 (2020). https://doi.org/10.1109/ACCESS.2020.3029154
Article Google Scholar
Rai, B.K.: BBTCD: blockchain based traceability of counterfeited drugs. Health Serv Outcomes Res Methodol 23(3), 337–353 (2023)
Article Google Scholar
Rai, B.K., Fatima, S., Satyarth, K.: Patient-centric multichain healthcare record. Int. J.E-Health Med. Commun. (IJEHMC) 13(4), 1–4 (2022). https://doi.org/10.4018/IJEHMC.309439
Article CAS Google Scholar
Rai, B. K., Kumar, G., and Balyan, V. Eds., “AI and Blockchain in Healthcare,” 2023, doi: https://doi.org/10.1007/978-981-99-0377-1.
Shatte, A.B., Hutchinson, D.M., Teague, S.J.: Machine learning in mental health: a scoping review of methods and applications. Psychol. Med. 49(9), 1426–1448 (2019)
Article PubMed Google Scholar
Tavabi, L.: “Multimodal machine learning for interactive mental health therapy,” In: ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction, Association for Computing Machinery, Inc, Oct. 2019, pp. 453–456. doi: https://doi.org/10.1145/3340555.3356095.
Thieme, A., Belgrave, D., Doherty, G.: Machine learning in mental health: a systematic review of the HCI literature to support the development of effective and implementable ML systems. ACM Transact. Comput.-Human Interact. (TOCHI) 27(5), 1–53 (2020)
Article Google Scholar
Xie, W. et al., “Interpreting Depression from Question-wise Long-term Video Recording of SDS Evaluation,” Jun. 2021. http://arxiv.org/abs/2106.13393

Download references

Funding

This work was not funded.

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Dayananda Sagar University, Bengaluru, India
Bipin Kumar Rai
Department of CSE-DS, ABES Institute of Technology, Ghaziabad, 201009, Uttar Pradesh, India
Ishika Jain, Baibhav Tiwari & Abhay Saxena

Authors

Bipin Kumar Rai
View author publications
You can also search for this author in PubMed Google Scholar
Ishika Jain
View author publications
You can also search for this author in PubMed Google Scholar
Baibhav Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Abhay Saxena
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

IJ, BT and AS have done review work. BKR and IJ wrote the proposed work. All authors reviewed the manuscript.

Corresponding author

Correspondence to Bipin Kumar Rai.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest associated with this study.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Rai, B.K., Jain, I., Tiwari, B. et al. Multimodal mental state analysis. Health Serv Outcomes Res Method (2024). https://doi.org/10.1007/s10742-024-00329-2

Download citation

Received: 06 January 2024
Accepted: 29 March 2024
Published: 16 April 2024
DOI: https://doi.org/10.1007/s10742-024-00329-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal mental state analysis

Abstract

Access this article

Similar content being viewed by others

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multimodal mental state analysis

Abstract

Access this article

Similar content being viewed by others

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation