Abstract
Mental stress is a significant risk factor for several maladies and can negatively impact a person’s quality of life, including their work and personal relationships. Traditional methods of detecting mental stress through interviews and questionnaires may not capture individuals’ instantaneous emotional responses. In this study, the method of experience sampling was used to analyze the participants’ immediate affective responses, which provides a more comprehensive and dynamic understanding of the participants’ experiences. WorkStress3D dataset was compiled using information gathered from 20 participants for three distinct modalities. During an average of one week, 175 h of data containing physiological signals such as BVP, EDA, and body temperature, as well as facial expressions and auditory data, were collected from a single subject. We present a novel fusion model that uses double-early fusion approaches to combine data from multiple modalities. The model’s F1 score of 0.94 with a loss of 0.18 is very encouraging, showing that it can accurately identify and classify varying degrees of stress. Furthermore, we investigate the utilization of transfer learning techniques to improve the efficacy of our stress detection system. Despite our efforts, we were unable to attain better results than the fusion model. Transfer learning resulted in an accuracy of 0.93 and a loss of 0.17, illustrating the difficulty of adapting pre-trained models to the task of stress analysis. The results we obtained emphasize the significance of multi-modal fusion in stress detection and the importance of selecting the most suitable model architecture for the given task. The proposed fusion model demonstrates its potential for achieving an accurate and robust classification of stress. This research contributes to the field of stress analysis and contributes to the development of effective models for stress detection.
Similar content being viewed by others
Data availability
The WorkStress3D dataset generated during and/or analyzed during the current study is available in the Mendeley repository https://data.mendeley.com/datasets/t93xcwm75r/5.
References
Jacobs N, Myin-Germeys Inez, Cathérine Derom P, Delespaul J Van, Os, and NA Nicolson, (2007) A momentary assessment study of the relationship between affective and adrenocortical stress responses in daily life. Biol Psychol 74(1):60–66
Cohen Sheldon, Kamarck Tom, Mermelstein Robin et al (1994) Perceived stress scale. Measur Stress Guide Health Soc Sci 10(2):1–2
Koh KB, Park JK, Kim CH (2000) Development of the stress response inventory. J Korean Neuropsychiatric Assoc 39(4):707–719
Dogan G, Akbulut FP, Catal C, Mishra A (2022) Stress detection using experience sampling: a systematic mapping study. Int J Environ Res Public Health 19(9):5693
Akbulut FP, Ikitimur B, Akan A (2020) Wearable sensor-based evaluation of psychosocial stress in patients with metabolic syndrome. Artific Intell Med 104:101824
Fatma Patlar Akbulut, Harry G Perros, and Muhammad Shahzad. Bimodal affect recognition based on autoregressive hidden markov models from physiological signals. Computer Methods and Programs in Biomedicine, 195:105571, 2020
Akbulut FP (2022) Evaluating the effects of the autonomic nervous system and sympathetic activity on emotional states. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi 21(41):156–169
Derdiyok S, Akbulut FP (2023) Biosignal based emotion-oriented video summarization. Multimed Syst 29(3):1513–1526
Yildirim E, Akbulut FP, Catal C (2023) Analysis of facial emotion expression in eating occasions using deep learning. Multimed Tools Appl 82:31659–31671
Shiffman S, Stone AA, Hufford MR (2008) Ecological momentary assessment. Annu Rev Clin Psychol 4:1–32
Serre Fuschia, Fatseas Melina, Debrabant Romain, Alexandre Jean-Marc, Auriacombe Marc, Swendsen Joel (2012) Ecological momentary assessment in alcohol, tobacco, cannabis and opiate dependence: a comparison of feasibility and validity. Drug Alcohol Depend 126(1–2):118–123
Abraham AD, Leung EJY, Wong BA, Rivera ZMG, Kruse LC, Clark JJ, Land BB (2020) Orally consumed cannabinoids provide long-lasting relief of allodynia in a mouse model of chronic neuropathic pain. Neuropsychopharmacology 45(7):1105–1114
Myin-Germeys I, Krabbendam L, Jolles J, Delespaul PA, van Os J (2002) Are cognitive impairments associated with sensitivity to stress in schizophrenia? an experience sampling study. Am J Psychiatry 159(3):443–449
Peters Stefan, Wilkinson Amanda, Mulligan Hilda (2019) Views of healthcare professionals on training for and delivery of a fatigue self-management program for persons with multiple sclerosis. Disabil Rehabil 41(23):2792–2798
Brys ADH, Di Stasio E, Lenaert B, Sanguinetti M, Picca A, Calvani R, Marzetti E, Gambaro G, Bossola M (2020) Serum interleukin-6 and endotoxin levels and their relationship with fatigue and depressive symptoms in patients on chronic haemodialysis. Cytokine 125:154823
Nalepa GJ, Kutt K, Giżycka B, Jemioło P, Bobek S (2019) Analysis and use of the emotional context with wearable devices for games and intelligent assistants. Sensors 19(11):2509
Setz Cornelia, Arnrich Bert, Schumm Johannes, La Marca Roberto, Tröster Gerhard, Ehlert Ulrike (2009) Discriminating stress from cognitive load using a wearable eda device. IEEE Transactions on information technology in biomedicine 14(2):410–417
Mohaddes F, da Silva RL, Akbulut FP, Zhou Y, Tanneeru A, Lobaton E, Lee B, Misra V (2020) A pipeline for adaptive filtering and transformation of noisy left-arm ECG to its surrogate chest signal. Electronics 9(5):866
Akbulut FP, Akan A (2018) A smart wearable system for short-term cardiovascular risk assessment with emotional dynamics. Measurement 128:237–246
Rothkrantz LJM, Wiggers P, Van Wees JWA, van Vark RJ (2004) Voice stress analysis. In: International conference on text, speech and dialogue, pp 449–456. Springer
Leung Y, Oates J, Chan SP (2018) Voice, articulation, and prosody contribute to listener perceptions of speaker gender: a systematic review and meta-analysis. J Speech Lang Hearing Res 61(2):266–297
Pennebaker JW (1993) Putting stress into words: health, linguistic, and therapeutic implications. Behav Res Therapy 31(6):539–548
Madhavi I, Chamishka S, Nawaratne R, Nanayakkara V, Alahakoon D, De Silva D (2020) A deep learning approach for work related stress detection from audio streams in cyber physical environments. In: 2020 25th IEEE international conference on emerging technologies and factory automation (ETFA), volume 1, pp 929–936. IEEE
Wood Adrienne, Rychlowska Magdalena, Korb Sebastian, Niedenthal Paula (2016) Fashioning the face: sensorimotor simulation contributes to facial expression recognition. Trends Cognit Sci 20(3):227–240
Mitra S, Acharya T (2007) Gesture recognition: a survey. IEEE Trans Syst Man Cybern Part C (Appl Rev) 37(3):311–324
Happy SL, Routray A (2017) Fuzzy histogram of optical flow orientations for micro-expression recognition. IEEE Trans Affect Comput 10(3):394–406
Verma GK, Tiwary US (2014) Multimodal fusion framework: a multiresolution approach for emotion classification and recognition from physiological signals. NeuroImage 102:162–172
Gunes H, Piccardi M (2005) Affect recognition from face and body: early fusion versus late fusion. In: 2005 IEEE international conference on systems, man and cybernetics, vol 4, pp 3437–3443. IEEE
Zhou X, Jin Y, Zhang H, Li S, Huang X (2016).A map of threats to validity of systematic literature reviews in software engineering. In: 2016 23rd Asia-Pacific software engineering conference (APSEC), pp 153–160. IEEE
Acknowledgements
This work was supported by the Scientific Research Projects Coordination Unit of Istanbul Kultur University with project number: IKU-BAP2012. Within the scope of the study, data were collected with the permission of the ethics committee of Istanbul Kultur University, with the decision dated 20.05.2020 and numbered 2020.29.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dogan, G., Akbulut, F.P. Multi-modal fusion learning through biosignal, audio, and visual content for detection of mental stress. Neural Comput & Applic 35, 24435–24454 (2023). https://doi.org/10.1007/s00521-023-09036-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-09036-4