Skip to main content

Stress Level Detection in Continuous Speech Using CNNs and a Hybrid Attention Layer

  • Conference paper
  • First Online:
International Conference on Innovative Computing and Communications (ICICC 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 731))

Included in the following conference series:

  • 179 Accesses

Abstract

This paper mainly targets stress detection by analyzing the audio signals obtained from human beings. Deep learning is used to model the levels of stress pertaining to this whole paper followed by an analysis of the Mel spectrogram of the audio signals is done. A hybrid attention model helps us achieve the required result. The dataset that has been used for this article is the DAIC-WOZ dataset containing continuous speech files of conversations between a patient and a virtual assistant who is controlled by a human counselor from another room. The best results obtained were a 78.7% accuracy on the classification of the stress levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chlasta K, Wołk K, Krejtz I (2019) Automated speech-based screening of depression using deep convolutional neural networks. ArXiv, abs/1912.01115

    Google Scholar 

  2. Tanuj MM, Virigineni AA, Mani A, Subramani RR (2021) Comparative study of gradient domain based image blending approaches. In: 2021 international conference on innovative computing, intelligent communication and smart electrical systems (ICSES), Chennai, India, pp 1–5. https://doi.org/10.1109/ICSES52305.2021.9633858

  3. Tomba K, Dumoulin J, Mugellini E, Abou Khaled O, Hawila S (2018) Stress detection through speech analysis. ICETE

    Google Scholar 

  4. Patil KJKJ, Zope PH, Suralkar SR (2012) Emotion detection from speech using Mfcc and Gmm. Int J Eng Res Technol 1(9)

    Google Scholar 

  5. Lanjewar RB, Mathurkar SS, Patel N (2015) Implementation and comparison of speech emotion recognition system using gaussian mixture model (GMM) and K-nearest neighbor (K-NN) Techniques. Proced Comput Sci 49:50–57

    Article  Google Scholar 

  6. Chavali ST, Kandavalli CT, STM, SR (2022) Grammar detection for sentiment analysis through improved viterbi algorithm. In: 2022 international conference on advances in computing, communication and applied informatics (ACCAI), Chennai, India, pp 1–6. https://doi.org/10.1109/ACCAI53970.2022.9752551

  7. Gong Y, Poellabauer C (2017) Proceedings of the 7th annual workshop on audio/visual emotion challenge. Topic modeling based multi-modal depression detection. Association for Computing Machinery, New York, pp 69–76

    Google Scholar 

  8. Subramani R, Vijayalakshmi C (2016) A review on advanced optimization techniques. ARPN J Eng Appl Sci 11(19):11675–11683

    Google Scholar 

  9. Williamson JR, Godoy EE, Cha M, Schwarzentruber A, Khorrami P, Gwon Y, Kung H-T, Dagli C, Quatieri TF (2016) Proceedings of the 6th international workshop on audio/visual emotion challenge. Detecting depression using vocal, facial and semantic communication cues. Association for Computing Machinery, New York, pp 11–18

    Google Scholar 

  10. Murugadoss B et al (2021) Blind digital image watermarking using henon chaotic map and elliptic curve cryptography in discrete wavelets with singular value decomposition. In; 2021 international symposium of Asian control association on intelligent robotics and industrial automation (IRIA). IEEE

    Google Scholar 

  11. Al Hanai T, Ghassemi MM, Glass JR (2018) Interspeech. Detecting depression with audio/text sequence modeling of interviews. International Speech Communication Association, France, pp 1716–1720

    Google Scholar 

  12. Yang L, Sahli H, Xia X, Pei E, Oveneke MC, Jiang D (2017) Proceedings of the 7th annual workshop on audio/visual emotion challenge. Hybrid depression classification and estimation from audio video and text information. Association for Computing Machinery, New York, pp 45–51

    Google Scholar 

  13. Ringeval F, Schuller B, Valstar M, Cummins N, Cowie R, Tavabi L, Schmitt M, Alisamir S, Amiriparian S, Messner E-M et al (2019) Proceedings of the 9th international on audio/visual emotion challenge and workshop. Avec 2019 workshop and challenge: state-of-mind, detecting depression with AI, and cross-cultural affect recognition. Association for Computing Machinery, New York, pp 3–12

    Google Scholar 

  14. Li C (2022) Robotic emotion recognition using two-level features fusion in audio signals of speech. IEEE Sens J 22(18):17447–17454. https://doi.org/10.1109/JSEN.2021.3065012

  15. Zhou Y, Liang X, Gu Y, Yin Y, Yao L (2022) Multi-classifier interactive learning for ambiguous speech emotion recognition. IEEE/ACM Trans Audio, Speech, Language Proc 30:695–705. https://doi.org/10.1109/TASLP.2022.3145287

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Subramani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Subramani, R., Suresh, K., Donald, A.C., Sivaselvan, K. (2024). Stress Level Detection in Continuous Speech Using CNNs and a Hybrid Attention Layer. In: Hassanien, A.E., Castillo, O., Anand, S., Jaiswal, A. (eds) International Conference on Innovative Computing and Communications. ICICC 2023. Lecture Notes in Networks and Systems, vol 731. Springer, Singapore. https://doi.org/10.1007/978-981-99-4071-4_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4071-4_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4070-7

  • Online ISBN: 978-981-99-4071-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics