Skip to main content

Depression and Suicide Prediction Using Natural Language Processing and Machine Learning

  • Conference paper
  • First Online:
Cyber Security, Privacy and Networking

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 370))

Abstract

Depression has always been one of the prominent concerns of mental health worldwide. In the worst-case scenario, someone suffering from depression may lead to drastic measures such as suicide. According to the World Health Organization, depression and anxiety affect one out of every five people worldwide, costing trillions of dollars each year. In the COVID-19 pandemic, the situation has worsened alarmingly as more people suffer from depression. It has become essential, more than ever, to  maintain the mental health profiles of our people and to predict any unfortunate event. Depression can be prevented and treated at a very early stage and a low cost, given early detection and identification of the causes. With advancements in machine and deep learning models, it has become possible to identify such behaviour through social interactions such as posts, tweets, and comments. This paper aims to detect user behaviour that can conclude whether a person is suffering from depression and suicidal tendencies based on the user’s social media tweets. The research work proposes a classifier with a hybrid technique in preprocessing using Natural Language Processing (NLP) and machine learning techniques with an accuracy of 75% to identify such traits in a person through his/her tweets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Beard C, Millner AJ, Forgeard MJ, Fried EI, Hsu KJ, Treadway MT, Björgvinsson T (2016) Network analysis of depression and anxiety symptom relationships in a psychiatric sample. Psychol Med 46:3359–3369

    Google Scholar 

  2. Lin C, Hu P, Su H, Li S, Mei J, Zhou J, Leung H (2020) Sensemood: depression detection on social media. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 407–411

    Google Scholar 

  3. Coppersmith G, Dredze M, Harman C (2014) Quantifying mental health signals in Twitter. In: Proceedings of the workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 51–60

    Google Scholar 

  4. Kim HJ, Park SB, Jo GS (2014) Affective social network–happiness inducing social media platform. Multim Tools Appl 68(2):355–374

    Article  Google Scholar 

  5. Kim K, Moon J, Oh U (2020) Analysis and recognition of depressive emotion through NLP and machine learning. J Conv Cult Technol 6(2):449–454

    Google Scholar 

  6. Leiva V, Freire A (2017) Towards suicide prevention: early detection of depression on social media. In: International conference on internet science. Springer, Cham, pp 428–436

    Google Scholar 

  7. Liu J, Zheng Y, Dong K, Yu H, Zhou J, Jiang Y, Ding R (2020) Classification of fashion article images based on improved random forest and VGG-IE algorithm. Int J Pattern Recogn Artif Intell 34:2051004

    Google Scholar 

  8. Tong L, Zhang Q, Sadka A, Li L, Zhou H (2019) Inverse boosting pruning trees for depression detection on Twitter. arXiv preprint arXiv:1906.00398

  9. Burnap P, Colombo G, Amery R, Hodorog A, Scourfield J (2017) Multi-class machine classification of suiciderelated communication on twitter. Online Soc Netw Media 2:32–44

    Article  Google Scholar 

  10. Desmet B, Hoste V (2018) Online suicide prevention through optimised text classification. Inf Sci 439:61–78

    Article  Google Scholar 

  11. D’Angelo G, Palmieri F (2021) GGA: a modified genetic algorithm with gradient-based local search for solving constrained optimization problems. Inf Sci 547:136–162

    Article  MathSciNet  Google Scholar 

  12. Hiraga M (2017) Predicting depression for Japanese blog text. In: Proceedings of ACL 2017, Student research workshop, pp 107–113

    Google Scholar 

  13. Wu J, Ma J, Wang Y, Wang J (2021) Understanding and predicting the burst of burnout via social media. Proc ACM Hum-Comput Inter 4(CSCW3):1–27

    Google Scholar 

  14. D’Angelo G, Palmieri F (2020) Discovering genomic patterns in SARS-CoV-2 variants. Int J Intell Syst 35:1680–1698

    Google Scholar 

  15. Elia S, D’Angelo G, Palmieri F, Sorge R, Massoud R, Cortese C, De Stefano A (2019) A machine learning evolutionary algorithm-based formula to assess tumor markers and predict lung cancer in cytologically negative pleural effusions. Soft Comput 1–13

    Google Scholar 

  16. D’Angelo G, Pilla R, Dean JB, Rampone S (2018) Toward a soft computing-based correlation between oxygen toxicity seizures and hyperoxic hyperpnea. Soft Comput 22(6):2421–2427

    Article  Google Scholar 

  17. Clarizia F, Colace F, Lombardi M, Pascale F, Santaniello D (2019) Sentiment analysis in social networks: a methodology based on the latent Dirichlet allocation approach. In: Proceedings of the 11th conference of the European society for fuzzy logic and technology (EUSFLAT 2019), Prague, Czech Republic, pp 9–13

    Google Scholar 

  18. Casillo M, Clarizia F, D’Aniello G, De Santo M, Lombardi M, Santaniello D (2020) CHAT-Bot: a cultural heritage aware teller-bot for supporting touristic experiences. Pattern Recogn Lett 131:234–243

    Article  Google Scholar 

  19. Colace F, De Santo M, Lombardi M, Pascale F, Santaniello D, Tucker A (2020) A multilevel graph approach for predicting bicycle usage in London area. In: Fourth international congress on information and communication technology. Springer, Singapore, pp 353–362

    Google Scholar 

  20. Suicide and depression detection using subreddit and reddit platform (online). https://www.kaggle.com/nikhileswarkomati/suicide-watch

  21. Kumar Y, Sood K, Kaul S, Vasuja R (2020) Big data analytics and its benefits in healthcare. In: Big data analytics in healthcare. Springer, Cham, pp 3–21

    Google Scholar 

  22. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), p 12

    Google Scholar 

  23. Luo LX (2019) Network text sentiment analysis method combining LDA text representation and GRU-CNN. Pers Ubiquitous Comput 23(3):405–412

    Article  Google Scholar 

  24. Heimerl F, Lohmann S, Lange S, Ertl T (2014) Word cloud explorer: text analytics based on word clouds. In: 2014 47th Hawaii international conference on system sciences. IEEE, pp 1833–1842

    Google Scholar 

  25. Bullinaria JA, Levy JP (2007) Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods 39(3):510–526

    Article  Google Scholar 

  26. Qaiser S, Ali R (2018) Text mining: use of TF-IDF to examine the relevance of words to documents. Int J Comput Appl 181(1):25–29

    Google Scholar 

  27. Wu HC, Luk RWP, Wong KF, Kwok KL (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst (TOIS) 26(3):1–37

    Article  Google Scholar 

  28. Li Z, Xiong Z, Zhang Y, Liu C, Li K (2011) Fast text categorization using concise semantic analysis. Pattern Recogn Lett 32(3):441–448

    Article  Google Scholar 

  29. Description of support vector machine algorithm. https://towardsdatascience.com/support-vector-machine-introduction to-machine-learning-algorithms-934a444fca47. Accessed 1 June 2021

  30. Description of random forest algorithm. https://towardsdatascience.com/the-random-forest-algorithmd457d499ffcd. Accessed 29 May 2021

  31. Gupta S, Gupta MK (2021) Computational prediction of cervical cancer diagnosis using ensemble-based classification algorithm. Comput J bxaa198

    Google Scholar 

  32. Kadhim AI (2019) Survey on supervised machine learning techniques for automatic text classification. Artif Intell Rev 52(1):273–292

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harnain Kour .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kour, H., Gupta, M.K. (2022). Depression and Suicide Prediction Using Natural Language Processing and Machine Learning. In: Agrawal, D.P., Nedjah, N., Gupta, B.B., Martinez Perez, G. (eds) Cyber Security, Privacy and Networking. Lecture Notes in Networks and Systems, vol 370. Springer, Singapore. https://doi.org/10.1007/978-981-16-8664-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-8664-1_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-8663-4

  • Online ISBN: 978-981-16-8664-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics