Abstract
Depression has always been one of the prominent concerns of mental health worldwide. In the worst-case scenario, someone suffering from depression may lead to drastic measures such as suicide. According to the World Health Organization, depression and anxiety affect one out of every five people worldwide, costing trillions of dollars each year. In the COVID-19 pandemic, the situation has worsened alarmingly as more people suffer from depression. It has become essential, more than ever, to maintain the mental health profiles of our people and to predict any unfortunate event. Depression can be prevented and treated at a very early stage and a low cost, given early detection and identification of the causes. With advancements in machine and deep learning models, it has become possible to identify such behaviour through social interactions such as posts, tweets, and comments. This paper aims to detect user behaviour that can conclude whether a person is suffering from depression and suicidal tendencies based on the user’s social media tweets. The research work proposes a classifier with a hybrid technique in preprocessing using Natural Language Processing (NLP) and machine learning techniques with an accuracy of 75% to identify such traits in a person through his/her tweets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Beard C, Millner AJ, Forgeard MJ, Fried EI, Hsu KJ, Treadway MT, Björgvinsson T (2016) Network analysis of depression and anxiety symptom relationships in a psychiatric sample. Psychol Med 46:3359–3369
Lin C, Hu P, Su H, Li S, Mei J, Zhou J, Leung H (2020) Sensemood: depression detection on social media. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 407–411
Coppersmith G, Dredze M, Harman C (2014) Quantifying mental health signals in Twitter. In: Proceedings of the workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality, pp 51–60
Kim HJ, Park SB, Jo GS (2014) Affective social network–happiness inducing social media platform. Multim Tools Appl 68(2):355–374
Kim K, Moon J, Oh U (2020) Analysis and recognition of depressive emotion through NLP and machine learning. J Conv Cult Technol 6(2):449–454
Leiva V, Freire A (2017) Towards suicide prevention: early detection of depression on social media. In: International conference on internet science. Springer, Cham, pp 428–436
Liu J, Zheng Y, Dong K, Yu H, Zhou J, Jiang Y, Ding R (2020) Classification of fashion article images based on improved random forest and VGG-IE algorithm. Int J Pattern Recogn Artif Intell 34:2051004
Tong L, Zhang Q, Sadka A, Li L, Zhou H (2019) Inverse boosting pruning trees for depression detection on Twitter. arXiv preprint arXiv:1906.00398
Burnap P, Colombo G, Amery R, Hodorog A, Scourfield J (2017) Multi-class machine classification of suiciderelated communication on twitter. Online Soc Netw Media 2:32–44
Desmet B, Hoste V (2018) Online suicide prevention through optimised text classification. Inf Sci 439:61–78
D’Angelo G, Palmieri F (2021) GGA: a modified genetic algorithm with gradient-based local search for solving constrained optimization problems. Inf Sci 547:136–162
Hiraga M (2017) Predicting depression for Japanese blog text. In: Proceedings of ACL 2017, Student research workshop, pp 107–113
Wu J, Ma J, Wang Y, Wang J (2021) Understanding and predicting the burst of burnout via social media. Proc ACM Hum-Comput Inter 4(CSCW3):1–27
D’Angelo G, Palmieri F (2020) Discovering genomic patterns in SARS-CoV-2 variants. Int J Intell Syst 35:1680–1698
Elia S, D’Angelo G, Palmieri F, Sorge R, Massoud R, Cortese C, De Stefano A (2019) A machine learning evolutionary algorithm-based formula to assess tumor markers and predict lung cancer in cytologically negative pleural effusions. Soft Comput 1–13
D’Angelo G, Pilla R, Dean JB, Rampone S (2018) Toward a soft computing-based correlation between oxygen toxicity seizures and hyperoxic hyperpnea. Soft Comput 22(6):2421–2427
Clarizia F, Colace F, Lombardi M, Pascale F, Santaniello D (2019) Sentiment analysis in social networks: a methodology based on the latent Dirichlet allocation approach. In: Proceedings of the 11th conference of the European society for fuzzy logic and technology (EUSFLAT 2019), Prague, Czech Republic, pp 9–13
Casillo M, Clarizia F, D’Aniello G, De Santo M, Lombardi M, Santaniello D (2020) CHAT-Bot: a cultural heritage aware teller-bot for supporting touristic experiences. Pattern Recogn Lett 131:234–243
Colace F, De Santo M, Lombardi M, Pascale F, Santaniello D, Tucker A (2020) A multilevel graph approach for predicting bicycle usage in London area. In: Fourth international congress on information and communication technology. Springer, Singapore, pp 353–362
Suicide and depression detection using subreddit and reddit platform (online). https://www.kaggle.com/nikhileswarkomati/suicide-watch
Kumar Y, Sood K, Kaul S, Vasuja R (2020) Big data analytics and its benefits in healthcare. In: Big data analytics in healthcare. Springer, Cham, pp 3–21
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), p 12
Luo LX (2019) Network text sentiment analysis method combining LDA text representation and GRU-CNN. Pers Ubiquitous Comput 23(3):405–412
Heimerl F, Lohmann S, Lange S, Ertl T (2014) Word cloud explorer: text analytics based on word clouds. In: 2014 47th Hawaii international conference on system sciences. IEEE, pp 1833–1842
Bullinaria JA, Levy JP (2007) Extracting semantic representations from word co-occurrence statistics: a computational study. Behav Res Methods 39(3):510–526
Qaiser S, Ali R (2018) Text mining: use of TF-IDF to examine the relevance of words to documents. Int J Comput Appl 181(1):25–29
Wu HC, Luk RWP, Wong KF, Kwok KL (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inf Syst (TOIS) 26(3):1–37
Li Z, Xiong Z, Zhang Y, Liu C, Li K (2011) Fast text categorization using concise semantic analysis. Pattern Recogn Lett 32(3):441–448
Description of support vector machine algorithm. https://towardsdatascience.com/support-vector-machine-introduction to-machine-learning-algorithms-934a444fca47. Accessed 1 June 2021
Description of random forest algorithm. https://towardsdatascience.com/the-random-forest-algorithmd457d499ffcd. Accessed 29 May 2021
Gupta S, Gupta MK (2021) Computational prediction of cervical cancer diagnosis using ensemble-based classification algorithm. Comput J bxaa198
Kadhim AI (2019) Survey on supervised machine learning techniques for automatic text classification. Artif Intell Rev 52(1):273–292
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kour, H., Gupta, M.K. (2022). Depression and Suicide Prediction Using Natural Language Processing and Machine Learning. In: Agrawal, D.P., Nedjah, N., Gupta, B.B., Martinez Perez, G. (eds) Cyber Security, Privacy and Networking. Lecture Notes in Networks and Systems, vol 370. Springer, Singapore. https://doi.org/10.1007/978-981-16-8664-1_11
Download citation
DOI: https://doi.org/10.1007/978-981-16-8664-1_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8663-4
Online ISBN: 978-981-16-8664-1
eBook Packages: EngineeringEngineering (R0)