Abstract
Suicide is a serious issue around the world and is a leading cause of death in US. In the past 20 years, the suicide rate has seen a significant increase of 35%. With the rapid development of information technology, more and more people begin to use social media to share their inner feelings. It enables social media data to be widely used for research on suicide risk assessment. However, not all social media posts are suicide related. Previous research addressed this problem with post-level attention mechanism. However, post-level attention mechanism may not find relevant suicide posts. This problem becomes more serious in the feature-based post embeddings since each post is converted into a single vector to serve as the input of the model, resulting in the loss of word-level information during training. In this paper, we addressed this problem by introducing a novel word-level model including a post-selectin layer as a solution. Firstly, we utilize a suicide keyword dictionary to identify risky posts that may be missed by the post-level attention mechanism. We then convert the words in the risky posts into word embeddings and use self-attention to generate the post embeddings for the risky posts. Finally, we pass the post embeddings to a multilayer perceptron to classify the suicide risk. We also demonstrate that the FScore used in previous studies can be reduced to a function of accuracy, which does not reflect the model performance in predicting imbalanced datasets. Therefore, we additionally adopt macro F1 score as the evaluation function. Experiment results show that our model not only outperforms previous studies in FScore performance, but also achieves macro F1 Score a nearly 4% improvement compared to previous studies.
Similar content being viewed by others
Data availability
The datasets generated during the current study are available from the corresponding author on reasonable request.
Notes
We also experimented with the sum and the attention results of the two vectors, but empirically found that concatenation performed better.
References
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473
Baytas IM, Xiao C, Zhang X, Wang F, Jain AK, Zhou J (2017) Patient subtyping via time-aware LSTM networks. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. pp 65–74
Beltagy I, Peters ME, Cohan A (2020) Longformer: The long-document transformer. arXiv preprint arXiv:200405150
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Cao L, Zhang H, Feng L, Wei Z, Wang X, Li N et al (2019) Latent suicide risk detection on microblog via suicide-oriented word embeddings and layered attention. arXiv preprint arXiv:191012038
Centers for Disease Control and Prevention: Suicide Prevention. https://www.cdc.gov/suicide/index.html (2022). Accessed 27 June 2022
Choi KS, Kim S, Kim B-H, Jeon HJ, Kim J-H, Jang JH et al (2021) Deep graph neural network-based prediction of acute suicidal ideation in young adults. Sci Rep 11(1):1–11
Coppersmith G, Leary R, Crutchley P, Fine A (2018) Natural language processing of social media as screening for suicide risk. Biomed Inform Insights 10:1178222618792860
De Choudhury M, Kiciman E, Dredze M, Coppersmith G, Kumar M (2016) Discovering shifts to suicidal ideation from mental health content in social media. Proceedings of the 2016 CHI conference on human factors in computing systems. pp 2098–110
Domino G (1996) Test-retest reliability of the Suicide Opinion Questionnaire. Psychol Rep 78(3):1009–1010
Gaur M, Alambo A, Sain JP, Kursuncu U, Thirunarayan K, Kavuluru R et al (2019) Knowledge-aware assessment of severity of suicide risk for early intervention. The world wide web conference. pp 514–25
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jashinsky J, Burton SH, Hanson CL, West J, Giraud-Carrier C, Barnes MD et al (2014) Tracking suicide risk factors through Twitter in the US. Crisis 35(1):51
Klonsky ED, May AM (2015) The three-step theory (3ST): A new theory of suicide rooted in the “ideation-to-action” framework. Int J Cogn Ther 8(2):114–129
Leavey G, Mallon S, Rondon-Sulbaran J, Galway K, Rosato M, Hughes L (2017) The failure of suicide prevention in primary care: family and GP perspectives–a qualitative study. BMC Psychiatry 17(1):1–10
Lim M, Lee SU, Park J-I (2014) Difference in suicide methods used between suicide attempters and suicide completers. Int J Ment Heal Syst 8(1):1–4
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprint arXiv:171105101
Masuda N, Kurahashi I, Onari H (2013) Suicide ideation of individuals in online social networks. PLoS ONE 8(4):e62262
Matero M, Idnani A, Son Y, Giorgi S, Vu H, Zamani M et al (2019) Suicide risk assessment with multi-level dual-context language and BERT. Proceedings of the sixth workshop on computational linguistics and clinical psychology. pp 39–44
Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:171209405
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26
Mishra R, Sinha PP, Sawhney R, Mahata D, Mathur P, Shah RR (2019) SNAP-BATNET: Cascading author profiling and social network graphs for suicide ideation detection on social media. Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: student research workshop. pp 147–56
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32:8024–8035
Posner K, Brent D, Lucas C, Gould M, Stanley B, Brown G et al (2008) Columbia-suicide severity rating scale (C-SSRS). Columbia University Medical Center, New York, NY, p 10
Reimers N, Gurevych I (2019) Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:190810084
Renberg ES, Jacobsson L (2003) Development of a questionnaire on attitudes towards suicide (ATTS) and its application in a Swedish population. Suicide Life Threat Behav 33(1):52–64
Roy A, Nikolitch K, McGinn R, Jinah S, Klement W, Kaminsky ZA (2020) A machine learning approach predicts future risk to suicidal ideation from social media data. NPJ Digit Med 3(1):78
Sawhney R, Joshi H, Gandhi S, Shah R (2020) A time-aware transformer based model for suicide ideation detection on social media. Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). pp 7685–97
Sawhney R, Joshi H, Gandhi S, Shah RR (2021) Towards ordinal suicide ideation detection on social media. Proceedings of the 14th ACM International Conference on Web Search and Data Mining. pp 22–30
Sawhney R, Manchanda P, Singh R, Aggarwal S (2018) A computational approach to feature extraction for identification of suicidal ideation in tweets. Proceedings of ACL, Student Research Workshop2018. pp 91–8
Sawhney R, Neerkaje AT, Gaur M (2022) A risk-averse mechanism for suicidality assessment on social media. In :Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, vol 2: Short Papers. Dublin, Ireland, pp 628–635
Shing H-C, Nair S, Zirikly A, Friedenberg M, Daumé III H, Resnik P (2018) Expert, crowdsourced, and machine assessment of suicide risk via online postings. Proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic. pp 25–36
Shing H-C, Resnik P, Oard DW (2020) A prioritization model for suicidality risk assessment. Proceedings of the 58th annual meeting of the association for computational linguistics. pp 8124–37
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN et al (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, pp 6000–6010
Wang N, Luo F, Shivtare Y, Badal VD, Subbalakshmi K, Chandramouli R et al (2021) Learning Models for Suicide Prediction from Social Media Posts. arXiv preprint arXiv:210503315
Yang C, Zhang Y, Muresan S (2021) Weakly-Supervised Methods for Suicide Risk Assessment: Role of Related Domains. arXiv preprint arXiv:210602792
Zirikly A, Resnik P, Uzuner O, Hollingshead K (2019) CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts. Proceedings of the sixth workshop on computational linguistics and clinical psychology. pp 24–33
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tsai, Y.S., Chen, A.L.P. Suicide risk assessment using word-level model with dictionary-based risky posts selection. Multimed Tools Appl 83, 21435–21454 (2024). https://doi.org/10.1007/s11042-023-16361-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16361-2