Lightme: analysing language in internet support groups for mental health

Abstract

Background

Assisting moderators to triage harmful posts in Internet Support Groups is relevant to ensure its safe use. Automated text classification methods analysing the language expressed in posts of online forums is a promising solution.

Methods

Natural Language Processing and Machine Learning technologies were used to build a triage post classifier using a dataset from Reachout.com mental health forum for young people.

Results

When comparing with the state-of-the-art, a solution mainly based on features from lexical resources, received the best classification performance for the crisis posts (52%), which is the most severe class. Six salient linguistic characteristics were found when analysing the crisis post; (1) posts expressing hopelessness, (2) short posts expressing concise negative emotional responses, (3) long posts expressing variations of emotions, (4) posts expressing dissatisfaction with available health services, (5) posts utilising storytelling, and (6) posts expressing users seeking advice from peers during a crisis.

Conclusion

It is possible to build a competitive triage classifier using features derived only from the textual content of the post. Further research needs to be done in order to translate our quantitative and qualitative findings into features, as it may improve overall performance.

This is a preview of subscription content, log in to check access.

Notes

  1. 1.

    In machine learning, an hyper-parameter is a parameter whose value is set before the learning process, while the value of other parameters are derived via learning.

  2. 2.

    Grid search is a way of choosing the best hyper-parameters, and consist of exhaustively searching through a subset of the hyper-parameter space of a learning algorithm.

  3. 3.

    The C hyper-parameter referrers to the regularization value, which serves as a degree of importance that is given to miss-classification. The larger the value, the less the wrongly classified examples are allowed.

  4. 4.

    A loss function or cost function measures how good a prediction model does in terms of being able to predict the expected outcome.

  5. 5.

    TF-IDF is the amount of times a word appears in a document weighted by the number of meaningful words across multiple documents

References

  1. 1.

    Altszyler E, Berenstein AJ, Milne DN, Calvo RA, Slezak DF. Using contextual information for automatic triage of posts in a peer-support forum. In: K. Loveys, K. Niederhoffer, E. Prud’hommeaux, R. Resnik, P. Resnik (eds.) Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, CLPsych@NAACL-HTL, New Orleans, LA, USA, June 2018, pp. 57–68. Association for Computational Linguistics (2018). https://aclanthology.info/papers/W18-0606/w18-0606.

  2. 2.

    Carron-Arthur B, Ali K, Cunningham JA, Griffiths KM. From help-seekers to influential users: a systematic review of participation styles in online health communities. J Med Internet Res. 2015;17(12):e271.

    Article  Google Scholar 

  3. 3.

    Brew C. Classifying reachout posts with a radial basis function svm. In: Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology, pp. 138–142. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/W16-0315. http://www.aclweb.org/anthology/W16-0315.

  4. 4.

    Cash SJ, Thelwall M, Peck SN, Ferrell JZ, Bridge JA. Adolescent suicide statements on myspace. Cyberpsychol Behav Soc Netw. 2013;16(3):166–74. https://doi.org/10.1089/cyber.2012.0098.

    Article  Google Scholar 

  5. 5.

    Cimino A, Cresci S, Dell’Orletta F, Tesconi M. Linguistically-motivated and lexicon features for sentiment analysis of italian tweets. In: 4th evaluation campaign of Natural Language Processing and Speech tools for Italian (EVALITA 2014) pp. 81–86 (2014).

  6. 6.

    Cohan A, Young S, Goharian N. Triaging mental health forum posts. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 143–147. Association for Computational Linguistics, San Diego, CA (2016). http://www.aclweb.org/anthology/W16-0316.

  7. 7.

    Conway M, O’Connor D. Social media, big data, and mental health: current advances and ethical implications. Current Opinion in Psychology 9, 77–82 (2016). https://doi.org/10.1016/j.copsyc.2016.01.004. http://www.sciencedirect.com/science/article/pii/S2352250X16000063. Social media and applications to health behavior.

  8. 8.

    Coppersmith G, Ngo K, Leary R, Wood A. Exploratory analysis of social media prior to a suicide attempt. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 106–117. Association for Computational Linguistics, San Diego, CA (2016). http://www.aclweb.org/anthology/W16-0311.

  9. 9.

    Dyson HP, Hartling L, Shulhan J, Chisholm A, Milne A, Sundar P, Scott SD, Newton AS. A systematic review of social media use to discuss and view deliberate self-harm acts. PLoS ONE. 2016;11(5):1–15. https://doi.org/10.1371/journal.pone.0155813.

    Article  Google Scholar 

  10. 10.

    van Genderen M, Vlake J. Virtual healthcare; use of virtual, augmented and mixed reality. Nederlands tijdschrift voor geneeskunde. 2018;162:D3229.

    Google Scholar 

  11. 11.

    Gerrard Y. Beyond the hashtag: Circumventing content moderation on social media. New Media Soc. 2018;20(12):4492–511. https://doi.org/10.1177/1461444818776611.

    Article  Google Scholar 

  12. 12.

    Gkotsis G, Velupillai S, Oellrich A, Dean H, Liakata M, Dutta R.: Don’t let notes be misunderstood: a negation detection method for assessing risk of suicide in mental health records. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 95–105. Association for Computational Linguistics, San Diego, CA (2016). http://www.aclweb.org/anthology/W16-0310.

  13. 13.

    Griffiths KM. Mental health internet support groups: just a lot of talk or a valuable intervention? World Psychiatry. 2017;16(3):247–8. https://doi.org/10.1002/wps.20444.

    MathSciNet  Article  Google Scholar 

  14. 14.

    Hartzler A, Pratt W. Managing the personal side of health: How patient expertise differs from the expertise of clinicians. J Med Internet Res. 2011;13(3):e62. https://doi.org/10.2196/jmir.1728.

    Article  Google Scholar 

  15. 15.

    Hollingshead K, Ireland ME, Loveys K. (eds.). Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology—From Linguistic Signal to Clinical Reality. Association for Computational Linguistics, Vancouver, BC (2017). http://www.aclweb.org/anthology/W17-31.

  16. 16.

    Huh J, Yetisgen-Yildiz M, Pratt W. Text classification for assisting moderators in online health communities. J Biomed Inform. 2014;46(6):998–1005. https://doi.org/10.1016/j.jbi.2013.08.011.

    Article  Google Scholar 

  17. 17.

    Islam MR, Kabir MA, Ahmed A, Kamal ARM, Wang H, Ulhaq A. Depression detection from social network data using machine learning techniques. Health Inform Sci Syst. 2018;6(1):8. https://doi.org/10.1007/s13755-018-0046-0.

    Article  Google Scholar 

  18. 18.

    Jamil Z, Inkpen D, Buddhitha P, White K.: Monitoring tweets for depression to detect at-risk users. In: Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology — From Linguistic Signal to Clinical Reality, pp. 32–40. Association for Computational Linguistics, Vancouver, BC (2017). http://www.aclweb.org/anthology/W17-3104.

  19. 19.

    Kaplan K, Salzer M, Solomon P, Brusilovskiy E, Cousounis P. Internet peer support for individuals with psychiatric disabilities: a randomized controlled trial. Soc Sci Med. 2011;72:54–62.

    Article  Google Scholar 

  20. 20.

    Kim SM, Wang Y, Wan S, Paris C. Data61-csiro systems at the clpsych 2016 shared task. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 128–132. Association for Computational Linguistics, San Diego, CA (2016). http://www.aclweb.org/anthology/W16-0313.

  21. 21.

    Kornfield R, Sarma PK, Shah DV, McTavish F, Landucci G, Pe-Romashko K, Gustafson DH. Detecting recovery problems just in time: application of automated linguistic analysis and supervised machine learning to an online substance abuse forum. J Med Internet Res. 2018;20(6):e10136. https://doi.org/10.2196/10136.

    Article  Google Scholar 

  22. 22.

    Kroenke K, Spitzer RL, Williams JBW. The PHQ-9. J. General Internal Med. 2001;9:606–61316. https://doi.org/10.1046/j.1525-1497.2001.016009606.x.

    Article  Google Scholar 

  23. 23.

    Le Q, Mikolov T. Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, pp. II–1188–II–1196. JMLR.org (2014). http://dl.acm.org/citation.cfm?id=3044805.3045025.

  24. 24.

    Liu RT, Kleiman EM, Nestor BA, Cheek SM. The hopelessness theory of depression: a quarter-century in review. Clin Psychol. 2015;22(4):345–65. https://doi.org/10.1111/cpsp.12125.

    Article  Google Scholar 

  25. 25.

    Malmasi S, Zampieri M, Dras M. Predicting post severity in mental health forums. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 133–137. The Association for Computational Linguistics (2016).

  26. 26.

    Mikal J, Hurst S, Conway M. Investigating patient attitudes towards the use of social media data to augment depression diagnosis and treatment: a qualitative study. In: Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology—From Linguistic Signal to Clinical Reality, pp. 41–47. Association for Computational Linguistics, Vancouver, BC (2017). http://www.aclweb.org/anthology/W17-3105.

  27. 27.

    Milne DN, Pink G, Hachey B, Calvo RA. Clpsych 2016 shared task: triaging content in online peer-support forums. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 118–127. Association for Computational Linguistics, San Diego, CA (2016). http://www.aclweb.org/anthology/W16-0312.

  28. 28.

    Mohammad S, Turney PD. Crowdsourcing a word-emotion association lexicon. Comput Intell. 2013;29:436–65.

    MathSciNet  Article  Google Scholar 

  29. 29.

    Naslund JA, Aschbrenner KA, Marsch LA, McHugo GJ, Bartels SJ. Facebook for supporting a lifestyle intervention for people with major depressive disorder, bipolar disorder, and schizophrenia: an exploratory study. Psychiatr Quart. 2018;89(1):81–94. https://doi.org/10.1007/s11126-017-9512-0.

    Article  Google Scholar 

  30. 30.

    Naslund JA, Grande SW, Aschbrenner KA, Elwyn G. Naturally occurring peer support through social media: The experiences of individuals with severe mental illness using youtube. PLoS ONE. 2014;9(10):12.

    Article  Google Scholar 

  31. 31.

    O’Dea B, Larsen ME, Batterham PJ, Calear AL, Christensen H. A linguistic analysis of suicide-related twitter posts. Crisis. 2017;38(5):319–29. https://doi.org/10.1027/0227-5910/a000443.

    Article  Google Scholar 

  32. 32.

    O’Dea B, Wan S, Batterham PJ, Calear AL, Paris C, Christensen H. Detecting suicidality on twitter. Internet Interv. 2015;2(2):183–8. https://doi.org/10.1016/j.invent.2015.03.005.

    Article  Google Scholar 

  33. 33.

    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.

    MathSciNet  MATH  Google Scholar 

  34. 34.

    Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. EMNLP. 2014;14:1532–43.

    Google Scholar 

  35. 35.

    Pink G, Radford W, Hachey B. Classification of mental health forum posts. In: Proceedings of the 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, CLPsych@NAACL-HLT 2016, June 16, 2016, San Diego, CA, pp. 180–182 (2016). http://aclweb.org/anthology/W/W16/W16-0324.pdf.

  36. 36.

    Schwartza HA, Sap M., Kern ML, Eichstaedt JC, Kapelner A, Agrawal M, Blanco E, Dziurzynski L, Park G, Stillwell D, Kosinski M, Seligman ME, Ungar LH. Predicting individual well-being through the language of social media pp. 516–527 (2016).

  37. 37.

    Shickel B, Heesacker M, Benton S, Ebadi A, Nickerson P, Rashidi P. Self-reflective sentiment analysis. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 23–32. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/W16-0303. http://www.aclweb.org/anthology/W16-0303.

  38. 38.

    Smithson J, Sharkey S, Hewis E, Jones R, Emmens T, Owens C. Problem presentation and responses on an online forum for young people who self-harm. Discourse Stud. 2011;13(4):487–501. https://doi.org/10.1177/1461445611403356.

    Article  Google Scholar 

  39. 39.

    Staiano J, Guerini M. Depeche mood: a lexicon for emotion analysis from crowd annotated news. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 427–433. Association for Computational Linguistics, Baltimore, Maryland (2014). http://www.aclweb.org/anthology/P14-2070.

  40. 40.

    Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol. 2010;29(1):24–54. https://doi.org/10.1177/0261927X09351676.

    Article  Google Scholar 

  41. 41.

    Vapnik V, Ya A. Recognition of patterns with help of generalized portraits. Avtomat. i Telemekh. 1963;24(6):774–80.

    Google Scholar 

  42. 42.

    Pennebaker JW, Boyd RL, Jordan K, Blackburn K. The development and psychometric properties of LIWC2015 (2015).

  43. 43.

    Zirikly A, Kumar V, Resnik P. The gw/umd clpsych 2016 shared task system. In: Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, pp. 166–170. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/W16-0321. http://www.aclweb.org/anthology/W16-0321.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Gabriela Ferraro.

Ethics declarations

Conflict of interest

No competing financial interests exist.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ferraro, G., Loo Gee, B., Ji, S. et al. Lightme: analysing language in internet support groups for mental health. Health Inf Sci Syst 8, 34 (2020). https://doi.org/10.1007/s13755-020-00115-7

Download citation