Machine Learning for Identifying Emotional Expression in Text: Improving the Accuracy of Established Methods

  • Erin O’Carroll Bantum
  • Noémie Elhadad
  • Jason E. Owen
  • Shaodian Zhang
  • Mitch Golant
  • Joanne Buzaglo
  • Joanne Stephen
  • Janine Giese-Davis


Expression of emotion has been linked to numerous critical and beneficial aspects of human functioning. Accurately capturing emotional expression in text grows in relevance as people continue to spend more time in an online environment. The Linguistic Inquiry and Word Count (LIWC) is a commonly used program for the identification of many constructs, including emotional expression. In an earlier study by Bantum and Owen (Psychol. Assess. 21:79–88, 2009), LIWC was demonstrated to have good sensitivity yet poor positive predictive value. The goal of the current study was to create an automated machine learning technique to mimic manual coding. The sample included online support groups, cancer discussion boards, and transcripts from an expressive writing study, which resulted in 39,367 sentence-level coding decisions. In examining the entire sample, the machine learning approach outperformed LIWC, in all categories outside of sensitivity for negative emotion (LIWC sensitivity = 0.85; machine learning sensitivity = 0.41), although LIWC does not take into consideration prosocial emotion, such as affection, interest, and validation. LIWC performed significantly better than the machine learning approach when removing the prosocial emotions (p = <.0001). The sample overrepresented examples of emotion that fit into the overarching category of positive emotion. Remaining work is needed to create more effective machine learning features for codes that are thought to be important emotionally but were not well represented in the sample (e.g., frustration, contempt, and belligerence), and machine learning could be a fruitful method for continued exploration.


Emotional expression Machine learning Cancer Linguistic analysis 



This research was supported by a grant from the National Cancer Institute (NCI): 1R21CA143642-01. We would like to acknowledge the participants in the original studies as well as the following research assistants for their hard work on the manual emotion coding for the current study: Daria Ebneter, Cody Hua, Jacob Hwang, Curtis Lee, Marissa McMahon, and Josh Pelekai. We would also like to thank Dr. Ian Pagano for his help with the analysis.


  1. Alm CO, Roth D, Sproat R. (2005). Emotions from text: machine learning for text-based emotion prediction. Proceedings of the Conference on Human Language Technology and Empirical Methods for Natural Language Processing. Pages 579–86.Google Scholar
  2. Bantum, E. O., & Owen, J. E. (2009). Evaluating the validity of computerized content analysis programs for identification of emotional expression in cancer narratives. Psychological Assessment, 21, 79–88.CrossRefPubMedGoogle Scholar
  3. Brody S. & Elhadad N. (2010a). Detecting salient aspects in online reviews of health providers. AMIA Annual Symposium, pp. 202–206.Google Scholar
  4. Brody, Samuel, and Noemie Elhadad. An unsupervised aspect-sentiment model for online reviews. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 804–812. Association for Computational Linguistics, 2010b.Google Scholar
  5. Coviello, L., Sohn, Y., Kramer, A. D., Marlow, C., Franceschetti, M., Christakis, N. A., & Fowler, J. H. (2014). Detecting emotional contagion in massive social networks. PloS One, 9(3), e90315.CrossRefPubMedPubMedCentralGoogle Scholar
  6. Ekman, P. (1999). Basic emotions. In T. Dalgleish & T. Power (Eds.), The handbook of cognition and emotion (pp. 45–60). Cichester: John Wiley.Google Scholar
  7. Elhadad N., Zhang S., Driscoll P., & Brody S. (2014). Characterizing the sublanguage of online breast cancer forums medication, symptoms, and emoticons. AMIA Fall Symposium, pp 516–525.Google Scholar
  8. Epley, N., & Kruger, J. (2005). When what you type isn’t what they read: the perseverance of stereotypes and expectancies over email. Journal of Experimental Social Psychology, 41, 414–422.CrossRefGoogle Scholar
  9. Giese-Davis, J., Piemme, K. A., Dillon, C., & Twirbutt, S. (2005). Macro-variables in affective expression in women with breast cancer participating in support groups. In J. A. Harrigan, R. Rosenthal, & K. R. Schere (Eds.), Nonverbal behavior in the affective sciences: a handbook of research methods (pp. 399–445). Oxford: Oxford University Press.Google Scholar
  10. Gottman, J. M. (1994). What predicts divorce? The relationship between marital processes and marital outcomes. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  11. Honkalampi, K., Hintikka, J., Tanskanen, A., Lehtonen, J., & Viinamaki, H. (2000). Depression is strongly associated with alexithymia in the general population. Journal of Psychosomatic Research, 48, 99–104.CrossRefPubMedGoogle Scholar
  12. Jha, Mukund, and Noémie Elhadad. Cancer stage prediction based on patient online discourse. In Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, pp. 64–71. Association for Computational Linguistics, 2010.Google Scholar
  13. Kemeny, M. E., Foltz, C., Cavanagh, J. F., Cullen, M., Giese-Davis, J., Jennings, P., Rosenberg, E. L., Gillath, O., Shaver, P. R., Wallace, B. A., & Ekman, P. (2012). Comtemplative/emotion training reduces negative emotional behavior and promotes prosocial responses. Emotion, 12(2), 338–350.CrossRefPubMedGoogle Scholar
  14. Kramer, A. D., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111, 8788–8790.CrossRefGoogle Scholar
  15. Lewallen, A. C., Owen, J. E., Bantum, E. O., & Stanton, A. L. (2014). How language affects peer responsiveness in an online cancer support group: implications for treatment design and facilitation. Psycho-Oncology, 23, 766–772.CrossRefPubMedPubMedCentralGoogle Scholar
  16. Liess, A., Simon, W., Yutsis, M., Owen, J. E., Piemme, K. A., Golant, M., & Giese-Davis, J. (2008). Detecting emotional expression in face-to-face and online breast cancer support groups. Journal of Consulting and Clinical Psychology, 76, 517–523.CrossRefPubMedGoogle Scholar
  17. Lumley, M. A., Ovies, T., Stettner, L., Wehmer, F., & Lakey, B. (1996). Alexithymia, social support, and health problems. Journal of Psychosomatic Research, 41, 519–530.CrossRefPubMedGoogle Scholar
  18. Madhyastha, T. M., Hamaker, E. L., & Gottman, J. M. (2011). Investigating spousal influence using moment-to-moment affect data from marital conflict. Journal of Family Psychology, 22(2), 292–300.CrossRefGoogle Scholar
  19. Mattila, A. K., Saarni, S. I., Salminen, J. K., Huhtala, H., Sintonen, H., & Joukamaa, M. (2009). Alexithymia and health-related quality of life in the general population. Psychosomatics, 50, 59–68.CrossRefPubMedGoogle Scholar
  20. Neviarouskaya, A., Prendinger, H., & Ishizuka, M. (2010). Recognition of fine-grained emotions from text: an approach based on the compositionality principle. In T. Nishida et al. (Eds.), Modeling machine emotions for realizing intelligence (pp. 179–207). Berlin: Springer.CrossRefGoogle Scholar
  21. Nygaard, L. C., & Lunders, E. R. (2002). Resolution of lexical ambiguity by emotional tone of voice. Memory & Cognition, 30, 583–593.CrossRefGoogle Scholar
  22. Owen, J. E., Hanson, E., Preddy, D. A., & Bantum, E. O. (2011). Linguistically-tailored video feedback increases total and positive emotional expression in a structured writing task. Computers in Human Behavior, 27(2), 874–882.CrossRefGoogle Scholar
  23. Owen J.E., Bantum E.O., Criswell K., Bazzo J., Gorlick A., & Stanton A. (2013). Representativeness of two sampling procedures for an internet intervention targeting cancer related distress: a comparison of convenience and registry samples. Journal of Behavioral Medicine, [Epub ahead of print]Google Scholar
  24. Owen, J. E., Bantum, E. O., Gorlick, A., & Stanton, A. L. (2014). Engagement with a social-networking intervention for cancer-related distress. Annals of Behavioral Medicine, September, 11, 1–11.Google Scholar
  25. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.CrossRefGoogle Scholar
  26. Pennebaker, J. W. (1997). Writing about emotional experiences as a therapeutic process. Psychological Science, 8, 162–166.CrossRefGoogle Scholar
  27. Stanton, A. L., Danoff-Burg, S., Cameron, C. L., Bishop, M., Collins, C. A., Kirk, S. B., Sworowski, L. A., & Twillman, R. (2000). Emotionally expressive coping predicts psychological and physical adjustment to breast cancer. Journal of Consulting and Clinical Psychology, 68, 875–882.CrossRefPubMedGoogle Scholar
  28. Tamagawa, R., Li, Y., Gravity, T., Piemme, K., DiMiceli, S., Collie, K., & Giese Davis, J. (2015). Deconstructing therapeutic mechanisms in cancer support groups: do we express more emotion when we tell stories or talk directly to each other? Journal of Behavioral Medicine, 38(1), 171–182.CrossRefPubMedGoogle Scholar
  29. Tausczik, Y., Faasse, K., Pennebaker, J. W., & Petrie, K. J. (2012). Public anxiety and information-seeking following the H1N1 outbreak: Blogs, newspaper articles, and Wikipedia visits. Health Communication, 27, 179–185.CrossRefPubMedGoogle Scholar
  30. Van Kleef, G. A., van den Berg, H., & Heerdink, M. W. (2015). The persuasive power of emotions: effects of emotional expressions on attitude formation and change. Journal of Applied Psychology, 100, 1124–1142.CrossRefPubMedGoogle Scholar
  31. The Stanford Natural Language Processing Group. (2006). Stanford Named Entity Recognizer (NER). Retrieved from:

Copyright information

© Springer International Publishing 2017

Authors and Affiliations

  • Erin O’Carroll Bantum
    • 1
  • Noémie Elhadad
    • 2
  • Jason E. Owen
    • 3
  • Shaodian Zhang
    • 2
  • Mitch Golant
    • 4
  • Joanne Buzaglo
    • 4
  • Joanne Stephen
    • 5
  • Janine Giese-Davis
    • 6
  1. 1.Cancer Prevention & Control ProgramUniversity of Hawaii Cancer CenterHonoluluUSA
  2. 2.Biomedical InformaticsColumbia UniversityNew YorkUSA
  3. 3.Dissemination & Training DivisionVA Palo Alto Health Care SystemLivermoreUSA
  4. 4.Cancer Support CommunityWashingtonUSA
  5. 5.Alberta Health ServicesCalgaryCanada
  6. 6.Cumming School of Medicine, Department of OncologyUniversity of CalgaryCalgaryCanada

Personalised recommendations