What Goes Around Comes Around: Learning Sentiments in Online Medical Forums


It has been shown that online health-related discussions significantly influence the attitudes and behavioral intentions of the discussion participants. Although empirical evidence strongly supports the importance of emotions in health-related online discussions, there are few studies of the relationship between a subjective language and online discussions of personal health. In this work, we study sentiments expressed on online medical forums. Individual posts are classified into one of five categories. We identified three categories as sentimental (encouragement, gratitude, confusion) and two categories as neutral (facts, endorsement). A total of 1438 messages were annotated manually by two annotators with a strong inter-annotator agreement (Fleiss kappa = 0.737 when the posts were annotated in the context of discussion and Fleiss kappa = 0.763 when the posts were annotated as individual entities). Using machine learning multi-class classification approach, we assess the feasibility of automated recognition of the five sentiment categories. As well as considering the predominant sentiments expressed in individual posts, we analyze transitions between sentiments in online discussions.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3


  1. 1.


  2. 2.


  3. 3.


  4. 4.


  5. 5.


  6. 6.


  7. 7.


  8. 8.

    We harvested the data in July 2012.

  9. 9.

    All examples preserve original spelling and grammar.

  10. 10.



  1. 1.

    Kareklas I, Muehling DD, Weber TJ. Reexamining health messages in the digital age: a fresh look at source credibility effects. J Advert. 2015. Available at SSRN: http://ssrn.com/abstract=2556998.

  2. 2.

    Skowron M, Rank S, Świderska A, Küster D, Kappas A. Applying a text-based affective dialogue system in psychological research: case studies on the effects of system behaviour, interaction context and social exclusion. Cogn Comput. 2014;6(4):872–91.

    Article  Google Scholar 

  3. 3.

    Sillence E, Briggs P. Trust and Engagement in Online Health A Timeline Approach. Handb PsycholCommun Technol. 2015;33:469–87.

    Google Scholar 

  4. 4.

    Chee B, Berlin R, Schatz B. Measuring population health using personal health messages. In: Proceedings of AMIA symposium; 2009. p. 92–6.

  5. 5.

    Sudau F, Friede T, Grabowski J, Koschack J, Makedonski P, Himmel W. Sources of information and behavioral patterns in online health forums: observational study. J Med Internet Res. 2014;16(1):e10. doi:10.2196/jmir.2875.

  6. 6.

    Pennebaker JW, Chung CK. Expressive writing, emotional upheavals, and health. In: Evans JF, editor. Wellness & writing connections: writing for better physical, mental, and spiritual health. Enumclaw, WA: Idyll Arbor, Inc.; 2010. p. 33–112.

    Google Scholar 

  7. 7.

    Smith CA. Consumer language, patient language, and thesauri: a review of the literature. J Med Libr Asso. 2011;99(2):135.

    Article  Google Scholar 

  8. 8.

    Zafarani R, Cole W, Liu H. Sentiment propagation in social networks: a case study in live journal. Advances in social computing (SBP 2010). Springer Berlin Heidelberg; 2010. p. 413–20.

  9. 9.

    Malik S, Coulson N. Coping with infertility online: an examination of self-help mechanisms in an online infertility support group. Patient Educ Couns. 2010;81(2):315–8.

    Article  PubMed  Google Scholar 

  10. 10.

    Bobicev V, Sokolova M, Oakes M. Recognition of sentiment sequences in online discussions, SocialNLP-COLING; 2014.

  11. 11.

    Bisio F, Gastaldo P, Peretti C, Zunino R, Cambria E. Data intensive review mining for sentiment classification across heterogeneous domains. In: Advances in social networks analysis and mining (ASONAM). 2013 IEEE/ACM International Conference, IEEE; 2013. p. 1061–67.

  12. 12.

    Poggi I, D’Errico F. Multimodal acid communication of a politician ESSEM@AI*IA, vol. 1096 of CEUR workshop. In: Proceedings, CEUR-WS.org; 2013. p. 59–70.

  13. 13.

    Cieliebak M, Dürr O, Uzdilli F. Potential and limitations of commercial sentiment detection tools. In: Battaglino C, Bosco C, Cambria E, Damiano R, Patti V, Rosso P, editors. Proceedings of the First International Workshop on Emotion and sentiment in social and expressive media: approaches and perspectives from AI (ESSEM 2013) A workshop of the XIII International Conference of the Italian Association for Artificial Intelligence (AI*IA 2013). 2013.

  14. 14.

    Biyani P, Bhatia S, Caragea C, Mitra P. Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl-Based Syst. 2014;69:170–8.

    Article  Google Scholar 

  15. 15.

    Dodds P, Harris K, Kloumann I, Bliss C, Danforth C. Temporal patterns of happiness and information in a global social network: hedonometrics and twitter. PLoS One. 2011;6:e26752.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  16. 16.

    Chmiel A, Sienkiewicz J, Thelwall M, Paltoglou G, Buckley K, Kappas A, Hołyst JA. Collective emotions online and their influence on community life. PloS One. 2011;6(7):e22207.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  17. 17.

    Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-Based Methods for Sentiment Analysis. Comput Linguist. 2011;37(2):267–307.

    Article  Google Scholar 

  18. 18.

    Saif H, Fernandez M, He Y, Alani H. Evaluation datasets for twitter sentiment analysis. A survey and a new dataset, the STS-gold. First ESSEM workshop; 2013.

  19. 19.

    Ekman P. An argument for basic emotions. Cogn Emot. 1992;6:169–200.

    Article  Google Scholar 

  20. 20.

    Strapparava C, Mihalcea R. Semeval-2007 task 14: affective text. In: Proceedings of the 2008 ACM symposium on applied computing; 2008.

  21. 21.

    Cambria E, Hussain A. Sentic computing: techniques, tools, and applications. New York: Springer; 2012.

  22. 22.

    Staiano J, Guerini M. DepecheMood: a Lexicon for emotion analysis from crowd-annotated news. In: Proceedings of ACL-2014; 2014.

  23. 23.

    Osman D, Yearwood J, Vamplew P. Automated opinion detection: Implications of the level of agreement between human raters. Inf Process Manag. 2010;46:331–42.

    Article  Google Scholar 

  24. 24.

    Sokolova M, Bobicev V. What sentiments can be found in medical forums? In: Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP. Shoumen, Bulgaria: INCOMA Ltd; 2013. p. 633–39.

  25. 25.

    Bobicev V, Sokolova M, Jaffer Y, Schramm D. Learning sentiments from tweets with personal health information. In: Proceedings of Canadian AI 2012. Springer; 2012. p. 37–48.

  26. 26.

    Balahur A, Steinberger R. Rethinking sentiment analysis in the news: from theory to practice and back. In: Proceedings of the 1st workshop on opinion mining and sentiment analysis; 2009.

  27. 27.

    Goeuriot L, Na J, Kyaing W, Khoo C, Chang Y, Theng Y and Kim J. Sentiment lexicons for health-related opinion mining. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium, ACM; 2012. p. 219–25.

  28. 28.

    Xia R, Zong C, Hu X, Cambria E. Feature ensemble plus sample selection: domain adaptation for sentiment classification. Intell Syst IEEE. 2013;28(3):10–8.

    CAS  Article  Google Scholar 

  29. 29.

    Weichselbraun A, Gindl S, Scharl A. Extracting and grounding context-aware sentiment lexicons. IEEE Intell Syst. 2013;28(2):39–46.

    Article  Google Scholar 

  30. 30.

    Hung C, Lin HK. Using objective words in SentiWordNet to improve word-of-mouth sentiment classification. IEEE Intell Syst. 2013;28(2):47–54.

    Article  Google Scholar 

  31. 31.

    Smith P, Lee M. Acknowledging discourse function for sentiment analysis. In: Proceedings of CICLing; 2014.

  32. 32.

    Tsai ACR, Wu CE, Tsai RTH, Hsu JYJ. Building a concept-level sentiment dictionary based on commonsense knowledge. IEEE Intell Syst. 2013;28(2):22–30. doi:10.1109/MIS.2013.25.

    Article  Google Scholar 

  33. 33.

    Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P. User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD international conference on KDDM; 2011.

  34. 34.

    Hassan A, Abu-Jbara A, Radev D. Detecting subgroups in online discussions by modeling positive and negative relations among participants. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning; 2012.

  35. 35.

    Esposito A, Fortunati L, Lugano G. Modeling emotion, behavior and context in socially believable robots and ict interfaces. Cogn Comput. 2014;6:623–7.

    Article  Google Scholar 

  36. 36.

    Cambria E. An introduction to concept-level sentiment analysis. In: Proceedings of micai 2013, Springer; 2013. p. 478–83.

  37. 37.

    Baayen H. Analysing linguistic data: a practical introduction to statistics using R. New York: Cambridge University Press; 2008.

    Google Scholar 

  38. 38.

    Stanley DJ, Meyer JP. Two-dimensional affective space: a new approach to orienting the axes. Emotion. 2009;9(2):214–37.

    Article  PubMed  Google Scholar 

  39. 39.

    Havasi C, Speer R, Alonso J. ConceptNet 3: a flexible, multilingual semantic network for common sense knowledge. In: Proceedings of recent advances in natural language processing; 2007.

  40. 40.

    Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn Comput. 2015.41.

  41. 41.

    Mantikou E, Youssef MAFM, van Wely M, van der Veen F, Al-Inany HG, Repping S, Mastenbroek S. Embryo culture media and IVF/ICSI success rates: a systematic review. Hum Reprod Update. 2013;19(3):210–20.

    CAS  Article  PubMed  Google Scholar 

  42. 42.

    Pantasri T, Norman RJ. The effects of being overweight and obese on female reproduction: a review. Gynecol Endocrinol. 2013;30(2):90–4.

    Article  PubMed  Google Scholar 

  43. 43.

    Zillen N. Internet use of fertility patients: a systemic review of the literature. J Reprod Med Endocrinol. 2011;8(4):281–7.

    Google Scholar 

  44. 44.

    Chew C, Eysenbach G. Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PloS One. 2010;5(11):e14118.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  45. 45.

    Cambria E, Hussain A, Havasi C, Eckl C, Munro J. Towards crowd validation of the UK national health service. In: ACM WebSci. Raleigh; 2010.

  46. 46.

    Cambria E, Hussain A, Eckl C. Bridging the gap between structured and unstructured health-care data through semantics and sentics. In: ACM Web Sci, 3rd International Conference on Web Science. Germany; 2011.

  47. 47.

    Cambria E, Benson T, Eckl C, Hussain A. Sentic PROMs: application of sentic computing to the development of a novel unified framework for measuring health-care quality. Expert Syst Appl. 2012;39(12):10533–43.

    Article  Google Scholar 

  48. 48.

    Nichols T, Wisner P, Cripe G, Gulabchand L. Putting the kappa statistic to use. Qual Assur J. 2010;13:57–61.

    Article  Google Scholar 

  49. 49.

    Baccianella S, Esuli A, Sebastiani F. SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th conference on international language resources and evaluation; 2010. p. 2200-04.

  50. 50.

    Wiebe Janyce, Wilson T, Cardie C. Annotating expressions of opinions and emotions in language. Lang Resour Eval. 2005;39:165–210.

    Article  Google Scholar 

  51. 51.

    Thelwall M, Buckley K, Paltoglou G. Sentiment strength detection for the social Web. J Am Soc Inf Sci Technol. 2012;63(1):163–73.

    Article  Google Scholar 

  52. 52.

    Riloff E, Wiebe J. Learning extraction patterns for subjective expressions. EMNLP-2003; 2003.

  53. 53.

    Turney PD. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of ACL’02. Philadelphia, Pennsylvania, p. 417–24.

  54. 54.

    Cai Q, He H, Man H. Imbalanced evolving self-organizing learning. Neurocomputing. 2014;133:258–70.

    Article  Google Scholar 

  55. 55.

    Jurman G, Riccadonna S, Furlanello C. A comparison of MCC and CEN error measures in multi-class prediction. PloS One. 2012;7(8):e41882.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Victoria Bobicev.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bobicev, V., Sokolova, M. & Oakes, M. What Goes Around Comes Around: Learning Sentiments in Online Medical Forums. Cogn Comput 7, 609–621 (2015). https://doi.org/10.1007/s12559-015-9327-y

Download citation


  • Natural language processing
  • Sentiment analysis
  • Machine learning
  • Discourse analysis
  • Sentiment transitions