Cognitive Computation

, Volume 7, Issue 5, pp 609–621 | Cite as

What Goes Around Comes Around: Learning Sentiments in Online Medical Forums

  • Victoria BobicevEmail author
  • Marina Sokolova
  • Michael Oakes


It has been shown that online health-related discussions significantly influence the attitudes and behavioral intentions of the discussion participants. Although empirical evidence strongly supports the importance of emotions in health-related online discussions, there are few studies of the relationship between a subjective language and online discussions of personal health. In this work, we study sentiments expressed on online medical forums. Individual posts are classified into one of five categories. We identified three categories as sentimental (encouragement, gratitude, confusion) and two categories as neutral (facts, endorsement). A total of 1438 messages were annotated manually by two annotators with a strong inter-annotator agreement (Fleiss kappa = 0.737 when the posts were annotated in the context of discussion and Fleiss kappa = 0.763 when the posts were annotated as individual entities). Using machine learning multi-class classification approach, we assess the feasibility of automated recognition of the five sentiment categories. As well as considering the predominant sentiments expressed in individual posts, we analyze transitions between sentiments in online discussions.


Natural language processing Sentiment analysis Machine learning Discourse analysis Sentiment transitions 


  1. 1.
    Kareklas I, Muehling DD, Weber TJ. Reexamining health messages in the digital age: a fresh look at source credibility effects. J Advert. 2015. Available at SSRN:
  2. 2.
    Skowron M, Rank S, Świderska A, Küster D, Kappas A. Applying a text-based affective dialogue system in psychological research: case studies on the effects of system behaviour, interaction context and social exclusion. Cogn Comput. 2014;6(4):872–91.CrossRefGoogle Scholar
  3. 3.
    Sillence E, Briggs P. Trust and Engagement in Online Health A Timeline Approach. Handb PsycholCommun Technol. 2015;33:469–87.Google Scholar
  4. 4.
    Chee B, Berlin R, Schatz B. Measuring population health using personal health messages. In: Proceedings of AMIA symposium; 2009. p. 92–6.Google Scholar
  5. 5.
    Sudau F, Friede T, Grabowski J, Koschack J, Makedonski P, Himmel W. Sources of information and behavioral patterns in online health forums: observational study. J Med Internet Res. 2014;16(1):e10. doi: 10.2196/jmir.2875.
  6. 6.
    Pennebaker JW, Chung CK. Expressive writing, emotional upheavals, and health. In: Evans JF, editor. Wellness & writing connections: writing for better physical, mental, and spiritual health. Enumclaw, WA: Idyll Arbor, Inc.; 2010. p. 33–112.Google Scholar
  7. 7.
    Smith CA. Consumer language, patient language, and thesauri: a review of the literature. J Med Libr Asso. 2011;99(2):135.CrossRefGoogle Scholar
  8. 8.
    Zafarani R, Cole W, Liu H. Sentiment propagation in social networks: a case study in live journal. Advances in social computing (SBP 2010). Springer Berlin Heidelberg; 2010. p. 413–20.Google Scholar
  9. 9.
    Malik S, Coulson N. Coping with infertility online: an examination of self-help mechanisms in an online infertility support group. Patient Educ Couns. 2010;81(2):315–8.CrossRefPubMedGoogle Scholar
  10. 10.
    Bobicev V, Sokolova M, Oakes M. Recognition of sentiment sequences in online discussions, SocialNLP-COLING; 2014.Google Scholar
  11. 11.
    Bisio F, Gastaldo P, Peretti C, Zunino R, Cambria E. Data intensive review mining for sentiment classification across heterogeneous domains. In: Advances in social networks analysis and mining (ASONAM). 2013 IEEE/ACM International Conference, IEEE; 2013. p. 1061–67.Google Scholar
  12. 12.
    Poggi I, D’Errico F. Multimodal acid communication of a politician ESSEM@AI*IA, vol. 1096 of CEUR workshop. In: Proceedings,; 2013. p. 59–70.Google Scholar
  13. 13.
    Cieliebak M, Dürr O, Uzdilli F. Potential and limitations of commercial sentiment detection tools. In: Battaglino C, Bosco C, Cambria E, Damiano R, Patti V, Rosso P, editors. Proceedings of the First International Workshop on Emotion and sentiment in social and expressive media: approaches and perspectives from AI (ESSEM 2013) A workshop of the XIII International Conference of the Italian Association for Artificial Intelligence (AI*IA 2013). 2013.Google Scholar
  14. 14.
    Biyani P, Bhatia S, Caragea C, Mitra P. Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl-Based Syst. 2014;69:170–8.CrossRefGoogle Scholar
  15. 15.
    Dodds P, Harris K, Kloumann I, Bliss C, Danforth C. Temporal patterns of happiness and information in a global social network: hedonometrics and twitter. PLoS One. 2011;6:e26752.PubMedCentralCrossRefPubMedGoogle Scholar
  16. 16.
    Chmiel A, Sienkiewicz J, Thelwall M, Paltoglou G, Buckley K, Kappas A, Hołyst JA. Collective emotions online and their influence on community life. PloS One. 2011;6(7):e22207.PubMedCentralCrossRefPubMedGoogle Scholar
  17. 17.
    Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-Based Methods for Sentiment Analysis. Comput Linguist. 2011;37(2):267–307.CrossRefGoogle Scholar
  18. 18.
    Saif H, Fernandez M, He Y, Alani H. Evaluation datasets for twitter sentiment analysis. A survey and a new dataset, the STS-gold. First ESSEM workshop; 2013.Google Scholar
  19. 19.
    Ekman P. An argument for basic emotions. Cogn Emot. 1992;6:169–200.CrossRefGoogle Scholar
  20. 20.
    Strapparava C, Mihalcea R. Semeval-2007 task 14: affective text. In: Proceedings of the 2008 ACM symposium on applied computing; 2008.Google Scholar
  21. 21.
    Cambria E, Hussain A. Sentic computing: techniques, tools, and applications. New York: Springer; 2012.Google Scholar
  22. 22.
    Staiano J, Guerini M. DepecheMood: a Lexicon for emotion analysis from crowd-annotated news. In: Proceedings of ACL-2014; 2014.Google Scholar
  23. 23.
    Osman D, Yearwood J, Vamplew P. Automated opinion detection: Implications of the level of agreement between human raters. Inf Process Manag. 2010;46:331–42.CrossRefGoogle Scholar
  24. 24.
    Sokolova M, Bobicev V. What sentiments can be found in medical forums? In: Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP. Shoumen, Bulgaria: INCOMA Ltd; 2013. p. 633–39.Google Scholar
  25. 25.
    Bobicev V, Sokolova M, Jaffer Y, Schramm D. Learning sentiments from tweets with personal health information. In: Proceedings of Canadian AI 2012. Springer; 2012. p. 37–48.Google Scholar
  26. 26.
    Balahur A, Steinberger R. Rethinking sentiment analysis in the news: from theory to practice and back. In: Proceedings of the 1st workshop on opinion mining and sentiment analysis; 2009.Google Scholar
  27. 27.
    Goeuriot L, Na J, Kyaing W, Khoo C, Chang Y, Theng Y and Kim J. Sentiment lexicons for health-related opinion mining. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium, ACM; 2012. p. 219–25.Google Scholar
  28. 28.
    Xia R, Zong C, Hu X, Cambria E. Feature ensemble plus sample selection: domain adaptation for sentiment classification. Intell Syst IEEE. 2013;28(3):10–8.CrossRefGoogle Scholar
  29. 29.
    Weichselbraun A, Gindl S, Scharl A. Extracting and grounding context-aware sentiment lexicons. IEEE Intell Syst. 2013;28(2):39–46.CrossRefGoogle Scholar
  30. 30.
    Hung C, Lin HK. Using objective words in SentiWordNet to improve word-of-mouth sentiment classification. IEEE Intell Syst. 2013;28(2):47–54.CrossRefGoogle Scholar
  31. 31.
    Smith P, Lee M. Acknowledging discourse function for sentiment analysis. In: Proceedings of CICLing; 2014.Google Scholar
  32. 32.
    Tsai ACR, Wu CE, Tsai RTH, Hsu JYJ. Building a concept-level sentiment dictionary based on commonsense knowledge. IEEE Intell Syst. 2013;28(2):22–30. doi: 10.1109/MIS.2013.25.CrossRefGoogle Scholar
  33. 33.
    Tan C, Lee L, Tang J, Jiang L, Zhou M, Li P. User-level sentiment analysis incorporating social networks. In: Proceedings of the 17th ACM SIGKDD international conference on KDDM; 2011.Google Scholar
  34. 34.
    Hassan A, Abu-Jbara A, Radev D. Detecting subgroups in online discussions by modeling positive and negative relations among participants. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning; 2012.Google Scholar
  35. 35.
    Esposito A, Fortunati L, Lugano G. Modeling emotion, behavior and context in socially believable robots and ict interfaces. Cogn Comput. 2014;6:623–7.CrossRefGoogle Scholar
  36. 36.
    Cambria E. An introduction to concept-level sentiment analysis. In: Proceedings of micai 2013, Springer; 2013. p. 478–83.Google Scholar
  37. 37.
    Baayen H. Analysing linguistic data: a practical introduction to statistics using R. New York: Cambridge University Press; 2008.CrossRefGoogle Scholar
  38. 38.
    Stanley DJ, Meyer JP. Two-dimensional affective space: a new approach to orienting the axes. Emotion. 2009;9(2):214–37.CrossRefPubMedGoogle Scholar
  39. 39.
    Havasi C, Speer R, Alonso J. ConceptNet 3: a flexible, multilingual semantic network for common sense knowledge. In: Proceedings of recent advances in natural language processing; 2007.Google Scholar
  40. 40.
    Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn Comput. 2015.41.Google Scholar
  41. 41.
    Mantikou E, Youssef MAFM, van Wely M, van der Veen F, Al-Inany HG, Repping S, Mastenbroek S. Embryo culture media and IVF/ICSI success rates: a systematic review. Hum Reprod Update. 2013;19(3):210–20.CrossRefPubMedGoogle Scholar
  42. 42.
    Pantasri T, Norman RJ. The effects of being overweight and obese on female reproduction: a review. Gynecol Endocrinol. 2013;30(2):90–4.CrossRefPubMedGoogle Scholar
  43. 43.
    Zillen N. Internet use of fertility patients: a systemic review of the literature. J Reprod Med Endocrinol. 2011;8(4):281–7.Google Scholar
  44. 44.
    Chew C, Eysenbach G. Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PloS One. 2010;5(11):e14118.PubMedCentralCrossRefPubMedGoogle Scholar
  45. 45.
    Cambria E, Hussain A, Havasi C, Eckl C, Munro J. Towards crowd validation of the UK national health service. In: ACM WebSci. Raleigh; 2010.Google Scholar
  46. 46.
    Cambria E, Hussain A, Eckl C. Bridging the gap between structured and unstructured health-care data through semantics and sentics. In: ACM Web Sci, 3rd International Conference on Web Science. Germany; 2011.Google Scholar
  47. 47.
    Cambria E, Benson T, Eckl C, Hussain A. Sentic PROMs: application of sentic computing to the development of a novel unified framework for measuring health-care quality. Expert Syst Appl. 2012;39(12):10533–43.CrossRefGoogle Scholar
  48. 48.
    Nichols T, Wisner P, Cripe G, Gulabchand L. Putting the kappa statistic to use. Qual Assur J. 2010;13:57–61.CrossRefGoogle Scholar
  49. 49.
    Baccianella S, Esuli A, Sebastiani F. SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th conference on international language resources and evaluation; 2010. p. 2200-04.Google Scholar
  50. 50.
    Wiebe Janyce, Wilson T, Cardie C. Annotating expressions of opinions and emotions in language. Lang Resour Eval. 2005;39:165–210.CrossRefGoogle Scholar
  51. 51.
    Thelwall M, Buckley K, Paltoglou G. Sentiment strength detection for the social Web. J Am Soc Inf Sci Technol. 2012;63(1):163–73.CrossRefGoogle Scholar
  52. 52.
    Riloff E, Wiebe J. Learning extraction patterns for subjective expressions. EMNLP-2003; 2003.Google Scholar
  53. 53.
    Turney PD. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of ACL’02. Philadelphia, Pennsylvania, p. 417–24.Google Scholar
  54. 54.
    Cai Q, He H, Man H. Imbalanced evolving self-organizing learning. Neurocomputing. 2014;133:258–70.CrossRefGoogle Scholar
  55. 55.
    Jurman G, Riccadonna S, Furlanello C. A comparison of MCC and CEN error measures in multi-class prediction. PloS One. 2012;7(8):e41882.PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Victoria Bobicev
    • 1
    Email author
  • Marina Sokolova
    • 2
  • Michael Oakes
    • 3
  1. 1.Technical University of MoldovaChişinăuRepublic of Moldova
  2. 2.Institute for Big Data AnalyticsUniversity of OttawaOttawaCanada
  3. 3.Research Group in Computational LinguisticsUniversity of WolverhamptonWolverhamptonUK

Personalised recommendations