MedFact: Towards Improving Veracity of Medical Information in Social Media Using Applied Machine Learning

  • Hamman SamuelEmail author
  • Osmar Zaïane
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10832)


Since the advent of Web 2.0 and social media, anyone with an Internet connection can create content online, even if it is uncertain or fake information, which has attracted significant attention recently. In this study, we address the challenge of uncertain online health information by automating systematic approaches borrowed from evidence-based medicine. Our proposed algorithm, MedFact, enables recommendation of trusted medical information within health-related social media discussions and empowers online users to make informed decisions about the credibility of online health information. MedFact automatically extracts relevant keywords from online discussions and queries trusted medical literature with the aim of embedding related factual information into the discussion. Our retrieval model takes into account layperson terminology and hierarchy of evidence. Consequently, MedFact is a departure from current consensus-based approaches for determining credibility using “wisdom of the crowd”, binary “Like” votes and ratings, popular in social media. Moving away from subjective metrics, MedFact introduces objective metrics. We also present preliminary work towards a granular veracity score by using supervised machine learning to compare statements within uncertain social media text and trusted medical text. We evaluate our proposed algorithm on various data sets from existing health social media involving both patient and medic discussions, with promising results and suggestions for ongoing improvements and future research.



We thank the Alberta Machine Intelligence Institute (Amii) for funding this research.


  1. 1.
    Kata, A.: Anti-vaccine activists, web 2.0, and the postmodern paradigm-an overview of tactics and tropes used online by the anti-vaccination movement. Vaccine 30(25), 3778–3789 (2012)CrossRefGoogle Scholar
  2. 2.
    Rippen, H., Risk, A.: e-Health code of ethics (May 24). J. Med. Internet Res. 2(2) (2000)Google Scholar
  3. 3.
    Greenhalgh, T.: How to Read a Paper: The Basics of Evidence-Based Medicine. Wiley, Chichester (2010)Google Scholar
  4. 4.
    Ackley, B.J.: Evidence-Based Nursing Care Guidelines: Medical-Surgical Interventions. Elsevier Health Sciences, St. Louis (2008)Google Scholar
  5. 5.
    Child, J.: Trust-the fundamental bond in global collaboration. Organ. Dyn. 29(4), 274–288 (2001)CrossRefGoogle Scholar
  6. 6.
    Varlamis, I., Eirinaki, M., Louta, M.: A study on social network metrics and their application in trust networks. In: Proceedings of the IEEE International Conference on Advances in Social Networks Analysis and Mining, pp. 168–175 (2010)Google Scholar
  7. 7.
    Abdaoui, A., Azé, J., Bringay, S., Poncelet, P.: Collaborative content-based method for estimating user reputation in online forums. In: Wang, J., Cellary, W., Wang, D., Wang, H., Chen, S.-C., Li, T., Zhang, Y. (eds.) WISE 2015. LNCS, vol. 9419, pp. 292–299. Springer, Cham (2015). Scholar
  8. 8.
    Grant, S., Betts, B.: Encouraging user behaviour with achievements: an empirical study. In: IEEE International Working Conference on Mining Software Repositories (MSR), pp. 65–68 (2013)Google Scholar
  9. 9.
    Aljazzaf, Z.M.: Trust-Based Service Selection. Ph.D. thesis. University of Western Ontario (2011)Google Scholar
  10. 10.
    Park, M.: HealthTrust: Assessing the Trustworthiness of Healthcare Information on the Internet. Ph.D. thesis. University of Kansas (2013)Google Scholar
  11. 11.
    Aphinyanaphongs, Y., Aliferis, C., et al.: Text categorization models for identifying unproven cancer treatments on the web. In: World Congress on Medical Informatics (MedInfo), p. 968. IOS Press (2007)Google Scholar
  12. 12.
    Oliphant, T.: “I am making my decision on the basis of my experience”: constructing authoritative knowledge about treatments for depression. Can. J. Inf. Libr. Sci. 33(3–4), 215–232 (2009)Google Scholar
  13. 13.
    Stephens, G.J., Silbert, L.J., Hasson, U.: Speaker-listener neural coupling underlies successful communication. Proc. Natl. Acad. Sci. 107(32), 14425–14430 (2010)CrossRefGoogle Scholar
  14. 14.
    Nyhan, B., Reifler, J., Richey, S., Freed, G.L.: Effective messages in vaccine promotion: a randomized trial. Pediatrics 133(4) (2014)Google Scholar
  15. 15.
    Nyhan, B., Reifler, J.: When corrections fail: the persistence of political misperceptions. Polit. Behav. 32(2), 303–330 (2010)CrossRefGoogle Scholar
  16. 16.
    Plous, S.: The Psychology of Judgment and Decision Making. McGraw-Hill, New York (1993)Google Scholar
  17. 17.
    Dunning, D.: The dunning-kruger effect: on being ignorant of one’s own ignorance. Adv. Exp. Soc. Psychol. 44, 247 (2011)CrossRefGoogle Scholar
  18. 18.
    Proctor, R., Schiebinger, L.L.: Agnotology: The Making and Unmaking of Ignorance. Stanford University Press, Stanford (2008)Google Scholar
  19. 19.
    Henderson, J.: Expert and lay knowledge: a sociological perspective. Nutr. Diet. 67(1), 4–5 (2010)CrossRefGoogle Scholar
  20. 20.
    Straus, S.E., Richardson, S.W., Glasziou, P., Haynes, B.R.: Evidence-Based Medicine: How to Practice and Teach EBM. Elsevier/Churchill Livingstone, New York (2005)Google Scholar
  21. 21.
    Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP, vol. 4, pp. 404–411 (2004)Google Scholar
  22. 22.
    Cornet, R., de Keizer, N.: Forty years of SNOMED: a literature review. BMC Med. Inform. Decis. Mak. 8(1), S2 (2008)CrossRefGoogle Scholar
  23. 23.
    Smith, C., Stavri, P.: Consumer health vocabulary. In: Consumer Health Informatics, pp. 122–128 (2005)Google Scholar
  24. 24.
    Corcoglioniti, F., Rospocher, M., Aprosio, A.P.: Extracting knowledge from text with PIKES. In: International Semantic Web Conference (2015)Google Scholar
  25. 25.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. arXiv (2013)Google Scholar
  26. 26.
    Brassey, J.: TRIP database: identifying high quality medical literature from a range of sources. New Rev. Inf. Netw. 11(2), 229–234 (2005)CrossRefGoogle Scholar
  27. 27.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  28. 28.
    Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends\({\textregistered }\) Inf. Retr. 2(1–2), 1–135 (2008)Google Scholar
  29. 29.
    De Marneffe, M.C., Manning, C.D.: Stanford Typed Dependencies Manual. Technical report, Stanford University (2008)Google Scholar
  30. 30.
    Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. In: North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) (2015)Google Scholar
  31. 31.
    Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of AlbertaEdmontonCanada

Personalised recommendations