Detecting False Information in Medical and Healthcare Domains: A Text Mining Approach

  • Jiexun LiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11924)


In recent years, a lot of false information in medical and healthcare domains has emerged and spread over the Internet. Such false information has become a big risk to public health and safety. This study investigates this problem by analyzing data collected from two fact-checking websites, 416 medical claims from and 1,692 healthcare-related statements from Topic analysis reveals frequent words and common topics occurring in these claims spread online. Furthermore, using text-mining and machine-learning techniques, this study builds prediction models for detecting false information and shows promising performance. Several textual and source features are identified as good indicators for true or false information in medical and healthcare domains.


False information Medical Healthcare Text mining 


  1. 1.
    Kumar, S., Shah, N.: False Information on Web and Social Media: A Survey, April 2018Google Scholar
  2. 2.
    Sloan, L., Quan-Haase, A., Rubin, V.L.: Deception detection and rumor debunking for social media. In: The SAGE Handbook of Social Media Research Methods, pp. 342–363 (2017)Google Scholar
  3. 3.
    Luca, M., Zervas, G.: Fake it till you make it: reputation, competition, and yelp review fraud. Manag. Sci. 62(12), 3412–3427 (2016)CrossRefGoogle Scholar
  4. 4.
    Shu, K., Wang, S., Liu, H.: Beyond news contents: the role of social context for fake news detection. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 312–320 (2019)Google Scholar
  5. 5.
    Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 309–319 (2011)Google Scholar
  6. 6.
    Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a general rule for identifying deceptive opinion spam. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), pp. 1566–1576 (2014)Google Scholar
  7. 7.
    Juuti, M., Sun, B., Mori, T., Asokan, N.: Stay on-topic: generating context-specific fake restaurant reviews. In: Lopez, J., Zhou, J., Soriano, M. (eds.) ESORICS 2018. LNCS, vol. 11098, pp. 132–151. Springer, Cham (2018). Scholar
  8. 8.
    Purnomo, M.H., Sumpeno, S., Setiawan, E.I., Purwitasari, D.: Keynote speaker II: biomedical engineering research in the social network analysis era: stance classification for analysis of hoax medical news in social media. Procedia Comput. Sci. 116, 3–9 (2017)CrossRefGoogle Scholar
  9. 9.
    Tsay, M.A.J.: The internet, ethics, and false beliefs in health care. AMA J. Ethics 20(11), 1003–1006 (2018)CrossRefGoogle Scholar
  10. 10.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Western Washington UniversityBellinghamUSA

Personalised recommendations