Skip to main content

Mining Medication-Effect Relations from Twitter Data Using Pre-trained Transformer Language Model

  • Conference paper
  • First Online:
Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021)

Abstract

Pharmacovigilance aims to promote safe use of pharmaceutical products by continuously assessing the safety of marketed medications. Lately, an active area of this endeavor is to use social media such as Twitter as an alternative data source to gather patient-reported experience with medication use. Published work focused on identifying expressions of adverse effects in social media data while giving little attention to understanding the relationship between a mentioned medication and any mentioned effect expressions. In this study, we investigated the discovery of medication-effect relations from Twitter text using BERT, a transformer-based language model, with fine-tuning. Our results on a corpus of 9,516 annotated tweets show that the overall performance of our method is superior to the 4 baseline approaches studied. The outcome of this work may help automate and accelerate the process of discovering potentially unreported medication effects from patient-reported experiences documented in the sheer amount of social media data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.nlm.nih.gov/research/umls/META3_current_relations.html.

  2. 2.

    https://github.com/medeffects/tweet_corpora.

References

  1. Ramos-Casals, M., et al.: Off-label use of rituximab in 196 patients with severe, refractory systemic autoimmune diseases. Clin. Exp. Rheumatol. 28, 468–476 (2010)

    Google Scholar 

  2. Effinger, A., O’Driscoll, C.M., McAllister, M., Fotaki, N.: Impact of gastrointestinal disease states on oral drug absorption–implications for formulation design–a PEARRL review. J. Pharm. Pharmacol. 71, 674–698 (2019)

    Article  Google Scholar 

  3. Golder, S., Norman, G., Loke, Y.: Systematic review on the prevalence, frequency and comparative value of adverse events data in social media. Br. J. Clin. Pharmacol. 80, 878–888 (2015)

    Article  Google Scholar 

  4. Sarker, A., et al.: Utilizing social media data for pharmacovigilance: a review. J. Biomed. Inform. 54, 202–212 (2015)

    Article  Google Scholar 

  5. Magge, A., et al.: DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug effect mentions on twitter. medRxiv (2020)

    Google Scholar 

  6. Jiang, K., Huang, L., Chen, T., Karbaschi, G., Zhang, D., Bernard, G.R.: Mining potentially unreported effects from Twitter posts through relational similarity: a case for opioids. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2603–2609 (2020)

    Google Scholar 

  7. Jiang, K., Feng, S., Huang, L., Chen, T., Bernard, G.R.: Mining potential effects of HUMIRA in Twitter posts through relational similarity. Stud. Health Technol. Inf. 270, 874–878 (2020)

    Google Scholar 

  8. Jurgens, D., Mohammad, S., Turney, P., Holyoak, K.: Semeval-2012 task 2: measuring degrees of relational similarity. In: SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pp. 356–364 (2012)

    Google Scholar 

  9. Rindflesch, T.C., Fiszman, M.: The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J. Biomed. Inform. 36, 462–477 (2003)

    Article  Google Scholar 

  10. Aramaki, E., et al.: Extraction of adverse drug effects from clinical records. Medinfo 160, 739–743 (2010)

    Google Scholar 

  11. Gurulingappa, H., Rajput, A.M., Roberts, A., Fluck, J., Hofmann-Apitius, M., Toldo, L.: Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J. Biomed. Inform. 45, 885–892 (2012)

    Article  Google Scholar 

  12. Zhang, Y., Lu, Z.: Exploring semi-supervised variational autoencoders for biomedical relation extraction. Methods 166, 112–119 (2019)

    Article  Google Scholar 

  13. Rindflesch, T.C., Fiszman, M.: The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J. Biomed. Inform. 36(6), 462–477 (2003)

    Article  Google Scholar 

  14. Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, pp. 423–429 (2004)

    Google Scholar 

  15. Wang, C., James, F.: Medical relation extraction with manifold models. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 828–838 (2014)

    Google Scholar 

  16. Song, M., Won, K.C., Dahee, L., Go, E.H., Keun, Y.K.: PKDE4J: entity and relation extraction for public knowledge discovery. J. Biomed. Inform. 57, 320–332 (2015)

    Article  Google Scholar 

  17. Segura-Bedmar, I., Martínez, P., de Pablo-Sánchez, C.: Using a shallow linguistic kernel for drug–drug interaction extraction. J. Biomed. Inform. 44, 789–804 (2011)

    Article  Google Scholar 

  18. Kim, S., Liu, H., Yeganova, L., Wilbur, W.: Extracting drug–drug interactions from literature using a rich feature-based linear kernel approach. J. Biomed. Inform. 55, 23–30 (2015)

    Article  Google Scholar 

  19. Giuliano, C., Lavelli, A., Roman, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento (2006)

    Google Scholar 

  20. Kalina, B., Derczynski, L., Funk, A., Greenwood, M., Maynard, D., Aswani, N.: Twitie: an open-source information extraction pipeline for microblog text. In: Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, Hissar, pp. 83–90 (2013)

    Google Scholar 

  21. Hasby, M., Khodra, M.L.: Optimal path finding based on traffic information extraction from Twitter. In: International Conference on ICT for Smart Society, Jakarta, pp. 1–5 (2013)

    Google Scholar 

  22. Anggareska, D., Purwarianti, A.: Information extraction of public complaints on Twitter text for bandung government. In: 2014 International Conference on Data and Software Engineering (ICODSE), Bandung, pp. 1–6 (2014)

    Google Scholar 

  23. Yu, F., Moh, M., Moh, T.S.: Towards extracting drug-effect relation from Twitter: a supervised learning approach. In: IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), High Performance and Smart Computing (HPSC), and Intelligent Data and Security (IDS), pp. 339–344. IEEE (2016)

    Google Scholar 

  24. Adrover, C., Bodnar, T., Huang, Z., Telenti, A., Salathé, M.: Identifying adverse effects of HIV drug treatment and associated sentiments using Twitter. JMIR Publ. Health Surveill. 1, e7 (2015)

    Article  Google Scholar 

  25. Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The SIDER database of drugs and side effects. Nucleic Acids Res. 44, D1075–D1079 (2015)

    Article  Google Scholar 

  26. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). https://arxiv.org/abs/1810.04805

  27. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR, Scottsdale (2013)

    Google Scholar 

  28. Wang, Y., et al.: A comparison of word embeddings for the biomedical natural language processing. J. Biomed. Inform. 87, 12–20 (2018)

    Article  Google Scholar 

  29. Angeli, G., Premkumar, M.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 344–354 (2015)

    Google Scholar 

  30. Zeng, Q.T., Tse, T.: Exploring and developing consumer health vocabularies. J. Am. Med. Inform. Assoc. 13, 24–29 (2006)

    Article  Google Scholar 

  31. National Library of Medicine. Unified Medical Language System® (UMLS®) Glossary (2016). https://www.nlm.nih.gov/research/umls/new_users/glossary.html

  32. Scheff, S.W.: Nonparametric statistics. In: Fundamental Statistical Principles for the Neurobiologist. Academic Press, New York (2016)

    Google Scholar 

  33. Sani, F., Todman, J.: Experimental design and statistics for psychology: a first course. In: Appendix 1: Statistical Tables, pp. 183–196. John Wiley & Sons, New York (2006)

    Google Scholar 

  34. Jiang, K., Feng, S., Song, Q., Calix, R.A., Gupta, M., Bernard, G.R.: Identifying tweets of personal health experience through word embedding and LSTM neural network. BMC Bioinformatics 19(8), 67–74 (2018)

    Google Scholar 

Download references

Acknowledgement

Authors wish to thank anonymous reviewers for their critiques and constructive comments which improved this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keyuan Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, K., Zhang, D., Bernard, G.R. (2021). Mining Medication-Effect Relations from Twitter Data Using Pre-trained Transformer Language Model. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1525. Springer, Cham. https://doi.org/10.1007/978-3-030-93733-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93733-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93732-4

  • Online ISBN: 978-3-030-93733-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics