Abstract
Pharmacovigilance aims to promote safe use of pharmaceutical products by continuously assessing the safety of marketed medications. Lately, an active area of this endeavor is to use social media such as Twitter as an alternative data source to gather patient-reported experience with medication use. Published work focused on identifying expressions of adverse effects in social media data while giving little attention to understanding the relationship between a mentioned medication and any mentioned effect expressions. In this study, we investigated the discovery of medication-effect relations from Twitter text using BERT, a transformer-based language model, with fine-tuning. Our results on a corpus of 9,516 annotated tweets show that the overall performance of our method is superior to the 4 baseline approaches studied. The outcome of this work may help automate and accelerate the process of discovering potentially unreported medication effects from patient-reported experiences documented in the sheer amount of social media data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ramos-Casals, M., et al.: Off-label use of rituximab in 196 patients with severe, refractory systemic autoimmune diseases. Clin. Exp. Rheumatol. 28, 468–476 (2010)
Effinger, A., O’Driscoll, C.M., McAllister, M., Fotaki, N.: Impact of gastrointestinal disease states on oral drug absorption–implications for formulation design–a PEARRL review. J. Pharm. Pharmacol. 71, 674–698 (2019)
Golder, S., Norman, G., Loke, Y.: Systematic review on the prevalence, frequency and comparative value of adverse events data in social media. Br. J. Clin. Pharmacol. 80, 878–888 (2015)
Sarker, A., et al.: Utilizing social media data for pharmacovigilance: a review. J. Biomed. Inform. 54, 202–212 (2015)
Magge, A., et al.: DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug effect mentions on twitter. medRxiv (2020)
Jiang, K., Huang, L., Chen, T., Karbaschi, G., Zhang, D., Bernard, G.R.: Mining potentially unreported effects from Twitter posts through relational similarity: a case for opioids. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2603–2609 (2020)
Jiang, K., Feng, S., Huang, L., Chen, T., Bernard, G.R.: Mining potential effects of HUMIRA in Twitter posts through relational similarity. Stud. Health Technol. Inf. 270, 874–878 (2020)
Jurgens, D., Mohammad, S., Turney, P., Holyoak, K.: Semeval-2012 task 2: measuring degrees of relational similarity. In: SEM 2012: The First Joint Conference on Lexical and Computational Semantics–Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pp. 356–364 (2012)
Rindflesch, T.C., Fiszman, M.: The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J. Biomed. Inform. 36, 462–477 (2003)
Aramaki, E., et al.: Extraction of adverse drug effects from clinical records. Medinfo 160, 739–743 (2010)
Gurulingappa, H., Rajput, A.M., Roberts, A., Fluck, J., Hofmann-Apitius, M., Toldo, L.: Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. J. Biomed. Inform. 45, 885–892 (2012)
Zhang, Y., Lu, Z.: Exploring semi-supervised variational autoencoders for biomedical relation extraction. Methods 166, 112–119 (2019)
Rindflesch, T.C., Fiszman, M.: The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J. Biomed. Inform. 36(6), 462–477 (2003)
Culotta, A., Sorensen, J.: Dependency tree kernels for relation extraction. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, pp. 423–429 (2004)
Wang, C., James, F.: Medical relation extraction with manifold models. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 828–838 (2014)
Song, M., Won, K.C., Dahee, L., Go, E.H., Keun, Y.K.: PKDE4J: entity and relation extraction for public knowledge discovery. J. Biomed. Inform. 57, 320–332 (2015)
Segura-Bedmar, I., MartÃnez, P., de Pablo-Sánchez, C.: Using a shallow linguistic kernel for drug–drug interaction extraction. J. Biomed. Inform. 44, 789–804 (2011)
Kim, S., Liu, H., Yeganova, L., Wilbur, W.: Extracting drug–drug interactions from literature using a rich feature-based linear kernel approach. J. Biomed. Inform. 55, 23–30 (2015)
Giuliano, C., Lavelli, A., Roman, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento (2006)
Kalina, B., Derczynski, L., Funk, A., Greenwood, M., Maynard, D., Aswani, N.: Twitie: an open-source information extraction pipeline for microblog text. In: Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, Hissar, pp. 83–90 (2013)
Hasby, M., Khodra, M.L.: Optimal path finding based on traffic information extraction from Twitter. In: International Conference on ICT for Smart Society, Jakarta, pp. 1–5 (2013)
Anggareska, D., Purwarianti, A.: Information extraction of public complaints on Twitter text for bandung government. In: 2014 International Conference on Data and Software Engineering (ICODSE), Bandung, pp. 1–6 (2014)
Yu, F., Moh, M., Moh, T.S.: Towards extracting drug-effect relation from Twitter: a supervised learning approach. In: IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), High Performance and Smart Computing (HPSC), and Intelligent Data and Security (IDS), pp. 339–344. IEEE (2016)
Adrover, C., Bodnar, T., Huang, Z., Telenti, A., Salathé, M.: Identifying adverse effects of HIV drug treatment and associated sentiments using Twitter. JMIR Publ. Health Surveill. 1, e7 (2015)
Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The SIDER database of drugs and side effects. Nucleic Acids Res. 44, D1075–D1079 (2015)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). https://arxiv.org/abs/1810.04805
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at ICLR, Scottsdale (2013)
Wang, Y., et al.: A comparison of word embeddings for the biomedical natural language processing. J. Biomed. Inform. 87, 12–20 (2018)
Angeli, G., Premkumar, M.J., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 344–354 (2015)
Zeng, Q.T., Tse, T.: Exploring and developing consumer health vocabularies. J. Am. Med. Inform. Assoc. 13, 24–29 (2006)
National Library of Medicine. Unified Medical Language System® (UMLS®) Glossary (2016). https://www.nlm.nih.gov/research/umls/new_users/glossary.html
Scheff, S.W.: Nonparametric statistics. In: Fundamental Statistical Principles for the Neurobiologist. Academic Press, New York (2016)
Sani, F., Todman, J.: Experimental design and statistics for psychology: a first course. In: Appendix 1: Statistical Tables, pp. 183–196. John Wiley & Sons, New York (2006)
Jiang, K., Feng, S., Song, Q., Calix, R.A., Gupta, M., Bernard, G.R.: Identifying tweets of personal health experience through word embedding and LSTM neural network. BMC Bioinformatics 19(8), 67–74 (2018)
Acknowledgement
Authors wish to thank anonymous reviewers for their critiques and constructive comments which improved this manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, K., Zhang, D., Bernard, G.R. (2021). Mining Medication-Effect Relations from Twitter Data Using Pre-trained Transformer Language Model. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1525. Springer, Cham. https://doi.org/10.1007/978-3-030-93733-1_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-93733-1_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93732-4
Online ISBN: 978-3-030-93733-1
eBook Packages: Computer ScienceComputer Science (R0)