In the last decade ‘big data’ has become a buzzword used in several industrial sectors, including but not limited to telephony, finance and healthcare. Despite its popularity, it is not always clear what big data refers to exactly. Big data has become a very popular topic in healthcare, where the term primarily refers to the vast and growing volumes of computerized medical information available in the form of electronic health records, administrative or health claims data, disease and drug monitoring registries and so on. This kind of data is generally collected routinely during administrative processes and clinical practice by different healthcare professionals: from doctors recording their patients’ medical history, drug prescriptions or medical claims to pharmacists registering dispensed prescriptions. For a long time, this data accumulated without its value being fully recognized and leveraged. Today big data has an important place in healthcare, including in pharmacovigilance. The expanding role of big data in pharmacovigilance includes signal detection, substantiation and validation of drug or vaccine safety signals, and increasingly new sources of information such as social media are also being considered. The aim of the present paper is to discuss the uses of big data for drug safety post-marketing assessment.
This is a preview of subscription content,to check access.
Access this article
Martin-Sanchez F, Verspoor K. Big data in medicine is driving big changes. Yearb Med Inform. 2014;9:14–20.
Ross MK, Wei W, Ohno-Machado L. “Big data” and the ELECTRONIC HEALTH RECORD. Yearb Med Inform. 2014;9:97–104.
Index for excerpts from the American Recovery and Reinvestment Act of 2009. Health Information Technology (HITECH) Act 2009. 112–64.
Hripcsak G, Albers DJ. Next-generation phenotyping of electronic health records. J Am Med Inform Assoc. 2013;20:117–21.
Wagholikar KB, Sundararajan V, Deshpande AW. Modeling paradigms for medical diagnostic decision support: a survey and future directions. J Med Syst. 2012;36:3029–49.
Sullivan P, Goldmann D. The promise of comparative effectiveness research. JAMA. 2011;305:400–1.
Bate A, Juniper J, Lawton AM, Thwaites RM. Designing and incorporating a real world data approach to international drug development and use: what the UK offers. Drug Discov Today. 2016;21(3):400–5.
Trifirò G, Vaishali P, Schuemie MJ, Coloma P, Gini P, Herings R, Mazzaglia G, Picelli P, Nicotra F, Pedersen L, van der Lei J, Sturkenboom M, on behalf of the EU-ADR consortium. Can the EU-ADR database network detect timely drug safety signals? Pharmacoepidemiol Drug Saf. 2012;21(Supp. 3):173.
Bate A, Pariente A, Hauben M, Bégaud B. Quantitative signal detection and analysis in pharmacovigilance. Mann’s Pharmacovigil. 2014:331–54.
Food and Drug Administration (FDA). Reports received and reports entered into FAERS by year. https://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/ucm070434.htm. Accessed 25 July 2017.
Uppsala Monitoring Centre. Vigibase webpage. https://www.who-umc.org/vigibase/vigibase/. Accessed 25 July 2017.
European Medicines Agency. 2016 Annual report on EudraVigilance for the European Parliament, the Council and the Commission. http://www.ema.europa.eu/docs/en_GB/document_library/Report/2017/03/WC500224056.pdf.
Platt R, Wilson M, Chan KA, Benner JS, Marchibroda J, McClellan M. The new Sentinel Network—improving the evidence of medical-product safety. N Engl J Med. 2009;361:645–7.
Coloma PM, Schuemie MJ, Trifirò G, Gini R, Herings R, Hippisley-Cox J, Mazzaglia G, Giaquinto C, Corrao G, Pedersen L, van der Lei J, Sturkenboom M, EU-ADR Consortium. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepidemiol Drug Saf. 2011;20:1–11.
Food and Drug Administration. FDA’s Sentinel initiative–background. https://www.fda.gov/Safety/FDAsSentinelInitiative/ucm149340.htm. Accessed 25 July 2017.
Trifiro G, Fourrier-Reglat A, Sturkenboom MC, DíazAcedo C, Van Der Lei J, EU-ADR Group. The EU-ADR project: preliminary results and perspective. Stud Health Technol Inform. 2009;148:43–9.
Avillach P, Coloma PM, Gini R, Schuemie M, Mougin F, Dufour JC, Mazzaglia G, Giaquinto C, Fornari C, Herings R, Molokhia M, Pedersen L, Fourrier-Réglat A, Fieschi M, Sturkenboom M, van der Lei J, Pariente A, Trifirò G, EU-ADR consortium. Harmonization process for the identification of medical events in eight European healthcare databases: the experience from the EU-ADR project. J Am Med Inform Assoc. 2013;20(1):184–92.
Trifirò G, Coloma PM, Rijnbeek PR, Romio S, Mosseveld B, Weibel D, Bonhoeffer J, Schuemie M, van der Lei J, Sturkenboom M. Combining multiple healthcare databases for postmarketing drug and vaccine safety surveillance: why and how? J Intern Med. 2014;275(6):551–61.
de Bie S, Coloma PM, Ferrajolo C, Verhamme KM, Trifirò G, Schuemie MJ, Straus SM, Gini R, Herings R, Mazzaglia G, Picelli G, Ghirardi A, Pedersen L, Stricker BH, van der Lei J, Sturkenboom MC, EU-ADR consortium. The role of electronic healthcare record databases in paediatric drug safety surveillance: a retrospective cohort study. Br J Clin Pharmacol. 2015;80(2):304–14.
Trifirò G, de Ridder M, Sultana J, Oteri A, Rijnbeek P, Pecchioli S, Mazzaglia G, Bezemer I, Garbe E, Schink T, Poluzzi E, Frøslev T, Molokhia M, Diemberger I, Sturkenboom MCJM. Use of azithromycin and risk of ventricular arrhythmia. CMAJ. 2017;189(15):E560–8.
Blake KV, Devries CS, Arlett P, Kurz X, Fitt H. Increasing scientific standards, independence and transparency in post-authorisation studies: the role of the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance. Pharmacoepidemiol Drug Saf. 2012;21(7):690–6.
Harpaz R, DuMouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel data-mining methodologies for adverse drug event discovery and analysis. Clin Pharmacol Ther. 2012;91(6):1010–21.
Bisgin H, Liu Z, Fang H, Xu X, Tong W. Mining FDA drug labels using an unsupervised learning technique-topic modeling. BMC Bioinform. 2011;12(Suppl 10):S11.
Orre R, Bate A, Norén GN, Swahn E, Arnborg S, Edwards IR. A Bayesian recurrent neural network for unsupervised pattern recognition in large incomplete data sets. Int J Neural Syst. 2005;15(03):207–22.
Chandler RE, Juhlin K, Fransson J, Caster O, Edwards IR, Norén GN. Current safety concerns with human papillomavirus vaccine: a cluster analysis of reports in VigiBase®. Drug Saf. 2017;40(1):81–90.
Alvager T, Smith TJ, Vijai F. Neural-network applications for analysis of adverse drug reactions. Biomed Instrum Technol. 1993;27(5):408–11.
Reps JM, Garibaldi JM, Aickelin U, Gibson JE, Hubbard RB. A supervised adverse drug reaction signalling framework imitating Bradford Hill’s causality considerations. J Biomed Inform. 2015;56:356–68.
Walker AM, Zhou X, Ananthakrishnan AN, Weiss LS, Shen R, Sobel RE, Bate A, Reynolds RF. Computer-assisted expert case definition in electronic health records. Int J Med Inform. 2016;86:62–70.
Luo Y, Thompson WK, Herr TM, Zeng Z, Berendsen MA, Jonnalagadda SR, Carson MB, Starren J. Natural language processing for EHR-based pharmacovigilance: a structured review. Drug Saf. 2017. (epub ahead of print).
Abacha AB, Chowdhury MF, Karanasiou A, Mrabet Y, Lavelli A, Zweigenbaum P. Text mining for pharmacovigilance: using machine learning for drug name recognition and drug–drug interaction extraction and classification. J Biomed Inform. 2015;58:122–32.
Shang N, Xu H, Rindflesch TC, Cohen T. Identifying plausible adverse drug reactions using knowledge extracted from the literature. J Biomed Inform. 2014;52:293–310.
Nikfarjam A, Sarker A, O’Connor K, Ginn R, Gonzalez G. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inform Assoc. 2015;22(3):671–81.
Sarker A, Gonzalez G. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform. 2015;53:196–207.
Smart Insights. Global social media research summary 2017. http://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/. Accessed 25 July 2017.
Pierce CE, Bouri K, Pamer C, Proestel S, Rodriguez HW, Van Le H, Freifeld CC, Brownstein JS, Walderhaug M, Edwards IR, Dasgupta N. Evaluation of Facebook and Twitter monitoring to detect safety signals for medical products: an analysis of recent FDA safety alerts. Drug Saf. 2017;40(4):317–31.
Sarker A, Ginn R, Nikfarjam A, O’Connor K, Smith K, Jayaraman S, Upadhaya T, Gonzalez GJ. Utilizing social media data for pharmacovigilance: a review. Biomed Inform. 2015;54:202–12.
Yom-Tov E, Gabrilovich E. Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large-scale analysis of web search queries. J Med Internet Res. 2013;15(6):e124.
Ventola LC. Mobile devices and apps for health care professionals: uses and benefits. Pharm Ther. 2014;39(5):356–64.
Kukula VA, Dodoo AA, Akpakli J, Narh-Bana SA, Clerk C, Adjei A, Awini E, Manye S, Nagai RA, Odonkor G, Nikoi C. Feasibility and cost of using mobile phones for capturing drug safety information in peri-urban settlement in Ghana: a prospective cohort study of patients with uncomplicated malaria. Malar J. 2015;14(1):411.
Piwek L, Ellis DA, Andrews S, Joinson A. The rise of consumer health wearables: promises and barriers. PLoS Med. 2016;13(2):e1001953.
Xia F, Yang LT, Wang L, Vinel A. Internet of things. Int J Commun Syst. 2012;25(9):1101.
Ko J, Lu C, Srivastava MB, Stankovic JA, Terzis A, Welsh M. Wireless sensor networks for healthcare. Proc IEEE. 2010;98(11):1947–60.
Pratt N, Andersen M, Bergman U, Choi NK, Gerhard T, Huang C, Kimura M, Kimura T, Kubota K, Lai EC, Ooba N, Osby U, Park BJ, Sato T, Shin JY, Sundström A, Yang YH, Roughead EE. Multi-country rapid adverse drug event assessment: the Asian Pharmacoepidemiology Network (AsPEN) antipsychotic and acute hyperglycaemia study. Pharmacoepidemiol Drug Saf. 2013;22(9):915–24.
Coloma PM, Trifirò G, Patadia V, Sturkenboom M. Postmarketing safety surveillance: where does signal detection using electronic healthcare records fit into the big picture? Drug Saf. 2013;36(3):183–97.
Patadia VK, Coloma P, Schuemie MJ, Herings R, Gini R, Mazzaglia G, Picelli G, Fornari C, Pedersen L, van der Lei J, Sturkenboom M, Trifirò G. Using real-world healthcare data for pharmacovigilance signal detection—the experience of the EU-ADR project. Expert Rev Clin Pharmacol. 2015;8(1):95–102.
Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA. 2017. doi:10.1001/jama.2017.7797 (epub ahead of print).
Obermeyer Z, Emanuel EJ. Predicting the future—big data, machine learning, and clinical medicine. N Engl J Med. 2016;375:1216–9.
No sources of funding were used to assist in the preparation of this study.
Conflict of interest
Janet Sultana has no conflicts of interest that are directly related to the contents of this study. Andrew Bate has no conflicts of interest that are directly related to the contents of this study. He is a full-time employee of Pfizer and holds stock and stock-options with Pfizer. Gianluca Trifirò has no conflicts of interest that are directly related to the contents of this study. He is the scientific coordinator of a Master’s degree course which has received unconditional funding from Celgene, Amgen, ABC International Pharma, Shire Pharmaceuticals, Mediolanum Pharmaceuticals, Hospira, Allergan, MSD, Astrazeneca, Roche, Alfa Wassermann, Otsuka, Teva Pharmaceuticals, Bristol-Myers Squibb, and Daiichi Pharmaceuticals.
About this article
Cite this article
Trifirò, G., Sultana, J. & Bate, A. From Big Data to Smart Data for Pharmacovigilance: The Role of Healthcare Databases and Other Emerging Sources. Drug Saf 41, 143–149 (2018). https://doi.org/10.1007/s40264-017-0592-4