Study of serious adverse drug reactions using FDA-approved drug labeling and MedDRA
Adverse Drug Reactions (ADRs) are of great public health concern. FDA-approved drug labeling summarizes ADRs of a drug product mainly in three sections, i.e., Boxed Warning (BW), Warnings and Precautions (WP), and Adverse Reactions (AR), where the severity of ADRs are intended to decrease in the order of BW > WP > AR. Several reported studies have extracted ADRs from labeling documents, but most, if not all, did not discriminate the severity of the ADRs by the different labeling sections. Such a practice could overstate or underestimate the impact of certain ADRs to the public health. In this study, we applied the Medical Dictionary for Regulatory Activities (MedDRA) to drug labeling and systematically analyzed and compared the ADRs from the three labeling sections with a specific emphasis on analyzing serious ADRs presented in BW, which is of most drug safety concern.
This study investigated New Drug Application (NDA) labeling documents for 1164 single-ingredient drugs using Oracle Text search to extract MedDRA terms. We found that only a small portion of MedDRA Preferred Terms (PTs), 3819 out of 21,920 or 17.42%, were observed in a whole set of documents. In detail, 466/3819 (12.0%) PTs were in BW, 2023/3819 (53.0%) were in WP, and 2961/3819 (77.5%) were in AR sections. We also found a higher overlap of top 20 occurring BW PTs with WP sections compared to AR sections. Within the MedDRA System Organ Class levels, serious ADRs (sADRs) from BW were prevalent in Nervous System disorders and Vascular disorders. A Hierarchical Cluster Analysis (HCA) revealed that drugs within the same therapeutic category shared the same ADR patterns in BW (e.g., nervous system drug class is highly associated with drug abuse terms such as dependence, substance abuse, and respiratory depression).
This study demonstrated that combining MedDRA standard terminologies with data mining techniques facilitated computer-aided ADR analysis of drug labeling. We also highlighted the importance of labeling sections that differ in seriousness and application in drug safety. Using sADRs primarily related to BW sections, we illustrated a prototype approach for computer-aided ADR monitoring and studies which can be applied to other public health documents.
KeywordsAdverse drug reactions Data mining MedDRA Drug labeling Boxed Warning Structured product labeling Standard terminology Drug safety
Adverse drug reaction
High level group term
High level term
Low level term
Medical dictionary for regulatory activities
New drug application
Natural language processing
Serious adverse drug reaction
System organ class
Structured product labeling
Structured query language
Unique ingredient identifier
Warnings and precautions
Adverse Drug Reactions (ADRs) are harmful events related to the use of a drug product. A serious Adverse Drug Reaction (sADR) is defined as any event or reaction that results in death, a life threatening adverse event, inpatient hospitalization or prolongation of existing hospitalization, a persistent or significant incapacity or substantial disruption of the ability to conduct normal life functions, or a congenital anomaly or birth defect [1, 2]. In the U.S., sADRs contribute to over 100,000 deaths per year and have been one of the leading causes of mortality over the past several decades, and thus impose a significant public health concern [1, 3, 4, 5, 6, 7]. sADRs such as liver failure and fatal arrhythmia, can lead to a drug being withdrawn from the market when the risks outweigh the benefits [8, 9, 10, 11].
FDA-approved drug labeling is defined by the Code of Federal Regulations (21CFR201.57)  and contains 17 distinct sections. Each section provides specific information such as drug safety (e.g., Drug Interactions and Contraindications), efficacy (e.g., Indications & Usage and Dosage & Administration), patient information (e.g., Patient Counseling Information), target populations (e.g., Use in Specific Populations), and clinical and nonclinical data (e.g., Clinical Pharmacology and Nonclinical Toxicology) . To promote the safe use of drug products and protect public health, ADR information is collected from clinical trials and post-marketing surveillance data and summarized in FDA-approved drug labeling . Boxed Warning (BW), Warnings and Precautions (WP), and Adverse Reactions (AR) are three sections that focus on ADRs.
Even though these three sections involve ADRs, each has a different level of severity and coverage. BW describes “serious warnings, particularly those that lead to death or serious injury,” while WP describes “clinically significant adverse reactions,” and AR describes “overall adverse reaction profile of the drug” . Consequently, ADRs mentioned in BW are the most serious, whereas those in either WP or AR contain serious and less-serious ADRs. While each of these three sections do contain pertinent information related to adverse reactions that is valuable and critical for health professionals to promote the safe use of the drug product. Overall, if these three ADR related sections are treated equally could lead to an inadequate assessment of the severity degree of ADRs, and could lead to misinterpretation or unintended harmful events. Therefore, it is important to consider the different levels of severity associated with labeling sections when studying ADRs.
The Medical Dictionary for Regulatory Activities (MedDRA) [15, 16, 17, 18] is the standard medical terminology developed by the International Council for Harmonization (ICH) of Technical Requirements for Pharmaceuticals for Human Use, and is used worldwide to facilitate the sharing of regulatory information for medical products. MedDRA is mandated in Europe and Japan for safety reports , and has been used for coding adverse events in the FDA’s Adverse Event Reporting System (FAERS) . MedDRA is widely applied in analyzing adverse event report data [21, 22, 23, 24] and in mining public health data (e.g., Medline, WebMD, and Web of Science databases) for potential safety concerns [25, 26, 27, 28]. One of the key features of MedDRA is its five-level hierarchical structure. The basic Low Level Terms (LLTs) are the most granular terms and can be used to encode adverse events (AEs) or ADRs. LLTs often include common and well known terms that patients, those reporting ADRs, and some healthcare providers frequently use. Synonymous and quasi-synonymous LLTs are grouped under a Preferred Term (PT), which many health care providers and researchers are prone to use. Through the hierarchy, clinically relevant PTs are grouped under High Level Terms (HLT), and relevant HLTs are grouped under High Level Group Terms (HLGT) in System Organ Classes (SOC). This network of linked terms provides a method to standardize the language used and allows for accurate analysis of reported ADRs.
Studies have successfully implemented the use of MedDRA terminology to code and investigate ADRs in a variety of documents. For example, a study conducted by Thiessard et al. applied MedDRA terminology to study over 190,000 ADR reports in the French spontaneous reporting system between years 1986–2001  and discovered that ADRs related to skin and subcutaneous tissue disorders and nervous system disorders were the most frequently reported. de Langen et al. used MedDRA to code and compare ADRs self-reported by patients and those reported by healthcare professionals, to evaluate the intrinsic value of patient self-reporting , and found differences in the categories of the seriousness (e.g., life-threatening and death related ADRs).
MedDRA has also been used to analyze ADRs in FDA drug labeling [29, 30]. For example, the Side Effect Resource Database (SIDER) applied MedDRA terminology to extract ADR information from drug labeling [30, 31, 32]. In our previous research, we have applied MedDRA to drug labeling to assess the utility of ADRs in drug repurposing . However, most research on drug labeling, if not all, does not discriminate the severity of an ADR according to different labeling sections (e.g., BW, WP, and AR). Therefore, they might not provide an adequate assessment of drug toxicity and severity, potentially undermining the utility of drug labeling.
To demonstrate the utility of FDA-approved drug labeling for the study of ADRs, we compared the results from the three sections with a specific focus on sADRs presented in BW. Our results demonstrate that this computer-aided ADR analysis of combining standardized terminology of MedDRA with data mining techniques allowed us to characterize the frequency, severity, and pattern of ADRs in drug labeling documents. This approach provides a prototype for the study of ADRs in other public health documents.
ADR analysis based on different drug labeling sections
Occurrence of MedDRA terms in three ADR related labeling sections
ADR Section Name
# Low Level Terms*
# Preferred Terms*
Boxed Warning (BW)
Warnings and Precautions (WP)
Adverse Reactions (AR)
Whole Labeling Document
These results further support our theory that by simply treating these three ADR sections equally could lead to the misinterpretation and potential underestimation of the most important sADRs. We have focused on the analysis of sADRs through the investigation of PTs in BW section in the subsequent analysis.
Drug induced organ toxicity
Of note, SOC Genrl involved the highest number of drugs (197) and had 41 unique PTs like Death, Pain, and Perforation. SOC Nerv involved the second highest number of drugs (123) and contained the most PTs (58 unique PTs). SOCs Card, Vasc, and Blood involved a relatively higher number of drugs and a significantly higher number of PTs compared to SOCs Endocrine disorders (Endo), Eye disorders (Eye), and Ear and labyrinth disorders (Ear).
Hierarchical cluster analysis reveal PT patterns across drug classes
The same drug classes shared similar PT patterns
By applying HCA, we were able to investigate whether drugs under the same ATC therapeutic categories share similar PT patterns. HCA results revealed several clusters: (a) L01 (antineoplastic agents) and L04 (immunomodulating agents) shared diverse PT profiling with 39 PTs; L01 involved 75 of the total 129 PTs. (b) J05 (antivirals for systemic use) drugs were highly enriched with PTs like Hepatitis and HIV infection. (c) Nervous system ATC groups (N) were enriched with drug abuse related PTs like substance abuse, dependence, and completed suicide. (d) PTs such as coma, respiratory depression, and sedation co-occurred in BW of Nervous system drugs. (e) PTs such as myocardial infarction and ulcer were shared between S01 (ophthalmologicals), D01 (other dermatological preparations), M01 (anti-inflammatory and antirheumatic products), and M02 (topical products for joint and muscular pain), which all include NSAIDs that can increase the risk of serious gastrointestinal adverse reactions.
Analysis was conducted on ADRs which were extracted from BW, WP, and AR sections using MedDRA terminology and Oracle Text search. We first conducted a comparative analysis of three ADR sections of drug labeling (i.e., BW, WP and AR). Next, we applied pattern recognition and statistical methods to analyze sADRs from BW across MedDRA SOCs and therapeutic classes to gain an understanding of the sADRs underpinning drug safety. Our study has shown that MedDRA hierarchical structure facilitates the novel use of drug labeling documents for the analysis of sADRs. In addition, data mining by combining MedDRA and drug class information revealed patterns of sADRs within and across ATC drug classes.
The number of MedDRA PTs occurring in each section increased in the order of BW < WP < AR while the severity of the ADRs decrease in the same order (BW > WP > AR). We compared the top 20 most frequently occurring MedDRA PTs among BW, WP, and AR. The six PTs (Death, Pregnancy, Depression, Hemorrhage, Cardiac Failure, Infection) that overlapped between BW and WP are more serious ADRs in comparison to eight PTs (Nausea, Pain, Vomiting, Diarrhea, Hypersensitivity, Pyrexia, Infection, and Hypertension) that were highly present in both WP and AR. We noticed that only one PT (Infection) out of 20 top PTs was present across all three sections, indicating that virus infection could lead to diverse side effects of drug use.
Analysis results showed that a PT occurring in different sections may carry a different frequency and weight. For example, PT Myocardial infarction occurred 34/367 (9.26%) times in BW sections and was observed 193/1148 (16.8%) times in WP sections, indicating that the usage frequency of Myocardial infarction is similar in the two labeling sections, mainly because sADRs like myocardial infarction are described in both BW and WP. On the other hand, PT Hypersensitivity showed a different rate among the sections, as it only occurred 11/367 (3.00%) times in BW sections but occurred 360/1148 (31.36%) times in WP sections. Hypersensitivity’s appearing more often in WP than BW section indicates that the seriousness of Hypersensitivity varies from drug to drug. Thus, the frequency and seriousness of the ADR will need to be taken into consideration while evaluating ADR risks.
Most, if not all, previous ADR studies using drug labeling with MedDRA [30, 32] focused on ADRs from the entire drug labeling with no discrimination in the severity of the same ADRs appearing in different sections. Such an approach does not fully take advantage of the drug labeling information. For example, SIDER is a well-established resource containing information on marketed medicines and recorded ADRs, which is mainly extracted from public documents and drug labeling. The available information includes ADR frequency, drug and ADR classifications, drug indication, and other relevant information. However, the SIDER database does not discriminate ADRs of one section from another, which could lead to a false representation of ADRs. The separation of ADRs by sections is of great importance when discriminating the seriousness level of ADRs for drug safety monitoring and evaluation [14, 35], as shown in this study.
HCA analysis revealed that the same classes of the drugs are likely to have similar PT (i.e., ADR) patterns. Drugs from sub-therapeutic categories N01 and N02 (e.g., opioids) in the Nervous system class (N), were more related to PTs such as substance abuse, dependence (including LLT addiction), and respiratory depression (Additional file 3). These findings are consistent with our understanding that the opioid crisis is highly related to addiction. The opioid epidemic is one of the most pressing public health concerns in the U.S. and is a top priority for the FDA . For drugs that are known to have potentially serious risks, the FDA has enhanced labeling by incorporating the Risk Evaluation and Mitigation Strategy program (REMS) to provide an oversight for the continued safe use of those drugs . One N01 drug (fentanyl) and two N02 drugs (buprenorphine, oxycodone) are opioids under REMS (Fig. 3, cluster c). Another N01 drug involved in cluster c (sodium oxybate) is also under REMS.
In this study, we applied MedDRA terms to extract ADRs in drug labeling, an area that has not been well investigated. Drug labeling documents are in free text, making it difficult to extract information and conduct ADR analysis. Use of MedDRA terminology to standardize ADR terms helps to enhance the analytical ability in text mining. This method can be deployed in pharmacovigilance by mining free text observational data for adverse drug events to assist drug safety surveillance. In addition to MedDRA, there are other biomedical terminologies, dictionaries, and coding systems (e.g., SNOMED-CT and ICD9) that have been developed for public healthcare information dissemination . However, SNOMED-CT is not limited to tractable levels for its hierarchies (i.e., more than 10 levels), which creates hurdles for the translational and regulatory application. The MedDRA hierarchy, with five clearly defined levels, simplifies mapping and coding practices and facilitates communications with ADR reporting systems like the FDA Adverse Event Reporting System (FAERS). Of note, MedDRA is used as the adverse event reporting terminology by many drug regulatory authorities and the pharmaceutical industry worldwide but is not required for FDA-approved drug labeling. MedDRA PTs can be used to describe medical events and medication errors that are AEs or ADRs.
To evaluate Oracle Text search performance on MedDRA terms extracted from the Boxed Warning drugs, we compared our results with a dataset of manually extracted ADRs from 200 drug labeling published in Scientific Data in 2018 (as a gold-standard dataset) . Specifically, our study and the publication had 30 BW drugs in common. We calculated the recall and precision for each drug (Additional file 4). On average per drug, the recall score for PTs was 0.93 by Oracle Text search; 26 of the 30 (86.7%) drugs yielded 1.0 recall. Four of the 30 drugs had false-negative PTs (total of 3 different PTs). Differences were due to identification of PTs which occurred during the manual coding by experts (using human interpretation) in the reference dataset, that Oracle Text search was unable to match because those words did not appear in that exact order in the labeling text (details see Additional file 4). For example, Oracle did not recognize the term “suicidal behavior” when it occurred in the text as “suicidal thinking and behavior.” The average precision was low, 0.46, indicative of high false-positives, which were mostly contributed to the occurrence of an extra smaller term within a larger term (e.g., myocardial infarction contains PT term infarction) which is difficult for Oracle Text to distinguish as one larger PT and not two PTs.
Further caution should be exercised due to the following listed reasons. First, drug labeling documents are not mandated to be MedDRA coded and some ADRs in drug labeling are worded differently from the terms in MedDRA which could cause Oracle Text query to fail to identify them. Second, MedDRA has terms beyond ADRs for regulatory reporting purposes. Third, stop words and multiple-meaning words may pose an additional limitation. Oracle Text query was built with basic NLP (Natural Language Processing) techniques including stop word removing, stemming, and tokenization. Default stop words used during Oracle Text indexing and mapping of the MedDRA dictionary did present a problem. For example, Hepatitis A contained the stop word ‘A’ and Hepatitis D contained the stop word ‘D’. Thus, all labeling that contained “Hepatitis *” was identified as a positive hit regardless of whether it was A, D, or another stop word (Fig. 4). Lastly, issues with multiple-meaning words were also identified during this study. For example, drug labeling might contain the word “fall” as in “fall in hemoglobin,” meaning decreased blood hemoglobin level. Therefore, the accurate coding for this situation should be LLT “Hemoglobin decreased” not LLT “fall,” which refers to a person “falling down.”
Overall, relatively high recall and low precision was observed using Oracle Text search compared to the gold standard MedDRA manually coded, which indicates that automatic computer programs could help identify and narrow ADR terms to reduce labor-intensive manual coding. However, manual validation is essential to reduce false-negatives and false-positives. In addition, further refinement of Oracle Text (e.g., advanced NLP) search based on the understanding of the MedDRA standard and Drug labeling text documents is warranted.
This study demonstrated that combining MedDRA standard terminologies with data mining techniques facilitated computer-aided ADR analysis of drug labeling. This study also highlighted the importance of discrimination of the same ADRs which appear in different labeling sections. We specifically focused on serious ADRs primarily presented in BW as a proof-of-concept for the study of ADRs and the same approach should be equally applicable to other public health documents. It is worthwhile to point out that the proposed approach can be developed with consideration of other labeling sections, such as Indications and Usage, Drug Interactions, Contraindications, and Clinical Studies, to extract valuable safety and efficacy related information from drug labeling documents and even other public health documents (e.g., Electronic Health Records).
Materials and methods
Drug labeling documents
Drug labeling documents used in this study are in the Structured Product Labeling (SPL) format. SPL is a document markup standard approved by the Health Level Seven International (HL7), mandated by the FDA since 2005, as a standard XML format used to guide manufacturers on how to report and share drug product information. A wealth of material associated with a drug is included in the SPL (e.g., text, tables, safety and use information, active ingredients, package inserts, packaging type), and is required for all human drug products, including over-the-counter and biologic drug products. The FDA’s Center for Drug Evaluation and Research manages SPL submissions and approvals for US marketed drug products. In SPL documents, each labeling section title is coded by Logical Observation Identifiers Names and Codes (LOINC), which is a set of universal codes used to identify or exchange medical information. For example, the LOINC code for BW is 34,066–1, and the LOINC code for WP is 43,685–7. We used LOINC to parse the three ADR related sections (BW, WP, AR) from the XML-based SPL file.
FDALabel database (https://www.fda.gov/scienceresearch/bioinformaticstools/ucm289739.htm) was used to collect the drug labeling documents for this study . FDALabel is developed and maintained by the FDA as a web-based application that allows access to the most up-to-date drug-labeling data, aiding their use in regulatory science, drug development, and scientific research. In its latest version, FDALabel allows the easy querying of drug information based on labeling sections (e.g., BW, WP, and AR). SPL documents are the source of FDALabel and are archived by the FDA and can be downloaded from DailyMed . The current version of FDALabel database (3/20/2017) has 94,657 SPLs, which include human prescription drugs, biological products, and over-the-counter (OTC) drugs.
FDA-approved NDA drug list
In the current version of FDALabel, 34,681 of the 94,657 SPLs are of human prescription drug labeling (hereafter called “drug labeling”). Of note, one prescription drug can have multiple SPLs due to the differences in regulatory applications, dosage forms, routes of administration, manufacturers, etc. For this study, duplicates of SPLs with the same Unique Ingredient Identifier (UNII) were removed and only the most recent effective SPL of the UNII drug was used. The drug list used in this study was selected using the following sequential criteria: (I) human prescription drug; (II) New Drug Application (NDA) drug; (III) single active ingredient UNII; (IV) most recent SPL of the same UNII of a drug. Finally, 1164 unique drug SPLs were extracted. The detailed drug list is provided in Additional file 5.
Extracting MedDRA standardized terms for ADR study using Oracle text search
In this study, version 19.0 was used and has, in total, 75,818 LLTs, 21,920 PTs, 1732 HLTs, 335 HLGTs, and 27 SOCs. MedDRA has anatomical, physiological, and etiological SOCs. AEs or ADRs coded by MedDRA LLTs are classified per MedDRA’s predefined hierarchy and can be aggregated using SOCs. Of the 27 SOCs, 22 are “disorder” SOCs with PTs that are highly related to ADRs, such as Cardiac disorders and Psychiatric disorders. We removed 5 SOCs that were not ADR specific: Injury, poisoning and procedural complications (Inj&P), Investigations (Inv), Social circumstances (SocCi), Surgical and medical procedures (Surg), and Product issues (Prod).
We extracted ADRs in drug labeling with LLTs through an Oracle Text querying strategy and then linked the LLTs to their corresponding PTs for frequency counting. We counted each PT only once per section per labeling, regardless of how many times the PT, or its subordinate LLTs, occurred within the specific labeling section. Although PTs can be linked to multiple SOCs, for our SOC level analysis, only the primary SOC was considered.
The MedDRA terms extraction process was conducted using Oracle Text search. First, the labeling SPLs of full text sections, as XML, were parsed into the Oracle database based on LOINC . The text index was built in basic NLP procedures at Oracle database including stop word removal, stemming, and pattern matching [42, 43]. Then, the processed text information was indexed and extracted using MedDRA LLTs and mapped to PTs. Specifically, the LLTs and PTs were extracted for each drug labeling document from three ADR related sections (i.e., BW, WP, and AR) as well as the whole document using structured query language (SQL). The resulting drugs - PTs matrix was used for further data analysis.
Fisher’s exact test of SOC significance
Fisher’s exact test was performed per individual SOC, comparing the number of PTs that occurred in BW drugs belonging to the SOC to the total number of PTs occurring in that SOC for the FDA-approved NDA drug list. Since multiple SOCs were tested, Bonferroni correction (p < 0.002) was further considered in determining whether SOCs had significantly enriched Boxed Warnings (Additional file 2).
Anatomical therapeutic chemical (ATC) codes
Anatomical Therapeutic Chemical (ATC) classification system classifies drugs by organ or system of involvement, as well as by chemical, therapeutic, and pharmacological properties. In this study, drugs were categorized into 54 ATC classes under therapeutic/pharmacological levels (the second level in ATC hierarchy). Details can be found in Additional file 6. If a drug had multiple ATC codes, all ATCs were counted separately. ATC information for the 1164 drugs was retrieved from the DrugBank database . First, we mapped via the active ingredient, then we mapped the remaining drugs to Active moiety UNIIs. Thus, 989 drug-ATC relationships were identified and used to group the drugs into ATC classes.
Hierarchical clustering analysis
A two-way Hierarchical Cluster Analysis (HCA) is an unsupervised learning approach and primarily used for pattern discovery . In this analysis, HCA was used to investigate the grouping of ADRs (along with associated PTs) for BW drugs (i.e., drugs with a BW) in terms of their similarities across drug classes (ATC). Log 2 transformations of PT frequencies were performed to conduct the HCA analysis. Extracted PT data and ATC group data were organized into a data matrix where each row represented a single MedDRA PT, and each column represented an ATC secondary-level group. The frequency of each PT is the number of drugs in one ATC group that contained this PT in the labeling.
Some ATC groups have multiple drugs, such as antineoplastic agents (L01), psycholeptics (N05), and psychoanaleptics (N06). However, some ATC groups only contain one BW drug, such as antifungals for dermatological use (D01) and pituitary and hypothalamic hormones and analogues (H01). To reduce possible data noise in low frequency values, we compiled a preprocessed data matrix containing only ATC groups with at least 5 drugs, which were then further explored by cluster analysis. Similarly, only PTs that appeared in at least 5 drug counts across all drugs were included in the cluster analysis. Overall, for the final analysis, 129 out of 460 PTs and 25 out of 54 ATCs were used to compile a preprocessed data matrix (Additional file 7), and were analyzed by cluster analysis using heatmap.1 function in R (version 3.2.1).
DM and JY are grateful for the support of this project in part by an appointment to the Internship/Research Participation Program at the National Center for Toxicological Research, U.S. FDA, administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and the FDA.
The FDALabel project was supported by FDA/CDER and FDA/NCTR funding.
Availability of data and materials
Data used during the current study is available on request. FDA-approved drug labeling could be retrieved from FDALabel database which can be accessed at https://nctr-crs.fda.gov/fdalabel/ui/search
The views presented in this article do not necessarily reflect the current or future opinion or policy of the U.S. Food and Drug Administration. Any mention of commercial products is for clarification and not intended as an endorsement.
About this supplement
This article has been published as part of BMC Bioinformatics Volume 20 Supplement 2, 2019: Proceedings of the 15th Annual MCBIOS Conference. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-20-supplement-2
HF, WT and LW conceived and designed this study. LW, ZL, SH, GZ, JY, WG and HF performed data analysis. LW, TI, ZL, AZW, ST, JX, DM, WT and HF wrote the manuscript. All authors have agreed on all the contents and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Code of Federal Regulations Title 21 (21CFR) 312.32 [https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSearch.cfm?fr=312.32], accessed on 03/2018.
- 3.D'arcy P, Griffin J. Thalidomide revisited. Adverse Drug React Toxicol Rev. 1993;13(2):65–76.Google Scholar
- 12.Code of Federal Regulations Title 21 (21CFR) 201.57 [https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfCFR/CFRSearch.cfm?fr=201.57], accessed on 03/2018.
- 16.Mozzicato P. MedDRA - an overview of the medical dictionary or regulatory activities. Pharmaceutical Medicine. 2009;23(2):65–75.Google Scholar
- 17.Mozzicato P. MedDRA - past and future. Regulatory Affairs J Pharma. 2006:797–805.Google Scholar
- 18.Harrison J, Zhao-Wong A. Working with MedDRA to improve data standards. Good Clinical Practice Journal. 2006.Google Scholar
- 19.Tabor E. Cobert’s manual of drug safety and Pharmacovigilance. Drug Information Journal. 2012;46(1):140–0.Google Scholar
- 22.de Langen J, van Hunsel F, Passier A, de Jong-van den berg L, van Grootheest K: Adverse drug reaction reporting by patients in the Netherlands three years of experience. Drug Saf 2008, 31(6):515–524.Google Scholar
- 25.Nikfarjam A, Sarker A, O’Connor K, Ginn R, Gonzalez G: Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inform Assoc 2015:ocu041.Google Scholar
- 27.Segura-Bedmar I, De La Peña S, Martınez P. Extracting drug indications and adverse drug reactions from Spanish health social media. Proceedings of BioNLP. 2014:98–106.Google Scholar
- 28.Ji X, Chun SA, Cappellari P, Geller J. Linking and using social media data for enhancing public health analytics. J Inf Sci. 2017;43(2):221–45.Google Scholar
- 34.Chen M, Zhang J, Wang Y, Liu Z, Kelly R, Zhou G, Fang H, Borlak J, Tong W. The liver toxicity knowledge base: a systems approach to a complex end point. Clinical Pharmacology & Therapeutics. 2013;93(5):409–12.Google Scholar
- 37.Blendon RJ, Benson JM. The public and the opioid-abuse epidemic. N Engl J Med. 2018.Google Scholar
- 38.Risk Evaluation and Mitigation Strategies (REMS) https://www.fda.gov/Drugs/DrugSafety/REMS/default.htm, accessed on 03/2018.
- 42.Dixon P. Basics of oracle text retrieval. IEEE Data Eng Bull. 2001;24(4):11–4.Google Scholar
- 43.Murthy R, Banerjee S: Xml schemas in Oracle XML DB. In: Proceedings of the 29th international conference on Very large data bases-Volume 29: 2003. VLDB Endowment: 1009–1018.Google Scholar
- 44.Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;34(suppl 1):D668–72.Google Scholar
- 45.Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. J Intell Inf Syst. 2001;17(2):107–45.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.