FormalPara Key Points

It is possible to identify and compare adverse events of statins mentioned on social media to those identified in other sources.

Reports of adverse events of statins on social media generally follow a similar pattern to reports from other sources.

Social media provides information on which adverse events of statins are mentioned the most by patients.

1 Introduction

Statins are used to prevent coronary artery disease (CAD) events in both patients with and without a history of coronary artery disease. However, there has been controversy regarding their benefit–harm balance, especially in individuals who do not have a history of CAD [1]. Part of the issue stems from concern over poor reporting of the adverse events of statins. Regulatory data has been lacking because reporting is voluntary and the number of patients taking medications is difficult to ascertain; therefore, the prevalence of a reported event is unknown [2,3,4].

Adverse events are an important consideration in healthcare decision making. When different treatment options are available, adverse events can often be the deciding factor—as in some cases there may be modest benefit, and adverse events can lead to further treatments at significant cost [5,6,7,8]. It is not only those adverse events that are defined as serious that are important (such as those leading to hospitalization, disability and death) but also those that are bothersome or interfere with an individual’s quality of life. The importance of all adverse events should not be understated given that many interventions are given for prevention or for a small potential benefit. Researchers characterizing the benefits and harms of medications therefore should be comprehensive in order to obtain the most complete picture of the adverse events profile for any intervention. Sources currently used to identify adverse events have many limitations [9]. In particular, poor reporting or underreporting is apparent in pharmacovigilance data, clinical trials and other types of studies [2, 10,11,12,13]. A large source of timely data that is currently underused is social media [14, 15]. These data could help supplement data from other sources [15, 16], providing a different perspective.

Social media has previously been found to be a feasible source for identifying posts of adverse events [15]. It has been estimated that 0.2% of posts on generic social media platforms (such as Twitter) to 8% of all posts in patient forums are adverse event reports [17]. However, few studies have evaluated the value of adverse event reports in social media compared with other sources such as regulatory data, drug information databases (DIDs) used when making prescribing decisions and other traditional sources such as research studies in the form of systematic reviews or randomized controlled trials (RCTs) [17]. In part this may be because of the difficulty of comparing such disparate data formats. For example, a study of adalimumab presented a comparison of Twitter data with regulatory data, DIDs and systematic reviews and concluded that Twitter adverse events were in moderate agreement with known events, and some bothersome events were found in Twitter that were not noted elsewhere [11]. However, this study was an example using only one injectable biologic to treat diseases with notable symptoms and the generalizability of the findings to other drugs is unknown. Statins, on the other hand, are daily oral medications used for both secondary and primary prevention and many patients may feel perfectly healthy with no obvious manifestations of disease. Statin therapy may cause irritating or uncomfortable new symptoms in these patients, and this may affect their adherence and compliance to the treatment regimen, which generally requires taking tablets every day for the rest of their life. So, the pattern of adverse events and patient reporting is potentially very different to that of a drug such as adalimumab.

We therefore aimed to address this gap in the research by using social media to identify adverse events of statins and then compare the results with other sources. Statins were chosen as they are widely used, with atorvastatin and simvastatin in the top 10 drugs by prescription numbers in the US, and they are widely tweeted about [3]. Expanding on our prior effort to uncover the value of social media for pharmacovigilance by using another case study intervention (statins), contributions of this paper include (1) a detailed comparison of social media with traditional information sources including regulatory data, DIDs and systematic reviews, (2) a qualitative assessment of the generalizability of the methodology proposed in [8] to a different medication class, (3) an annotated corpus of tweets that mention an adverse effect of statins.

2 Methods

We collected data on the adverse events of statins from Twitter, the US FDA Adverse Event Reporting System (FAERS), the UK Medicines and Healthcare products Regulatory Agency (MHRA), DIDs (Facts and Comparisons® and Clinical Pharmacology®) and systematic reviews.

2.1 Tweet Collection

Twitter posts were collected from June 2013 to August 2018 from the University of Pennsylvania HLP Twitter drug database [18]. This database collects tweets via the public application programming interface (API) (https://dev.twitter.com/streaming/public). We utilized medication names and their variants [19] as keywords. The methods used to generate variants of the drugs names are outlined elsewhere [19, 20]. We collected tweets in English and excluded retweets. Furthermore, we removed duplicated tweets that had different tweet IDs but the same text (using conditional formatting in Excel and a manual review on tweets with matching first 50 characters). We searched for eight statin medications. Five are licensed in the UK—atorvastatin, fluvastatin, pravastatin, rosuvastatin and simvastatin; two additional medications, lovastatin and pitavastatin, are licensed in the US. One (cerivastatin) is withdrawn.

All tweets were manually annotated as perceived adverse events, or some other categories, and the adverse event span was extracted. Instances in which more than one potential attributing drug and adverse event were stated were rare and this may in part be attributable to the short nature of the Twitter posts. In these cases we did, however, use manual annotation and relied upon the author’s statement and the annotator’s interpretation. The adverse event terms were manually normalized to MedDRA® Lowest Level Terms (LLT) and Preferred Term (PT) codes (such as myalgia, malaise or hypersensitivity etc.). To facilitate comparison with other sources, the PT codes were then assigned to one of 27 MedDRA® broader categories of biological systems codes (primary System Organ Class (SOC) codes, such as hepatobiliary disorders, cardiac disorders or eye disorders, etc.). Examples are given in Table 1. In addition, we recorded fatalities in the ‘General disorders and administration site conditions’.

Table 1 Examples of tweets expressing adverse events of statins

2.2 Drug Regulatory Data

Data were collected from FAERS from June 2013 to October 2018. FAERS quarterly reports matching the timeframe of tweet collection were downloaded from 2013-Q3 to 2018-Q3 (FAERS website). In the DRUG file, ‘prod_ai’ was used to select reports for lovastatin, simvastatin, pravastatin, fluvastatin, atorvastatin, rosuvastatin and pitavastatin. From this list, combination drugs (e.g. amlodipine besylate/atorvastatin besylate) were eliminated. Multiple salts of the drug were aggregated into one drug category (e.g. atorvastatin calcium, atorvastatin sodium). The ‘role_cod’ file was used to select only cases in which one of the indicated statins was the ‘primary suspect’. Reports identified in the DRUG file were mapped to demographics in the DEMO file and reactions to MedDRA® PTs in the REAC file using PRIMARYID as the common key. The latest update of each case report was utilized. PTs were examined hierarchically, including the SOC categories to allow for comparisons across Twitter, FAERS, MHRA, DIDs and systematic reviews.

Adverse event reporting data were also collected from the MHRA from 2013 to 2018 for each of the drugs. There were no entries for pitavastatin, no eligible entries for lovastatin (only data for 2011) and limited data for cerivastatin (2013–2015). MedDRA® Primary SOC broad category data were collated for each drug and aggregated.

2.3 Drug Information Databases (DIDs)

Statin class events were collected from Facts and Comparisons® Class and related monographs for HMG CoA reductase inhibitors, and Clinical Pharmacology® drug class overview for HMG CoA reductase inhibitors. Sources to populate the databases include product monographs and primary literature. Facts and Comparisons® and Clinical Pharmacology® were selected because they report adverse reactions by individual drug and by drug class. Adverse reactions were reported by systems. For example, gastrointestinal system included, but was not limited to, abdominal pain, abdominal distress, constipation and diarrhoea. Neuromuscular and skeletal system included, among other events, arthralgia, arthritis, muscle spasm, myalgia and myopathy. An event rate for each adverse event was estimated by aggregating the range of occurrence for each event; events were then classified into systems.

For each adverse event system, we collected the frequency for each MedDRA® PT code and calculated the rank order for each coded adverse event and the percentage of occurrence. To provide category comparisons between sources, adverse event categories were combined into larger categories analogous to MedDRA® SOCs. For example, MedDRA® ‘cardiac disorders’ were combined with ‘vascular disorders’ to be comparable to the DID clinical category of ‘cardiovascular’ events. To allow for appropriate comparison, MedDRA® category’endocrine disorders’ was combined with MedDRA ‘metabolism and nutrition disorders’ because diabetes mellitus is contained in ‘metabolism and nutrition’; however, diabetes is contained in the endocrine disorders category of DIDs. Allergic reactions are contained in the MedDRA® category ‘immune system disorders’.

2.4 Systematic Review Identification

There are huge numbers of systematic reviews carried out on statins, however, we were concerned with those with an adequate amount of data specifically on the adverse events of statins. We conducted a search for systematic reviews with a focus on the adverse events of statins on 22 January 2020 on Epistemonikos. The search included generic and trade names for statins as well as terms for adverse events (Box 1). There were no date or language limits applied to the searches and the publication type selected was ‘Systematic Review’.

3 Data Analysis

3.1 Comparison Metrics

We took a multiple step approach as previously used [11]. The first step involved comparing the categories of adverse events mentioned. The second step involved calculating the rank ordering of adverse event frequencies in order to assess agreement. This approach uses the data from the treatment arm only of the systematic reviews with no information taken from the control arm. However, this is more comparable to the data collected from tweets and regulatory data where a control arm is not available. The third step involves calculating the frequency of adverse events relative to one another.

3.2 Range of Adverse Event Mentions

We compared the named events in regulatory data, DIDs and systematic reviews with those mentioned in Twitter to ascertain whether some adverse events are recorded in one source but not the other. We also compared the primary SOC categories mentioned in one or more sources.

3.3 Frequency and Frequency Ranking

Frequencies were calculated as the absolute percentages of all reports from one source in a given SOC category. For each source, we ranked the SOC categories in order of most common to least common. We could not aggregate the systematic reviews with this metric so we presented the ranks for individual reviews.

3.4 Relative Frequency Comparison

To compare the relative magnitude of differences between the adverse event categories in Twitter, FAERS, MHRA and DIDs, we computed the relative frequencies of adverse event categories as demonstrated previously [9]. Dermatologic disorders (skin and subcutaneous tissue disorders) were selected as the index comparator with a value of ‘1’ because skin conditions may be detectable by patients and tweetable as well as readily detected by medical care providing an equal opportunity for reporting by any source. To calculate a relative adverse event category, the percentage reporting that specific event was divided by the percentage reporting a dermatologic event. For example, the frequency of cardiovascular events in FAERS was 3.11% and dermatologic events were 4.99%, resulting in a relative frequency of 0.62 for cardiovascular events.

4 Results

4.1 Tweet Collection

The search of our database returned 16,338 statin tweets, of which 12,649 were unique from 9116 users. We removed from analysis 748 tweets not related to a statin (typos, spelling mistakes or another use of the statin name), 31 tweets that were computer-generated posts (such as bots) and 18 that were mainly in a non-English language. This left 11,852 posts for analysis from which 401 AEs were extracted in total. After removing duplicate mentions of the same AE expressed by the same user in different tweets, 356 AEs remained. There are 166 unique LLT terms that mapped to 119 unique PT terms. These mapped to 19 of the 27 primary SOC codes.

4.1.1 Tweets on Specific Statins

The pattern in the proportion of mentions of specific statin adverse events approximately matches the proportion of total mentions for each specific statin (Fig. 1). Thus, those drugs most mentioned on social media were also those drugs with the most adverse events mentioned.

Fig. 1
figure 1

Mentions of specific statins and statin ADRs in Twitter and Prescription rates in the US and UK. *https://openprescribing.net/, **https://clincalc.com/DrugStats/Top200Drugs.aspx

Prescription rates show that atorvastatin and simvastatin dominate the markets in the US and England (Fig. 1). The prescription rates of statins in the US and UK, however, do not appear to be in line with the mentions on Twitter. For example, simvastatin is the second most popular prescribed statin yet had relatively few mentions on Twitter. On the other hand, rosuvastatin was the second most commonly mentioned statin on Twitter but has relatively few prescriptions (Fig. 1). The tweet rate, therefore, may only partially be explained by the numbers of people taking them and so may be related to changes such as rosuvastatin to generic in 2016, which statin a celebrity is prescribed, a news item or even a TV commercial.

4.2 Drug Regulatory Data

From FAERS, a total of 45,447 reports of lovastatin, simvastatin, pravastatin, fluvastatin, atorvastatin, rosuvastatin and pitavastatin were identified including 2360 unique PTs from 2013 to 2018. The ten most frequent PTs were myalgia (2189, 4.82%), rhabdomyolysis (1154, 2.54%), drug interaction (1013, 2.23%), fatigue (867, 1.91%), muscle spasm (739, 1.63), pain in extremity (710, 1.56), arthralgia (683, 1.50), asthenia (644, 1.42), muscular weakness (607, 1.34), acute kidney injury (524, 1.15) and dizziness (518, 1.14). Of the 2360 PTs, 18 PTs occurred at 1% or greater. Of the 18, five were related to muscle pain and weakness. Frequencies and ranks are displayed in Table 3.

Table 3 Adverse event categories in Twitter, MHRA, FAERS, DIDs and systematic reviews

From the MHRA database we identified 10,415 adverse drug reports from 2013 to 2018. The majority were from atorvastatin (4962) and simvastatin (4149). These reports were allocated to the appropriate SOC categories (Table 3).

4.3 Drug Information Databases (DIDs)

The frequency of adverse events for the drug class ‘statins’ in DIDs are reported as percentages and illustrated in Fig. 2 and Table 3 alongside the other data sources with ranks. Gastrointestinal disorders are the most frequently reported event in DIDs, while in FAERS, MHRA and Twitter they are ranked fourth most frequent. Musculoskeletal and connective tissue disorders are the second most frequent in DIDs and the most frequent in FAERS, MHRA and Twitter. In the general disorders and administration-site conditions category of MedDRA®, the most frequent events are drug interactions, fatigue, asthenia and drug ineffective (data not shown). Drug interactions are not quantified in DIDs, although a dichotomous drug interaction report can be obtained. The next two are not listed individually in the DIDs, but are likely contained within other categories; for example, ‘fatigue’ may be captured in muscle fatigue. The final sub-category in general disorders—drug ineffective—is not reportable in DIDs; thus, the second most frequent event in other data sources is not reportable in DIDs. Across all data sources, nervous system disorders ranked third. Of the top four categories in Twitter, FAERS and MHRA were in agreement and three of the DIDs were in agreement.

Fig. 2
figure 2

Reports in Twitter, MHRA, FAERS and DIDs by MedDRA Primary SOC category as a percentage of all reports from that source. +MedDRA metabolism and nutrition disorders (includes diabetes) combined with endocrine disorders to make endocrine and metabolic disorders in DIDs. ++MedDRA renal and urinary disorders and reproductive system disorders combined to make DID category genitourinary disorders. +++MedDRA vascular disorders combined with cardiac disorders to make DID cardiovascular disorders

4.4 Relative Frequency Comparison of Twitter, FDA Adverse Event Reporting System (FAERS), Medicines and Healthcare products Regulatory Agency (MHRA) and DIDs

Calculating a rank for each adverse event that was relative to the index category ‘skin and subcutaneous tissue disorders’ for that source allowed us to compare the relative magnitude of an adverse event across all data sources. In other words, was it higher across all sources than the index, or was it reported inconsistently? For example, the most frequent event, ‘musculoskeletal and connective tissue disorders’, was mentioned 21.7 times the index in Twitter, 3.82 times the index in FAERS, 3.61 times the index in MHRA and 5.96 times the index in DIDs. Although none of the sources match the magnitude of Twitter, the relative magnitude reported across all was consistently higher than the index (Fig. 3).

Fig. 3
figure 3

Relative frequencies of system categories by adverse event data sourcea. aCategories based on MedDRA system organ class categories. Index is defined as dermatologic category (y axis = 1). Categories reported relatively more frequently in Twitter than Index are above 1. For DIDS: adverse event categories were combined to aggregate categories analogous to MedDRA SOCs: blood and lymphatic system disorders + neoplasms = hematologic; cardiac disorders + vascular disorders = cardiovascular; endocrine disorders + metabolism and nutrition (combined to include Diabetes in all sources); immune system contains allergic reactions including anaphylaxis

With respect to ‘general disorders and administration-site conditions’ and ‘nervous system disorders’, they were both relatively ranked above the index in all sources; however, again Twitter was much higher.

4.5 Systematic Reviews

The search for systematic reviews retrieved 331 hits. After sifting the titles and abstracts we ordered the full-text articles of 48 papers. On evaluation of the full-text articles, 23 systematic reviews passed our inclusion criteria (Appendix 1, see electronic supplementary material). Those excluded were either not systematic reviews, in a non-English language, or compared combination therapy rather than statin monotherapy.

Seventeen systematic reviews presented the data in terms of rates of adverse events, most commonly in the treatment and control arm. Three systematic reviews presented odds ratios, risk ratios or risk difference only; one review presented the results as events/patient-years and another two reviews merely mentioned the specific adverse events with no numerical data. All but one systematic review limited their inclusion criteria to clinical trials.

4.5.1 Range of Adverse Event Mentions

4.5.1.1 Specific Named Adverse Events

Interestingly, most adverse events were mentioned in both the systematic reviews and Twitter. However, ‘cancer’ was only mentioned in the systematic reviews. And there were a few adverse events only mentioned in Twitter and not the systematic reviews. For example, those that were mentioned in at least five tweets were ‘hypersensitivity’, ‘muscle atrophy’, ‘arthralgia’, ‘muscular weakness’, ‘abnormal dreams’, ‘memory impairment’, ‘disability’, ‘mental impairment’, ‘dementia’, ‘memory loss’ and ‘flatulence’.

4.5.2 Absolute Frequency and Frequency Ranking

4.5.2.1 Specific Named Adverse Events

We were only able to obtain the rank order of adverse events from 15 systematic reviews for the statin class. The top ranking adverse effect was myalgia in six reviews, hepatic dysfunction in three reviews, cancer in two reviews and death, myopathy, kidney problems or gastrointestinal issues were each the most common in one review. In Twitter, myalgia also ranked first as the most common adverse effect mentioned. However, cancer was not mentioned on social media, hepatic dysfunction ranked 89th (although hepatic enzyme increase was 6th), death ranked 29th, myopathy 28th, kidney disease 96th and gastrointestinal tract infection 80th.

We were able to conduct a more detailed analysis using the rates from Roberts 2007 (Fig. 4), as this review presented rates for the treatment and control, reported by organ group. With this review we summed the adverse events from the RCTs in each category to calculate the absolute percentage difference for statin versus control and then presented the rank order of attributable frequency. This demonstrates that when analysis is conducted with inclusion of a control group then rank order of adverse event categories can change. However, ‘musculoskeletal and connective tissue disorders’ remains high.

Fig. 4
figure 4

Risk difference for adverse drug reactions in those taking statins versus control in randomised controlled trials

4.6 MedDRA® Primary System Organ Class (SOC) Code Comparisons among Twitter, FAERS, MHRA, DIDs and Systematic Reviews

With regard to the MedDRA® SOC codes, Twitter data did not contain any adverse events as a primary SOC code for eight categories: ‘Vascular disorders’, ‘Blood and lymphatic system disorders’, ‘Congenital, familial and genetic disorders’, ‘Endocrine disorders’, ‘Neoplasms benign, malignant and unspecified (including cysts and polyps)’, ‘Pregnancy, puerperium and perinatal conditions’, ‘Product issues’ and ‘Surgical and medical procedures’. These categories all had frequencies of < 1.5% in FAERS and MHRA and so this may be a product of the lower counts overall of adverse events in Twitter.

Regulatory data had reports for adverse events for all 27 primary SOC codes whilst DIDs data was not available for 10 codes; ‘General disorders and administration site conditions’, ‘Psychiatric disorders’, ‘Investigations ear and labyrinth disorders’, ‘Social circumstances’, ‘Congenital, familial and genetic disorders’, ‘Neoplasms benign, malignant and unspecified (including cysts and polyps)’, ‘Pregnancy, puerperium and perinatal conditions’, ‘Product issues’ and ‘Surgical and medical procedures’.

For some of these categories not in DIDs there was a reasonable explanation. For example, the ‘General disorders and administration site conditions’ category is not reportable in DIDs as the most frequent events in this category are listed by MedDRA® as ‘drug interactions’, ‘fatigue’, ‘asthenia’ and ‘drug ineffective’. ‘Drug interactions’ are not quantified in DIDs, although a dichotomous drug interaction report can be obtained. ‘Fatigue’ and ‘asthenia’ are not listed individually in the DIDs, but are likely contained within other categories. For example, ‘fatigue’ may be captured in muscle fatigue. The final sub-category in general disorders, ‘drug ineffective’, is not reportable in DIDs; thus, the second most frequent event in other data sources is not reportable in DIDs. Ear and labyrinth disorders may include dizziness as an inner ear disorder, while dizziness is reported as a central nervous system disorder in DIDs.

The systematic reviews tended to focus on adverse events specified within their protocol or inclusion criteria and thus did not have the range of adverse events reported elsewhere. Within the 17 systematic reviews there were 12 categories of MedDRA® adverse events that were not mentioned: ‘Immune system disorders’, ‘Cardiac disorders’, ‘Injury, poisoning and procedural complications’, ‘Social circumstances’, ‘Vascular disorders’, ‘Blood and lymphatic system disorders’, ‘Congenital, familial and genetic disorders’, ‘Endocrine disorders’, ‘Infections and infestations’, ‘Pregnancy, puerperium and perinatal conditions’, ‘Product issues’ and ‘Surgical & medical procedures’.

4.6.1 MedDRA® Primary SOC Codes

The sequence of the top three categories (‘Musculoskeletal and connective tissue’, ‘General disorders and administration site conditions’ and ‘Nervous system disorders’) in Twitter, MHRA and FAERS exactly match and a similar ranking is seen throughout.

The order of the categories of adverse events in DIDs had a similar overall pattern but 11 of the 27 primary SOC codes were not reported and ‘Gastrointestinal disorders’ were by far the most common. In addition, six primary SOC codes were combined into three. ‘Metabolism and nutrition disorders’ (includes diabetes) was combined with ‘Endocrine disorders’, ‘Renal and urinary disorders’ with ‘Reproductive system disorders’ and ‘Vascular disorders’ with ‘Cardiac disorders’. This was performed because conditions such as diabetes are listed in some sources as an endocrine disorder, but are a metabolism and nutrition disorder in MedDRA®. Cardiac conditions are frequently listed as cardiovascular in medical literature but are separated in MedDRA® as cardiac conditions and also vascular conditions. Medical literature also reports genitourinary conditions, while MedDRA reports the two separately.

NB: categories based on MedDRA® System Organ Class Categories. Dermatologic category was selected as Index (designated at y axis = 1) because conditions may be detected by patients and clinicians, and are readily tweetable (equal opportunity for reporting). Categories reported relatively less frequently in Twitter than Index are below 1 (above 0). Categories reported relatively more frequently in Twitter than Index are above 1.

For DIDS, the following adverse event categories were combined to aggregate categories analogous to MedDRA® SOCs: blood and lymphatic system disorders + neoplasms, cardiac disorders + vascular disorders (combined to include cardiovascular conditions such as hypertension) and endocrine disorders + metabolism and nutrition (combined to include diabetes in all sources).

DID data are pooled from multiple studies and may include post-marketing reports.

*Immune system contains allergic reactions including anaphylaxis.

5 Discussion

Comparing data collected from social media, regulatory data or DIDs with that obtained from systematic reviews is challenging [11]. Adverse event reports are voluntary on the part of healthcare professionals and patients. Additionally, the exact number of medication users is not known so it is difficult to estimate prevalence of an event in the complete population. Events reported in systematic reviews may be reported as rates, proportions or measures of risk such as odds ratios—all of which are difficult to compare head to head. Misspellings add another layer of complexity. We set out to harmonize reports since Twitter users state adverse events in lay language that may not match formal sources.

This comparative study emphasises the similarities and differences between data available on the adverse events of statins from Twiter, regulatory data from the US and the UK, clinical drug information databases and systematic reviews. Whilst there is a large degree of similarity, particularly from Twitter and regulatory data, we found that patients are far more likely to complain about musculoskeletal symptoms such as muscle spasms, muscle pain and muscle fatigue on social media than in any other data source. Other modalities appear to capture very different items that are of interest to healthcare professionals who report to FAERS or MHRA. Trials that are included in systematic reviews tend to reflect the interests of researchers or trial investigators as to what adverse events are recorded and reported. The more serious adverse events of death and cancer were presented in more detail and relatively more frequently in systematic reviews.

Previous studies have found that the adverse events mentioned in social media are also documented elsewhere (such as regulatory data or published trials) [17] and this is what we found in the current study. However, we looked beyond which adverse events are mentioned and looked at the ranking of the adverse events reported. Some other studies also compare the number or ranking of adverse events from social media and other data sources [21,22,23,24,25] and demonstrate agreement at the SOC level [21]. Although a tendency for higher frequency of adverse events to be recorded in social media than consumer-reported regulatory data has been noted previously [21], in the current study we did not restrict our regulatory data to those reported by consumers only and thus we found more reports from regulatory data than social media.

A previous study that examined the consistency between the literature and social media identified a largely similar pattern with adverse events when compared by SOC, but not when detailed specific adverse events were studied [26]. The largest disparity seen was between social media and RCTs, which is much the same in the current study.

Systematic reviews have previously been criticized in relation to harms reporting [27]. It has become well known that systematic reviews tend to focus on serious adverse events, and that uncommon adverse events or those deemed unrelated to the intervention are underreported. This may, in part, be explained by the reporting in RCTs on which many systematic reviews are based. We found also that systematic reviews were selective in the adverse events they studied.

Similar to systematic reviews, the DIDs are comprised from primary literature. These databases are used by clinicians, frequently at the point of care. Advantages include the high level of evidence, the immediate availability of information and drug class comparisons when available. DID information may be limited by the primary literature that has been evaluated and reported, and by the (lack of) timeliness of updates. We found that the DIDs were in relative agreement with Twitter in patient sentiment categories such as gastrointestinal complaints (dyspepsia), central nervous system complaints (headache) and neuromuscular complaints (myalgia).

The differences we have seen may reflect patient perceptions or indicate signals of medication-related problems. For instance, social media posts may be highly correlated with press articles and controversy surrounding the adverse events of statins. To investigate if results were magnified by awareness, it may be worth assessing a drug with new safety concerns identified after approval. Activities in Twitter before and after the safety communication can then be compared to explore the possible influence of awareness. Our study, however, goes beyond just identifying new adverse events by highlighting which adverse events are most discussed by social media users, and which may therefore be the most influential in patients’ decision whether to start or to continue taking their medication. For instance, concerns about myalgia and other symptomatic adverse events can strongly influence patient adherence to daily lifelong tablets.

The full value of social media, specifically Twitter in this study, in mining medication-related adverse events is still being explored. Several studies have reviewed and made recommendations for the use of social media in pharmacovigilance, and reasonably conclude that is a fledgling process [15, 16, 26, 28,29,30]. The results of this study add to our similar study of Humira (adalimumab) that found moderate agreement between the pharmacovigilance sources, but different methods of reporting, made interpretation and comparison challenging [11]. This study found plausible adverse events in Twitter were substantiated in other sources, and when consideration for the size of the Twitter sample is taken into account, the level of agreement is striking. It may therefore be time to scale up case studies of pharmacovigilance on Twitter to assess the generalizability of our findings. Our study presented a more robust picture of adverse events by examining multiple sources that included spontaneous reporting systems, clinical trials, observational trials, drug information sources and social media. While challenges remain when trying to utilize adverse event data from social media, methods to extract posts are maturing. Patient perspectives not available in other sources may add to existing pharmacovigilance systems, and further work could include methods to operationalize a complementary system.

In terms of study limitations, comparing categories between sources requires aggregating granular categories into larger, less precise categories. The larger categories may, however, provide a signal of interest where frequently mentioned or reported categories may be cause to initiate a more specific investigation. The larger categories may also provide perspective on events important to patients.

6 Conclusion

Twitter, regulatory data from the US and the UK, clinical drug information databases and systematic reviews all provide a partial picture of the statin adverse event profile. In particular, Twitter, FAERS and MHRA are strikingly similar in the rank order of adverse events reported. However, Twitter reports a higher relative frequency of neuromuscular, general and nervous system adverse events. Systematic reviews and DIDs, on the other hand, may provide a more complete picture of serious adverse events such as cancer and cardiovascular events. Social media may provide information about medications that supports what is known and also potentially useful information not readily available from traditional sources including adverse events of most concern to patients.