Introduction

Medication-related problems can lead to significant harm and impact on quality of life. In Australia, around 2% to 3% of all hospital admissions can be attributed to medicine-related problems while around 10% of patients visiting a medical practitioner have experienced an adverse medication event in the last 6 months [1]. Efforts to improve medication safety focus on both identification and quantification of previously unrecognised adverse events of medicines and the clinical risk factors for those events, as well as identification of inappropriate medicine use in practice. Knowledge about the safety of medicines and use of medicines in high-risk populations can help to inform interventions to improve the use of medicines.

In pharmacoepidemiology, the goal of medication safety studies is to generate knowledge about the risks associated with medicine use and factors that may modify that risk. Large-scale health care administrative databases are a convenient source of information to generate this knowledge as they contain individual patient-level data on health services claimed for many millions of patients. These data are routinely collected in many countries with universal health systems and by large private health insurers. Medication safety assessments using these data can be made by linking exposure data, i.e. medicine dispensing data, to outcome data, i.e. hospital admission data; however, this is not always possible, practical or timely particularly if hospital services and pharmaceutical services are subsidised by different payers and require external linkages to bring the datasets together. Regardless of the ability to link to other data sources, individual patient-level dispensing datasets are often available in many countries [2]. These data contain information on the medication dispensed, date of supply, quantity supplied and dose. For example, Australia maintains a national dispensing dataset of medicines subsidised under their Pharmaceutical Benefits Scheme with the collection beginning in 1990 and patient linked since 2002 [3]. The Nordic countries all maintain nationwide prescription datasets [4] as does New Zealand [5], Scotland [6], Ireland [7] and the Asian countries, Korea, Taiwan, and Japan [8, 9] among others.

When only patient-level dispensing data are available, there are two key opportunities for medication safety assessment: medicine utilisation studies and medication safety assessments. Medicine utilisation studies have long been used in pharmacoepidemiology to identify issues with medicine use at the population level that may be indicative of known safety issues such as use in contraindicated populations, use outside of indication (off-label) or subsidy restriction, inappropriate treatment sequences, inappropriate dosing or prevalence of interacting medicines. Medicine utilisation studies can help to identify the potential over-use or under-use of medicines, particularly where they can be compared with estimates of disease prevalence or incidence for the population under study. These studies are an effective strategy to enhance medication safety as they can identify opportunity for intervention and promotion of quality use of medicines. For example, Polluzi et al. [10] used dispensing data across 13 different European countries to determine the extent of use of antihistamine medicines that were associated with safety signals identified through spontaneous reports. Where use is considered to be high, this can then prompt risk minimisation activities by regulators and clinicians. In Australia, the Australian Government Department of Veterans’ Affairs (DVA) funds an ongoing health promotion-based program, Veterans’ Medicines Advice and Therapeutic Education Services (Veterans’ MATES), which implements interventions to improve use of medicines in the Veteran community [11]. Veterans’ MATES interventions are developed based on issues identified in drug utilisation reviews of medication dispensing data and practice change after implementation of the intervention is evaluated [12, 13].

While medicine utilisation studies in large-scale medication dispensing datasets are frequent, fewer studies have associated medicine use with adverse outcomes using only these data. A review of studies utilising the Australian PBS data [3] identified 50 studies that used individual level dispensing data. The majority of these studies (27 studies) examined trends in utilisation and impacts of interventions while only 9 correlated medicine use with outcomes. Similarly, a review of the use of claims data in the Nordic countries [4] including 515 studies from Denmark, Finland, Iceland, Norway and Sweden found that medicine utilisation studies accounted for 44% of all studies with the remainder investigating the effectiveness or safety of medicines; however, many of these linked dispensing data to other datasets to identify outcomes. In Scotland, dispensing of medicines publicly funded by the National Health Service is captured in the Prescribing Information System. These data have been used predominantly for drug utilisation studies [6].

There are two key barriers to generating evidence of medication safety when only dispensing data are available. The first if the lack of coded diagnosis data to identify adverse events and the second is the lack of data on potential confounders. To address these issues, medication safety assessments can be made by exploiting the fact that symptoms of adverse events of medicines can be treated with other medicines and self-controlled designs that eliminate the need to numerically control for confounding can be implemented.

The aim of this article is to provide an overview of studies that have investigated issues of medication safety using only medication dispensing datasets. We investigate the outcomes assessed by these studies and the medication ‘proxies’ used to define these outcomes. We also examine the methodologies that have been used to generate evidence of medication safety with a particular focus on adjustment for unmeasured confounding. Lastly, we discuss the potential for ongoing work in this area.

Medicine Safety Studies in Dispensing Data

One of the barriers to medicine safety assessment in dispensing data is the lack of diagnosis data to identify adverse events. In order to use dispensing data for safety assessment, one can utilise the fact that symptoms of adverse effects of medicines can be treated in primary care and do not always result in hospitalisation. When the symptoms of an adverse event are misinterpreted as the development of a new unrelated condition and treated with another medicine, this is sometimes referred to in the literature as a prescribing cascade [14]. Examples include dry cough associated with angiotensin-converting enzyme (ACE) inhibitors, which may be managed by antitussive medication, and urinary incontinence associated with cholinesterase inhibitor use for dementia, which may be managed with oxybutynin. Prescribing cascades can be utilised in safety assessments of medicines by examining the rate of initiation of ‘proxy’ medicines that can be used to treat adverse event symptoms, after initiation of the index medicine.

In addition to the lack of diagnosis data for medication safety studies in dispensing data, another limitation is the lack of available information on potential confounders. The advent of self-controlled designs that inherently control for confounders that do not vary over time by making within-person comparisons has meant that safety studies can be implemented in dispensing data without requiring information on measured confounders. Self-controlled study designs include the case-crossover [15], the self-controlled case series [16] and sequence symmetry analysis [17]. All of these designs share a similar characteristic, which is that the patient’s risk of experiencing an adverse event after exposure is compared with another time in that patient’s history (or their future) when they were not exposed. This comparison to one’s-self rather than another individual eliminates the need to numerically adjust for patient characteristics that do not vary over time such as gender, disease severity or frailty, provided the follow-up time is short.

Sequence Symmetry Analysis

Of the class of self-controlled designs, by far, the most frequently used in medication dispensing data has been the sequence symmetry analysis (SSA) [17, 18•]. SSA aims to identify a pattern in medicine use that suggests a medicine indicative of treatment of an adverse event is initiated more often after exposure than prior to exposure. The statistic of interest is the sequence ratio, which has been described as the ratio of the rate of events in exposed individuals compared to a similar unexposed population in the period of time before the exposure [19]. We identified 31 published SSA studies that used initiation of a medicine as an indicator of an adverse event. All published SSA studies and the indicator medicines that were used are summarised in Table 1. Examples include initiation of antidepressants as an indicator of depression [17, 42, 43], initiation of antitussives as an indicator of dry cough [49, 51] and initiation of glaucoma eye drops as an indicator for glaucoma [53]. Initially, the method was used to explore specific hypotheses; however, of the studies published in the last 5 years, nearly half used a hypothesis generating approach either to determine which medicines were associated with a specified adverse event (e.g. urinary incontinence [26], lower urinary tract infection [27], erectile dysfunction [31] and heart failure [33••]) or to determine potential adverse events associated with a specific medicine (e.g. adverse events associated with novel oral anticoagulants (NOACs) [56••, 57••], or cholinesterase inhibitors [55]). In the studies investigating novel oral anticoagulants, the aim of the analysis was to use SSA as a pharmacovigilance tool to generate safety signals with the newly marketed medicines. In addition to bleeding or stroke events associated with NOACs, which have been well studied in the literature, these studies identified potential safety signals including constipation, depression and nausea [56••, 57••].

Table 1 Tabular summary of published studies using the prescription sequence symmetry analysis design to investigate issues of medication safety using only dispensing data

Self-Controlled Case Series and Case-Crossover

Very few studies were identified in the literature that used either the self-controlled case series (SCCS) or the case-crossover design where the outcome was identified through dispensing data only. A systematic review of applications of either SCCS or case-crossover (CCO) [61] identified no studies that used only dispensing data to define outcomes. One SCCS study included in the review [62] used prescriptions to identify depression associated with discontinuation of long-term use of glucocorticoids; however, hospitalisation data was also used in a composite outcome. Another review of case-crossover methodologies in medication safety and effectiveness studies [63] identified 70 empirical applications of the method, but included only one study that used medicines dispensing data only to define the outcome. This study [64] examined the risk of flare of inflammatory bowel disease, defined as initiation of a corticosteroid, associated with antibiotic use.

Discussion

In this review, we have identified that medication safety studies using dispensing data only are possible; however, identification of a ‘proxy’ for identification of an adverse event is required as is an appropriate study design that limits the need to adjust for confounding. The advantage of using initiation of medicines as a proxy for adverse events is that safety signals may be able to be identified more rapidly than waiting for spontaneous reports to be lodged. Additionally, dispensing data are often more timely than other types of health claims data. For example, dispensing data in Australia is made available to researchers with a 3-month delay while hospitalisation data can often be delayed by a year or more. Investigating dispensing data for adverse events has the potential to identify less serious events that may not require hospitalisation such as nausea and vomiting. These symptoms are often frequent and can have a detrimental impact on patient’s quality of life, medication adherence and persistence. We found that symmetry analysis, a technique that analyses patterns in treatment initiations to identify safety signals, has been used widely in dispensing data. This technique has been used both as a tool for directed inquiry and for hypothesis generation. Due to its ease of implementation and minimal data requirement, the method has been used as a tool in the development of a rapid post-market surveillance system across the Asia-Pacific region as part of the Asian Pharmacoepidemiology Network (AsPEN) [22, 30•, 35•, 39•]. SSA has been shown to have moderate sensitivity (61%) and high specificity (93%) for detecting known adverse drug reactions [65] and a recent review concluded that it is a promising method for signal detection in administrative health databases [66•].

To enhance the validity of medication safety studies in dispensing data, further research is required to determine the sensitivity and specificity of the medication ‘proxy’ used for identification of adverse events. Table 1 shows that variations exist in the medicines used to define similar outcomes and similar medicines are used to define different outcomes. Since many dispensing systems do not collect reason for prescription, it is often difficult to determine indications for medicines that can be used to treat multiple conditions and this will impact on medication safety assessments. For example, warfarin can be initiated to treat deep vein thrombosis (DVT) or atrial fibrillation. For some medicines, the pattern of medication use post initiation can help to distinguish indication, for example in the case of warfarin, shorter-term treatment may indicate treatment for DVT while longer-term treatment may indicate atrial fibrillation. Safety signals generated by SSA, when using dispensing data only, have been compared to safety signals generated when hospital data are available. For example, a study by Wahab et al. [33] used SSA to examine the association between medicine use and heart failure using both initiation of frusemide as a ‘proxy’ for heart failure and hospital admission for heart failure. Of 397 medicines in which heart failure was not listed in the product information, signals were generated for 12 medicines using heart failure hospitalisation as the outcome and with 9 medicines using frusemide as the ‘proxy’ for outcome. While there were differences in the medicines identified, prostaglandin eye drops were identified in both analyses. A limitation of utilising dispensing data for medicine safety assessment is that there may be no specific pharmacological treatment for some adverse events or the initial medicine may be discontinued when an adverse event has occurred rather than being treated with another medicine. Additionally, adverse events of medicines may be managed by treatments not recorded in claims data such as over-the-counter treatments or treatments managed within a health practitioner or emergency department visit, for example an Epi-pen administered for anaphylaxis.

Future directions for developing the potential for medication safety studies utilising dispensing data only should concentrate on identifying valid and standardised ‘proxies’ for particular adverse events. Comorbidity scores that use algorithms of medication dispensing to identify clinical conditions may be one avenue to help in this pursuit. Pharmacy-based measures of comorbidity that identify particular diseases based on patterns of medication use have been in existence since the 1990s. The first of these was the Chronic Disease Score (CDS) [67], consisting of 17 comorbidity categories where medicines were used to identify the presence or absence of the category. A version for the US veteran population, that could be calculated from routine dispensing claims data and was known as the Rx-Risk-V index, was developed in 2003; it consisted of 45 categories of comorbidity [68]. Since then, an updated Rx-Risk index [69] and other comorbidity scores that utilise dispensing data including the Drug Derived Complexity Index (DDCI) [70] and the Medicines Comorbidity Index [71] have all been validated as predictors of mortality in dispensing data and may be useful for identifying particular conditions as indicators of adverse outcomes. Additionally, these scores have the potential to be used to adjust for confounding in cohort studies utilising only dispensing data.

While self-controlled designs are advantageous when data on potential confounders are not captured, as is the case with dispensing only data, they do have important assumptions that must be met and these can be restrictive to their use in these data. For example, the SCCS assumes that outcome events are rare or independent of each other, which may not be met where medicines are used for chronic conditions. Medicines that can be used intermittently such as antibiotics (indicative of infection) or intermittent pain medications may be more suitable as outcomes for SCCS. The assumption that consecutive events are independent is not applicable for CCO or SSA as only the first outcome is analysed in these studies. Another assumption of self-controlled designs is that the occurrence of the outcome event should not affect the likelihood of subsequent exposure. This assumption is required for both the SCCS and SSA but is not applicable to the CCO as the CCO design only examines time before an outcome event. In dispensing data, there may be many situations in which this assumption will be violated. For example, if the outcome medicine is contraindicated after initiation of another medicine, studies will be biased towards the null while if the outcome medicine is usually initiated sequentially after another medicine, studies will be biased away from the null. The last assumption of self-controlled designs is that the occurrence of the outcome should not censor the observation period. Again, the CCO is robust to this assumption as only time prior to outcome is analysed. For the SCCS and SSA, this assumption is unlikely to be violated except where death is used as the outcome of interest. For the SCCS, a modified analytic technique has been developed to overcome this problem [72]; however, the SSA cannot be performed at all as clearly there will never be a sequence of events in which medication is initiated after death. Lastly, self-controlled designs using medications as a proxy for outcome will suffer from miss-classification bias due to uncertainty around the onset date of an insidious outcome, e.g. depression, diabetes or cancer [61], particularly if the decision to treat with a medicine occurs well after a condition is recognised and diagnosed, e.g. diabetes which may be managed through diet.

One of the biggest threats to the use of self-controlled designs for studying medication safety in dispensing data is the assumption of no time-varying confounding. When analysing medication treatment patterns, time-varying confounding can be present either when specific treatments are more likely to be initiated in a particular order as patients age or due to underlying trends in medicine utilisation. As patients age, certain medicines may increase in likelihood of prescription in a particular sequence. For example, anti-dementia medicines will be more likely to be initiated at an older age and therefore after other preventative medicines such as statins. Exposure time trends due to marketing campaigns, new medicines entering the market or removal of medicines from the market can also affect the likelihood of specific treatment orders and can result in spurious relationships in the absence of a casual association. For example, if a medicine indicative of an adverse event is increasingly used over time at the population level, then it is more likely that it will be initiated after the medicine of interest just by chance. All of the self-controlled designs have techniques that can be employed to control for underlying trends in medicine use overtime. In SSA, a null sequence ratio can be calculated which estimates the sequence ratio that would be expected due to the underlying trends in medicines in the absence of an association [17, 52]. The effects of age or calendar time on exposure trends can be modelled directly in the SCCS design [73]. In case-crossover studies, a case-time-control design [74] or case-case-time-control design [75] can be implemented. In these designs, the odds ratio from the CCO analysis is divided by the odds ratio for exposure in the risk periods in matched ‘controls’, i.e. those that have not experienced the outcome event at the same calendar time as the case. While these techniques are available, a comparison of the self-controlled designs [76•] found that results of CCO and SSA methods were robust to exposure time trends while SCCS had residual bias with long-term time trends in both exposure and outcome events.

Dispensing datasets often contain information related to patient demographics including data such as date of birth, gender and date of death. When these data are available, it is possible to undertake medication safety research using only dispensing data in which the outcome of interest is death. For example, the CCO method has been used to determine the risk of death associated with use of selective cyclooxygenase-2 inhibitors and nonselective nonsteroidal anti-inflammatory drugs after acute myocardial infarction [77] and the SCCS has been used to investigate the association between bupropion for smoking cessation and death [78].

While many of the studies included in this review used the initiation of a medicine as a proxy for the development of an adverse event, other studies have utilised longer-term patterns in dispensing data to indicate safety issues. For example Joshi et al. [79] utilised changes in drug treatment to infer disease progression or treatment failure in cancer. Another study used patterns of accumulation of cardiovascular diseases, indicated by medications used to treat them, to investigate the effectiveness of modifiable disease progression in statin initiators [80].

The Future of Medication Safety Studies in Dispensing Data: Hypothesis-Free or Purposeful Inquiry?

As has been the trend in many areas, data-mining approaches have been used to interrogate dispensing data. Hallas et al. [60••] published a hypothesis-free approach to signal detection using prescription sequence symmetry analysis in which every medicine was tested against every other medicine. The results of this experiment showed that while many signals were generated, over half represented already known drug reactions, common treatment pathways or simply good clinical practice. This highlights a limitation of hypothesis-free screening which is the generation of a large volume of non-informative associations that require further clinical review. To address this problem, a study by Hoang et al. [59••] used a supervised machine learning approach to predict potential adverse drug reactions in which models were trained on positive and negative control associations and using domain knowledge databases (e.g. Drug Bank and SIDER). Hoang et al. linked each drug in the dispensing data with the Structured Indications from DrugBank via the medicine ATC Code. The indications were then linked to hierarchies in the medical dictionary for regulatory activities (MedDRA). Supervised ADR classifiers were then used to predict whether sequences of medicine use were potential ADRs given the domain knowledge. The gradient boosting classifier was found to have improved performance over SSA, improving sensitivity by 21% compared to SSA without any loss of specificity. While hypothesis-free approaches to safety signal detection that machine learning allows are likely to enhance detection of medication safety issues, the probabilities produced may be more difficult for clinicians and regulators to interpret and do not provide quantification of the extent of the harm. Machine learning techniques may have a place with dispensing data by ranking the probabilities of signals, thus enabling triage for directed enquiries using other methods, such as cohort studies in linked data in which hospitalisation or diagnosis data are available.

Conclusion

Medication safety assessment in dispensing data has the potential to provide timely evidence to complement spontaneous reports particularly as medicines enter the market. To enhance the validity of medication safety studies, research should focus on validating patterns of medication dispensing as indicators of adverse event occurrence. Self-controlled designs are likely to be the most appropriate approach to generating this evidence as they eliminate the need for confounding adjustment; however, their application may be limited due to their strict assumptions and their potential for bias due to time-varying confounding due to trends in medicine utilisation. While machine learning approaches are likely to be of value in the exploration of safety signals in dispensing data, research should investigate the ability of these techniques to control for confounding. Incorporating electronic domain knowledge bases has the potential to help train machine learning algorithms as well as aid in filtering the large volumes of spurious signals likely to be generated by these analyses.

While generation of evidence of previously unknown safety issues with medicine use are of value, medicine utilisation studies in dispensing data should not be overlooked as a powerful tool to identify patterns of use that may be indicative of already known safety concerns. This evidence can be effective in informing strategies to promote more appropriate prescribing so that harms are avoided.