FormalPara Key Points for Decision Makers

In many chronic diseases, adherence (taking medication as prescribed) is associated with healthcare costs: inpatient costs are often lower but pharmacy costs and total costs are often higher in more adherent patients.

The association of adherence and costs depends on the type of chronic disease and the severity of the disease.

Methodological difficulties when measuring adherence with health insurance data should be considered carefully.

1 Introduction

For chronically ill patients, it has been claimed that it is of special importance to avoid any disruptions in their medication because it can lead to an exacerbation or development of secondary diseases [1]. In this context, adherence is defined as “the extent to which a person’s behavior – taking medication, following a diet, and/or executing lifestyle changes, corresponds with agreed recommendations from a health care provider” [2]. Adherence to medication can be divided into three components: initiation, implementation, and discontinuation [3]. Initiation and discontinuation are dates when the patient takes the first dose and stops the medication, respectively. Implementation is the period in-between. In this paper, we understand adherence as the “extent to which a patient’s actual dosing corresponds to the prescribed dosing regimen” during implementation [3].

The World Health Organization estimates non-adherence to long-term therapies in developed countries at around 50% [2]. Similar results can be found in more recent studies about chronic diseases such as diabetes or cardiovascular diseases, [4, 5] and it can be assumed that a longer lifespan, an increasing prevalence of chronic diseases, and multimorbidity aggravate this situation [6]. This has health effects, such as increased hospitalization and mortality, as well as economic effects [7, 8]. The economic effects in Europe are estimated at around 125 billion Euro annually [9].

The costs of non-adherence are classified into direct and indirect costs. Direct costs are especially those that occur during medical treatment, while the latter includes social costs, for example productivity loss. The most important sub-categories of direct costs are outpatient, inpatient, and pharmacy costs [10].

A systematic review of Cutler et al. [11] shows that non-adherence is associated with healthcare costs in general. Nevertheless, there are important differences in the amount and direction of the effect. Differences by cost sub-categories were discussed by Iuga and McGuire [12]. Increasing adherence obviously increases pharmacy costs owing to the extra expenditure for medication. However, it is assumed to avoid progression of diseases, further treatment—especially hospitalization—and hence, inpatient and outpatient costs. The relationship between adherence and costs varies between diseases because the proportion of cost categories and the consequences of non-adherence are disease specific. Hence, both different cost categories and diseases should be analyzed in detail.

According to the review of Cutler et al. [11], the majority of studies come from the USA, a few from Europe, and to our knowledge, no study yet exists from Germany, which is the second largest European country by number of inhabitants. Their main approach is using claims data. Some data sources are Medicaid [13], Medicare [14], or Veterans insurances [15], which disproportionately contain low-income households, the elderly, or male individuals. This makes external generalization difficult. Overall, generalization to other countries is limited because of country-specific healthcare systems and cost structures. A more international view can give further insight into the adherence–costs relationship.

The current research of adherence is characterized by a vast variety of methodological approaches. This makes results hardly comparable [11]. There is a wide range of definitions of adherence, costs, and its operationalizations, as well as modeling strategies [3, 10, 16]. Fortunately, several guidelines about conduction and reporting adherence research have been developed in recent years [17,18,19].

Sociodemographic variables, health status, and health behavior are common confounders of the relationship between adherence and costs [7, 13, 14]. Difficulties in controlling for the latter is a major disadvantage of claims data because it usually contains little information about general health behavior. This raises the risk of healthy adherer bias, which occurs when adherent patients tend to generally healthier behavior, and consequently have lower healthcare costs [20].

Stuart et al. [21] argue that measuring costs and adherence at the same time period can lead to misinterpretation. Major adverse health events and hospitalization increase costs, might result in the initiation of drug therapy, and may influence adherence. Roebuck et al. [13] discussed the use of a time lag between adherence and costs to avoid this potential reverse causality. However, this approach is not very common in adherence studies—with few exceptions [22,23,24]. Furthermore, none of the studies compared both approaches in the same sample, and there is so far no systematic research about the length of the time lag or its potential dependency to the investigated disease.

Methods used to analyze the relationship between adherence and costs are often limited to simple regression models (linear regression or generalized linear regression), sometimes without controlling for potential confounders [11, 25]. More advanced analyses such as subgroup analyses or non-linear models are rarely used [14, 26].

We used German claims data of several chronic diseases in our study. We want to quantify the relationship between adherence during implementation and total healthcare costs including four sub-categories (pharmacy, outpatient, inpatient, and other costs) in these populations. We want to find out how this relationship depends on the time period between measuring adherence and costs, on possible subgroups, and whether the effect of adherence is linear.

2 Methods

2.1 Data

We had access to claims data for the years 2007–16, provided by several German stationary health insurances with more than 3.5 million insured persons. The database contains demographic information such as sex and age, inpatient and outpatient diagnoses coded according to the German modification of the International Classification of Diseases, 10th Revision, filled prescription drugs by date, package size, Anatomical Therapeutic Chemical classification code, and defined daily dose (DDD) according to the World Health Organization Collaborating Centre for Drug Statistics Methodology [27], information about the participation in disease management programs of six different diseases (asthma, breast cancer, chronic obstructive pulmonary disease, type 1 diabetes, type 2 diabetes, and coronary heart disease), and charges for outpatient, inpatient, drugs, and six other cost categories.

2.2 Study Population

We extracted 4.5 years of data between July 2011 and December 2015 because 2012–15 were the latest fully available years from the database. The baseline year was 2012 (t0), and 2013–15 (t1–t3) were follow-up years. We defined nine cohorts of patients with at least one International Classification of Diseases, 10th Revision diagnosis of the following chronic diseases within each observational year out of all insured persons with year-round coverage: type 1 diabetes (T1D: E10), type 2 diabetes (T2D: E11), hypertension (I10), congestive heart failure (CHF: I110, I130, I50), coronary heart disease (CHD: I20, I21, I22, I23, I24, I25), hyperlipidemia (E78), asthma (J45), chronic obstructive pulmonary disease (COPD: J44), and inflammatory bowel disease (IBD: K50, K51). Patients with more than one chronic disease were part of multiple cohorts.

Patients with excess costs (top 5% total charges of each cohort) at baseline year t0 were excluded to restrict the sample to a study population where costs have not been escalated yet. Furthermore, we excluded patients without any data or fills of corresponding prescription drugs in baseline year t0. See Table 1 of the Electronic Supplementary Material (ESM) for the definition of diseases and drugs used in this study.

2.3 Definition of Variables

Outcome variables were annual total costs of each year t0–t3 as well as four sub-categories: pharmacy costs, outpatient costs, inpatient costs, and other costs. The latter is a category for all remaining costs such as for curative means and aids, physical therapy, and ambulance service. We did not distinguish between disease-specific and non-disease-specific costs. We used charges as a proxy for costs that include the actual amount paid by the insurance but do not consider a possible copayment by the patient. All prices were transferred to 2015 prices by multiplication with annual inflation of the healthcare sector as stated by the German Federal Statistical Office [28].

We operationalized adherence at baseline year t0 through the proportion of days covered (PDC) by any diagnosis-specific medication based on the list in Table 1 of the ESM. The PDC was calculated as follows: covered days were defined as days with at least one dose of any diagnosis-specific drug available to the patient. Drugs were distinguished by Anatomical Therapeutic Chemical codes [27]. We assumed a drug being available starting from the date of the prescription fill for the number of days calculated by the total package size divided by the DDD [27]. We also considered the medication supply from the last 6 months of the previous year on a pro rata basis if it could be considered partially available at the beginning of the observational year (pre-supply). In addition, we assumed a drug being available during hospitalization if the patient was treated with the same drug within 3 months before or after the hospital stay. We divided the number of covered days by the number of days between the first covered day, and the last day of the year to finally calculate the PDC (in percent).

We extracted the following covariates based on the data of baseline year t0: we used age and sex as sociodemographic variables. We calculated the Charlson’s Comorbidity Index in its International Classification of Diseases, 10th Revision, German Modification version with updated weights [29,30,31] and initial total costs to represent the general health status. We created a two- or three-level severity variable based on treatment guidelines and prescription drug fills to distinguish by severity of the chronic diseases within a population (Tables 2–4 of the ESM for detailed definitions). Because health behavior is an important confounder but is not directly available in claims data, we used participation in any disease management program and influenza vaccination as a proxy similar to Stuart et al. [14] and Brookhart et al. [32].

2.4 Statistical Analysis

We developed three categories of models, which we applied separately for every population and all cost outcomes (total costs, pharmacy costs, outpatient costs, inpatient costs, and other costs). A simplified overview of all models is visualized in Fig. 1. Our main model (M1) was a linear regression model with mean costs of follow-up years t1–t3 as the outcome, the baseline PDC as the main predictor, and the covariates sex, age, Charlson’s Comorbidity Index, initial costs, disease management program participation, and vaccination. We also added an interaction term of PDC and severity to model different PDC effects of severity subgroups.

Fig. 1
figure 1

Overview of models. Follow-up years in which the outcome was measured are highlighted in green. B baseline, PDC proportion of days covered

We further customized our models to analyze effects of the PDC on the costs of different follow-up years: instead of mean costs, we used repeated measurements of annual costs (t1–t3) as the outcome, and a three-fold interaction term of PDC, severity, and a continuous (M2a), respectively, a categorical (M2b) time variable. In these models, we used cluster robust standard errors (Huber–White standard errors) to account for repeated measurements. A model with costs of baseline year t0 as the outcome (M2c) was calculated separately, whereby initial costs could not be included as a covariate when the outcome was the same or similar.

We developed a third category of models (M3) to test for a potential non-linear relationship between PDC and costs: We modeled costs of every year (t0–t3) separately within every severity subgroup using multiple fractional polynomials. Again, initial costs were only used as a covariate when the outcome was not measured in the same year. Linear and non-linear models were compared by the Bayesian Information Criterion.

All statistical analysis was performed in R version 3.6.2 [33]. Hypothesis testing was performed at exploratory two-sided 5% levels of significance.

3 Results

The defined claims data contains 2,644,212 patients with at least a 1-year round coverage between 2012 and 2015. Our final cohorts include 15,463 patients with T1D, 91,544 patients with T2D, 402,898 patients with hypertension, 54,015 patients with CHF, 102,326 patients with CHD, 127,247 patients with hyperlipidemia, 87,883 patients with asthma, 53,817 patients with COPD, and 6747 patients with IBD. See Fig. 2 for a flow chart of included and excluded patients per disease and year for model M1.

Fig. 2
figure 2

Flowchart of cohorts. CHD coronary heart disease, CHF congestive heart failure, COPD chronic obstructive pulmonary disease, IBD inflammatory bowel disease, T1D type 1 diabetes, T2D type 2 diabetes

The median PDC of T1D, T2D, hypertension, CHF, and CHD cohorts ranges from 83 to 99% at baseline. Their PDC distribution is highly left skewed with 41–67% of patients having a PDC higher than 90%, whereas the median PDC of hyperlipidemia, asthma, COPD, and IBD cohorts ranges from 28 to 67%. Apart from IBD, their distribution is bimodal, with only 12–30% of patients having a PDC higher than 90%. Median total costs is lowest in patients with asthma in t0 with 1136 Euro and highest in patients with CHF in t0 with 4558 Euro. All cost categories are unimodal right skewed apart from inpatient costs where we measured no inpatient costs at all in 46–79% of patient-years resulting in a bimodal distribution. These and further descriptive statistics are given in Table 1.

Table 1 Descriptive summary statistics of cohorts of nine chronic diseases: median (interquartile range) for continuous and absolute (relative) frequencies for categorical variables

The mean proportion of pharmacy costs ranges from 29 to 51%, inpatient costs from 10 to 24%, outpatient costs from 21 to 46%, and other costs from 10 to 18% (Fig. 3). We analyzed the linear relationship of the PDC in our main M1 regression models, and 3-year mean costs of follow-up t1–t3 years in severity subgroups controlling for sociodemographic variables, general health status, and health behavior. See Fig. 4 for a heat map of the effects and Table 5 of the ESM for more details. Total costs increased by the PDC in 21 out of 25 models, with a range from 0.32 to 32.57 Euro on average per year and a PDC %-point. Of the four diagnosis-severity subgroups with decreasing costs (mild hypertension, mild CHD, medium hyperlipidemia, and severe IBD), only one reaches statistical significance. Sign and size of the effect differed when distinguishing between cost sub-categories. The PDC effect on average outpatient costs ranged from − 2.19 to 1.43 Euro. Thus, it was relatively weak considering the median outpatient costs of 549–911 Euro. While all pharmacy costs increased by 1.19–39.47 Euro per PDC %-point, inpatient costs decreased in 12 out of 25 models between − 36.44 and − 0.46 Euro on average. Inpatient costs increased significantly only in three diagnosis-severity subgroups (severe asthma, COPD, and CHF).

Fig. 3
figure 3

Mean proportion of cost sub-categories by disease. CHD coronary heart disease, CHF congestive heart failure, COPD chronic obstructive pulmonary disease, IBD inflammatory bowel disease, T1D type 1 diabetes, T2D type 2 diabetes

Fig. 4
figure 4

Results of main model M1: estimated average effect per proportion of days covered-%-point per year on costs in Euro (statistically significant values are printed in bold). CHD coronary heart disease, CHF congestive heart failure, COPD chronic obstructive pulmonary disease, IBD inflammatory bowel disease, T1D type 1 diabetes, T2D type 2 diabetes

We further analyzed the effect of the PDC on the costs of different years in our M2 models (detailed results in Table 6 of the ESM). There were significant year differences between follow-up t1–t3 years in 6 of 100 subgroups (total costs in severe COPD; outpatient costs in medium T1D and severe COPD; inpatient costs in mild hypertension, severe asthma, and severe COPD) in M2a models with a continuous time variable. Similar subgroups (total costs in severe COPD; outpatient costs in medium CHD and severe COPD; inpatient costs in mild hypertension and severe COPD) had significant year differences between follow-up years t1–t3 in M2b models with an even more flexible categorical time variable. The effect of the PDC on costs in M2c models with PDC and costs, which were both measured in baseline year t0, differed significantly from the effects of M2b model in 13 of 100 subgroups. We relaxed the assumption of a linear relationship of PDC and costs in M3 models (detailed results in Table 7 of the ESM). The Bayesian Information Criterion difference was higher than 6 in 86 of 400 models indicating a strong non-linear relationship improvement compared with the linear model in these models [34].

4 Discussion

We analyzed German claims data of nine chronic diseases and up to three nested severity subgroups controlling for sociodemographic variables, general health status, and health behavior in a large data set. Overall, we found increasing total costs by increasing the PDC, a weak association with outpatient costs, increasing pharmacy costs, and frequently decreasing inpatient costs. There were major differences by diagnosis, and severity, but minor differences between different years, provided costs are not measured in the same year as the PDC.

In general, the direction of effects in the cost categories are similar to other studies and, as expected, different between cost categories. While outpatient costs are relatively low and the effects of adherence are small, inpatient costs are higher and the negative effects of adherence on inpatient costs were observed. The latter can be explained by prevention or less severe hospitalization. The estimated effect on pharmacy costs, which include the extra expenditure for the filled doses, was positive although higher than we expected compared with the usual costs of disease-specific drugs. This might be explained by improper control for confounding by multimorbidity and the correlation of adherence across different conditions. The overall positive estimated effect of adherence on total costs was unexpected, at least for some diseases, and different to most other studies [11]. This might partly be explained by differences in the cost structure in our data from the German healthcare system compared with the mainly US American data used in most other studies [10]. The decrease of inpatient costs due to higher adherence has a smaller effect on total costs, and might be outbalanced by pharmacy costs when the proportion of inpatient costs is lower.

We concluded that the effect of adherence on costs is relatively stable over time because differences between the effects of different follow-up t1–t3 years were small. Moreover, we did not find major differences by diagnosis. Hence, our main model with mean costs of follow-up t1–t3 years seems appropriate. This model is additionally less complex because we neither have repeated measurements nor a three-fold interaction term. There were considerable differences between the PDC effects on costs in base year t0 and follow-up t1–t3 years, which is in contrast to this finding. We expect base-year t0 models to be biased by concurrent measures of adherence and costs and their ambiguous association, although this can also be explained by differences in model specification. We used a time lag between adherence and costs in our main model as discussed by Roebuck et al. [13] to avoid this potential reverse causality. Overall, our results support this approach, and show that the length of the time lag is not important, provided adherence and costs are not measured in the same period. However, this might differ in a medium- to long-term effect setting because of the maximal follow-up period of up to 3 years. Moreover, if we assume adherence to be stable over time, the PDC of different years will be correlated and a time series analysis with yearly measured adherence would give more insights into the longitudinal effect of adherence on costs.

We further examined the functional form of the adherence–costs relationship. We allowed non-linear effects in an explorative sensitivity study. Non-linear relationships, such as polynomial or logarithmic functions, showed a better model fit in about one fifth of our models. This was mainly the case in patients with asthma, severe conditions, and pharmacy costs. We decided to continue assuming linear relationships because of the comparability and consistency of results although this might be worth exploring in future studies.

The study has several limitations by using claims data for adherence. First, they do not contain information about real intake, only prescription drug fills. Therefore, we assume drugs filled are also being taken, and that there is no unrecorded source of drugs. In general, this is a common approach, and is considered to be reliable [35,36,37]. Second, we do not have additional information about prescribed doses in German claims data. Instead, the DDD is used to calculate the duration of availability of a filled package. The DDD is a general recommendation and can differ from the individually prescribed dose according to the clinical decision of the healthcare provider. A package lasts shorter than assumed and adherence is overestimated if the prescribed dose is higher than the DDD and vice versa. We observed a high proportion of perfect adherence in our data and we suspect some of them were overestimated. Third, these data were not collected for research purposes and it does not contain all relevant confounders. For example, there is little information about general health behavior available, but it has been discussed as an important confounder for the adherence–costs relationship [20]. Using proxy variables, although it might not be sufficient, is a common approach to counteract this data limitation [14, 32]. Failing to control for all relevant confounders can lead to a biased effect estimate, which might be the case in this study. However, claims data contain detailed information about drug fills, different diagnoses, and costs. The number of observations is generally higher and observing data over several years is easier because death and a change of insurance are the only major reason for drop-outs compared to survey studies. In Germany, change of insurance is quite uncommon, especially in the older population with chronic diseases [38]. Stationary insurances have a coverage of almost 90% of the total population owing to the compulsory health insurance in Germany [39]. Therefore, these data are highly generalizable to the general population of Germany. It is also of great importance for other healthcare systems, especially because the population is less restricted to special subgroups than commonly analyzed populations such as Medicaid, Medicare, and Veterans Affairs data from the USA, which focus on low-income households, the elderly, or male individuals. The exclusion of patients with very high costs at baseline from our study population limits generalizability to a population in which costs have not yet escalated. The definition of the study population, time frames, variables, and model specification were informed by subject knowledge and the nature of the available data, which have implications for the estimand of the effect of adherence. Other definitions and specifications may lead to different findings, which underlines the importance of hypothesis-based research for greater reliability and interpretability.

There are several approaches for measuring adherence, and common terms such as the medication possession ratio or PDC are often used inconsistently as Raebel et al. [16] pointed out. We decided to use the PDC to measure adherence because in contrast to the medication possession ratio, oversupply does not lead to values higher than the theoretical maximum of 100, although it is less common. We considered three different sources of drugs (actual fills during the observational year, pre-supply from previous year, and supply during hospitalization) to avoid underestimation of available doses. Other sources are negligible because of integrated prescription drug coverage within the German health insurance. In many studies, the originally continuous adherence variable is dichotomized [11]. Patients with PDC or a medication possession ratio higher than 80% are usually defined as adherent or multiple categories are used. Only a minority of adherence studies use continuous adherence measures [14, 40] or compare them to categorized adherence measures [24]. The 80% cut-point has been shown to be associated with clinical outcomes such as lower hospitalization although initially it was arbitrary [41]. However, Roebuck et al. [24] found other cut-points to be more appropriate and warn that the effect of adherence might be masked otherwise. In addition, Tueller et al. [42] criticized dichotomization of adherence because it leads to loss of information, and can introduce bias. We decided to use the continuous PDC without dichotomization because this is also in line with the general statistical literature about the dichotomization of continuous variables [43, 44].

We introduced severity subgroups based on treatment guidelines to distinguish the estimated relationship of adherence to costs between them. These differences demonstrate the importance of considering severity of the disease in adherence research. However, given the available claims data, severity subgroups were defined by drug fills, which could lead to a spurious correlation of severity and PDC, especially if a different number of unique medications is related to each severity subgroup. Medication characteristics (such as side effects or route of administration) are also likely to differ and influence PDC. These limitations should be considered when interpreting our results.

Healthcare costs are usually right skewed because most patients have no or very low costs. Some authors argue that it is necessary to log-transform the costs or to use gamma distribution in a generalized linear model to address the skewness of the outcome [14, 15]. However, estimation of model parameters, and of covariance is unbiased in large samples. Therefore, effect estimation, and hypothesis testing are valid even without transformation of the data in this case [45, 46]. Therefore, and to facilitate interpretation, we decided not to use any transformation of our cost variables.

An advantage of this study is the vast variation of model specification within the same study to explore the robustness of our findings. Different assumptions about time dependency and non-linearity can be compared. An advanced approach is leaving the area of general effects for a whole population and instead focusing on a subgroup analysis. Using severity subgroups can be seen as a first step, but more advanced methods are currently under development and will be able to automatically detect subgroups with different effects or even patient individual effects.

5 Conclusions

In our study of patients with chronic diseases, treatment adherence was associated with total healthcare costs. The strength of this association depended on the type of chronic disease and the severity of the disease. Our results differed significantly from those of other studies, although similar associations were found in cost subcategories. From a methodological perspective, it seems important not to measure adherence and costs in the same time period. In some situations, the assumption of a non-linear relationship might be appropriate. Further methodological research on adherence and its relationship to healthcare costs is needed.