Type 2 diabetes mellitus (T2DM) is a progressive metabolic disorder that often requires treatment intensification as the condition progresses, and insulin therapy may be prescribed for patients who fail to achieve glycemic control with oral diabetes medications [1]. While treatment with antidiabetes medications, including insulin, has been linked with improved outcomes and health-related quality of life [2,3,4,5,6,7,8,9,10,11], a key factor to treatment success is adherence to prescribed therapy [12, 13]. That is, good adherence leads to good outcomes. However, insulin adherence is difficult to measure. Dosing varies widely from patient to patient [2], and insulin fills are generally recorded in pharmacy claims as a 30-day supply, while the amount of insulin prescribed often lasts longer than 30 days [1, 14]. Therefore, an accurate measure of insulin adherence would need to factor in the individual variation in dosing and actual days supply.

Traditional methods of measuring medication adherence fall short when applied to insulin. Objective adherence measures, such as the medication possession ratio (MPR) or proportion of days covered (PDC), are calculated from pharmacy claims data (or health authority data in Europe) and rely on prescription fill dates, the number of days of medication supplied, and the quantity of drugs dispensed to estimate adherence [8, 14, 15]. They are widely used due to the ease of calculation and availability of pharmacy claims data. Asking patients directly about their insulin adherence could potentially be a more accurate method of assessing insulin adherence, but these methods rely on patient recall and candid responses, and such data are rarely available.

Researchers have attempted to account for shortcomings in the calculation of insulin MPR due to differences in insulin package size by creating an adjustment factor that is applied to the traditional MPR [8, 14,15,16,17]. However, an extensive review of studies reporting various measures of adherence including patient self-report, traditional MPR, and adjusted MPR (which takes into account differences in package sizes), and PDC concluded that these methods are insufficient in accurately capturing adherence to insulin [1].

As has been shown by prior research, the individualized dosing system of insulin may impact the overall utilization of insulin, and more information on individual dosing could be vital to understand adherence to insulin. In this study, we used patients’ self-reported insulin utilization data that had been merged with their pharmacy claims data to develop an improved insulin adherence measure.

To do so, we evaluated basal insulin adherence using existing claims measures and developed a novel insulin adherence measure that incorporated the self-reported total daily insulin dose information as reported by a sample of patients with T2DM. Subsequently, we developed a predictive model to estimate basal insulin adherence—according to the measure incorporating self-report data—but using claims-based predictors only. We then applied this predictive model to a larger validation sample of patients with T2DM who were using basal insulin to determine adherence in this broader population. For the predictive model of insulin adherence, we purposefully restricted the list of independent factors to claims-derived measures to make the model applicable to the more readily available claims datasets. In this way, our model-based insulin adherence measure could be determined using claims data alone without the need of patients’ self-reported data.


Study Design and Data Source

A cross-sectional survey of T2DM basal insulin users was conducted in which patients were asked a number of questions about their basal insulin utilization practices during the 12-month period prior to and including the date of the survey [18,19,20]. Survey methodology details, including the development of the patient questionnaire, have been previously published [20]. After the survey phase was completed, patients’ survey data were linked via the survey patient ID with their administrative claims data for the same 12-month pre-survey period (see Appendix S1 in the electronic supplementary materials).

The HealthCore Integrated Research Database® (HIRD), a large administrative claims database that contains geographically diverse, longitudinal medical and pharmacy claims from 14 US health plans, was used as the primary source of data for this analysis. The HIRD was used as a sampling frame to identify the survey-eligible patient population. After the survey phase was completed, the HIRD was then used to extract survey respondents’ administrative claims data for the 12-month pre-survey period, which were then linked using the patient ID with their survey data. Finally, the HIRD was used to identify an external validation database of patients with T2DM who were basal insulin users in which MPR was calculated using our model compared with other more traditional definitions of MPR. This observational study used only de-identified patient data in full compliance with relevant provisions of the Health Insurance Portability and Accountability Act. Institutional Review Board approval was not required.

Patient Population

Model Development Sample

The model development sample consisted of patients who completed the patient survey and answered all questions pertaining to their insulin utilization for the 12-month period prior to and including the survey date, and had continuous administrative claims data for the same 12-month pre-survey period.

Initially, the administrative claims data in the HIRD were used to identify survey-eligible patients. Survey patients (age 18 years and older) were currently active members of a commercial health plan. All eligible patients had at least 1 medical claim with an International Classification of Diseases, 9th/10th Revision, Clinical Modification (ICD-9/10-CM) diagnosis code for T2DM (ICD-9-CM diagnoses codes 250.x0 or 250.x2; ICD-10 diagnosis starts with E11) at any time during the sample identification period (April 1, 2015 through May 31, 2016). Eligible patients also had at least three fills of basal insulin, with 1 fill in the most recent 3 months, and no medical claims for type 1 diabetes (ICD-9-CM diagnosis codes 250.x3; ICD-10 diagnosis starts with E10) during the sample identification period.

Prior to the administration of our survey, survey-eligible patients who responded to the pre-notification recruitment email or letter gave verbal or electronic consent to participate and confirmed their current insurance status, diagnosis of T2DM, and current use of basal insulin,. The survey ran from October 1 through December 31, 2016. The survey protocol and all survey-related materials were reviewed and approved by a central institutional review board.

Patients’ survey data was linked with their claims data for a period of 12 months prior to the survey administration date to capture healthcare resource utilization, healthcare costs, diabetes medication utilization, and other characteristics.

External Validation Sample

The external validation sample was identified from administrative claims data in the HIRD and consisted of adults (age 18 years or older) with at least one medical claim with an ICD-9/10-CM code for T2DM and at least three pharmacy fills of basal insulin between January 1, 2007 and December 31, 2016, meeting the same inclusion criteria as the patients included in the model development sample.

Study Measures

Basal insulin utilization data were determined from responses to questions asked in the patient survey and from pharmacy claims data (see Appendix S2 in the electronic supplementary materials). The four adherence measures used were traditional claims-based MPR, adjusted claims-based MPR, survey-based self-reported MPR, and hybrid MPR (see Appendix S3 in the electronic supplementary materials). For the predictive model, we considered the hybrid MPR adherence measure to be a reference standard for the study as the hybrid MPR incorporated the patient’s self-reported insulin dose and accounted for the actual number of insulin units dispensed over the study period, thus estimating the extent to which patients were able to consume their dispensed medication dose for the time period based on their individual dose.

For all MPR measures, we calculated MPR as the sum of basal insulin days of supplied divided by 365 (see Appendix S3 in the electronic supplementary materials). Adherence status was constructed as a binary variable, using an MPR threshold of 0.8 (≥ 0.8) to define adherence to basal insulin [i.e., adherent (MPR ≥ 0.8) vs. non-adherent (MPR < 0.8)].

Statistical Analysis

Descriptive statistics were reported for the outcome measures. Means and standard deviations (SD) were reported for continuous variables; frequencies and percentages were reported for categorical variables. Comparisons of different measures of insulin utilization, including insulin dose, total days of insulin supply, and adherence to insulin based on MPR, for claims-based and survey-based self-reported measures were reported. Statistical comparisons between patients classified as adherent and non-adherent based on hybrid MPR were performed using t tests and Wilcoxon-rank sum tests for continuous variables and χ2 tests for categorical variables, as appropriate. Statistical analyses were performed using SAS Enterprise 7 (SAS Institute, Cary, NC, USA) and R Studio statistical software. The R Glmnet package was used for implementing the LASSO analysis [21]. An a priori α-level of P < 0.05 was considered significant.

Predictive Model: Adherence to Insulin Using Hybrid MPR

While the hybrid MPR can overcome limitations of typical claims-based measures by incorporating self-reported total daily insulin dose, it is not easy to implement this method as it takes self-reported daily dose from individual patients to construct. Such data are difficult or costly to obtain in most cases. However, with claims data, it is possible to use the hybrid MPR adherence model if the predictive model restricts covariates to the variables in the claims data. We used only claims-based covariates to predict hybrid MPR. Variables used to predict adherence included claims-determined demographic characteristics, including age and gender; clinical conditions, such as the presence of any diabetes complications during the 12 months prior to survey administration; concomitant medications, including oral antidiabetic medications, antihypertensive medications, statins, short-acting insulin, and non-insulin antidiabetes injectables, during the baseline period; and insulin utilization-related characteristics, including days of insulin supply, number of insulin fills during the baseline period, gap days between insulin fills, and adherence to insulin based on traditional MPR ≥ 0.8 as well as adjusted MPR. Other claims-determined characteristics were also included, such as at least one HbA1c test and the number of office visits during the 12 months prior to the survey administration date.

We utilized the least absolute shrinkage and selection operator (LASSO) logistic regression model technique to predict binary adherence to insulin. LASSO regression not only selects predictive covariates but also shrinks the coefficients of poorly associated variables to zero, eliminating them from the prediction model. Essentially, LASSO gives a parsimonious model with a refined set of predictors. We implemented LASSO logistic regression with tenfold cross-validation. The cross-validation technique randomly selects training data to derive the model and apply the regression estimate to test the statistical model in the test data. It is widely used to help avoid overfitting the data and ensures there is internal validity [22]. The total sample was divided into ten subsamples. Nine of the subsamples were used to develop a model and the last subsample was used to test the model. This process was then repeated by re-sampling the patients, and repeated more than 1000 times. Finally, the results from each iteration were averaged over the samples to build the model by identifying covariates and estimating their coefficients, and to test the model by estimating performance characteristics.

We calculated the performance statistics of the model, including sensitivity, specificity, positive predictive value, and negative predictive value. We compared the performance of the model-derived estimates with the traditional claims-determined MPR and adjusted claims-determined MPR measures for predicting basal insulin adherence using the hybrid MPR measure.


Demographic and Clinical Characteristics

Of 400 survey participants, 296 (74%) met all inclusion criteria and had no missing values for all survey and claims variables used in the analysis. The mean age (SD) was 56.9 (9.3) years and 51% were female; 76% used insulin pens and 24% used vials and syringes. The most recent mean HbA1c level reported by patients was 7.8% (1.8%).

Insulin Usage Patterns

The current total daily insulin dose was lower for the patients’ self-reported estimate than the claims-based estimate [mean 57.7 (SD 38.3) vs. 77.9 (71.8)]. However, the number of 30-day refills for the self-reported estimate was higher than the claims-based estimate [mean 12.0 (SD 6.3) vs. 8.3 (3.8), respectively; Table 1]. Hybrid basal insulin days supply [median 282 (interquartile range (IQR) 171)] and self-reported [median 360 (IQR 120)] basal insulin days supply were higher than the claims-based days supply [median 259 (IQR 106)]. The median MPR score [0.99 (IQR 0.33)] based on patients’ self-report was highest, followed by the adjusted claims-determined MPR [0.98 (IQR 0.34)], hybrid MPR [0.87 (IQR 0.48)], and the traditional claims-determined MPR [0.80 (IQR 0.27)].

Table 1 Comparison of claims-based and survey-based basal insulin usage measures

Based on the MPR adherence cut-point of ≥ 0.80, 76% were adherent to basal insulin using the survey-based MPR; 71% using adjusted traditional MPR; 56% using hybrid MPR; and 50% using traditional MPR (Fig. 1).

Fig. 1
figure 1

Comparison of patients’ self-reported and claims-based basal insulin adherence measures. Adherence to basal insulin based on MPR ≥ 0.80. BI basal insulin, CB claims-based, MPR medication possession ratio.

Characteristics of Patients by Hybrid MPR-Based Insulin Adherence

The mean claims-derived days supply of basal insulin was significantly higher among patients who were adherent than those who were non-adherent to basal insulin (Table 2). Oral antidiabetic agents, antihypertensive agents, and statins were the most commonly used concomitant medications among patients adherent and non-adherent to basal insulin, in relatively similar proportions. However, a lower proportion of patients who were adherent to basal insulin used the injectable antidiabetic glucagon-like peptide-1 agonist compared with patients who were non-adherent to basal insulin.

Table 2 Claims-based characteristics by adherence to insulin among patients with T2DM using basal insulin

Almost every patient had at least one diabetic complication, with neuropathy being the most common complication. Patients who were adherent to basal insulin had a marginally higher proportion of retinopathy (22.3% vs. 14.6%) and lower proportion of cardiovascular disease (21.7% vs. 28.5%) compared with patients who were non-adherent to basal insulin. However, the differences were not statistically significant.

The mean Adapted Diabetes Complication Severity Index as well as mean Quan–Charlson Comorbidity Index scores were relatively similar among adherent and non-adherent patients. Furthermore, nearly similar proportions of adherent and non-adherent patients had at least one inpatient hospitalization and emergency department visit.

Predictors of Basal Insulin Adherence Using Hybrid MPR

An increase in total claims-based days of insulin supply, the presence of retinopathy, and classification as adherent to insulin based on adjusted claims-determined MPR were positively associated with insulin adherence. In contrast, older age and the use of non-insulin injectable diabetes medications were negatively associated with insulin adherence (Table 3).

Table 3 Comparison of estimates from simple, stepwise logistic, and LASSO regression models

We found similar factors predictive of adherence to basal insulin using logistic and step-wise regression models compared with the LASSO model, confirming the robustness of our findings.

LASSO Model vs. Traditional Adherence Measures

As the threshold of predictive probability increases, there is a decrease in sensitivity and an increase in specificity. To achieve the optimum balance between sensitivity and specificity, we identified three thresholds to determine optimum levels: (1) a threshold based on Younden’s Index (Y-index), which represents the predictive probability threshold where the sum of sensitivity and specificity is maximized (i.e., calculated as sensitivity + specificity − 1) and ranges from 0 to 1 with the highest observed value corresponding to the optimum predictive probability threshold; (2) a threshold determined with a view of maximizing specificity, detecting more true positives with a level that would provide sensitivity at least greater than a coin flip (P > 0.5; referred to as SPECMAX); and (3) a threshold determined with a view of maximizing sensitivity, detecting more true negatives with a level that would provide specificity at least greater a coin flip (P > 0.5; referred to as SENSMAX; see Appendix S4 in the electronic supplementary materials).

With respect to specificity, adherence defined using SPECMAX had the lowest sensitivity (52%), followed by adherence using the traditional claims-determined MPR (63%), adjusted claims-determined MPR (75%), Y-index (75%), and SENSMAX (81%). SENSMAX had the lowest specificity (53%), followed by adjusted claims-determined MPR (62%), Y-index (64%), traditional claims-determined MPR (66%), and SPECMAX (79%). Compared with existing methods, the Y-index-based threshold demonstrated a high level of specificity and sensitivity (Table 4).

Table 4 Comparison of LASSO model performance with traditional measures used to estimate basal insulin adherence

Application of LASSO Model-Based Adherence Equation

For the larger claims-only validation sample, we identified 23,391 patients with T2DM using basal insulin. Adherence was calculated using the LASSO model-based and traditional claims-based adherence measures. We found a similar hierarchy of adherence levels with the various measures of adherence. For example, 28% of patients were classified as adherent to basal insulin using the traditional MPR threshold; 46% using the Y-Index-based threshold; and 55% using the adjusted MPR-based threshold (Fig. 2).

Fig. 2
figure 2

Adherence levels using different methods to identify insulin adherence in larger validation cohort. MPR medication possession ratio, P-SENMAX threshold yielding maximum sensitivity based on predictive model, SPECMAX threshold yielding maximum specificity based on predictive model, Trad traditional, Y-Index Younden’s Index

We also examined the agreement between different measures of adherence using κ-statistics (see Appendix S5 in the electronic supplementary materials). We found a strong agreement between adherence based on the adjusted claims-determined MPR and the SENSMAX threshold (κ = 0.93); and the adjusted claims-determined MPR and the Y-Index based threshold (κ = 0.83).


To the best of our knowledge, this is the first study using two different sources of information for the same patient sample—survey-based patient self-reported data and their merged claims-based data—to calculate adherence to insulin. Insulin dose varies widely among patients, and dispensed quantities of insulin often last longer than 30 days. Traditional and adjusted MPR are calculated from pharmacy claims, and previous studies have shown the disadvantages of traditional MPR as it cannot correctly account for actual insulin usage [1, 2, 14]. Although the adjusted MPR accounts for differences in insulin package sizes, traditional and adjusted MPRs do not account for wastage or stockpiling of insulin, and no measure ensures insulin dispensed was used as prescribed. While the survey-based self-reported MPR can better describe insulin dose and days supply, it relies on patients’ recall, which may be inaccurate. No one method for measuring insulin adherence is accurate [1], so blending patient-reported and claims-based data may produce a better estimate of insulin adherence.

The mean total daily basal insulin dose reported by patients in the survey was lower than the estimated claims-based current insulin dose. That is, dispensed basal insulin could last for a longer period than indicated by the claims-based days of insulin supply. This finding confirms that patients with T2DM using basal insulin received more than the required dose mainly due to different package sizes given their claims-reported days of supply. And this is consistent with the actual insulin fill interval. Our findings suggest that traditional MPR calculations may underestimate basal insulin adherence mainly due to differences in patient-reported and claims-derived total daily dose.

Our novel method to estimate adherence using both patients’ self-reported and administrative claims data (hybrid MPR) showed greater adherence to basal insulin compared with traditional claims-based measures among patients with T2DM using basal insulin. The adjusted claims-determined MPR takes into account the population average to estimate the gap of insulin supply over time, and treats that gap as if insulin was being filled later due to excess pending insulin, which cannot be true in every case. Some patients may truly have a gap in insulin supply, making them temporarily non-adherent. Therefore, adjusting for gap, i.e., multiplying the average gap ratio by the traditional claims-determined MPR for each patient, could overestimate true adherence. Our study findings support this hypothesis and suggest that the traditional claims-determined MPR may underestimate true adherence as it does not account for insulin package size adjustment, while adjusted claims-determined MR may overestimate adherence as it applies an adjustment factor to truly non-adherent patients. Therefore, it is likely that the true adherence ranges somewhere between the estimates from the traditional and adjusted claims-determined MPR measures.

Furthermore, there was a strong agreement between defining adherence to basal insulin using adjusted claims-determined MPR and a model-based estimate or hybrid MPR. Although most patients who were classified as being adherent using the hybrid MPR measure were also classified as adherent based on the adjusted claims-determined MPR, there was still a proportion of patients who were classified as non-adherent based on hybrid MPR but were classified as adherent using the adjusted claims-determined MPR.

Our study has some limitations. The patient self-reported adherence measures were derived to mimic claims-based adherence measures. Such measures were based on the utilization or supply of insulin rather than on the consumption of insulin. Supplied insulin may not have been used by patients according to their physician’s recommendations. In that case, due to the cross-sectional nature of the survey, we were unable to examine self-reported consumption and administration of insulin over time as per reported insulin dose. We also noted the trend of the titration dose of insulin based on patients’ feelings and daily activities; it may be possible that a patient used a lower or higher amount of insulin than their prescribed dose for given days. Accounting for fluctuation among patients could lead to over- or underestimation of insulin adherence. Additional studies should be conducted to document the influence of such fluctuations by measuring adherence using daily diaries or electronic monitoring devices. Lastly, claims may have contained undetected errors in coding. All participants in this study were members of a large US commercial health plan. The results may not be generalizable to patients with government-provided insurance or who are uninsured, or to those living outside the United States.


Calculating insulin adherence is complex and there is no “gold standard.” Individualized insulin regimens complicate the ability of claims-based measures to capture adherence accurately. We aimed to develop a novel method to measure insulin adherence by integrating two sources of data—patient-reported and claims-determined—to address some of the inherent limitations associated with traditional claims-based and patient-reported adherence measures. By incorporating information from the patients on how they use insulin, this study attempted to enhance our understanding of basal insulin adherence. When compared against the claims-based adherence measures, the patient-reported mean total daily basal insulin dose was lower than that estimated from claims, suggesting traditional claims-determined MPR calculations may underestimate basal insulin adherence mainly due to differences in patient-reported and claims-derived total daily dose. To aid in the screening of patients potentially at risk of poor adherence to insulin, we have developed a model of insulin adherence based on patients’ self-reported mean daily dose. Requiring only administrative claims data, our model can help identify and screen patients, possibly non-adherent to insulin, for more targeted engagement. Further research into insulin adherence, including replication of our model for further validation, is desired especially since there is a growing pool of injectable antidiabetics for which adherence is similarly difficult to measure.