Introduction

Relative to non-Hispanic white patients, non-white patients in the US have a disproportionally elevated risk of type 2 diabetes mellitus and type 2 diabetes mellitus related morbidity and mortality [1,2,3,4]. It has been demonstrated that non-white patients exhibit poorer metabolic control relative to white patients [5, 6]. Analyses of efficacy and tolerability of pharmacologic interventions, including those used to treat type 2 diabetes mellitus, are based on clinical trials that often include predominately white populations, in part because of challenges associated with recruitment and retention of racial/ethnically diverse study populations [7, 8]. Despite the underrepresentation of non-white patients, there is some evidence of different treatment efficacy and tolerability according to race for some antihyperglycemic medications [9, 10]. Indicators of tolerability, such as hypoglycemia incidence, have been reported to be elevated in African American patients relative to non-Hispanic white patients [10]. This observation has been attributed to self-administered type 2 diabetes mellitus treatments and may be due to differences in adherence, lower use of or access to HbA1c testing, as well as environmental and lifestyle factors.

Sub-optimally controlled type 2 diabetes mellitus may require use of more intensive treatment options, possibly including combination therapies and/or injectable antihyperglycemic medication. These more intensive therapies can lead to increased treatment complexity and reduced tolerability. Evidence from randomized controlled trials has shown that exenatide once weekly (EQW), as a later line therapy, may have some clinical advantages such as convenient dosing, weight loss, low incidence of hypoglycemia, improved blood pressure, and an improved lipid profile. However, the degree to which these advantages lead to improved effectiveness and tolerability in customary clinical care and among different races is less known.

While prior studies have observed differential type 2 diabetes mellitus characteristics and type 2 diabetes mellitus treatment responses by race, studies with real-world data have not investigated race-specific differences in the tolerability or effectiveness of EQW. In this study, we evaluate the tolerability and effectiveness of EQW among type 2 diabetes patients of different races in routine patient care in the US, with additional results for basal insulin (BI) provided for context.

Methods

Data Source

The study population was drawn from Optum’s electronic health records (EHR) database. The EHR database is updated on a quarterly basis and integrates records from many medical groups and hospitals. The contributing medical groups represent a geographically diverse patient population within the USA. At the time of this study, the EHR database included over 25,000 physicians and over 25 million patients, forming a broad patient-level database of healthcare encounters in ordinary clinical practice. The EHR captures clinical, operational, and financial information that physicians record at the time of care. This information includes diagnoses, procedures, medications (prescribed and administered), clinical measures (biometric and laboratory values), and clinical notes [e.g., physician, pathology, and radiology notes as well as notes indicating hypoglycemia and gastrointestinal (GI) symptoms].

The data access and conduct of this study conformed to applicable Health Insurance Portability and Accountability Act (HIPAA) policies. Approval of the study protocol and a waiver of informed consent were obtained from a central institutional review board.

Study Design and Population

Injectable-naive patients with type 2 diabetes mellitus who initiated either EQW or BI were identified within the EHR data between January 2012 and January 2015, with follow-up through March 2015. The baseline period was defined as the 183 days (6 months) prior to the date of study drug initiation (index date). From the EHR data, patients with type 2 diabetes mellitus were identified using International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9-CM) codes, and initiations of EQW or BI (i.e., insulin glargine or insulin detemir) were identified in the EHR data capturing prescribed medications using National Drug Codes (NDC) and drug descriptions. The study population was limited to those who were at least 18 years old, had at least 6 months of available baseline data (defined as documentation of an outpatient medical visit 6 or more months prior to study drug initiation), and had at least one diagnosis of type 2 diabetes mellitus (ICD-9-CM: 250.X0 or 250.X2) prior to and including the date of the study drug initiation. Patients with a prior diagnosis of type-1 diabetes (ICD-9-CM: 250.X1 or 250.X3) or gestational diabetes, or with evidence of prior dispensings of a glucagon-like peptide 1 receptor agonist or insulin were excluded from the study population. Using the prescription data, the date of cohort entry was identified as the date of first prescription of EQW or BI meeting the study eligibility criteria.

Propensity Score Analysis and Matching

All analyses were conducted with the use of SAS version 9.2. Propensity matching was implemented to achieve balance between EQW and BI with respect to a large number of characteristics [11, 12]. The propensity score model incorporated dozens of predictors, identified in EHR data from the baseline period, of the use of EQW over BI. These variables included a wide range of demographic, medical, and health care utilization information that may have been associated with the decision to begin one or the other treatment. Clinical observations (i.e., body weight, body mass index, systolic and diastolic blood pressure) were selected from the EHR observation table, laboratory values [i.e., hemoglobin A1c (HbA1c), serum creatinine, urine albumin/creatinine ratio (ACR), total cholesterol (TC), high density lipoprotein (HDL) cholesterol, low density lipoprotein (LDL) cholesterol, and triglycerides (TG)] were selected from the EHR laboratory table. For clinical observation and laboratory values, the last available value available occurring in the baseline period was selected to represent status at initiation of therapy; if no value was observed during baseline, the value was multiply-imputed (5 imputations) using fully conditional specification methods (FCS) [13]. Estimated glomerular filtration rate (eGFR) was calculated from serum creatinine, sex and race variables using the Chronic Kidney Disease Epidemiology Collaboration Equation [14]. Hypoglycemia in baseline was identified using an algorithm that incorporated both diagnostic codes and natural language processing for clinical notes [15, 16].

Propensity scores were estimated by a logistic regression model that incorporated potential predictors of therapy as the independent variables using treatment EQW versus BI as the dependent variable. The propensity score model included a priori-specified confounding variables and stratification variables so the populations would remain balanced when conducting stratified analyses under the assumption that there was no important interaction between stratification variables and other covariates. Clinically important variables were identified using univariate c-statistics and were forced into the propensity score model. Other covariates were allowed to enter the model using a stepwise selection based on a univariate p value for entry (p < 0.2) and a multivariate p-value for remaining in the model (p < 0.3). The final propensity score model included several predictors: demographics (age, sex, race, US region of residency, calendar year of initiation); health history (diagnosis indicating the presence or history of smoking, alcohol abuse, obesity, any complication of diabetes as a composite and specifically neuropathy); healthcare utilization (presence of ER visit, hospitalization, foot examination, hospitalization days, number of medical care sites, number of 3-digit diagnosis codes, number of procedures, number of medications, number of antidiabetic medications, and class of antidiabetic medications (e.g., metformin, TZD, SGLT2, DDP-4); and empirically selected diagnostic codes, procedure codes, and other medications that had different distributions of occurrence between EQW and BI initiators; as well as the potential stratification variables [age, race, weight, HbA1c, body mass index (BMI), blood pressure, eGFR, and hypoglycemia].

Each EQW initiator was matched to up to two BI initiators using a greedy matching algorithm. Once an EQW initiator was matched with two BI initiators, the members of the matched set were removed from subsequent matching [17, 18]. Cohorts balanced in this manner may be directly compared for outcomes since covariate imbalance is addressed by matching.

Outcome Definitions

Effectiveness

Primary outcomes of treatment effectiveness included changes from baseline in HbA1c, body weight, and BMI. For these analyses, the clinical and laboratory measures were extracted from the EHR data and summarized in standard intervals over the first year following initiation of treatment. HbA1c, weight, and BMI were summarized in quarterly intervals (3-month intervals). For these clinical observations and laboratory measures, the interval value was taken as the mean of values occurring within an interval. If no values were observed, then values were multiply imputed (5 imputations) using the FCS method. Parameter estimates and associated variance (standard errors) were determined within imputed data sets and pooled (averaged) into a single set of statistics (SAS PROC MIANALYZE) that reflects the uncertainty in parameter estimates across all imputed values.

Tolerability

The primary outcomes used to assess the tolerability of the given treatments were the occurrence of hypoglycemia and gastrointestinal symptoms (nausea, vomiting, diarrhea, and constipation) in follow-up. Algorithms were developed to identify events of hypoglycemia and gastrointestinal symptoms, separately using both diagnostic codes and natural language processing of clinical notes.

Using ICD-9 codes within in administrative diagnostic fields, we identified hypoglycemia using a modified algorithm described by Ginde et al. (2008) [19]; we identified gastrointestinal symptoms using the following ICD-9 codes: 536.2, persistent vomiting; 787.01, nausea and vomiting; 787.02, nausea alone; 787.03, vomiting alone; 787.91, diarrhea; 564.5, functional diarrhea; and 564.0×, constipation. If multiple events were observed on the same day in an outpatient setting or within 7 days of a continuous inpatient stay, events were collapsed into a single event.

The natural language processing (NLP) algorithm processes the syntax of clinical notes to identify sentiment terms (denial, affirmation) of the event to enable a determination of whether the patient/provider is denying or affirming the occurrence an event. If multiple events were observed on the same day in an outpatient setting or within 7 days of a continuous inpatient stay, events were collapsed into a single event.

We included events identified by either the ICD-9 algorithm or the NLP algorithm. If the ICD-9 algorithm and NLP algorithm identified an event on the same day in an outpatient setting or within 7 days during a continuous inpatient stay, events were collapsed into a single event.

Analysis Plan

Patient race was identified from the EHR data and categorized as non-Hispanic white, African American/other black, or other race/ethnicity. Patients with unknown race were excluded from the analysis. All analyses described below are performed within race strata. Race differences are indirectly examined referring to the difference between EQW and BI within the strata. Each patient was followed from initiation of new therapy (date of cohort entry) until the earliest of the following: occurrence of a new outcome or event (separately for each event), disenrollment in the EHR system, or end of the study follow-up period. Analyses were conducted in an intent-to-treat fashion, with patients remaining within a single cohort (EQW or BI) throughout follow-up. The number of patients and person-time of observation were used to calculate the proportion and rates of occurrence of an outcome, respectively. The distributions of follow-up time were tabulated for each cohort.

Measurement of Change

For HbA1c, weight, and BMI, change was calculated as the percent difference between the measured value in the baseline period and measurements taken in each quarter of the first year of follow-up. Distributions of changes across each measure were summarized by providing the summary statistic (mean, mean percent differences) or the frequency of measures that were collapsed into a categorical metric. For mean percent differences, we provide the estimate and its 95% confidence interval (CI).

Measurement of Occurrence

For hypoglycemia, nausea, vomiting, diarrhea, and constipation, we evaluated the occurrence of each event separately. Additionally, we created a combined metrics of any nausea or vomiting, any diarrhea or constipation, or any GI symptoms. We calculated the frequency of events and the proportion of each cohort that experienced the event during follow-up. We calculated incidence rates of each event (and 95% CI), using person-time censored at first event during follow-up. Cohorts were compared using a relative rate (RR) estimate and its 95% CI.

Further Adjustment for Confounding

The propensity matching of EQW to BI initiators was conducted to achieve balance in observed covariates between the full cohorts of BI and EQW initiators, but this balance was not fully retained within strata of race. For this reason, measures of change and occurrence were further adjusted through multivariable modeling to improve the comparability of EQW and BI patients within strata of race. For measures of change and occurrence, mean values, percentages, and rates were obtained as the least squared estimates, by cohort and race, from models adjusted for baseline covariates that were imbalanced in baseline (linear regression for means, logistic regression for percentages, and Poisson regression for rates).

This study protocol and practices were approved by both privacy and institutional review boards affiliated with Optum. The study complies with the Health Insurance Portability and Accountability Act (HIPAA) guidelines for the protection of patient confidentiality and guidelines for good pharmacoepidemiology practices (GPP).

This article does not contain any new studies with human or animal subjects performed by any of the authors.

Results

The propensity-matched cohorts included 2008 EQW initiators matched to 4016 BI initiators. Each cohort had an average follow-up time of 1.5 years. The analyses presented here are limited to the 1946 EQW initiators and 3903 BI initiators for whom race was observed. Patient characteristics are reported by race and drug cohort (Table 1). Among included patients, 84% were white, 8% were African American, and 8% were identified as “other.” The “other” category included patients identified as Hispanic (65%), Asian (20%), and multiple races (15%). As a result of the propensity score matching, no notable differences were observed between drug cohorts within strata of race. Differences between races, however, were observed. As compared to white patients, African American and “other” patients were younger and more likely to be female. Measures of BMI and blood lipids (with the exception of triglycerides) were lower in white patients relative to African American and “other” patients.

Table 1 Comparison of baseline characteristics between propensity score-matched cohortsa of EQW and BI initiators, stratified by race: 1 January 2012 to 31 January 2015, Optum Electronic Health Record Research Database

Primary measures of effectiveness are reported by race and drug cohort (Table 2). In addition to propensity score matching, outcome measures were adjusted for within-race differences in age, gender, smoking status, baseline measures (HbA1c, BMI, HDL, LDL), sulfonylurea use, and number of comorbidities. With varying degrees of magnitude and statistical significance, the effectiveness measures were generally more favorable for EQW initiators relative to BI initiators. Among white patients, HbA1c decreased in the first 6 months and remained stable (reduction maintained) through to 1 year. The largest reduction in HbA1c (measured as a percent change from baseline) in the EQW group occurred 3–6 months after initiation (−7.97, 95% CI: −9.31, −6.63). EQW initiators lost weight through the 12-month follow-up period. By 9–12 months after EQW initiation, the percent change in weight was −1.73 (95% CI: −2.45, −1.02) for white patients, −1.11 (95% CI: −3.02, −0.81) for African American patients, and −1.43 (−3.63, −0.77) for “other” patients. The reductions in HbA1c and weight were similar across the race groups. HbA1c reduction was observed for BI cohorts but no weight reduction. The proportion of patients achieving any weight loss in conjunction with maintaining/achieving HbA1c ≤7% (53 mmol/mol) was consistently higher among EQW initiators relative to BI initiators. At 9–12 months following initiation, percent change in BMI was most notable among EQW initiators [−1.77 (95% CI: −2.49, −1.04) among white patients, −1.19 (95% CI: −3.17, 0.79) among African American patients, and −1.69 (95% CI: −4.08, 0.70) among “other” patients]. There was no appreciable change in BMI among BI initiators, regardless of race.

Table 2 HbA1c and weight change among propensity score-matched cohorts of EQW or BI initiators (with no prior use of an injectable diabetic medication), stratified by race: 1 January 2012 to 31 January 2015, Optum Electronic Health Record Research Database

Unadjusted mean values of HbA1c, weight, and BMI were compared between races among EQW initiators (Fig. 1). Measures of weight were comparable between white and African American patients; however, African American patients tended to have higher BMIs. White patients had a lower HbA1c at baseline relative to African American and “other” patients. This difference persisted throughout follow-up. In spite of the variations of point estimates between races, the changes from baseline in HbA1c, weight, and BMI followed similar patterns.

Fig. 1
figure 1

Measures of weight, HbA1c, and BMI by race among EQW initiators

Incidence rates of tolerability indicators are reported by race and drug cohort, and relative rates are reported for EQW versus BI within strata of race (Table 3). Within each race stratum, incidence rates and relative rates were adjusted for age, gender, smoking status, baseline measures (HbA1c, BMI, HDL, LDL), and sulfonylurea use and number of comorbidities. Among white patients, EQW initiators had an 18% lower rate of hypoglycemia relative to BI initiators (RR: 0.82 95% CI: 0.66, 1.01). The association was somewhat more pronounced among African American patients, among whom EQW initiators had a 41% lower rate of hypoglycemia relative to BI initiators (RR: 0.59 95% CI: 0.26, 1.34). While the magnitude of association between drug cohort and hypoglycemia differed, no formal tests were conducted to assess the significance of this difference. Associations between drug cohort and individual GI symptoms were comparable between races and as combined outcomes. Among white and African American patients, GI symptoms tended to be higher in EQW initiators relative to BI initiators. Due to the limited sample size, in part, the estimates for African American and “other” patients were less precise than those for white patients.

Table 3 Frequency of hypoglycemia and GI symptoms in propensity score-matched cohorts of EQW and BI initiators stratified by race: 1 January 2012 to 31 January 2015, Optum Electronic Health Record Research Database

Discussion

Observed measures of tolerability and effectiveness were generally more favorable for EQW relative to BI, and this finding was comparable between races. Though no formal tests of significance were conducted for between-race comparisons, EQW appeared to be more strongly associated with reduced hypoglycemia incidence in African American patients relative to white patients. The stronger negative association in African American patients was predominantly driven by an elevated incidence rate of hypoglycemia among African American BI initiators. This is consistent with prior research identifying increased hypoglycemia risk among African Americans treated with insulins [10].

The propensity score matching implemented in this study balances observed confounders between EQW and BI cohorts. While the distribution of race was balanced across treatment groups through propensity score matching, further multivariable adjustments were made to improve covariate balance across treatment within race strata. The multivariable adjustment beyond propensity matching had a minimal impact on the observed measures. No adjustment or standardization was implemented to attain a balance of covariates between race groups. As such, apparent differences in tolerability or effectiveness between races should not be interpreted as a direct effect of race, but may be due to differential type 2 diabetes mellitus characteristics present at baseline. Because this cohort study captures the real-world differences in risk factors within race, these comparisons remain informative for the assessment of expected effectiveness and tolerability of EQW by race in a real-world setting. That is, covariates that are associated with both race and tolerability or effectiveness should be viewed as mediators explaining the association rather than confounders of the association.

The measures of effectiveness and tolerability observed in this observational study of real-world data are generally comparable with or slightly lower than findings from clinical trials. In a randomized trial of EQW versus insulin glargine with 1.6 years of follow-up [20], the EQW group had an absolute HbA1c decline from baseline to follow-up of −1.2%. While the average baseline HbA1c was similar in the trial (8.3%) and in our study cohorts (8.2% for white EQW patients and 8.6% African American EQW patients), the decline seen in the trial population was slightly greater than the absolute declines seen in the first year of this cohort study, −0.62% and −0.55% for white and African American patients, respectively. The weight loss experienced by EQW users in the trial was similar to findings within our observational cohort study. In the trial population, EQW users lost 2.1 kg of weight compared to 1.9 kg for white patients and 1.0 for African American patients in the cohort study. Comparisons between the trial and our cohort should consider that the trial included a longer follow-up time for changes in weight and HbA1c (84 weeks) as compared to our cohort study (52 weeks). Both studies showed that hypoglycemia occurred less often among EQW versus BI users, yet the detection of hypoglycemia in trial participants was much greater than in our study. Specifically among trial participants, 24% EQW users and 54% insulin glargine users experienced mild hypoglycemia, yet in the cohort study accounting for all hypoglycemia events in the EHR regardless of severity, we found that among all race groups combined only 7.4% of EQW and 9.2% of BI initiators experienced any hypoglycemia in follow-up. Similarly, nausea occurred more frequently among EQW users (12%) compared to insulin glargine users (6%) within 1.6 years of follow-up compared to 15% vs. 13% among EQW initiators and BI initiators in our cohort study, respectively.

This study was based on an analysis of EHR data. This real-word data resource reflects patient distributions and medical interactions captured from a geographically diverse set of hospitals and medical groups. While EHR data are valuable for examination of clinical health care outcomes and treatment patterns, databases derived from EHR (such as the one used for this study) have certain inherent limitations because the data are collected for the purpose of clinical patient management, not research. First, these data represent the intent of the prescriber through the written prescription for a medication and do not indicate that a medication was filled, consumed, or taken as prescribed. This does not alter the interpretation of the results from a clinical perspective (i.e., outcomes associated with providing a prescription) since the non-adherence leading to this form of misclassification is likely to be representative of patient populations in a real-word setting. From a patient perspective (i.e., outcomes associated with taking the medication), this form of non-adherence could lead to an underestimation of the incidence of tolerability indicators (hypoglycemia and GI symptoms) and bias the observed effectiveness (HbA1c control and weight change) toward no difference. Clinical variables are missing for some individuals because of variation in care practices and potentially other factors. Additionally, health care encounters with medical providers who do not contract with Optum’s EHR services would not be observed. The observed patient characteristics within this EHR database included a greater proportion of patients identified as white relative to national estimates.

To facilitate the uses of EHR data for the assessment of measures of effectiveness and tolerability, a multiple imputation method was implemented to estimate values within standard intervals of follow-up. Multiple imputation is founded on the assumption that unobserved variables are missing at random (i.e., missingness is random after conditioning on observed covariates). This assumption is more broadly applicable than the assumption that missingness is completely at random (i.e., missingness is independent from any covariate, observed or unobserved), which would mean the missingness could generally be ignored in analysis. While multiple imputation reduces the potential for bias, it is possible that patients with observed values are systematically different from those with unobserved values in both unmeasured ways.

Conclusion

This retrospective cohort study within a US EHR database assessed real-world measures of effectiveness and tolerability of EQW relative to BI within strata of race. Regardless of race, patients initiating EQW appeared to have better weight control, larger reductions in HbA1c, and reduced risk of hypoglycemia relative to BI. Risks of GI symptoms were modestly elevated among EQW initiators, though only the observed difference for nausea in white patients appears to exclude a chance finding. Measures of EQW effectiveness and tolerability were generally comparable between races.