Background

Diabetes has become one of the most common and expensive medical conditions amongst older adults. Nearly 25% of all adults ages 60 and older have diabetes in the United States [1] and this group accounts for over half of the more than $100 billion in healthcare expenditures attributable to this disease [2]. Reducing the disease and economic burden of diabetes has been a long-standing goal in health policy.

Carefully designed analyses of secondary datasets, including Medicare claims data, can contribute to diabetes health services research. The strength of this design is critically dependent on being able to identify diabetic patients in datasets. Utilization-based algorithms are frequently used to identify chronic disease cohorts in national Medicare data and other administrative datasets. However, concordance between disease status assessed by these algorithms and patient self-reports can be low, raising concern about the bias associated with research designs relying on non-clinical data [3, 4].

A diabetes diagnosis requires clinical measures of Hemoglobin A1c (the gold standard since 2010), a measure of blood sugar control over the past 60–90 days, or fasting blood sugar [5]. However, obtaining clinical measures of diabetes status for health services research is rare, and most researchers rely on more subjective measures, such as self-reports or claims-based diagnoses to identify the diabetic population. Both measures have strengths and limitations.

Self-reports are commonly used to assess diabetes prevalence in survey data when biological measures are unavailable [69]. Self-reports are believed to underestimate diabetes prevalence because they depend on a patient seeing a doctor to receive a clinical diagnosis and correctly reporting when asked. The standard approach in surveys with biomarkers is to take all self-reports as true cases and add in the people who self-report “no” but test above the diagnostic threshold. The National Health and Nutrition Examination Survey (NHANES) is the most common data source used for studying the population of undiagnosed diabetics. The latest published NHANES estimates for the 65 and older population for years 2003–2006 indicate a diabetes rate of 21.1% based on the sum of self-reported diabetics (17.7%) and self-reported non-diabetics with an A1c reading greater than or equal to 6.5 (3.5%) [10].

Claims-based diagnoses may be obtained from patient billing records, hospital discharge abstracts, and physician data and are usually based on algorithms that may or may not match actual diagnoses found in medical records [11]. The Center for Medicare and Medicaid Services’ Chronic Condition Warehouse (CCW) algorithm is the primary diabetes algorithm used in Medicare claims-based research as it is included in their research files and many users are likely to implement it given inclusion in the Beneficiary Annual Summary File. The CCW algorithm defines a diabetic as someone who has had at least one inpatient, skilled nursing or home health visit or at least two outpatient visits with a diabetes-related ICD-9 code during a two year period [12]. This definition was observed to be adequately sensitive (≥70%) and reliable (kappa ≥ 0.80) [13]. Other studies of administrative claims-based measures of diabetes status have yielded varying sensitivities ranging from 64% to 87% when validated against different benchmarks, including laboratory data, medical records, and self-reports [4, 1416].

Investigating the quality of survey self-report and claims-based diabetes measures is important because data users rely on these measures in health services research to produce population-based estimates of the diabetes population. These estimates often have important policy implications and any inaccuracies can lead to incorrect inferences. For example, overestimation of the diabetes population may lead to misleading conclusions about the quality of diabetes care as the false positives may be seen as not receiving adequate amounts of care. Conversely, underestimation of the diabetes population may lead to conclusions that understate the prevalence of the disease and the economic burden of the disease.

This paper provides some insight as to the accuracy of self-report and CCW claims-based diabetes measures and their implications for health services research. In it, we use nationally representative survey data linked to Medicare claims and measured Hemoglobin A1c levels from an in-person blood draw to compare the accuracy of the self-report and CCW measures relative to the A1c reading. Furthermore, we examine discrepancies between the two measures and compare commonly used healthcare utilization outcomes across each of the diabetes definitions.

Methods

We utilize the 2006 wave of the Health and Retirement Study (HRS) [17]. The HRS is a nationally representative longitudinal study of Americans over the age of 50. In 2006, HRS began collecting several physical measurements, including biologic specimens: saliva and dried blood spots. Here we focus on the collection of blood spots and measured Hemoglobin A1c to validate and understand discrepancies in self-reported and claims-based diabetes status, which was obtained for 83 percent of the biomarker subsample who consented to the blood draw.

Medicare-eligible HRS respondents were asked for permission to link their Medicare records from the Centers for Medicare and Medicaid Services (CMS). HRS data and records from the 2006 Medicare Beneficiary Annual Summary Files were linked for 88 percent of respondents who consented to the linkage. Consent rates were higher for younger, wealthier, Hispanic, and non-white respondents. Diabetes prevalence is higher amongst minorities and the younger-old [2], suggesting that the self-selected subsample represents persons who are more likely to be diabetic.

Three measures of diabetes status were utilized: survey self-reports (“Has a doctor ever told you that you have diabetes or high blood sugar?”), indication of diabetes based on the Chronic Condition Warehouse (CCW) algorithm, and a Hemoglobin A1c score of 6.5 or higher [5].

Four healthcare utilization outcomes were extracted from the Medicare claims and used to assess differences between the diabetes definitions. The first three outcomes (total Medicare reimbursement, number of office visits, and number of hospitalizations in 2006) relate to general healthcare usage. The last outcome, the number of A1c tests ordered between years 2002–2005, is used as an indicator of quality of diabetes care, a commonly used measure in quality of care studies [1821].

We restrict the sample to persons aged 65 years and older who were Medicare-eligible at the time of interview (n = 11,354). The sample was further restricted to individuals randomly selected for biomarker collection (n = 5,784), of which 3,820 completed the blood draw and received a valid A1c score. Of these cases, Medicare records were linked to 3,389 respondents. Beneficiaries who were not continuously enrolled in fee-for-service Medicare were excluded (n = 1,185) along with potential users of Veterans Affairs (VA) services to ensure that all health care utilization was reflected in the claims (n = 176). The resulting analytic sample consisted of 2,028 respondents.

Concordance between the self-report and claims data was assessed using t-tests and two-way tables. Mean A1c levels and utilization outcomes are reported for the concordant and discordant cases. All analyses were performed using appropriate survey procedures in SAS 9.1.3 inside a secure data enclave at the Institute for Social Research in Ann Arbor, Michigan. This study was exempt from Institutional Review Board oversight.

Results

First, to assess the comparability of our analytic sample we compare diabetes estimates from the linked HRS-Medicare-A1c sample with those from three national samples: the 2005–2006 National Health and Nutrition Examination Survey (NHANES), the linked HRS-A1c only sample, and the linked HRS-Medicare only sample. All samples are restricted to the Medicare-eligible population aged 65 and older. Table 1 shows that the percentage of self-reported diabetics (Table 1) (21.1; 95% CI: 19.2-23.2), clinical diabetics with A1c reading ≥ 6.5 (12.6; 10.7-14.4), and undiagnosed diabetics (4.4; 95% CI: 3.2-5.6) in the linked HRS-Medicare-A1c sample are statistically comparable to the corresponding estimates obtained from the other three samples, suggesting consistency across the sources and minimal bias in using the linked HRS-Medicare-A1c sample.

Table 1 Prevalence of Diabetes amongst Adults ages 65 and older in Three National Samples

The CCW claims-based algorithm yields a significantly higher percentage of diabetes compared to self-reports (Claims: 27.3; SRs: 21.2; p < 0.05). However, the overall percentage of diabetes (based on the sum of self-reported and undiagnosed diabetics) obtained from the survey data (25.6) is statistically indiscernible from the claims-based estimate (27.3; 95% CI: 25.3-29.2).

The next analysis examines the two-way agreement between self-report and the claims-based diabetes definitions (Table 2). About 19.5 percent (or n = 406) of respondents are classified as diabetic based on both self-report and claims data and 71.1 percent (or n = 1,426) are classified as non-diabetics in both data sources. The remaining 9.3% of the sample (n = 196) yield a conflicting a result between the two measures. Most of the discordant cases (82.7 percent; n = 162, 7.7 percent of total sample) are flagged as diabetic in the claims data. Discordant claims-based diabetics tend to be older and self-report better health status compared to concordant diabetics (Table 3).

Table 2 Cross-classification of self-report and claims-based diabetes status
Table 3 Percent of Medicare beneficiaries aged 65 and older, by Discordant (Claims) and Concordant Diabetes Classification Group and characteristics

Clinical classifications based on A1c readings are shown in Table 4. Not surprisingly, concordant diabetics (CDs) and concordant non-diabetics (CNDs), respectively, yield the highest and lowest percentage of clinical diabetes (A1c ≥ 6.5), respectively (CDs: 45.7; CNDs: 3.5) among the four possible agreement groups. Among the discordant cases, self-reported diabetics yield a significantly higher mean A1c reading (6.32 vs. 5.86), a higher percentage of clinical diabetes (30.5 vs. 12.4), and a lower percentage of A1c readings less than 6.0 (42.8 vs. 63.6) relative to the CCW-based diabetics (all comparisons yield p-value < 0.05).

Table 4 Hemoglobin A1c estimates by self-report and chronic condition warehouse diabetes status

Of the 4.4% of the undiagnosed sample who were classified as diabetic based on having an A1c score ≥ 6.5, only 29.9% (95% CI: 18.l-39.9) of them were reported as being diabetic in the claims data. Thus, the claims data do not differ from self-report primarily because of better classification of the undiagnosed cases.

The results are suggestive that self-reports produce a more accurate indication of a diabetes diagnosis (based on clinical data) relative to the CCW algorithm. Hence, an important question to ask is whether healthcare utilization profiles are distorted by using the CCW-algorithm to identify the diabetic population.

Table 5 shows that the healthcare utilization outcomes among discordant claims-based diabetics are generally similar to that of concordant diabetics regardless of A1c levels. However, the number of A1c tests received was markedly lower for the discordant claims-based diabetics than for the concordant diabetics (p-value < 0.05). This pattern of use suggests that discordant claims-based diabetics do not receive diabetes-related care at the same rate as concordant diabetics but do receive an overall level of health services and have expenditures similar to diabetics. Regression-adjusted results (controlling for demographic characteristics) yielded the same conclusions.

Table 5 Hemoglobin A1c levels and utilization outcomes by self-report and chronic condition warehouse diabetes status

Discussion and conclusions

In this study nationally-representative data linked with biomarker and Medicare claims were used to study the agreement between self-report and claims-based measures of diabetes. The claims-based CCW algorithm yielded a significantly higher rate of diabetics compared to self-reports. The biomarker data suggested that the higher rate of diabetics in the claims was due to false positives. False positives tended to be associated with high users of Medicare services with provider utilization profiles similar to those of concordant diabetics. Higher rates of diabetes prevalence in claims data may reflect intensive monitoring of pre-diabetic patients with elevated levels of other cardiovascular risk factors.

Our results raise potential concerns about attempts to use the current CCW diabetes indicator to identify diabetics in claims data for the purpose of assessing quality of care. Regions or accountable care organizations with high proportions of false positives may correctly fail to provide ongoing diabetes maintenance care to these patients, and thus appear to provide lower quality of care to diabetic patients (e.g., lower A1c testing).

Our findings should be interpreted in the context of the following considerations. This study relies on a single measure of Hemoglobin A1c rather than repeated measures of fasting plasma glucose or glucose tolerance test results to validate the self-report and claims-based measures of diabetes. Furthermore, people with diabetes who are controlled may have an A1c score below 6.5. We did not have access to Part D claims, though information about use of insulin or oral diabetes medications would have been another way to identify patients being treated for diabetes. However, the tradeoff would be a less representative sample as not all Medicare beneficiaries are in a stand-alone Part D plan that would have research data. Nevertheless, efforts to refine and validate claims-based algorithms with medication information may help to improve the use of these measures for health services research.

Diabetes is an important and costly chronic condition. Researchers interested in studying treatment and healthcare utilization outcomes associated with diabetes face a variety of measures to identify this subgroup. We find strengths and weaknesses of using self-reports and claims-based measures to identify the diabetic population. For researchers fortunate enough to have both measures available then the choice of which measure to use should be driven by the goals of the analysis (e.g., estimating prevalence, assessing quality of care, profiling aggregate levels of utilization) and sensitivity of the results obtained from both measures. However, if the CCW algorithm is the only measure available, then results should be interpreted with caution, particularly if those results pertain to diabetes prevalence and quality of care analyses.