INTRODUCTION

As evidence in healthcare has mounted in recent years, we have seen a growing interest in the development of performance measures to assess overuse. This is particularly true in preventive care, where most existing measures focus on underuse.1 Underuse measures have played an important role in increasing the uptake of preventive care.2 But as we continue to encourage prevention for patients who are likely to benefit, it is imperative that we also find ways of discouraging the use of unnecessary and potentially harmful services to improve quality of care, decrease waste, and improve access. Efforts such as the Choosing Wisely campaign have highlighted specific clinical areas where measurement of overuse may be warranted.3 Yet, we have little practical experience with measures of overuse and their broader effects on healthcare delivery.

A key challenge in implementing performance measures is avoiding unintended effects. For example, several recent studies have shown that implementation of underuse measures can unintentionally encourage overuse of care.47 In contrast, overuse measures have the potential to discourage utilization when care is actually indicated, promoting underuse. In light of such concerns, Mathias and Baker have recommended that the potential unintended effects of overuse measures be carefully considered during measure development, and that efforts be taken to mitigate and monitor for such effects.8 As a starting point, they suggest that overuse measures be based on strong evidence and that these measures demonstrate high specificity to avoid falsely identifying overuse, thereby minimizing the risk of promoting underuse.

The purpose of this study was to develop and test an electronic measure of screening colonoscopy overuse in the Veterans Affairs Health Care System (VA), and to estimate its overuse within VA. Screening colonoscopy is an ideal candidate for an overuse measure for a number of reasons. First, appropriate use of colorectal cancer screening is supported by strong evidence.9 Second, screening colonoscopy is often overused in both VA and non-VA settings.10,11 Third, existing performance measures for colorectal cancer screening are focused exclusively on underuse of screening, and we have shown that implementation of these measures within VA is associated with overuse.12 Additionally, colonoscopy is an invasive test with a small but real risk of complications.13 Finally, because colonoscopy is a resource-limited service, overuse reduces access to necessary and appropriate colonoscopy.

It is challenging to identify inappropriate screening colonoscopies using electronic data for several reasons. First, administrative data have limited accuracy for determining whether a colonoscopy was performed for average-risk screening or for a non-screening indication (e.g., for diagnostic purposes or post-polypectomy surveillance).14,15 Second, the appropriate interval between colonoscopies is dependent on knowing the results of prior colonoscopies. For example, a patient with adenomas should have a repeat (surveillance) colonoscopy sooner than a patient with a completely normal colonoscopy16; however, detailed information about the type and number of polyps resected is not captured in administrative data.17 Third, many colonoscopies are appropriately repeated sooner than recommended by guidelines due to suboptimal bowel preparation, but this factor is also not captured in administrative data.18

For these reasons, working with operational partners in VA, we conducted a study with three main aims: (1) to develop a highly specific measure that can be used to identify overuse of screening colonoscopy from electronic data; (2) to assess how reliably this measure identifies facility-level overuse over time; and (3) to examine facility-level factors that may be associated with colonoscopy overuse.

METHODS

Overview

We performed a cross-sectional study using electronic data from the VA Health Care System for each year from 2011 to 2013. The primary analysis utilized data from the year 2013. The study proceeded in four sequential steps. First, we convened a workgroup of experts in colorectal cancer screening and performance measurement to specify an overuse measure for screening colonoscopy. Second, using the workgroup specifications, we developed an electronic measure of overuse. Third, we validated this measure in a subset of patients by comparing it to gold-standard manual record review. Finally, we used the validated electronic measure to quantify overuse of screening colonoscopy across VA. The study was approved by the Institutional Review Board of the VA Ann Arbor Healthcare System.

Measure Specification and Electronic Measure Construction

A performance measure of screening colonoscopy overuse was specified by an expert workgroup composed of VA experts in colorectal cancer screening (DF, PS, JD, KP) and performance measurement (EK, JF, LK). The workgroup process is described in detail in the Appendix. The final measure defined by the workgroup identified average-risk screening colonoscopies (comprising the denominator, with exclusions—Text Box 1A) that met one or more criteria for probable or possible overuse (comprising the numerator—Text Box 1B). The workgroup met three times over a 6-month period to define and review the measures and provide feedback on preliminary results of electronic data extraction.

Text Box 1A: Denominator and Exclusions for Screening Colonoscopy Overuse Measure

Denominator:

Patients who underwent colonoscopy over the selected time period (N = 248,284).

Exclusions:

Colonoscopy performed for a non-screening indication (N = 159,530):

 

(1) Colonoscopy performed for diagnostic, high-risk screening, or surveillance indication (N = 86,904).*

(2) Colonoscopy performed in a patient at increased risk for colorectal cancer (N = 72,212)**:

a. Personal history of adenomatous polyps (N = 68,039).

b. Personal history of colorectal cancer (N = 916).

c. Personal history of inflammatory bowel disease (N = 170).

d. Family history of colorectal cancer (N = 3087).

(3) Colonoscopy performed during hospitalization (N = 411).

(4) Colonoscopy performed in a patient who has undergone prior total abdominal colectomy (N = 3)**.

*Using approach previously developed and validated by Fisher and colleagues (Appendix)

**Using CPT and ICD-9 codes from FY00 to FY13 (Appendix)

Text Box 1B: Numerator for Screening Colonoscopy Overuse Measure

Numerator:

“Probable” overuse of screening colonoscopy:

 

(1) Colonoscopy performed less than 9 years after complete colonoscopy.

(2) Colonoscopy performed in a patient < 40 or > 85 years of age.

(3) Colonoscopy performed in a patient with life expectancy < 6 months.

(4) Colonoscopy performed < 6 months after negative fecal occult blood test (FOBT).

 

“Possible” overuse of screening colonoscopy:

 

(1)Colonoscopy performed in a patient 40 to 49 years of age.

(2)Colonoscopy performed in a patient 76 to 85 years of age.

After the measure had been specified, we approximated measure elements from VA electronic data that were available from fiscal year (FY) 2000 to the present time in a Corporate Data Warehouse (CDW), including: (1) patient demographics; (2) diagnostic and procedure codes; (3) use and results of laboratory testing; and, (4) structured documentation of chronic and serious health conditions, including hospice enrollment. A detailed description of the methodology used to electronically approximate the measure denominator and numerator can be found in the Appendix.

Validation of the Electronic Measure Using Manual Record Review

We developed a standardized electronic health record (EHR) abstraction algorithm to identify measure elements in manual record review. This abstraction algorithm was piloted prior to use, and pilot data was used to modify the algorithm to enhance its ease of use and reliability. Once the abstraction algorithm was finalized, we selected a sample of 3,000 Veterans who had a procedure code for colonoscopy (Appendix Table 4) between April 2011 and March 2012, using a stratified random sampling strategy based on predicted overuse at each facility. First, we used our candidate electronic measure to estimate the proportion of screening colonoscopies at each VA facility that met a definition for overuse in FY11. Next, we used these data to stratify facilities into three groups: bottom quartile (low expected overuse); middle two quartiles; and top quartile (high expected overuse). Finally, we randomly sampled 3000 Veterans with electronic documentation of colonoscopy from each of these three groups (1000 from low overuse facilities, 1000 from average overuse facilities, and 1000 from high overuse facilities). Manual record review was performed by West Virginia Medical Institute, a professional chart abstraction group that performs large-scale, national chart reviews for VA performance measurement programs on an ongoing basis. Further detail about the manual record review process is provided in the Appendix.

Independent Variables

We obtained or constructed several potential facility-level predictors of screening overuse: (1) proportion of screen-eligible patients in FY13 who were “up to date” for screening according to current guidelines (assessed by chart review conducted as part of VA’s ongoing External Peer Review Program); (2) median number of days between positive fecal occult blood test (FOBT) and colonoscopy (“FOBT wait time”) (see below); (3) complexity score (a measure of facility complexity based on factors such as patient risk, academic affiliation, and patient volume);19 (4) academic affiliation (obtained from the VA Office of Academic Affiliations); (5) number of colonoscopies performed in FY13 (obtained from CDW); and, (6) proportion of colonoscopies “outsourced” to non-VA facilities in FY10 (also known as “fee basis” colonoscopies) (obtained from CDW). To quantify the FOBT wait time, we identified all positive FOBTs in FY12 (N = 55,494) and all colonoscopies performed within 12 months after positive FOBT (N = 22,728). We then calculated the median time in days from positive FOBT to colonoscopy for each facility. Data were also extracted from CDW on several patient-level variables, including age, gender, and Charlson comorbidity index.20

Statistical Analysis

We compared assessments of overuse constructed from manual record review to those constructed from electronic data. Specifically, we examined diagnostic test characteristics (sensitivity and specificity) and simple agreement for probable overuse, possible overuse, and overall overuse. These test characteristics were examined for both national VA data and for each stratum of predicted overuse, as specified in our stratified sampling strategy. Additionally, underlying reasons for overuse were tabulated (corresponding to the individual elements comprising the numerator of our overuse measure). VA facilities are organized into 21 regional Veterans Integrated Service Networks (VISNs), each of which has substantial autonomy over budgetary decisions and approaches to delivery of care. We therefore examined variation at the facility level and also the VISN level.

Multilevel, multivariable logistic regression was performed to identify independent facility-level predictors of screening overuse. We also examined stability in overuse over time at the national, VISN, and facility level (from FY11 to FY13). Data were missing for less than 1 % of the sample for descriptive analysis and for approximately 15 % of predictor variables for multivariable analysis. These records were excluded from multivariable analysis. Analyses were performed using the Stata 12 statistical package (StataCorp, College Station, Texas).

RESULTS

Validation of Electronic Measure

We identified 3000 Veterans who had electronic documentation of a colonoscopy in FY11. Of these, 2976 (99 %) had documentation of a colonoscopy in manual record review; 2915 (97 %) had documentation of colonoscopy indication and were used for validation of the electronic measure. Of these 2915 colonoscopies, 2675 were identified as either non-screening (2171) or appropriate screening (504) in manual record review (i.e., did not meet a measure definition of overuse), and 240 met a measure definition of probable or possible overuse. As per our a priori study design, the electronic measure was highly specific, but not sensitive, for overuse compared to manual record review. Out of the 2675 colonoscopies that were non-screening or appropriate screening, 2585 (specificity = 97 %) were correctly identified as appropriate by the electronic measure. Out of the 240 colonoscopies that met a definition of overuse by manual record review, 47 (sensitivity = 20 %) were correctly identified as overuse by the measure. Low sensitivity of the electronic measure for overuse was largely due to low sensitivity of the electronic measure for screening indication (sensitivity = 36 %). Specificity and sensitivity for overuse were similar in high-performing facilities (i.e., those with low rates of overuse) (specificity = 97 %, sensitivity = 19 %) and in low-performing facilities (specificity = 95 %, sensitivity = 27 %).

Electronic Measurement of Overuse

Baseline characteristics of facilities and patients are shown in Tables 1 and 2, respectively. In FY13, 88,754 screening colonoscopies were identified across 122 VA facilities (Tables 1 and 2). Of these, 20,530 (23 %) met the definition for probable (17 %) or for possible (6 %) overuse (Table 3). The most common reasons for overuse were performance of colonoscopy less than 6 months after negative FOBT (35 %) and performance of colonoscopy less than 9 years after prior colonoscopy (31 %) (Table 3). Substantial and significant variation in colonoscopy overuse was noted between facilities and between VISNs, with a nearly eightfold difference between the maximum and minimum rates of overuse at the facility level and a nearly twofold difference at the VISN level (Figs. 1 and 2). Furthermore, overuse at the VISN and facility level was relatively stable over time. Between FY11 and FY13, all VISNs in the bottom quartile of performance (high overuse) remained in the same quartile. Similarly, all VISNs in the top quartile of performance (low overuse) remained there. Likewise, during this time period, 70 of 122 facilities remained in the same quartile of performance for all 3 years, and 103 of 122 facilities improved or worsened by no more than one quartile. Of the 27 facilities with high overuse, all remained in the bottom quartile for the entire 3-year period.

Table 1. Baseline Characteristics of VA Facilities (N = 122)
Table 2. Baseline Characteristics of Patients (N = 88,754)
Table 3. Reasons for Overuse of Screening Colonoscopy Identified by Electronic Measure (N = 20,530)
Figure 1.
figure 1

Overuse of screening colonoscopy across 122 VA facilities (N = 88,754). Each marker represents a single VA facility, with error bars indicating 95 % confidence intervals (median overuse = 23 %, interquartile range = 18 % to 29 %).

Figure 2.
figure 2

Overuse of screening colonoscopy across 21 Veterans Integrated Service Networks (VISNs) (N = 88,754). Each marker represents a single VISN, with error bars indicating 95 % confidence intervals (median overuse = 21 %, interquartile range = 20 % to 28 %).

In multivariable analysis adjusting for patient characteristics (age, gender, and Charlson comorbidity index), we examined several facility-level variables as potential predictors of overuse, including: (1) proportion of screen-eligible patients who were “up to date” for screening according to current guidelines; (2) median number of days between positive fecal occult blood test (FOBT) and colonoscopy (“FOBT wait time”); (3) complexity score; (4) academic affiliation; (5) number of colonoscopies performed in FY13; and, (6) proportion of colonoscopies “outsourced” to non-VA facilities. None of these factors was independently associated with overuse of screening colonoscopy.

DISCUSSION

Colorectal cancer screening is a widely recommended preventive service that has traditionally been underused. But in large integrated healthcare systems like VA, systematic efforts to increase screening rates have been successful.21 In this context, we used an expert workgroup process to develop and test an electronic measure of screening colonoscopy overuse. Of nearly 90,000 screening colonoscopies performed in VA in FY13, 23 % met a consensus definition for overuse. Common reasons for overuse were colonoscopy being performed soon after negative FOBT (35 %), colonoscopy being repeated less that 9 years after prior negative screening colonoscopy (31 %), and colonoscopy being performed in a patient under 50 years of age (17 %).

While our electronic measure had low sensitivity (reflecting limitations in our ability to electronically ascertain screening indication), the measure had very high specificity. Thus, while the measure failed to identify all possible overuse, the likelihood of “false positive” categorization of overuse was low. Furthermore, rates of overuse were stable over time at the facility and VISN levels, suggesting that our electronic measure reliably reflects underlying organizational characteristics. Nonetheless, it is possible that some of the colonoscopies that the measure categorized as “overuse” were, in fact, appropriate. For example, some procedures could have been indicated based on signs or symptoms that were not adequately documented in the CDW, or based upon prior incomplete colonoscopy (i.e., failure to reach the cecum or achieve adequate bowel preparation). Additionally, colonoscopies performed in individuals under 50 years of age could be considered appropriate in African Americans, who may be at increased risk for colorectal cancer, though this recommendation is not included in USPSTF or VA screening guidelines on this topic. Despite these potential limitations, our electronic measure provides a tool that, coupled with existing underuse measures, could be used to assess and improve the appropriateness of colonoscopy utilization. In current practice, measures of underuse are frequently “unopposed” by corresponding measures of overuse. Such lack of balance can inadvertently promote overutilization. As meaningful use of electronic health records becomes more widespread, limitations in ascertaining colonoscopy indication will improve, thereby improving the sensitivity of electronic measures of overuse. Along these lines, VA is working to develop a standardized, structured system for colonoscopy documentation, including procedure indication and findings.

Existing and proposed measures of colorectal cancer screening overuse have focused on two key areas: (1) screening colonoscopy performed less than 10 years after a negative colonoscopy (a CMS Physician Quality Reporting System [PQRS] measure adopted in 2013);22 and, (2) screening colonoscopy performed in an individual over 85 years of age (proposed by both gastroenterology professional societies and the National Committee for Quality Assurance).23 As our work demonstrates, these individual measures (which are components of our composite measure) have the potential to be operationalized electronically using administrative data with acceptable reliability. Our work also demonstrates that within VA, screening colonoscopy is rarely overused in the very elderly, with only 1 % of overuse being due to screening in those over age 85. Thus, focusing on this population is likely to have limited value in improving quality of care.

Our study has several strengths that should be highlighted. First, we successfully used an expert workgroup process to develop a high-specificity electronic measure of overuse, one that demonstrated stability over time across facilities and VISNs. Because this measure is electronically implementable and stable over time, it could be used to identify facilities with high rates of overuse and subsequently be used to monitor these facilities for performance improvement. At the same time, existing measures of colorectal cancer screening underuse could be monitored to ensure that efforts to reduce overuse are not being accompanied by underuse of screening. Second, we used data from the largest integrated healthcare system in the United States, analyzing data on nearly 90,000 patients per year across 122 facilities. The use of VA data allowed us to examine overuse in a healthcare system with a long-standing, comprehensive, and highly effective implementation of underuse measures of colorectal cancer screening. Because performance of colorectal cancer screening is now a standard measure in many healthcare systems, it is likely that similar patterns of overuse can be found in other high-performing healthcare systems. Moreover, our measure could be used to identify and monitor overuse in other healthcare systems. For example, in healthcare systems with greater standardization of electronic data (e.g., Kaiser Permanente), our measure is likely to be more accurate than in VA.

Our study also has limitations. Accurate coding of colonoscopy indication is critical to any effort to assess colonoscopy quality. Because VA electronic data have limited accuracy for colonoscopy screening indication, we were unable to develop an electronic measure that was highly sensitive and specific for overuse. A less specific but more sensitive measure could have been developed with existing data, potentially providing more accurate estimates of the absolute volume of screening colonoscopy overuse across the healthcare system, but such a measure could promote screening underuse in the context of performance measurement. It is also possible that coding practices differ systematically between VA sites, a phenomenon that could at least partly explain the variation noted across facilities. Prior work suggests that colonoscopy indication may be more accurately coded in fee-for-service Medicare.15 Some sites may also already have systems in place to minimize overuse (e.g., electronic assessment of prior colonoscopy or FOBT as part of the colonoscopy ordering template), another potential cause of variation. Future work will be needed to better understand the underlying causes of overuse. It is also important to note that Veterans may undergo procedures outside VA, meaning that our data (which did not capture use of non-VA care) may underestimate true rates of overuse among Veterans. As well, some of our findings may be specific to the VA healthcare context. In non-VA settings, for instance, overuse of colonoscopy within 6 months of a negative FOBT (the most common reason for overuse in our study) is less likely, since FOBT is used less frequently for screening. Finally, our measure was relatively conservative in its use of health status and life expectancy, requiring a life expectancy of less than 6 months to meet a definition of overuse. While this strict definition maximizes specificity, it does so at the expense of sensitivity. For example, nearly 8 % of patients in our cohort had a Charlson comorbidity index ≥ 4, indicating poor health and limited life expectancy. In such patients, screening is unlikely to be of benefit and may even be harmful, indicating overuse.24 Yet, the vast majority of these patients have a life expectancy of more than 6 months, meaning that they would not have met our measure definition of overuse. The development of more clinically meaningful and patient-centered measures that take into account health status and patient preference is a topic of ongoing work.

In summary, our results suggest that screening colonoscopy is overused in the VA Health Care System, and that rates of overuse vary widely across VA facilities and VISNs. The electronic measure we developed and tested demonstrated high specificity and reliability for facility-level measurement, successfully stratifying facilities according to performance. The stability of facility performance over time supports the hypothesis that the measure is capturing true organizational characteristics and/or provider practices. In the future, this measure, coupled with the existing measure of colorectal cancer screening underuse, could be used to guide quality improvement in large health systems.