Choosing Wisely: Prevalence and Correlates of Low-Value Health Care Services in the United States
- First Online:
- Cite this article as:
- Colla, C.H., Morden, N.E., Sequist, T.D. et al. J GEN INTERN MED (2015) 30: 221. doi:10.1007/s11606-014-3070-z
- 2.8k Downloads
Specialty societies in the United States identified low-value tests and procedures that contribute to waste and poor health care quality via implementation of the American Board of Internal Medicine Foundation’s Choosing Wisely initiative.
To develop claims-based algorithms, to use them to estimate the prevalence of select Choosing Wisely services and to examine the demographic, health and health care system correlates of low-value care at a regional level.
Using Medicare data from 2006 to 2011, we created claims-based algorithms to measure the prevalence of 11 Choosing Wisely-identified low-value services and examined geographic variation across hospital referral regions (HRRs). We created a composite low-value care score for each HRR and used linear regression to identify regional characteristics associated with more intense use of low-value services.
Fee-for-service Medicare beneficiaries over age 65.
Prevalence of selected Choosing Wisely low-value services.
The national average annual prevalence of the selected Choosing Wisely low-value services ranged from 1.2% (upper urinary tract imaging in men with benign prostatic hyperplasia) to 46.5% (preoperative cardiac testing for low-risk, non-cardiac procedures). Prevalence across HRRs varied significantly. Regional characteristics associated with higher use of low-value services included greater overall per capita spending, a higher specialist to primary care ratio and higher proportion of minority beneficiaries.
Identifying and measuring low-value health services is a prerequisite for improving quality and eliminating waste. Our findings suggest that the delivery of wasteful and potentially harmful services may be a fruitful area for further research and policy intervention for HRRs with higher per-capita spending. These findings should inform action by physicians, health systems, policymakers, payers and consumer educators to improve the value of health care by targeting services and areas with greater use of potentially inappropriate care.
KeywordsLow-value care Medicare Geographic variation
Recent health policy initiatives prioritize impoved organization and delivery of care to reduce fragmentation and prevent expensive complications of chronic illness or iatrogenic disease. This approach may miss an important opportunity to address quality concerns and rising health care spending: the overuse of low-value services. In 2012, the Institute of Medicine estimated that 30% ($750 billion) of annual health care spending is wasteful and that over half of this spending is on unnecessary services and inefficient care.1 Elimination of low-value services as a cost control strategy has much economic appeal because it would improve quality while reducing costs.
Society increasingly recognizes the importance of excess medical care, but it is difficult pinpointing the services and populations that represent health care overuse. There is general agreement on the definition of overtreatment (treatment of indolent disease, aggressive treatment at the end-of-life) and overdiagnosis (diagnosis and treatment of disease that would not have affected the lives of patients), but consensus has not been sufficient to facilitate their identification in clinical practice.2 Identification of low-value and potentially harmful services is an essential first step in improving quality and reducing overuse. The second critical step is engaging physicians and patients in efforts to reduce use of these services. Together with physician specialty societies, the American Board of Internal Medicine (ABIM) Foundation launched the Choosing Wisely initiative in 2011 to advance both of these aims. Over 60 participating physician societies have now each identified five specialty-specific, low-value services whose avoidance would improve the efficiency of care through higher quality, reduced risks and lower costs.3
In this study, we developed claims-based algorithms to examine 11 services identified in one or more Choosing Wisely lists and estimated the prevalence of these services at the regional and national levels. We created a regional composite measure of overuse based on the prevalence of these 11 services and explored the demographic, health and health care system correlates of overuse at a regional level. Based on this information, we estimate the magnitude of the harm and wasteful spending attributable to each service. This information may aid decision makers in prioritizing areas for intervention and provide a baseline against which to test the impact of policies aimed at reducing use of low-value services.
We used 100% Medicare administrative claims data (2006–2011) to determine the prevalence of low-value services. We limited our analysis to fee-for-service beneficiaries enrolled in Medicare Parts A and B (inpatient and outpatient insurance); we also required enrollment in Part D (prescription insurance) for three measures of Choosing Wisely services related to prescription drugs (analyses employing Part D data were limited to a 40% sample). We used residential ZIP codes to assign each beneficiary to a Dartmouth Atlas of Health Care hospital referral region (HRR).
Choosing Wisely Measurement
Measures developed to assess prevalence of services identified as low-value through Choosing Wisely
Choosing Wisely recommendation
Low-value diagnostic services
Don’t do imaging for low back pain when no red flags are present
American Academy of Family Physicians, American College of Physicians, North American Spine Society
Beneficiaries who received a low back x-ray, CT, or MRI within six weeks of incident low back pain diagnosis
Beneficiaries with low back pain over age 65 without other imaging indication
Prior diagnosis of low back pain, trauma and neurological impairment, within previous 12 months and cancer at any point during the study period; “E” code (external causes of injury) or trauma diagnosis on imaging event claim
Don’t order upper-tract imaging for patients with benign prostatic hyperplasia (BPH)
American Urological Association
Beneficiaries who received an intravenous pyelogram or an abdominal CT, MRI, or ultrasound within 60 days of the index diagnosis
Male beneficiaries diagnosed with BPH over age 65 without other indications for imaging
Cancer diagnosis at any point during the study period (e.g., chronic renal failure, nephritis, calculus of kidney and ureter, kidney stones, abdominal pain) within 60 days of index diagnosis
Don’t order cardiac tests on low-risk, asymptomatic patients
American Academy of Family Physicians, American College of Cardiology, American College of Physicians, American Society of Echocardiography, American Society of Nuclear Cardiology, Society of Cardiovascular Computed Tomography
Beneficiaries who received a non-indicated cardiac test, including stress tests, echocardiograms, electrocardiograms, advanced cardiac imaging
Low-risk beneficiaries ages 66–80
Indications of cardiac disease or other conditions that could indicate cardiac testing (e.g., HIV/AIDS, diabetes, peripheral vascular disease, pulmonary disease, cancer) or use of a prescription drug associated with the above conditions in a calendar year; enrollment in hospice; appropriate clinical indication on testing event claim
Don’t screen women older than 65 years of age for cervical cancer who have had adequate prior screening and are not otherwise at high risk for cervical cancer
American Academy of Family Physicians
Beneficiaries who received a Pap test
Female beneficiaries at low risk for cervical cancer over age 65
Gynecological cancers, HIV / AIDS, diethylstilbestrol use, HPV infection, or a previous abnormal Pap test during the study period
Don’t routinely repeat dual-energy x-ray absorptiometry (DXA) scans more often than once every two years
American College of Rheumatology
DXA scans performed on female beneficiaries at low risk for fracture within 23 months of a previous scan
DXA scans performed on female beneficiaries over age 66 at low risk for fracture
Fragility fracture or cancer diagnosis within 23 months of the index DXA scan
Don’t perform preoperative cardiac tests for cataract surgeries
American Academy of Ophthalmology, American College of Cardiology
Beneficiaries who received a non-indicated cardiac test, including stress tests, echocardiograms, electrocardiograms and advanced cardiac imaging in the 30 days before cataract surgery
Beneficiaries over age 65 undergoing cataract surgery
Appropriate clinical indication on testing event claim (e.g., palpitations) or admission in the 30 days before surgery
Don’t perform preoperative cardiac tests for low-risk, non-cardiac surgeries
American College of cardiology, American College of Physicians, American College of radiology, American College of surgeons, American Society of anesthesiologists, American Society of Echocardiography, American Society of Nuclear Cardiology, Society of Cardiovascular Computed Tomography, Society of General Internal Medicine, Society of Thoracic Surgeons, Society for Vascular Medicine
Beneficiaries who received a non-indicated cardiac test, including stress tests, echocardiograms, electrocardiograms, CTs, MRIs or PETs within 30 days before low-risk surgery
Beneficiaries over age 65 undergoing low-risk, non-cardiac surgery (e.g. breast surgery, transurethral resection of the prostate, corneal transplant, inguinal hernia repair, lithotripsy, arthroscopy, laparoscopic cholecystectomy)
Appropriate clinical indication on testing event claim (e.g. palpitations) or admission in the 30 days before surgery
Don’t perform population based screening for 25-OH-Vitamin D deficiency
American Society for Clinical Pathology, Endocrine Society, American Association of Clinical Endocrinologists
Beneficiaries who received a test for vitamin D deficiency
Low-risk beneficiaries over age 65
Beneficiaries with osteoporosis, fragility fracture, kidney disease, renal dialysis during the same calendar year
Don’t use antipsychotics as first choice to treat behavioral and psychological symptoms of dementia
American Medical Directors Association, American geriatrics Society, American Psychiatric Association
Beneficiaries who received one or more prescriptions for an antipsychotic following two observed dementia diagnoses
Beneficiaries over age 65 with diagnosed dementia
Severe mental illness during the study period
Don’t recommend percutaneous feeding tubes in patients with advanced dementia
American Academy of Hospice and Palliative Medicine, American Geriatrics Society, American Medical Directors Association
Beneficiaries with two observed dementia diagnoses residing in an institution who received a feeding tube
Institutionalized beneficiaries over age 65 with diagnosed dementia
Don’t use opioid or butalbital treatment for migraine, except as a last resort
American Academy of Neurology
Beneficiaries who filled an opioid or butalbital prescription within 21 days of the office visit with migraine diagnosis
Beneficiaries over age 65 with a diagnosed migraine and no other indication for opioids
An “E” code, inpatient admission, back pain, abdominal pain, surgery, fracture, cancer or hospice enrollment within 60 days of index visit
We used a combination of International Classification of Diseases, Ninth Revision (ICD-9), and current procedural terminology (CPT) codes to construct cohorts at risk for 11 Choosing Wisely services and to identify health service events highlighted by the Choosing Wisely recommendations (Online Appendix 2). We also used Medicare Part D prescription records, where applicable, for cohort inclusion/exclusion or to identify Choosing Wisely prescription service events. In all cases, we conservatively excluded beneficiaries not targeted by the Choosing Wisely recommendation. We limited our analysis to non-indicated tests and procedures, excluding services with claims diagnoses that suggest appropriate medical indication. We drew from measure definitions in the literature and conducted claims-based sensitivity analyses to optimize the measure construct when possible.4, 5, 6, 7, 8, 9, 10] For example, we studied the characteristics and follow-up events for those we deemed “low risk” for the cardiac screening measure. All measures not drawn from the literature were developed by a clinician; each was then reviewed by a second clinician. Disagreements were resolved via discussion. Although we used 2006–2011 data, some measures were limited to smaller windows to permit sufficient look-back periods within the data to identify, for example, prevalent disease states (e.g., long-standing back pain that would result in denominator exclusion for the back pain imaging measure). In Table 1, we describe the data, the time window for cohort qualification, event definitions, measure-specific cohorts, and cohort and event exclusions for each measure.
Based on a conceptual framework for decisions regarding health care services, we created HRR-level covariates to include in an exploratory regression analysis.11 These HRR-level measures characterized population demographics, health and health care systems for each area based on Medicare, Behavioral Risk Factor Surveillance System (BRFSS), U.S. Census and American Community Survey data. Explanatory variables included the following: per-beneficiary Medicare spending (a measure of health care use intensity); physician group concentration (a measure of market competition); the ratio of specialists to primary care physicians; the age-, sex- and race-adjusted mortality rate and the percent of adults reporting fair or poor health (measures of health state); the percent of Medicare beneficiaries of black race; the percent of Medicare beneficiaries of Hispanic ethnicity; a Medicare effective care use score; the percent of HRR residents living in a rural area; and the percent of residents below 150% of the federal poverty limit.
We calculated an average annual prevalence in the at-risk population for each Choosing Wisely service, both nationally and at the HRR level, along with the coefficient of variation across HRRs. We estimated national spending associated with each service by multiplying observed average spending per low-value care event by the national number of low-value care events among fee-for-service Medicare beneficiaries. We constructed an overall composite measure of low-value care for each HRR, equal to the average of the 11 standardized rates (z scores or standard deviations from the mean, Cronbach’s alpha = 0.66). We examined geographic variation in the overall composite measure by dividing the HRRs into quintiles and mapping the results. We used ordinary least squares regression to determine the association of HRR-level characteristics with the composite low-value care scores (N = 306 HRRs).
Statistical analyses were performed using SAS and Stata software. The study was approved by the institutional review board at Dartmouth College. See Online Appendix 1 for further methodology details.
Average annual prevalence of, variation in and spending associated with Choosing Wisely procedures and tests (N = 306 hospital referral regions)
Choosing wisely services*
Affected population (millions)
Minimum and maximum
Coefficient of variation
Estimate of waste ($millions)†
Low-value diagnostic services
Back pain imaging
Sioux Falls, SD
New Brunswick, NJ
White Plains, NY
Cervical cancer screening
Fort Lauderdale, FL
East Long Island, NY
Preoperative cardiac testing (cataract surgery)
Preoperative cardiac testing (non-cardiac surgery)
Vitamin D screening
Royal Oak, MI
Takoma Park, MD
Antipsychotics in dementia patients
Feeding Tubes in dementia patients
Los Angeles, CA
Opioids in migraine patients
The spending amount associated with each of the low-value services was a function of the prevalence of the service, the size of the affected population and the cost of test or treatment. Non-indicated use of antipsychotics in dementia patients had the highest amount of associated spending ($765.1 million), followed by non-indicated vitamin D screening ($198.6 million). Non-indicated imaging for benign prostatic hypertrophy and non-indicated preoperative cardiac testing for cataract surgery had the lowest levels of associated spending ($0.3 and $0.6 million, respectively).
Multivariate linear regression of regional characteristics associated with Choosing Wisely service use (N = 306 hospital referral regions)
Mean for characteristic†
Coefficient (95% CI)
α = 0.66
Regional health system characteristics
Physician group concentration
−0.008* (−0.148, −0.002)
Specialist/Primary care ratio
0.343* (0.060, 0.626)
0.069 (−0.026, 0.164)
Regional population characteristics
−0.096 (−0.281, 0.090)
Percent with poor or fair health
0.026* (0.001, 0.052)
0.018*** (0.011, 0.025)
0.015*** (0.001, 0.023)
0.003 (−0.001, 0.006)
−0.025** (−0.044, −0.006)
The Choosing Wisely initiative identified a set of low-value services via high-level expert opinion and consensus. We carefully constructed 11 claims-based algorithms to quantify and track utilization likely to represent overuse by relying on the recommendations from the Choosing Wisely program. Analysis of these services revealed substantial overuse and variation in overuse in the Medicare population by measure and geography. From both patient and societal perspectives, use of these services may have substantial health and economic implications. Some of the measured services represent treatments that may directly confer risk of harm (e.g., opioids in migraine patients), some may directly confer risk of harm and result in significant spending (e.g., antipsychotics in dementia patients) and others may indirectly confer risk of downstream harm by prompting additional testing and possibly resulting in false positive results (e.g., non-indicated preoperative cardiac testing). Our analysis provides an estimate of the opportunity for improving quality while reducing spending on these 11 services.
We found adjusted Medicare spending was positively associated with use of low-value services after controlling for regional health indicators. Many areas identified by others as having consistently high adjusted Medicare spending (e.g., McAllen, TX; Manhattan and Long Island, NY; Miami, FL; and Los Angeles, CA) also have high use of low-value services, indicating that at least some of their high spending results from wasteful services. The strong association between the proportion of racial and ethnic minority beneficiaries in the region and lesser use of low-value services in these exploratory regressions raises questions. We suspect this association is not due to individual-level differences in treatment between racial and ethnic groups, but rather is an artifact of practice styles in regions where these population sub-groups live. Previous research has shown that where a patient lives can affect the level and quality of health care the patient receives independent of individual characteristics, and that overuse patterns do not differ by insurance type.12, 13, 14, 15
Recent evidence indicates provider organizations and regions with a higher proportion of primary care physicians have lower utilization and spending and better use of recommended preventive and chronic care.16 Moreover, workforce characteristics explain 42% of the state-level variation in Medicare spending per beneficiary.17 The magnitude of the association between specialist ratio and low-value care in our study echoes these results, but does not suggest an obvious policy intervention. It is unknown whether this observation reflects excess testing by specialists or by all types of physicians in regions with a higher relative concentration of specialists.
Overuse of other services not included in our analysis may display different patterns than the 11 services we measured. Our estimates of overuse, however, include generalist- and specialist-directed care, expensive and inexpensive tests and procedures, and a broad range of specialty society lists. The main limitation of this research is our reliance on administrative claims to identify and describe use of low-value services.18 Claims may not provide the clinical detail needed to definitively identify certain examples of low-value care. Claims may miss important patient history such as long-term, untreated back pain that contributes to clinical decision-making and justifies services that would appear in claims as low-value. Often the same service can be high- or low-value depending on the patient; if the cohort exclusions are not adequately detailed, the measure will represent utilization of the procedure and not overuse. While claims data are not ideal for measurement of patient risk or symptoms, we provide algorithms to represent each recommendation and believe they are conservative starting points to estimate the use of these services, associated spending, variation in spending and correlates of use. These algorithms are valuable for research and discussion; use of these algorithms for quality measurement or payment by payers will require validation by chart review. We do not expect the “right” rate for these claims-based measures to be zero, but the differences across geography suggest what is achievable. In research on claims-based measurement of cancer treatment quality, Earle et al. define the 10th percentile as the benchmark for health care systems to set as a goal.19 In their work evaluating the intensity of end-of-life cancer care, for example, this meant that hospitals would be providing appropriate-intensity care if less than 2% of patients started a new chemotherapy regimen in the last 30 days of life. We report the 25th percentile for each measure in Table 2 as a conservative initial benchmark for clinicians and health systems to work toward. This benchmark may have to change as the quality of care improves.
Our analysis of the correlates of low-value care was exploratory and aimed at generating hypotheses. As an ecological correlation analysis, it was not based on individual patient- and provider-level modeling. Each low-value test or procedure likely has its own profile and is differentially affected by payment incentives, malpractice liability concerns, physician comfort with diagnostic uncertainty and patient demand for services, among other factors. Nonetheless, several of the patterns observed, including the association of higher spending and greater specialist supply with a greater provision of low-value care, are consistent with previous work and should serve as the basis for developing a conceptual framework for decision making around low-value care utilization.17,20
The measures developed for this study may help policymakers and payers focus attention on the forms of low-value care that are most harmful, prevalent or costly. Our conservative estimate of the spending for the low-value care services included in our analysis represents a small part of the overall cost problem and does not include other costs associated with the service or downstream costs, but is still an important starting point and opportunity for savings. Reduction in use of the services we measure will improve quality while lowering costs – changes that are hard to find in health care. Our measures also provide a baseline against which to test the impact of policies aimed at controlling costs and improving the efficiency of health care delivery, including, but not limited to, those that target low-value services directly.
The Choosing Wisely initiative has labeled services as low-value and has begun educating both patients and physicians through outreach material.21 Future work should examine the effects of the Choosing Wisely initiative and related programs on use of these services, as well as other reforms (such as accountable care organizations or value-based insurance design) that are intended to slow spending growth and reduce waste in health care. Identifying and eliminating low-value care is a critical component of health care reform and one in which careful measurement and targeting of policies will be essential to maximizing value and minimizing unintended harm.
The authors would like to thank Daniel Gottlieb and Rebecca Zaha for analytic support, as well as Brook Martin and Joan Teno for assistance with measure development.
This study was supported by grants from the Robert Wood Johnson Foundation’s Changes in Health Care Financing and Organization (HCFO) Initiative (#70729), the National Institute on Aging (P01 AG019783 and K23 AG035030), and The Commonwealth Fund (#20130339). These organizations had no role in the collection, analysis and interpretation of the data, or approval of the finished manuscript.
This work was previously presented in a policy roundtable discussion sponsored by the Robert Wood Johnson Foundation’s Changes in Health Care Financing and Organization (HCFO) Initiative on January 27, 2014.
Conflict of Interest
The authors declare that they do not have a conflict of interest.