Take-Home Points

• Increasing emphasis on value in health care has spurred the development of value-based and alternative payment models. Inherent in these models are choices around program scope; selecting absolute or relative performance targets; rewarding improvement, achievement, or both; and offering penalties, rewards, or both.

• We examined current Medicare value-based and alternative payment models on these elements of program design and found that while current models vary significantly across each parameter, there are scant prior data to inform which choice is best.

• Such variability in program design may represent an important opportunity to learn from existing models and create new ones in the future.

Introduction

Historic change in the way we pay for health care is underway. Public and private payers alike are increasingly moving toward alternative payment models (APMs) and value-based purchasing (VBP) models that link provider payment to the quality and/or costs of care delivered. The US Department of Health and Human Services has set a goal of tying 50% of Medicare reimbursements to APMs by 2018, with 90% of remaining fee-for-service payments tied to quality or value.1 These goals and other reforms, including the bipartisan Medicare and CHIP Reauthorization Act, or MACRA, which establishes widespread value-based physician payment and provides incentives for participation in APMs, are broadly spurring the development of new models. Emphasis on value in health care seems likely to persist, even given the recent change in administration.

In the creation of new payment models, policymakers face choices about program design: in particular, how to measure and reward quality and cost-savings. Alternative approaches differ fundamentally, and each has pros and cons.2 In this article, we examine four dimensions of program design central to program function (Table 1): first, the scope of a program’s measures; second, whether performance targets are absolute or relative; third, whether a program rewards improvement or achievement or both; fourth, whether incentives are framed as penalties or rewards.

Table 1 Elements of Program Design in Current Medicare Value-Based and Alternative Payment Programs

Using this framework, we examined seven Medicare programs (Table 1). These included three hospital programs—the Hospital Readmissions Reduction Program (HRRP),3 Hospital Value-Based Purchasing Program (HVBP),4 and Hospital-Acquired Conditions Reduction Program (HACRP)5—and three ambulatory programs—the Medicare Advantage (MA) Quality Star Rating program6 and the Physician Value-Based Payment Modifier (VM)7 and its successor, the Merit-Based Incentive Payment System (MIPS).8 We also examined one large voluntary APM, the Medicare Shared Savings Program (MSSP).9 The design choices made for these programs have been widely divergent, and thus a critical examination may offer insights to inform the design of new payment and delivery models in the future.

Program Scope (Broad vs. Narrow)

One basic program design decision is the number and diversity of measures on which providers will be evaluated and payment will depend. Table 1 demonstrates that VBP/APM programs in Medicare vary widely in this regard. For example, in the hospital setting, the HRRP focuses exclusively on risk-standardized excess readmissions and the HACRP on patient safety. In contrast, HVBP will include 21 measures in five domains in fiscal year (FY) 2017, including both quality and cost measures; the Physician VM program lets clinicians choose from over 200 quality measures and includes cost measures as well.

There are likely tradeoffs between a broad versus narrow scope for the performance measures included in a program. Programs that use a broader set of measures may spur providers to undertake more intensive systems-based approaches to overall quality improvement. On the other hand, targeted programs may be less administratively burdensome and could make critical areas for improvement especially salient.

There are few data to support either a broad or targeted approach in terms of impact on outcomes. Support for a broad approach may come from results of the United Kingdom Quality and Outcomes Framework, which was a financial incentive program for general practitioners, aimed at measures of disease control and prevention (diabetes treatment, immunizations, etc.). Studies demonstrated that performance improved for most incented indicators, though many gains were modest.10 The broad-based HVBP program has been associated with improvements in processes of care, but has little demonstrated impact on outcomes included in the program.11 On the other hand, the highly targeted HRRP, which focuses only on readmissions, has been associated with a significant drop in readmission rates that was largest at poorly performing hospitals and for targeted conditions.12 14

Another important issue is that of “teaching to the test,” namely, whether when only a limited number of outcomes are measured, others—which may be equally important to patients and clinicians—are neglected. In the UK program mentioned above, for example, quality measures that were not specifically incented slowed in their improvement.15 Interestingly, in the same program, performance remained high for some incentivized quality indicators even after the indicators were retired,16 a pattern that has also been seen in an incentive program in US Veterans Affairs hospitals.17 This suggests that phasing a broad set of measures in and out rather than choosing a small static set may be a useful strategy.

The appropriate scope of measures for a given program will depend on its goals. As noted above, the HRRP and HACRP have a specific thematic focus, and thus a narrow set of measures is appropriate. In contrast, HVBP, MSSP, and VM are programs intended to change the way care is delivered across conditions, and thus a broader set is necessary. One strategy might be to have specific, targeted programs for the highest-priority conditions or issues and broad-based, frequently updated programs to improve care more generally.

Setting Targets Based on Absolute or Relative Performance

Another program design element is whether benchmarks are set on a relative or absolute basis—whether providers are graded on actual performance or “on a curve.” This element differs across current Medicare VBP programs and APMs. For example, under the HRRP, there is no target readmission rate—rather, whether or not a hospital is penalized depends on its performance relative to others in any given year. Similarly, for the HACRP, the hospitals in the highest quartile of adverse patient safety events and infection rates annually are penalized, regardless of the absolute performance those hospitals achieve. Conversely, for MSSP, performance targets are set based on the distribution of performance in the prior year, so participating providers know ahead of time that achieving a specific compliance rate will earn full points for that measure. In theory, all ACOs participating in the MSSP could earn perfect quality scores if all met the pre-set benchmarks.

There are advantages and disadvantages to these approaches, but few data to support either tactic. One consideration is the way in which relative vs. absolute targets are perceived by clinicians and contribute to behavior change. Absolute benchmarks give providers specific targets to meet, which may be more meaningful to clinical leaders and front-line staff and may encourage collaboration across providers. On the other hand, relative benchmarks may feel more abstract and discourage collaboration. One criticism of the HRRP has been that its relative benchmarks mean that even if all hospitals improve their readmission rates, the majority will still receive penalties; this may be suboptimal for helping clinicians to feel that their efforts in improvement are meaningful.

From the payer perspective, however, the tradeoffs between absolute and relative benchmarks are much different. Relative performance assessment allows the payer to prospectively assure budget neutrality by ensuring that the number of “winners,” or at least their winnings, can balance losses by the “losers,” while absolute benchmarking has much less financial certainty. Relative benchmarking may also be more easily implemented because it allows the distribution of observed performance to determine rewards and penalties and does not require a significant duration of pre-data with which to set parameters for expected performance.

Rewarding improvement or Achievement

Whether to reward improvement or achievement is another important dimension of program design. Again, programs vary: the HRRP, HACRP, and VM programs do not explicitly reward improvement, whereas HVBP and MA do. The MIPS program has signaled it will reward improvement, though details on how this will be done have not been released. The MSSP focuses heavily on improvement: during their first 3 years in the program, ACOs in the MSSP are evaluated against their own historical spending rather than any external benchmark.

Whether a program rewards improvement or achievement can significantly impact which providers do well and poorly under the program. If providers are evaluated only on achievement, the highest-performing providers at baseline will likely do best.18 For example, under the achievement-only HRRP, the highest-performing hospitals at baseline were the most likely to avoid penalties. The lowest-performing hospitals actually improved more quickly over the first 3 years of the program, but many still received penalties in every program year because they started out far behind the best performers and did not fully catch up.13 , 14 Baseline low performers, on the other hand, may benefit the most from improvement opportunities. The purest form of rewarding improvement, evaluating providers against their own historical performance, may give baseline poor performers the best opportunity, assuming that there is “low-hanging fruit” that can be addressed. Early experience from the MSSP as well as the Pioneer ACO program suggests that the most expensive baseline providers were the most likely to save money, supporting this possibility.19 , 20 However, only rewarding improvement may also mean giving financial rewards to providers who have improved, but are nonetheless delivering suboptimal or even substandard care, or, on the other hand, failing to reward persistently excellent performers whose year-upon-year performance changes little.

Another related issue is that of risk adjustment. Improvement-based comparisons depend much less heavily than achievement-based comparisons on accurate risk adjustment to enable fair comparisons between peers since each hospital or clinician serves as its own comparison group. This may be of particular salience to providers that serve medically or socially complex populations, who have been shown previously to perform more poorly on many existing VBP programs, in part because of characteristics of the patients they serve.21 24

Transparency for consumers is also a key consideration in the achievement versus improvement debate. If providers are only judged on improvement, a patient viewing a hospital’s rating might not know whether a good score was based on high absolute performance or on poor performance with high improvement over time. Given prior evidence suggesting that public reports may influence consumer choice in meaningful ways,25 , 26 such considerations are key. Public reporting and financial rewards could be de-coupled to avoid this particular problem.

Some combination of rewarding achievement and improvement may be optimal in most cases, which is how many current programs, including HVBP, are constructed. This offers an incentive to organizations to participate even if initial performance is low, while also recognizing high absolute levels of achievement and acknowledging that continued improvement is relatively more difficult at high levels of performance. In addition, including some absolute measures would likely help consumers directly compare provider quality, increasing transparency and promoting consumer-driven care. Some data suggest that providers may prefer this dual approach as well27 and may respond better to mixed-strategy compensation models than single-strategy ones.28 , 29

Framing Incentives as Penalties or Bonuses

Insights from behavioral economics suggest that how incentives are framed—in particular, whether they are framed as penalties or bonuses—can affect how providers respond. As in the other elements of program design, incentive framing in current Medicare programs varies. Two of the three hospital-based programs, HRRP and HACRP, are penalty-only; HVBP, VM, and the forthcoming MIPS program are penalty-or-bonus; the MSSP (Track 1, which includes more than 90% of MSSP participants) and MA quality star rating programs are largely bonus-only, though MA imposes non-financial penalties on poor performers.

There are pros and cons to the use of penalties versus bonuses. Prospect theory holds that more value is placed on losses than on equivalent gains (“loss aversion”),30 suggesting that penalties may provide a more powerful behavioral incentive than bonuses. Another related concept is that “willingness to accept” is often significantly greater than “willingness to pay,” suggesting that people require much more to give something up than they would be willing to pay for it.31 While there are few data in this area directly related to payment models, one hospital pay-for-performance program applied prospect theory by sending an advance incentive payment to eligible providers based on the expectation that they would be more motivated to avoid losing the payment than achieving a possible gain.32 Penalties may also be more economically efficient than bonuses, since bonus programs require paying additional money to high performers in order to incent change among low performers.33

On the other hand, bonus programs may be preferred by providers,27 and thus more likely to meet political acceptance, either because they are perceived as more fair or because of loss aversion—and the larger the penalty, the larger the concern.

Other innovative financial approaches that have been used to encourage physician behavior include the use of a lottery34 in which participants are rewarded with the possibility of a large reward rather than a more certain small reward. Lotteries have been trialed more extensively in the patient behavior literature,35 , 36 but have not been systematically studied in the performance improvement setting.

There is little empirical evidence to suggest whether bonuses or penalties are more effective at scale. Recent evidence demonstrating that the HRRP (penalty-only) has been associated with reductions in readmission rates,12 while HVBP (penalty-or-bonus) has had little impact on quality of care, patient experience, or mortality rates,11 , 37 might support the theory that penalties lead to a stronger behavioral response from providers, though there are many other differences (including scope, as noted above) between the two programs that make drawing a solid conclusion difficult.

A related question is the size of the incentive. Historically, bonus payments to hospitals have been in the range of 1–5% and to physicians 5 to 10%, but we know of no evidence that clearly links size of incentive to behavioral response. A large pay-for-performance program in UK hospitals that offered up to 4% bonuses was associated with improvements in mortality,38 while the Hospital Quality Incentive Demonstration (1–2% bonus, 1–2% penalty)39 and HVBP programs (1–2% bonus, 1–2% penalty) in the US were not.11 This finding might suggest that larger incentives have a larger effect on performance, but again there are other differences between these programs that preclude firm conclusions based on these examples alone.

Other Contextual Considerations

Variation across Medicare’s payment programs also reflects the different contextual background of the programs. For example, the HACRP was established as a single-issue, penalty-only program, presumably reflecting a perception among members of Congress that unacceptable lapses were resulting in complications and poor patient outcomes. In this case, a narrow focus on safety, with a penalty-only construct, met programmatic goals. On the other hand, the forthcoming MIPS program was established as a broad effort to incent providers in multiple areas—quality, costs, use of electronic health records, and practice improvement activities. Providing bonuses as well as penalties in MIPS was critical for widespread acceptance of such a sea change in physician payment and appropriately reflective of the fact that on many quality measures there may be a range of acceptable performance. Statutory frameworks differ as well, which impacts how programs are ultimately implemented: the legislation creating the HACRP has detailed statutory language around program design, while under MIPS, CMS was given a great deal of latitude through rulemaking in determining the specifics of measurement, bonuses and penalties, and implementation. Evaluating program design options therefore requires an understanding of not only the design elements themselves, but also of program genesis and intent.

Conclusions and Future Directions

As health care moves rapidly into an era dominated by APMs and VBP, program design is central, yet our current knowledge base is inadequate. Studying the effects of program design elements in existing and future federal programs will require complex data analysis to untangle which, if any, program design features maximize goal attainment, understanding that goals differ from program to program. The use of randomized trials of different payment models40 is another powerful tool that has historically been under-utilized in this area, but holds immense potential for creating the type of knowledge about clinician behavior that could help shape future policies.41 The Center for Medicare and Medicaid Innovation (CMMI) has recently launched models that include assignment to intervention and control groups, including the Million Hearts Model42 and the Home Health Value-Based Purchasing Model,43 and these will shed important light on strategies for payment reform.