INTRODUCTION

Physician practice behaviors may be influenced by health systems and payers’ use of physician compensation incentives.1,2,3 Physician compensation incentives can be based on performance on quality measures, patient experience scores, productivity, practice finances, and practice profiling (also called practice efficiency). Physician compensation incentives have received attention due to increasing use of commercial value-based purchasing (VBP) strategies (i.e., pay-for-performance and accountable care organizations)4, 5 and an intent by Medicare to link 90% of all payments to quality.6

Despite interest and appeal, the relationship between physician compensation incentives and clinical quality is mixed. In ambulatory care, financial incentives may improve care processes,7 yet their effect on outcomes has not been clearly demonstrated.8, 9 Additionally, physician compensation incentives have the potential for unintended consequences: wasteful efforts to game the system;10 distraction from unincentivized activities;11 adverse effects on the physician-patient relationship;12 reduced physician autonomy;13 negative effects on workplace morale;14, 15 increased physician burnout;16, 17 and incentivizing low-value care, such as increased unnecessary antibiotic18,19,20 or opioid prescribing.21

A prior study of primary care physicians in the United States found no association between physician compensation incentives and high-value care in 2007-2008.22 A more recent study found no differences in quality between primary care physicians who were practice owners and those whose compensation was based on salary, productivity, or some combination.23 This study did not assess the association of individual compensation incentives on quality of care.

An updated assessment of physician compensation incentives and quality of care among primary care physicians in the United States can inform ongoing evaluation and adoption of value-based reimbursement strategies. We conducted a cross-sectional analysis of nationally representative primary care visits to assess the association between physician compensation incentives and quality of care, including high- and low-value care.

METHODS

Data Source

The current study was an analysis of data from the National Ambulatory Medical Care Survey (NAMCS), which is a nationally representative, weighted survey of ambulatory visits to non-federally employed office-based physician practices in the United States. The National Center for Health Statistics (NCHS), which conducts the NAMCS, uses stratified sampling to recruit physicians each year. For physicians who consent to participate, the NCHS collects practice and physician-level information and samples visits during a randomly selected 1 to 2 week period. Census field representatives and medical staff use an automated survey tool to collect visit information from medical charts. The NCHS assigns visit weights to each sampled visit to account for clustering, sampling probability, and non-response to derive national annual visit rates.24 Physician participation in the NAMCS from 2012-2016 ranged from 46% to 59%. The number of unweighted sampled visits among primary care physicians ranged from 76,330 in 2012 to 13,165 in 2016. The NCHS institutional review board approved the public use of the NAMCS data, including a waiver of informed consent of participating patients.

The NAMCS captures comprehensive patient and practice information for each visit across 17 medical specialties. Data collected includes top 3 reasons for the visit; 3 diagnosis codes for years 2012-2013 and 5 diagnosis codes for years 2014-2016 (ICD-9-CM for years 2012-2015 and ICD-10-CM for year 2016); sociodemographics; limited medical history for common medical problems; comprehensive information on the patient’s medication list (up to 10 medications from years 2012-2013 and up to 30 medications from years 2014-2016) and whether the medication was new or continued; select vital signs; tests ordered; services and referrals provided; and limited lab results. Physician-level information includes practice ownership, physician-employee status, electronic health record capabilities, and practice revenue structure.

One of the NAMCS survey items asked participants, “please indicate whether the practice explicitly considers the following in determining your compensation.” Responses to this question include performance on quality measures (“specific measures of quality, such as rates of preventive services for your patients”), patient experience scores (“results of satisfaction surveys from your own patients”), individual productivity (“factors that reflect your own productivity”), financial performance of the practice (“the overall financial performance of the practice”), and practice profiling (“results of practice profiling, that is, comparing your pattern of using medical resources with that of other physicians”). For this analysis, we recoded items left blank for this survey item as “No” (1.31% of weighted total).

Measures

We constructed and analyzed all variables at the visit-level. We included visits to primary care physicians by adults age greater than 18. We defined primary care physicians as internal medicine, family medicine, and general practice. We excluded visits during which the patient was not see by a physician.

We created 17 binary outcome measures based on commonly endorsed, evidence-based quality measures.25,26,27 The 17 individual measures fall into two categories: High-Value Care and Low-Value Care (Medical Overuse; Appendix 1).

We assessed performance on each measure as the proportion of patients who fulfilled the numerator criteria divided by the number of patients who were eligible for the denominator criteria of a given measure. Each numerator criteria was binary at the visit-level (i.e., met or not met during the visit). Each denominator was the number of visits which were eligible for a given measure. For example, in the antibiotics for upper respiratory infection (URI) & bronchitis measure, the numerator was the number of visits during which a new antibiotic was prescribed, and the denominator was the number of visits for URI or bronchitis.

We tested the difference in the proportions for each outcome measure based on whether the visit was to a physician with or without a given compensation incentive. We independently tested 5 compensation incentives: Quality measures, patient experience, productivity, financial performance of the practice, and practice profiling. The referent group for each comparison was visits to physicians who did not have that incentive (e.g., visits with incentives for quality measures compared to no incentives for quality measures). Clinically relevant exclusions were applied equally to both the numerator and denominator for each measure.

We also created continuous, 0-1 composite measures of high- and low-value to summarize the visit-weighted mean proportion of eligible measures met.25 To account for visits which were eligible for multiple measures, we multiplied the NAMCS survey weight by the number of eligible measures for a given visit.

We used existing variables in the NAMCS dataset as covariates: year, patient age, race/ethnicity, number of chronic diseases, patient’s insurance at the visit, physician specialty, region of practice, office location, and office type. We constructed a covariate for physician owner/employee status using variables practice ownership and physician-employee status. We constructed a covariate for electronic health record capabilities based on Stage 2 Meaningful Use criteria,28 which became effective in 2014, that we used in our entire analysis. For missing data, the NCHS imputes values for age, sex, and race/ethnicity.24 For number of chronic diseases, we imputed missing data (1.18% of sample) by using the number of prescribed medications as a surrogate marker for chronic illness burden. Missing patient insurance information and physician-level variables are recorded as “unknown.” We recoded missing physician-level variables as “other” or “no” as appropriate for the variable in our analysis.

Statistical Analysis

Our outcome variables of interest were performance on each of the 17 individual outcomes measures of overuse and clinical care quality and the 2 composite measures. Our independent variables were whether or not the visit was to a physician who received compensation based on each incentive: performance on measures of quality, patient experience scores, productivity, financial performance of the practice, or practice profiling.

We used sampling weights for each visit to create a multilevel model which accounts for clustering, sampling probability, and non-response bias. 24 Following NCHS guidelines,29 we ensured that statistical tests were based on a sample of at least 30 patient records, that each estimate of the weighted data had a relative standard error (RSE) less than 30%, that the item nonresponse rate was <30%, and all records in the data files were used in each analysis to ensure correct sample variance. We excluded several measures that did not meet these criteria (Appendix 2). We used STATA statistical software (version 16.0; College Station, TX) for all analyses. All regressions were weighted for complex survey weighting methods using STATA’s ‘svyset’ suite of commands. We used ‘svy, subpop(x): test’ to estimate our regression models.

We used the chi-square test to assess whether visits to physicians with a certain compensation incentive is associated with performance on 17 individual quality measures. Performance on each quality measure was expressed as the proportion of patientsreceived appropriate care who received appropriate care at the visit as defined by that measure. For each of the 17 individual quality measures, we estimated an unadjusted logistic regression model. We then estimated a multivariable logistic regression model for each individual measure which adjusted for covariates to account for patient and physician characteristics which could be responsible for differences in quality of care. We treated all variables as unordered categorical variables, except for age, which was a continuous variable. Due to the number of individual measures that were tested, we expected that 4 or 5 measures would be significant by chance alone.

For the composite measures, we calculated the weighted mean proportion of quality measures met at visits to physicians whose compensation was or was not determined by each compensation incentive. We used fractional logistic regression to examine the independent association between each compensation incentive and performance on our composite quality measures.25 Fractional logistic regression is a quasi-likelihood model used when the dependent variable – as with the proportions used in our analysis – is between 0 and 1.30 For both composite measures, we estimated an unadjusted regression model and a multivariable regression model with the same covariates described above to adjust for patient and physician characteristics. All p-values were two-tailed and we considered a value of <0.05 as significant.

In a sensitivity analysis, we omitted physicians who answered the NAMCS compensation survey question with responses “item left blank,” “compensation unknown” (7.5%), and “refused to answer” (0.9%), and performed the same tests as above. In our initial analysis, we chose to code “compensation unknown” as “No” because incentives only work if physicians are aware of them.1, 31

Role of the Funding Source

The Society of General Internal Medicine had no role in the design, conduct, and reporting of this study.

RESULTS

From 2012 to 2016, there were 49,850 sampled visits, which represented 1.45 billion visits to primary care physicians in the United States. Quality measure performance was used as an incentive in 22% of visits, patient experience scores for 17% of visits, productivity for 57% of visits, financial performance of the practice for 63% of visits, and practice profiling for 12% of visits (Table 1).

Table 1 Patient and Physician Characteristics at Visits to Primary Care Physicians, 2012-2016 (n=1,452,525,992)

Patient sex, age, race/ethnicity, and patient insurance status were similar among visits to physicians with each compensation incentive. The various physician compensation incentives differed by region, urban versus rural, physician ownership, practice revenue, electronic health record capabilities. Incentives for quality measures were least common in the South, incentives for patient experience were most common in the West, and incentives for practice profiling were most common in the Northeast. Rural areas were more likely than urban areas to have incentives for productivity and financial performance of the practice. Physician owners were most likely to have incentives for financial performance of the practice.

In unadjusted models, physician compensation for quality measure performance was associated with better performance on our high value composite measure (43% of eligible measures met for visits to physicians without compensation for quality measure performance, versus 47% of eligible measures met for visits to physicians with compensation for quality measure performance; odds ratio (OR), 1.16; 95% 1.04 – 1.30; Appendix 3A).

In adjusted models (Table 2), there were no consistent associations between any compensation incentive and individual high-value or low-value care measures. Incentives for quality measures were not associated with an increase in high-value care or a decrease in low-value care on any individual measure, though there was an association with less use of angiotensin-converting enzyme (ACE) inhibitors (47% without an incentive versus 43% with an incentive; aOR, 0.74; 95% CI, 0.59 to 0.93). Incentives for patient experience, productivity, and financial performance were not associated with an increase in low-value care for any individual measure, except for an association between productivity incentives and increased opioid prescribing (12% without an incentive versus 19% with an incentive; aOR, 1.55; 95% CI, 1.03 to 2.32). Incentives for patient experience were also associated with less unnecessary cardiac screening (9.4% without an incentive versus 6% with an incentive; aOR, 0.63; 95% CI 0.40 to 0.99). Incentives for practice profiling were associated with an increase in tobacco cessation counselling in smokers (19% without an incentive versus 26% with an incentive; aOR, 1.47; 95% CI, 1.00 to 2.14). None of the incentives were associated with performance on our high- or low-value composite measures.

Table 2 Adjusted Associations Between Visit-level Physician Compensation Incentives and Measures of Quality of Care

In our sensitivity analysis, practice profiling was associated with more high value care (42% of eligible measures met per visit without incentive versus 47% with incentive; aOR, 1.19; 95% CI, 1.00 – 1.40). The association of productivity incentives and increased opioid prescribing in our initial analysis was no longer statistically significant (13% vs 19%; aOR, 1.37; 95% CI, 0.88 – 2.14). Other findings from sensitivity analysis are annotated in Appendix 3A-E.

DISCUSSION

In a nationally representative survey of visits to primary care physicians in the United States, physician compensation incentives for quality measure performance, patient experience, productivity, financial performance of the practice, and practice profiling were not associated with quality of care. In multivariable modelling examining associations between physician compensation incentives and quality, 4 associations were significant. Only 1 significant association makes intuitive sense (productivity incentives “increasing” opioid prescriptions). With 5 physician compensation incentives evaluated across 19 quality measures (95 measures total and p = 0.05), we would expect 4 or 5 to be significant by chance alone. This suggests that individual financial incentives in physician compensation by themselves may not significantly influence quality of care.

Small physician compensation incentives are often ineffective at improving health care quality.8, 9, 32 From 2012-2016, less than 5% to 10%, on average, of primary care physicians’ compensation was based on quality, patient experience, and resource utilization.33,34,35 Thus, our findings could be consistent with the ineffectiveness of small incentives.

Larger compensation incentives may improve quality when used within VBP reimbursement strategies.9, 32 For example, the Fairview Health Services Accountable Care Organization (ACO), which based 40% of clinician compensation on performance incentives, resulted in improvement in quality among low performing clinicians. The United Kingdom’s Quality and Outcomes Framework (QOF), which used to base 25% to 30% of physician compensation on clinical performance, may have improved performance in care processes. Overall, improvements in quality resulting from physician compensation incentives within VBP strategies may exist for processes of care, and have not yet been clearly demonstrated for health outcomes.7

The potential improvements in quality from highly incentivized VBP strategies must be weighed against the unintended negative consequences that can arise from performance measurement. Theoretical examples of unintended consequences of performance measurement include neglect of unmeasured outcomes, ignoring unincentivized activities, adverse selection of patients which can increase health inequities, and erosion of trust.10 The evidence for these potential harms is mixed.7 For physicians, performance measurement can be a source of stress and burnout due to loss of autonomy, documentation burden, unreasonable expectations, and lack of control over measured outcomes.13, 15, 36,37,38 In some cases, poorly designed incentives can result in worse medical care,18,19,20,21 which was not shown in our study.

Interestingly, incentives in the QOF were later revised down to 15% due to their unintended consequences,39 so there appears be an upper limit to how much of physicians’ compensation should be based on clinical performance. Due to the potential negative consequences of performance measurement, some have suggested that physician compensation incentives may at times be unnecessarily relied upon for motivating clinical improvement.40,41,42 When performance on quality measures is not tied to compensation, tracking quality measures can be valuable in other ways. Examples include population management, internal quality improvement goals, transparency for consumers, and setting thresholds for ACO incentives and balancing cost-containment measures in VBP models.

Another explanation for the lack of an association between compensation incentives for quality and improvements in care is improper implementation of performance measures. Measures must be clear and simple, and have appropriate size, benchmarks for performance, and measurement intervals.1 Measures must have capacity to change over time and should emphasize areas that most warrant performance improvement. Measures should have clinician buy-in, and would best have clinician involvement in measure selection and an evidence-based rationale. Incentives and measures should be congruent with clinician, institution, and community priorities.43 Measures must be attributed to and within the control of the appropriate clinician(s) and/or other stake holders, and there must be adequate clinical and administrative support to achieve the measure. Measures which are prone to gaming should be avoided.36 In our sample, 22% of visits were to physicians with incentives for quality measure performance, 17% for patient experience, and 12% for practice efficiency. We were unable to ascertain the proportion of physicians at these visits who were practicing in a setting which appropriately implemented their performance-based compensation incentives.

Our results have limitations. First, causation cannot be inferred from this retrospective, serial cross-sectional study. Strictly speaking, we have only found a lack of association between the NAMCS physician compensation incentives and the NAMCS measures of quality. Second, we were unable to specify the degree, amount, or mechanism of each incentive. Third, we may have had low power to detect differences on some measures. For example, we may have had low power on individual measures for antibiotics prescriptions for URI (unweighted visits, n = 1,238) and statin use in patients with history of cerebrovascular accident (n = 1,114). Fourth, we did not analyze how incentives were combined or used simultaneously for physicians. Sampled physicians may have been influenced by 0 to 5 different compensation incentives which may have been larger and thus more impactful than other incentives. Additionally, the mix of compensation incentives for each physician is complicated, and could bias results toward the null. Fifth, our analysis did not correct for multiple testing, but we considered the possibility of chance in interpreting our findings. Sixth, our composite measures are skewed toward measures that can be commonly met in a given visit (e.g., inappropriate cancer screening in elderly patients), rather than measures that may be more clinically significant or impacted by performance incentives (e.g., statin use in patients with history of cerebrovascular accident or yearly hemoglobin A1c for patients with diabetes). Seventh, we created our measures for quality based on their availability in the NAMCS (e.g., weight loss counseling for obesity), rather than measures which would otherwise reflect the best or most comprehensive treatment plan for a given problem (e.g., preventing obesity). Eighth, self-report of physician incentives may result in measurement error. Our sensitivity analysis showed an association between practice profiling and more high value care, which we also attribute to third party scrutiny of physician practice patterns which promotes higher quality care.

CONCLUSIONS

In a nationally representative, cross-sectional survey of primary care in the United States, we found no association between physician compensation incentives and quality of care. Physician compensation incentives have unintended negative consequences. Prior to implementation, physician compensation incentives should be correctly sized, properly implemented, and their benefits on improving quality should be clear.