In this parallel cluster RCT, we evaluated the effect of financial incentives on QMs in the treatment of patients with diabetes mellitus in primary care. We tested financial incentives targeted at the GP level in combination with feedback reports, in comparison to feedback reports only. Financial incentives did not have a significant effect on primary outcomes; the proportion of patients receiving annual HbA1c measurements and the proportion of patients achieving the recommended blood pressure target level was stable. Some effects were observed on secondary outcomes.
Systematic reviews3,4 concluded that financial incentives targeting process QMs, which can directly be altered by providers, showed greater effects than financial incentives targeting clinical QMs, which can only be influenced indirectly. In contrast, in this study, we detected no significant effect of the financial incentives on the directly incentivized process and clinical QM. However, in case of the process QM of measuring HbA1c annually, for which rates of 80% were achieved, we suspect a ceiling effect, and therefore, no further improvement was attained. Measuring HbA1c is the most standard procedure in diabetes mellitus care and other quality initiatives already taking place in the Swiss health care environment might have led to such already high-performance levels at baseline23,24 and notably the fee-for-service setting where the setting itself already gives an incentives for regular measurement. However, other European countries report achievement rates of over 90%,25 implying that higher rates are not beyond reach. The appearance of a ceiling effect for HbA1c measures is also supported by our models, which reveal smaller variability of random effects on practice and GP level for HbA1c, than for the other QMs. Furthermore, the presence of ceiling effects is supported when comparing to the achievement rates of 2014 where lower achievement rates but higher variation was observed.26
Despite we observed no effect in directly incentivized QMs, we observed some effects on non-incentivized process QMs. It is known that spill-over effects of financial incentives on non-incentivized QM can occur, generally with smaller effect sizes than in incentivized QMs.27 Spill-over effects might indicate that GPs’ awareness on the condition improved and that GPs addressed a more holistic disease approach.28 Hysong et al. also reported that financial incentives improved care documentation without necessarily improving the care provided. Despite these plausible mechanisms of spill-over effects, our finding should be treated with caution, as the effect on directly incentivized QMs—presumably due to ceiling effects—was limited and other unknown confounders might matter.
Our results show that the potential of financial incentives to increase proportions of patients fulfilling a QM might be limited, and that feedback reporting with educational aspects only had no effect. First, the promised incentive of 75 Swiss francs per percentage point improvement was rather low. With an assumed effect size, the amount achieved would have been slightly less than 1% of the annual fee-for-service income of a Swiss GP. Compared to the QOF, where GPs received financial incentives of up to 25% of their income,29 our incentive was most likely too low to have a major effect. Second, P4P initiatives and feedback reporting are complex and behavioral economic principles and the design aspects play an important role. Factors such as the amount of the incentive, incentive frequencies, and the level of payment have an influence on the effectiveness of incentives.30,31,32 Feedback reports are found to be most effective when they are actionable and individualized and when specific goals are set.33,34
Strengths and Limitations of This Study
To our knowledge, this is the first cluster RCT testing a P4P strategy in diabetes care within Europe. In general, quality improvement studies in diabetes care are of importance, due to high prevalence and disease burden. We were able to implement a simple P4P intervention and to blind the control group. Additionally, we were able to compare the study results with another cohort not participating in the study. This study therefore closes a gap, left open by the many observational studies. Compared to the nationwide-conducted and governmentally authorized intervention in the UK, we were only able to test our intervention with a comparably small number of GPs, however large enough to validly detect potential effects.
The major limitation of this study is its inherent risk of selection bias as GPs participate on a voluntary basis in the FIRE project. However, GPs participating in the FIRE project cover 8% of GPs working in the German-speaking region of Switzerland35 but they are most likely GPs more engaged in research and quality of care, than their counterparts not participating in the FIRE project.36 Additionally, using EMR favors higher quality.37,38 However, EMRs are only used by around 70% of GPs in Switzerland39 but an EMR is a key requirement for participating in the FIRE project.
In the baseline analysis of this RCT, we conducted further investigations to disentangle the complex interplay of different factors influencing quality of care, also including patient characteristics.40 However, the influence of available explanatory variables of practice, GP, and patient level on performance was surprisingly small. Another potential explanatory variable—not systematically retrieved—is the availability of specifically trained chronic care nurses. To our knowledge, only one practice had a special trained nurse, since in Switzerland it is highly uncommon that trained nurses work in GP practices.
Further limitations arose due to the underlying database. Such databases are known to be prone to missing data and data quality issues. These issues might not be apparent at baseline due to randomization. However, we cannot preclude that the increases in proportions of patients fulfilling process QMs are due to better data reporting, and not due to enhanced care delivery. This confounder might especially be of concern in the process QM of blood pressure measurement, as it is very likely that the blood pressure values, which need to be entered manually, are not exclusively reported in the intended field. However, cross-validation indicated high validity of the database.41 Lastly, limited knowledge was available about death of patients, as only very few GPs code it appropriately to be visible for FIRE. To overcome this limitation, patients with no consultation within the last 12 months were excluded due to loss of follow-up.