Growing Literature, Stagnant Science? Systematic Review, Meta-Regression and Cumulative Analysis of Audit and Feedback Interventions in Health Care
- First Online:
- Cite this article as:
- Ivers, N.M., Grimshaw, J.M., Jamtvedt, G. et al. J GEN INTERN MED (2014) 29: 1534. doi:10.1007/s11606-014-2913-y
This paper extends the findings of the Cochrane systematic review of audit and feedback on professional practice to explore the estimate of effect over time and examine whether new trials have added to knowledge regarding how optimize the effectiveness of audit and feedback.
We searched the Cochrane Central Register of Controlled Trials, MEDLINE, and EMBASE for randomized trials of audit and feedback compared to usual care, with objectively measured outcomes assessing compliance with intended professional practice. Two reviewers independently screened articles and abstracted variables related to the intervention, the context, and trial methodology. The median absolute risk difference in compliance with intended professional practice was determined for each study, and adjusted for baseline performance. The effect size across studies was recalculated as studies were added to the cumulative analysis. Meta-regressions were conducted for studies published up to 2002, 2006, and 2010 in which characteristics of the intervention, the recipients, and trial risk of bias were tested as predictors of effect size.
Of the 140 randomized clinical trials (RCTs) included in the Cochrane review, 98 comparisons from 62 studies met the criteria for inclusion. The cumulative analysis indicated that the effect size became stable in 2003 after 51 comparisons from 30 trials. Cumulative meta-regressions suggested new trials are contributing little further information regarding the impact of common effect modifiers. Feedback appears most effective when: delivered by a supervisor or respected colleague; presented frequently; featuring both specific goals and action-plans; aiming to decrease the targeted behavior; baseline performance is lower; and recipients are non-physicians.
There is substantial evidence that audit and feedback can effectively improve quality of care, but little evidence of progress in the field. There are opportunity costs for patients, providers, and health care systems when investigators test quality improvement interventions that do not build upon, or contribute toward, extant knowledge.
KEY WORDSaudit and feedbackscientific progressquality improvementsystematic reviewcumulative analysis
Audit and feedback is widely used as a strategy to improve professional practice, either on its own or as a key component of multifaceted quality improvement (QI) interventions. Providing data regarding clinical performance may overcome health professionals’ limited abilities to accurately self-assess their performance.1 It is posited that when well-designed feedback demonstrates suboptimal performance for important and actionable targets, recipients are more likely to respond with efforts to improve quality of care.2
Findings from Cochrane Systematic Reviews and Meta-Analyses of Audit and Feedback Over Time
Year of review
2003 (search up to January 2001)
Forty-seven studies with dichotomous outcomes: 7 % (IQR: 2–11) median absolute increase in compliance with intended professional behaviors or processes
“Audit and feedback can be effective in improving professional practice. When it is effective, the effects are generally small to moderate. The absolute effects of audit and feedback are more likely to be larger when baseline adherence to recommended practice is low.”4
2006 (search up to January 2004)
Forty-nine studies with dichotomous outcomes: 5 % (IQR: 3–11) median absolute increase in compliance with intended professional behaviors or processes
“Audit and feedback can be effective in improving professional practice. The effects are generally small to moderate. The absolute effects are likely to be larger when baseline adherence to recommended practice is low and intensity of audit and feedback is high.”5
2012 (search up to December 2010)
Sixty-two studies with dichotomous outcomes: 4 % (IQR: 1–16) weighted median absolute increase in compliance with intended professional behaviors or processes
“Audit and feedback generally leads to small but potentially important improvements in professional practice. The effectiveness of audit and feedback seems to depend on baseline performance and how the feedback is provided. Future studies of audit and feedback should directly compare different ways of providing feedback.”3
In some instances, audit and feedback is highly effective; learning from such examples is necessary to optimize the effectiveness of the intervention across different contexts. The Cochrane review and associated re-analyses have found that the effectiveness of audit and feedback depends to some extent on how the intervention is designed and delivered, suggesting an opportunity to maximize the impact of this QI strategy on quality of care.3,7,8 However, there is evidence that many audit and feedback interventions are developed and tested without an explicit attempt to consider relevant theories or to build upon extant knowledge.9 Ideally, results of early studies would inform the design of future interventions, and through this process, cumulative knowledge would lead to more effective QI. Given the continuing human and financial capital invested in audit and feedback interventions in health care, it is important to examine whether newer trials of audit and feedback have contributed new knowledge to the field.
The purpose of this paper is to extend the results of the Cochrane review of audit and feedback to explore the evolution of evidence supporting this QI intervention over time. In particular, we examined whether effect estimates, and the precision around those estimates, changed over time. To do this, we undertook a cumulative analysis of trials by year of publication and conducted a series of meta-regressions to understand how the literature has developed with respect to determining factors that could explain why audit and feedback is more or less effective.
This is a secondary analysis of data from the previously published Cochrane systematic review of audit and feedback. Complete methodological details are available3 and are summarized below. Ethics approval was not required for this study.
Audit and feedback was defined as a “summary of clinical performance of health care over a specified period of time.” This secondary analysis only included RCTs that directly compared audit and feedback (either alone or as the core, essential feature of a multifaceted intervention) to usual care. Furthermore, only RCTs that evaluated effects on provider practice as a primary outcome were included. For ease of interpretation of the meta-regression and cumulative meta-analysis, we further limited studies to those that reported dichotomous outcomes (i.e., compliance with intended professional practice).
Information Sources, Search, and Study Selection
A search strategy sensitive for RCTs involving audit and feedback was applied in December 2010 to the Cochrane Central Register of Controlled Trials, MEDLINE, EMBASE, and CINAHL. As previously described,3 we developed a MEDLINE search strategy that identified 89 % of all MEDLINE indexed studies from the previous version of the review and then translated this strategy into the other databases using the appropriate controlled vocabulary as applicable. Search terms included: audit, benchmarking, feedback, utilization review, health care quality, etcetera, plus typical search terms to focus on RCTs. Two reviewers independently screened the titles, abstracts, and full texts to apply inclusion criteria.
Data Collection Process
Two reviewers independently abstracted data from included studies. Studies included in the previous version of the Cochrane review of audit and feedback were reassessed due to changes in the data abstraction form and methods. Discrepancies were resolved through discussion. For studies lacking extractable data or without baseline information, we contacted investigators via email. Risk of bias for the primary outcome(s) in each study was assessed according to the Cochrane Effective Practice and Organization of Care group criteria10 (sequence generation, allocation concealment, blinding, incomplete outcome data, selective reporting, baseline similarity, lack of contamination, and other). We assigned an overall assessment of the risk of bias for each study as high, moderate, or low, following the recommendations in the Cochrane Handbook.11 Studies with a high risk of bias in at least one domain that decreased the certainty of the effect size of the primary outcome were considered to have a high risk of bias. Conversely, when a study had low risk of bias for each domain, it was deemed low risk of bias overall. Other studies were considered to have unclear risk of bias.
Measure of Treatment Effect
We only extracted results for the primary outcome. When the primary outcome was not specified, we used the variable described in the sample size calculation as the primary outcome. When the primary outcome was still unclear or when the manuscript described several primary process outcomes, we calculated the median value. We calculated the treatment effect as an adjusted risk difference (RD) by subtracting baseline differences from post-intervention differences. Thus, an adjusted RD of +10 % indicates that after accounting for baseline differences, health professionals receiving the intervention adhered to the desired practice 10 % more often than those not receiving the intervention.
Across multiple studies, we weighted the median effect by the number of health care providers. The ‘median of medians’ technique has been used in many similar reviews evaluating the effect of QI interventions on health professional performance,12 due to frequency of unit of analysis errors in the literature and the great variety of clinical contexts covered in the studies. For the cumulative analysis, the median adjusted RD and interquartile range (IQR) was recalculated at each time point as studies were added. The meta-regression examined how the adjusted RD was related to explanatory variables, weighted according to study size (number of health care professionals). Unlike the meta-regression from the Cochrane review of audit and feedback,3 high risk of bias studies were included. The meta-regression also tested the following potential sources of heterogeneity to explain variation in the results of the included studies: format (verbal, written, both, unclear); source (supervisor or senior colleague, professional standards review organization or representative of employer/purchaser, investigators, unclear); frequency (weekly, monthly, less than monthly, one-time); instruction for improvement (explicit measurable target or specific goal but no action plan, action plan with suggestions or advice given to help participants improve but no goal/target, both, neither); direction of change required (increase current behavior, decrease current behavior, mix or unclear); recipient (physician, other health professional); and study risk of bias (high, unclear, low). Meta-regression was conducted for all published trials as of 2010, 2006 and 2002. Finally, we added year of publication as a continuous variable to the meta-regression of all studies as an additional approach to assess whether this variable accounted for a significant portion of the heterogeneity. We conducted a multivariable linear regression using main effects only. Baseline compliance and year of publication were treated as continuous explanatory variables and the others as categorical. The analyses were conducted using the GLIMMIX procedure in SAS Version 9.2 (SAS Institute Inc. Cary, NC USA), accounting for the dependency between comparisons from the same trial.
Characteristics of Studies
UK or Ireland
Australia or New Zealand
Monthly or more
Repeated less than monthly
Unit of allocation
Instructions for improvement
Unit of analysis
Nature of change required
Increase current behavior
Risk of bias
Decrease current behavior
Mix or unclear
Number of arms in trial
Targeted health professional
Medical specialty (could be > 1)
Factors Explaining Variability in Effectiveness of Feedback: Serial Meta-Regressions
Characteristic of feedback
Estimated effect size*, (no. studies)
Format of feedback
p = 0.386
p = 0.731
p = 0.729
Both verbal and written
Source of feedback
p = 0.006
p = 0.034
p = 0.300
A supervisor or respected colleague
Standards review org. or representative of employer
Frequency of feedback
p < 0.001
p < 0.001
p < 0.001
Frequent (up to weekly)
Moderate (up to monthly)
Infrequent (less than monthly)
Instructions for improvement
p = 0.044
p = 0.068
p = 0.325
Explicit, measurable target, but no action plan
Action plan, but no explicit target
Nature of change required
p = 0.025
p = 0.028
p = 0.510
Increase current behavior
Decrease current behavior
Change behavior to similar alternative or unclear
Profession of recipient (Physician yes/no)
p < 0.001
p < 0.001
p < 0.001
Risk of bias
p = 0.375
p = 0.564
p = 0.281
Yes (low risk of bias)
No (high risk of bias);
Baseline performance (continuous variable)
p < 0.001
p = 0.003
p = 0.021
Audit and feedback works; the median effect is small though still potentially important at the population level, and 27/98 comparisons (28 %) resulted in an improvement of at least 10 % in quality of care.3 Small differences in the results seen in these re-analyses compared to the results of the Cochrane review are due to the lack of weighting in the cumulative analysis and the inclusion of high risk of bias studies in the meta-regression. Nevertheless, the expected effect of an intervention comparing audit and feedback to usual care has changed very little over the last two decades. Furthermore, new trials have provided little new knowledge regarding key effect modifiers. Given the lack of equipoise, it may no longer be ethically appropriate to continue to direct human and financial resources toward trials comparing audit and feedback against usual care, especially for common conditions in common settings. At this point, the appropriate question is not, ‘can audit and feedback improve professional practice?’ but ‘how can the effect of audit and feedback interventions be optimized?’
Based on our analyses, feedback seems most effective when it: is delivered by a supervisor or respected colleague; is presented frequently; includes both specific goals and action-plans; aims to decrease the targeted behavior; focuses on a problem where there was larger scope for improvement; and when the recipients are non-physicians. Unfortunately, relatively few trials feature these components. Furthermore, our findings suggest that investigators are not building upon best practices. For example, despite evidence that repeated feedback is more effective, studies that evaluate interventions after only one cycle of feedback continue to be performed. Furthermore, of the 32 studies conducted after 2002 considered in this analysis, feedback was delivered by a supervisor or respected colleague only six times, and no studies included feedback with both explicit goals and action plans. As a result, even after 140 randomized trials of audit and feedback, it remains difficult to identify how to optimize audit and feedback.6 For instance, although a ‘supervisor or respected colleague’ appears to be the most effective source to deliver feedback, precise strategies to reliably identify and leverage such sources are not well known.13 In addition, while it is advisable for action plans to accompany feedback since the downside is minimal, the best way to operationalize this is unknown.7,14 It is noteworthy that explicit targets without action plans do not seem to be particularly helpful. To achieve performance targets, recipients of feedback benefit from correct solution information8 that can focus their attention on the targeted behavior(s).
Cumulative meta-analyses have previously been used to investigate whether future trials would be likely to change the conclusions regarding the effectiveness of QI or health services interventions.15,16 For audit and feedback, it is plausible that further studies comparing the intervention against control may be informative if they are conducted for settings, professional groups or behaviors not well targeted in the current review (although relatively few additional trials should be needed to confirm whether observed effects are broadly aligned with observed effects across the body of literature). We recognize the risks of cumulative meta-analysis with respect to multiple testing and escalating type one error.17 However, since the Cochrane review did not include a variance around the intervention effect, the figures showing the results of our cumulative analysis do not feature error bars as in the seminal examples of Lau et al.18 Additionally, the number of characteristics tested in the meta-regression was limited by statistical and pragmatic concerns. Variables were only chosen for abstraction if there was an a priori directional hypothesis and a belief that data would be available in published reports. Confidence in the results of the meta-regression is limited by reliance upon indirect comparisons and risk of ecological fallacy. In other words, relationships identified across studies through meta-regression may not reflect relationships evident within studies; this is also known as aggregation bias. Finally, as with any review, the limitations of the primary studies must be considered.
We acknowledge that many other potential variables, including the clinical topic and context, likely impact the effectiveness of the intervention.19,20 Amongst the 98 comparisons, there were 41 comparisons testing audit and feedback alone and 57 comparisons testing audit and feedback as the core, essential part of a multifaceted intervention. It is plausible that co-interventions may interact with the effect modifiers tested in the meta-regressions. A recent international meeting was conducted to identify high-yield research questions for understanding how to enhance the effectiveness of audit and feedback. Stakeholders suggested a need for more research to better understand how contextual and recipient characteristics moderate audit and feedback effectiveness, characteristics of the desired behavior change that make a good target for audit and feedback, and how the specific design of the audit and feedback intervention interacts with these factors.21
Given the importance of audit and feedback as a key component of many QI interventions, there is a need to identify opportunities to sequentially and systematically test various approaches to the design and development of audit and feedback. Researchers can continue to conduct uncoordinated trials of audit and feedback versus usual care and rely upon periodically conducted meta-regressions across studies to explore effect modifiers. But the results will be at risk of ecological fallacies, and as demonstrated here, this approach has resulted in minimal advances over time. Alternatively, researchers could achieve greater confidence in causal inference regarding more effective intervention design through a limited number of multi-arm trials with direct, head-to-head comparisons testing different approaches for designing and delivering audit and feedback. Another approach that could help advance cumulative knowledge regarding audit and feedback and other QI strategies would be to consider engineering-based methodological options that enable testing of multiple potential effect modifiers, such as theory-driven factorial and/or sequential adaptive trials.22 Future audit and feedback interventions should feature the aspects known to be associated with greater effectiveness and future trials should be powered to find relatively small effect sizes, especially in the case of head-to-head trials. This proposed shift in direction for QI trials parallels the movement to limit placebo-controlled trials of clinical interventions and to increase focus on comparative effectiveness research.23
The findings of this review suggest that QI trialists have failed to cumulatively learn from previous studies (or from systematic reviews). Rather, it would appear that the norm for those testing audit and feedback interventions is to ‘re-invent the wheel’, repeating rather than learning from and contributing to extant knowledge.24 As highlighted in the recent series on increasing value and reducing waste in research,25 the opportunity cost of continuing in the current manner is large for patients, providers, and health systems. A coordinated approach toward building upon previous literature and relevant theory to identify the key, active ingredients of interventions would help QI stakeholders achieve greater impact with their interventions and produce outcomes that are more generalizable.26,27 In particular, QI trialists could benefit from adapting the model of the Children’s Oncology Group, which has successfully shared resources to accelerate progress.28 At a minimum, for stakeholders involved in the funding and conduct of QI trials, this analysis emphasizes the need for trials of carefully planned interventions with explicitly justified components to ensure that the field of QI in healthcare can move forward.
This study was conceived by NMI and JMG. Analyses were conducted by JOJ. The first draft was written by NMI and revised with critical input from all authors. All authors approved the final version of the manuscript.
This study received no specific external funding. NMI is supported by research fellowships from the Canadian Institutes of Health Research and the Department of Family and Community Medicine, University of Toronto. JMG is supported by a Canada Research Chair in Knowledge Transfer and Uptake.
Conflict of Interest
The authors declare that they do not have a conflict of interest.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.