Introduction

Practice facilitation is an approach to implementing evidence-based best practices in primary care [1]. It involves bringing an external healthcare professional (often a trained nurse with management experience) into a practice, in order to help primary care providers address the challenges associated with implementing evidence-based guidelines. Practice facilitators (also known as outreach facilitators, practice enhancement assistants, and practice coaches) are trained experts in initiating change in practices. They work with practices to identify areas for improvement, set care improvement goals, and provide tools and approaches to reach these goals. Facilitation has been found to lead to improvements in prevention, diabetes care, smoking cessation, and cancer care [24]. A recent meta-analysis demonstrated that primary care providers are more likely to adopt evidence-based guidelines when supported by a facilitator [5]. Furthermore, a cost consequence analysis of a facilitation study that focused on improving preventive care in primary care practices demonstrated a 40 % return on intervention investment [6].

Facilitation programs are being broadly implemented by practice-based research networks, health departments, professional associations, and health plans. As such, ongoing evaluation is needed to determine the effectiveness of practice facilitation in real-world settings.

The adoption, optimal intensity, and duration of practice facilitation remain uncertain, as does the effectiveness beyond single-disease-focused facilitation programs [1, 2, 5, 79].

We initiated the Improved Delivery of Cardiovascular Care (IDOCC) through an outreach facilitation project in 2007 to improve adherence to guidelines for the secondary prevention of heart disease, stroke, peripheral vascular disease, renal disease, and diabetes in primary care practices in a health region [10]. We hypothesized that facilitation would enable practices to improve overall adherence to cardiovascular care guidelines. This paper reports on our primary outcome of adherence to guidelines as measured by mean adherence to care delivery in the practice, based on patient-level data.

Methods

The IDOCC study protocol has been published elsewhere [10], so here, we provide a brief overview of the methods as per CONSORT [11].

Setting and participants

We conducted the study in a large health region in Eastern Ontario, Canada (16,000 km2), including Ottawa and the surrounding rural communities. It is a culturally diverse region of 1.2 million individuals who have chronic disease burdens and patient health outcomes comparable to Ontario and the rest of Canada [12].

Practices were eligible to participate if they provided general primary care services and had been in operation for at least 2 years prior to the initiation of the intervention. We enlisted practices if at least one physician from that practice consented to participate. Physicians received no monetary compensation for participating but were eligible for continuing professional development credits with the College of Family Physicians of Canada.

Design

We used a stepped wedge cluster randomized trial design (see Fig. 1) in which the intervention was sequentially rolled out to participating primary care practices assigned by region in three steps. Each step provides data for both control and intervention periods, and data analysis proceeds by comparing time points across steps in control versus intervention periods.

Fig. 1
figure 1

IDOCC stepped wedge study design. Shaded cells represent intervention periods. Blank cells represent control periods

We chose the stepped wedge design to (1) minimize the practical, logistical, and financial constraints associated with large-scale project implementation, (2) control for the effect of time, and (3) ensure that all practices in the project were eventually offered the intervention [13]. The Ottawa Hospital Research Ethics Board approved this trial (2007292-01H).

Randomization

We divided the health region systematically into nine geographic divisions using geographic information system mapping technology [14], stratifying the divisions by their location within the region (i.e., west, central, and east). We randomly assigned each of the three divisions per stratum (i.e., west, central, or east) to one of three study steps using computer-generated random numbers provided by an independent statistician. Each step comprised three divisions, with each step having a division from the east, central, and west part of the region. As such, each division per stratum had the same probability of beginning the program at any given step.

We developed a list, updated prior to recruitment at each step, of the contact information of all physicians practicing in the geographic regions of interest using a variety of physician listings such as The College of Physicians and Surgeons of Ontario website [15], the public telephone directory, a provincial directory of group practices, and through direct contact with the practices. We used a modified Dillman approach to recruit practices. This involved sending reminders and repeat mailings from the study team [16]. Recruitment continued in each step until we reached the desired sample. We obtained consent from all participating physicians.

Intervention

The intervention consisted of regular meetings with a facilitator over the study period. Practice facilitators intended to visit with practices 13 times (i.e., one visit every 4 weeks) in year 1 (intensive phase) and then 4–6 times (i.e., one visit every 8–12 weeks) in year 2 (sustainability phase). Starting with audit and feedback, consensus building and goal setting, the facilitators supported practices in changing their behavior through the incorporation of the following chronic care model elements: (1) integrated evidence-based care guideline and other decision support tools, (2) enhanced community linkages, (3) self-management support, and (4) delivery system redesign such as introducing a practice population approach, recall systems, and disease-specific registries. Facilitators encouraged practices to implement small but continuous changes through the plan-do-study-act cycle, which is a common quality improvement tool [17]. Full details of the intervention have been published previously [10] and are available in Additional file 1.

Outcome

The primary outcome for the trial was a patient-level score intended to reflect adherence to recommended guidelines for cardiovascular disease processes of care. The score represented the percentage of recommended process of care indicators for which the patient was eligible that were performed on the patient over a 1-year period. The indicators were (1) two blood pressure measures, (2) lipid profile, (3) waistline measure, (4) smoking status, (5) glycemic levels (two hemoglobin A1c measures for patients with diabetes or one fasting blood glucose for high-risk patients without diabetes), (6) kidney function (albumin-to-creatinine ratio or estimated glomerular filtration rate), (7) prescription of all eligible medications, and (8) referral to a smoking cessation program. For each step, we measured mean adherence for baseline time points and for the intensive (year 1) and sustainability (year 2) phases.

Data collection

The unit of intervention was the practice, but the unit of analysis and causal inference was the patient. Causal inference refers to the fact that although the intervention is aimed at producing a system-level change in the practice, the assessment of this change will occur at the patient level through the review of medical charts from the practice. For each practice, we randomly selected 66 charts of patients who were over age 40 and (1) had cardiovascular disease including coronary artery disease, cerebrovascular disease (stroke and/or transient ischemic attack), or peripheral vascular disease; (2) had diabetes or chronic kidney disease; or (3) were at high risk of developing cardiovascular disease as defined by age (males ≥45, females ≥55), smoking status, hypertension, weight, or dyslipidemia. The same patients were followed over time to assess outcomes.

We collected data for each time block in the stepped wedge design as highlighted in Fig. 1. We used the baseline data collection over 1 year from each practice for the audit and feedback stage of the facilitator intervention. In total, step I practices provided 3 years of data (1 year of baseline data, intensive year, and sustainability year), step II practices provided 4 years of data (2 years of baseline data, intensive year, and sustainability year), and step III practices provided 5 years of data (3 years of baseline data, intensive year, and sustainability year).

Six trained nurses completed the chart abstraction. A four-part quality implementation and monitoring process was used to ensure consistent levels of data quality across chart abstractors. Full details of our chart abstraction methods are described elsewhere [18].

We collected practice-level data through practice surveys and by linking family physicians’ medical ID to their demographic and practice model information contained in the Institute for Clinical Evaluative Sciences’ Corporate Provider Database, which reflects characteristics as of March 31, 2008 [17, 19].

The chart abstractors, facilitators, and practices were blinded to the full details of which data were being collected for the primary outcome analysis, and chart abstractors were additionally blinded as to whether the practice they were auditing was in the control or intervention phase.

Sample size calculation

There are as yet no published sample size formulae for a stepped wedge trial with a cohort design. Using the sample size calculation formula for a two-arm parallel design, we determined that 21 practices in each of the intervention and control arms with an average of 66 patients per practice (the anticipated average number of eligible patients per practice based on pilot data) would detect an 8 % absolute mean difference (deemed clinically relevant) in the mean adherence score using a two-sided test at the 5 % level of significance with 90 % power. This difference was deemed clinically relevant after conducting a series of simulations followed by discussions with a panel of family physicians and cardiologists. Our calculations assumed an intra-cluster correlation coefficient of 0.18 and a common standard deviation of 18 % calculated using IDOCC pilot data on nearly 500 patients across seven practices. Our target sample size was 30 practices in each step of the stepped wedge design to account for 20 % potential attrition. Since the stepped wedge design includes repeated observation periods per practice and is therefore more powerful than the simple parallel design, this sample size estimate was considered to be conservative [20].

Data analysis

We generated descriptive statistics summarizing practice and patient characteristics for the entire sample and for each step separately. We used general linear mixed-effect regression modeling as described by Hussey and Hughes to assess the impact of the IDOCC program on the primary outcome [21]. To assess the normality of the distribution of the primary outcome, we used visual inspection of histograms.

The model included fixed effects for time, treatment phase, and region and a random effect for the practice. Regions were treated as fixed—rather than as random—effects as they could not be considered a sample from some population but reflected a systematic division of Eastern Ontario. In addition, a compound symmetric covariance structure was specified to account for the correlation in repeated measures on the same patient over time. We estimated the model using restricted maximum likelihood. In the analysis, the treatment variable was specified as a three-level categorical variable (baseline, intensive, sustainability).

We conducted an unadjusted analysis (model A) followed by two adjusted analyses: model B adjusted for patient characteristics (age, sex, median neighborhood income, number of cardiovascular-related diseases, number of cardiovascular-related risk factors, and rurality) [22, 23], while model C included three practice-level characteristics that have been shown to influence primary care practice adherence to guidelines (payment model, percentage of participating physicians that were female, and average number of years since graduation (to 2010)) [24, 25]. We carried out pairwise comparisons between mean adherence scores across the different treatment years (baseline, intensive year, sustainability year) at the 0.017 Bonferroni-corrected level to maintain the familywise error rate at 5 %. We calculated adjusted least square mean adherence scores in each phase by setting continuous covariate values equal to their mean values and using observed marginal distributions for categorical variables. Thus, least square mean differences represent the estimated effect of the intervention on the quality of care for an “average” patient in an “average” practice. All analyses were conducted using SAS, Version 9.3, SAS Institute Inc.

Results

We approached all 533 primary care practices in the Champlain region to participate in the IDOCC program. Ninety-nine were ineligible to participate because the practice was no longer in operation, the clinic was an exclusive walk-in practice, or the physician running the practice was planning to retire within the 2-year intervention time frame. Of the 434 eligible practices, 93 practices (21 %) (194 physicians) agreed to participate. Nine practices withdrew from the study prior to the initiation of the facilitation intervention for various reasons, including not having the office space to accommodate a chart abstractor and insufficient number of eligible high-risk patient charts, leaving 84 participating practices (182 physicians): 27, 30, and 27 in steps I, II, and III, respectively. The withdrawal of ten practices from the study resulted in a loss of only 12 physicians, suggesting that only small or single physician practices chose to withdraw. The first facilitation visit was on April 14, 2008, and the final site visit was March 27, 2012. The trial concluded as scheduled at the end of the intervention phase for step III. No practices withdrew after the start of the intervention.

Baseline characteristics of participating practices, providers, and patients are presented in Table 1. The majority of participating practices were fee-for-service (53.5 %) and located in urban areas (82.1 %).

Table 1 Breakdown of practice-, provider-, and patient-level characteristics at baseline by step

We collected baseline data from 5292 patient medical charts, of which 627 (11.8 %) were lost at follow-up, primarily due to patients dying or moving within the study period. Chart abstraction data quality assessments via re-abstractions demonstrated an overall inter-rater reliability kappa value of 0.91 and a percent agreement of 93.9 %.

No practices received the intended intensity of facilitator visits. Only 32 practices had eight or more visits over 2 years. On average, facilitators had 6.6 (min: 2 max: 11) face-to-face visits with practices in year 1 and 2.5 (min: 0 max: 10) visits in year 2. In addition to face-to-face visits, facilitators communicated with practices an average of 1.7 times using phone or email in year 1 and 1.6 times in year 2.

Figure 2 presents the observed mean adherence scores for each study step during the baseline, intensive, and sustainability years. There is little change in the mean adherence score pre- and post-intervention for all practices in all three steps. Additionally, the three baseline time points for step III appear to show a (modest) upward secular trend in mean adherence, highlighting the importance of controlling for the time trend in the subsequent regression analysis. Fig. 3 presents the estimated effect of the IDOCC intervention presented as adjusted mean adherence score averaged over all practices in control, intensive, and sustainability conditions.

Fig. 2
figure 2

Observed mean adherence score (%) over time in practices allocated to study steps, indicating year of implementation of the intervention

Fig. 3
figure 3

Estimated effect of the IDOCC intervention presented as adjusted mean adherence score (%) averaged over all practices in control, intensive, and sustainability conditions. The model assumes a single underlying secular trend across steps and estimates a shift in level as a result of the intervention. Solid lines represent time intervals with observed data in that condition; dashed lines represent time intervals with no observed data in that condition

Using data from all three steps, the unadjusted (model A) least square mean differences showed an absolute decrease from baseline of 2.1 % in the intensive phase (95 % CI: −3.1 to −1.1 %) and a decrease of 4.7 % in the sustainability phase (95 % CI: −6.2 to −3.2 %) (see Table 2). After adjustment for patient characteristics (model B), there was an absolute decrease in mean adherence from baseline of 1.9 % in the intensive phase (95 % CI: −2.9 to −0.9 %) and a decrease of 4.2 % in the sustainability phase (95 % CI: −5.7 to −2.6 %). Adjusting for provider factors had little impact on these estimates.

Table 2 Results from multivariable mixed-effect regression analyses of the effect of the IDOCC intervention

Discussion

In our pragmatic trial of practice facilitation, the intervention did not improve adherence to evidence-based guidelines for cardiovascular disease in primary care practices. These results contrast with findings from previously reported facilitation trials [5, 2628] and highlight the complexities and challenges of scaling up research studies into sustainable programs. Suboptimal intensity of the intervention, a broad focus on multiple chronic conditions, and measurement challenges are all factors that may have contributed to the null results.

A meta-analysis conducted of facilitation programs demonstrated that intensity as measured by number of facilitation visits was associated with greater effect size [5]. The IDOCC intervention was designed with the intent of having the facilitators meet face-to-face with practices 13 times (i.e., one visit every 4 weeks) in year 1 (intensive phase) and 4–6 times in year 2 during the sustainability phase [29, 30]. We were unable to attain this “dose” with our practices, who despite having the best intentions of engaging and meeting with the facilitators, were all unable to find the time for this frequency of meetings. Other facilitation studies have faced similar issues. For instance, facilitators in a group-randomized practice facilitation trial conducted in South Texas initially intended to make monthly facilitation visits over 12 months but due to competing demands within participating practices (e.g., EMR implementation, staff turnover) were only able to make six to seven visits during the 1-year study period [31]. The difficulties in achieving the requisite intensity of visits necessary to facilitate changes suggest that practice facilitation may not be as effective in complex, real-world settings. In our program, no incentives were provided apart from the opportunity to claim continuing professional development credits. Supporting practice change may require incentives to compensate providers for their time and/or policies designed to encourage a culture of quality improvement such as mandatory reporting of quality metrics and accreditation of primary care practices. Additionally, financial incentives have been identified as having a modest positive impact on primary care processes associated with quality management of chronic disease [32].

Improving quality of care delivery for people with multimorbidity is an area that requires more research. Although our study was grounded in the chronic care model approach with targeted practice level improvement activities across several dimensions within the model, it failed to improve care delivery. Previous facilitation studies primarily focused on single diseases such as diabetes and asthma [5, 33, 34], whereas our intervention attempted to improve adherence to guidelines for patients with a broad number of cardiovascular-related conditions (coronary artery disease, peripheral vascular disease, stroke/TIA, diabetes, and chronic kidney disease) and risk factors (hypertension, dyslipidemia, weight, and smoking). Implementing guidelines for people with multiple chronic conditions is complex. While physicians may be able to successfully focus on single diseases such as diabetes or asthma, improving adherence across multiple conditions is more challenging. A review of interventions for managing patients with multimorbidity found that programs targeting specific areas of concern for the patient (e.g., functional difficulties) were more effective than broader disease-oriented programs [35]. Likewise, recent studies (including ours) examining the implementation of practice facilitation programs targeting multiple diseases have found only modest or insignificant improvements in patient outcomes [31, 33].

Our analyses identified statistically significant decreases in our mean adherence score; however, these changes are likely the result of our study being overpowered rather than the impact of the program on clinical behavior. The decreases we detected were smaller than our pre-specified minimal clinical difference and hence unlikely to have any clinical significance. We chose to measure adherence to all indicators using a mean adherence score as we felt it would reflect the reality of managing multiple chronic conditions simultaneously. We hypothesized that practice facilitation’s focus on practice-level transformation would lead to changes, which would positively influence multiple processes of care and could be captured with a mean score. However, we were unable to demonstrate this. Measuring impact across multiple areas is challenging. Our adherence score was an unweighted mean in which all indicators were considered equally important. As a result, it is more difficult to interpret the clinical importance of observed changes: whereas a decrease of 2–4 % could represent deterioration in one to two individual indicators if practices improved in some areas but deteriorated in others, such changes would not be detectable. For these reasons, future analyses are being planned of the individual indicators of adherence.

Strengths and limitations

Our study had several strengths, including our population approach and implementation across a large geographic area. We had a large sample size that included a diverse range of practices. Our stepped wedge design addressed the need of our funders to eventually offer this practice improvement program to all participants. Our low rate of attrition speaks to the feasibility of establishing ongoing relationships with the practices despite competing demands. Although practices that agreed to participate may not be representative of all practices across Eastern Ontario, they represent the kinds of practices expected to participate in QI initiatives in a real-world setting. Moreover, in this randomized controlled trial, internal validity is of primary concern.

Limitations include the use of an index of adherence, which provides an overall picture of change in adherence patterns but may hide certain changes that may have occurred during the intervention. We had also considered assigning specific weights to each indicator in order to reflect their relative importance to patient health outcomes. Unfortunately, there is no empirical evidence for assigning weighting factors to the indicators in this study. Although theoretically useful, many authors have acknowledged the difficulty in effectively determining meaningful weights, and as such, few researchers use this approach [36]. Also, due to practical and logistical constraints associated with implementing the intervention across a large geographic region, we were unable to randomize individual practices. As a result of randomizing at a regional level, there were some imbalances in the characteristics of participants across the three steps, as well as differences in the secular trend across steps. To address these limitations, we adjusted for observed differences in the analysis and examined models that included interactions with each step.

Conclusion

Our findings highlight the complexities of expanding research evidence into effective, sustainable programs. Whether practices can commit the time required without policy or incentives is unknown. Further research is needed to better understand what impact practice facilitation can have on eliciting change at this level once implemented more broadly.