The Effect of Intensive Implementation Support on Fidelity for Four Evidence-Based Psychosis Treatments: A Cluster Randomized Trial

Purpose Service providers need effective strategies to implement evidence-based practices (EBPs) with high fidelity. This study aimed to evaluate an intensive implementation support strategy to increase fidelity to EBP standards in treatment of patients with psychosis. Methods The study used a cluster randomized design with pairwise assignment of practices within each of 39 Norwegian mental health clinics. Each site chose two of four practices for implementation: physical health care, antipsychotic medication management, family psychoeducation, illness management and recovery. One practice was assigned to the experimental condition (toolkits, clinical training, implementation facilitation, data-based feedback) and the other to the control condition (manual only). The outcome measure was fidelity to the EBP, measured at baseline and after 6, 12, and 18 months, analyzed using linear mixed models and effect sizes. Results The increase in fidelity scores (within a range 1–5) from baseline to 18 months was significantly greater for experimental sites than for control sites for the combined four practices, with mean difference in change of 0.86 with 95% CI (0.21; 1.50), p = 0.009). Effect sizes for increase in group difference of mean fidelity scores were 2.24 for illness management and recovery, 0.68 for physical health care, 0.71 for antipsychotic medication management, and 0.27 for family psychoeducation. Most improvements occurred during the first 12 months. Conclusions Intensive implementation strategies (toolkits, clinical training, implementation facilitation, data-based feedback) over 12 months can facilitate the implementation of EBPs for psychosis treatment. The approach may be more effective for some practices than for others. Supplementary Information The online version contains supplementary material available at 10.1007/s10488-021-01136-4.


Introduction
Evidence-based practices (EBPs) can improve treatment outcomes for patients with psychosis. However, services must adhere to EBP model principles, which is rare in daily clinical work (Bighelli et al., 2016;Weinmann et al., 2007). Researchers and policy experts have therefore proposed using fidelity scales to assess whether a practice is implemented according to the core principles and procedures defining the EBP. Although the crucial outcome of EBPs is to improve patients' health and quality of life, fidelity is a measurable, intermediate outcome of the implementation of EBPs (Proctor et al., 2011). Fidelity scales can guide implementation and assess quality , though few studies have measured fidelity for multiple EBPs over several points in time (McHugo et al., 2007).
Routine mental health service providers typically implement EBPs with variable quality because they lack implementation supports. Clinical researchers have therefore developed theories, models, and frameworks for implementation strategies (Damschroder et al., 2009;Nilsen, 2015;Proctor et al., 2009), including strategies for evidence-based psychosocial interventions for people with severe mental illness (Menear & Briand, 2014). Strategies generally entail engaging managers and clinicians, helping practitioners to understand the needs for change, providing toolkits with a practice manual, conducting workshops to build enthusiasm and train practitioners, and offering longitudinal supervision and small group discussions based on feedback from fidelity assessments and other measurements. Experts recommend that implementation supports should be reasonably intensive, sensitive to context-specific conditions, and adjusted to the implementation phase (Menear & Briand, 2014). A compilation of Expert Recommendations for Implementation Change lists 73 implementation strategies with definitions (Powell et al., 2015), but many of these strategies are rarely used (Perry et al., 2019). The US National Evidence-Based Practices Project, using a comprehensive but small set of implementation strategies, achieved a large increase in mean fidelity for five EBPs for severe mental illness across 53 sites (McHugo et al., 2007). Implementation strategies should reflect the aims and needs of the specific project, and strategies should be reported in sufficient detail to facilitate replication (Kirchner et al., 2020;Proctor et al., 2013). Research on specific implementation strategies in general health care is becoming common, but mental health services, including for EBPs for patients with psychosis, also need studies (Powell et al., 2019). Implementation of EBPs in mental health services is needed to address the devastating impact of behavioral health disorders in the global community, and specific implementation strategies are needed to achieve this (Dixon & Patel, 2020).

Aims
The aim of the current cluster randomized trial was to evaluate the effectiveness of intensive support to implement EBPs for the treatment of patients with psychosis in routine public mental health services. We hypothesized that experimental sites receiving intensive implementation support would achieve higher fidelity than control sites receiving usual support.

Study Design and Sites
We used a cluster randomized trial to examine the effect of intensive implementation support for 18 months to mental health clinical units implementing EBPs for treatment of people with psychosis (ClinicalTrials NCT03271242, registered 5 September 2017 after recruitment of the clinical units, but before completion of data collection and data analysis). Each clinical unit chose two of four core EBPs for implementation. Based on a pairwise randomization design, each site implemented one practice assigned to the experimental condition and the other practice assigned to the control condition.
Mental health clinics in six of the 19 Norwegian health trusts, serving 38% of the country's population in urban and rural areas, participated in the study. The primary unit of analysis was 39 clinical sites providing services to adults or adolescents with psychosis (26 community mental health centers with outpatient clinics, mobile teams, and local inpatient wards; ten inpatient departments for adults with psychosis; three departments for adolescents).
The manager of each clinical unit signed a written consent to participate in the study, including consent to randomization. The Regional Committee for Medical andHealth Research Ethics in Southeastern Norway (Reg. No. REK 2015/2169) and the data protection officer for each health trust approved the study, which followed the principles in the Declaration of Helsinki.

Power Analysis
In the US National Evidence-Based Practice Project, the mean EBP fidelity increased from 2.28 (SD 0.95) at baseline to 3.76 (SD 0.78) at 12 months (personal communication from Gary Bond, Dartmouth Psychiatric Research Center, 2014). We assumed a similar mean increase in fidelity over 18 months for the experimental practices and no increase for control practices. Based on a two-tailed significance level of 5% and 90% power, we estimated that the overall hypothesis would be adequately powered with a minimum of eight sites in each arm for each practice. With 39 units as experimental sites for one practice and control sites for another, the study had sufficient power for analyses of differences for all practices combined and potentially adequate power for each of the four individual practices, assuming the average number of sites per arm for each practice was eight or nine.

Evidence-Based Practices for Implementation
The research group selected five EBPs for patients with psychosis that met several criteria: treatment with strong evidence and/or importance in the Norwegian national guidelines on treatment for people with psychosis (Helsedirektoratet, 2013), relevance for most patients with psychosis, and already partly established or with available training programs. In May 2015, in preparation for the current study, we conducted a survey among the clinical units in the participating health trusts on their preferences regarding each of these five practices. Four of the practices were preferred by the majority of the 26 responding units. Two were medical practices (physical health care, antipsychotic medication management) that all units were already providing without measurement of quality, and two were psychosocial practices (family psychoeducation, illness management and recovery) that were new to almost all units. Thus, the four practices were previously unavailable or not implemented to evidencebased standards. We eliminated the fifth practice (individual placement and support) from inclusion in the study design because it was preferred by a minority of the clinical units. Table 1 shows a brief description or components of each of the four practices. Previous papers described the four practices in greater detail (Egeland et al., 2020;Joa et al., 2020;Ruud, 2020a, b).

Randomization
We assumed that choice would enhance motivation, following advice from the Medical Research Council in UK for local adoption of complex interventions (Craig et al., 2008). In March 2016 all 39 clinical units received a detailed description of each of the four practices to choose the two practices they wanted to implement, accepting that the unit would be randomized to experimental site for one practice and control site for the other. As shown at the top of Fig. 1, 26 units chose physical health care, 17 chose antipsychotic medication management, 14 chose family psychoeducation, and 21 chose illness management and recovery. For each clinical unit, we randomly assigned one of the chosen practices to the experimental condition (intensive implementation support) and the other to the control condition (minimal support). Thus, each clinical unit became an experimental site for one practice and a control site for the other practice. Stratified randomization achieved a balance between arms for each of the six possible pairs of two practices. Figure 1 shows a flow diagram of the randomization. Two research methodologists, blind to the identity of the 39 clinical units, conducted the randomization in April 2016. The four EBPs formed six pairs of EBPs (six different combinations of four EBPs chosen pairwise). We grouped all sites within each EBP pair and randomized them as a block to balance the number of sites assigned to each condition across blocks. We offered all sites the implementation support as planned and completed fidelity scores for all sites at four time points. We did not attempt to blind fidelity assessments.

Intervention
As shown in Table 1, the intensive implementation support included four components: a toolkit for the practice, training for clinicians in the practice, implementation facilitation, and feedback from the fidelity assessments and from a questionnaire to clinicians on their experiences of the implementation process (Hartveit et al., 2019). The intervention period covered 18 months, from 1 September 2016 to 28 February 2018.
We distributed the printed toolkit at the start of the study to experimental sites. Experimental and control sites could access the toolkit on a website. The clinical training occurred during the first weeks of the intervention period. On average, nine to ten managers and clinicians from each site participated in the clinical workshops for their experimental practices. The average was four for family psychoeducation because a smaller number of clinicians provided the intervention. For the two psychosocial practices, trainers provided telephone supervision for 12 months after the clinical training.
Implementation facilitators visited each site every other week for 6 months and then monthly for 12 months. Each health trust recruited one to four part-time implementation facilitators to give implementation support to their participating clinical units. The facilitators were mostly mental health nurses with clinical experience working with patients with psychosis, and experience with quality improvement, but they were not experts in any of the four EBPs. In two workshops preceding the start of the intervention period, an implementation expert trained the facilitators in implementation facilitation. During the 18 months of implementation, after an initial phase with lectures and exercises, the facilitators met with the implementation expert every 6-8 weeks for further training, discussion, and networking. The implementation facilitation followed the Consolidated Framework for Implementation Research, focusing on elements and stages in the implementation process, as described in Table 1 (Damschroder et al., 2009;Grol et al., 2013;Rafferty et al., 2012). The implementation facilitators' role was to help the sites to use quality improvement procedures in the implementation of the EBP, like it had been done in a large Dutch project on implementation of six EBPs for treatment of patients with psychosis (Harvey & Lynch, 2017;Van Duin et al., 2013).
Site leaders received feedback every 6 months for the experimental practice on fidelity and from an online questionnaire to clinicians on their experiences of the implementation process (Implementation Process Assessment Tool-IPAT) (Hartveit et al., 2019). The site leaders received no feedback for the control practice.

Outcome Measures
The primary and only outcome measure was EBP fidelity, measured using fidelity scales for each of the four practices. Other researchers developed the Family Psychoeducation Fidelity Scale and the Illness Management and Recovery Fidelity Scale, and we reported psychometric properties for the scales elsewhere (Egeland et al., 2020;Joa et al., 2020). The current study investigators developed the Physical Health Care Fidelity Scale and the Antipsychotic Medication Management Fidelity Scale, reporting descriptions of the scales and their psychometric properties in earlier papers (Ruud, 2020a, b). The psychometrics of the four fidelity use advisory workgroups use an implementation advisor) Facilitation of the implementation process and quality improvement strategies were offered by implementation facilitators as meetings on site every other week for six months and then monthly for 12 months. The facilitation model built on teaching and encouraging managers and clinicians to organize the implementation process, identify and overcome implementation barriers, plan and monitor phase specific activities using Deming's circle and flow charts, collect data for feedback and monitoring, recognize contextual factors, tailor the implementation process, and build systems to sustain the implementation Feedback at baseline and after 6, 12, and 18 months (ERIC: audit and provide feedback) A written report with fidelity scores and comments for the experimental practice was sent to the site manager within a few weeks after each 6 months fidelity assessment. Scores were discussed with the site manager to correct any misunderstandings Feedback on the results from an online questionnaire (IPAT) to clinicians on their experiences of the implementation process was sent to the site manager after every 6 months for the experimental practice if five or more of the clinicians chosen by the manager had completed the questionnaire (Hartveit et al., 2019). The feedback contained diagrams of the answers on each question and comments to help the manager understand the staff's experience and how the manager could support the implementation process in the site Component available for the control sites Written description of the practice A written description of all the four practice (one part of the toolkits) was sent to all clinical unit as information before they chose which two practices they would implement 1 3 scales were good to excellent. All four fidelity scales followed the same format and scoring . Using multiple items with each rated on a 5-point behaviourally anchored continuum, a rating of 5 indicated full adherence to practice guidelines, a rating of 1 represented substantial lack of model adherence, and ratings of 4, 3, and 2 represented gradations between these two extremes. We calculated total scale scores as the unweighted sum of item scores, divided by 5. By convention, a score of 4.0 or higher is considered adequate fidelity (McHugo et al., 2007).

Procedures
Baseline fidelity assessment occurred in May-June 2016 after the randomization and before the start of the implementation intervention in September 2016. Subsequent fidelity assessments occurred at 6, 12, and 18 months, during March-April 2017, September-October 2017, and March-April 2018. Two trained assessors rated fidelity for the two practices being implemented in each clinical unit. Fidelity assessors conducted site visits in person, rated fidelity independently, and resolved discrepancies by consensus. The fidelity visits for family psychoeducation and illness management and recovery included interviews with managers and clinicians and inspection of written material. Fidelity visits for physical health care and antipsychotic medication management included interviews with managers and clinicians and inspection of written material, using subscales to rate documentation found in 10 randomly selected patient records.

Analyses
We described fidelity scores reporting means, confidence intervals, and distributions across all sites at baseline (before the start of the intervention) and at 18 months. We estimated linear mixed models to analyse the overall difference between experimental and control group fidelity over time. The models included fixed effects for time, modelled as second-order polynomial to account for possible non-linear effects, group, and the interaction between the two. Models included random intercepts for units as well as random slopes for time. We used an unstructured covariance at the unit level and AR(1)-type of covariance for within-unit correlations in time. A significant interaction term implied significant differences between the groups in overall trend. Post hoc analyses assessed within-group changes between two time points and between-group differences in changes. We analysed all practices together and each of the four practices separately. We conducted residual diagnostics by assessing the residuals graphically.
We reported the results of main analyses as regression coefficients (RC), standard errors (SE) and p-values and illustrated graphically; and presented post-hoc analyses as mean within-group changes and mean differences in change between the groups with the corresponding 95% confidence intervals (CI) and p-values, and effect sizes (Cohen's d) for the mean differences for all time intervals (Cohen, 1992). We used SPSS for Windows version 26 for descriptive analyses and SAS version 9.4 for linear mixed model analyses. Table 2 shows the mean (CI) fidelity and distribution of fidelity scores of the four practices across all sites at baseline and at 18 months. The fidelity scores across all practices at baseline were poor. Only two (3%) of the 78 practices (39 sites with two practices each) were already implemented with adequate fidelity (4.0 or above) at baseline. One was family psychoeducation (experimental site), and one was illness management and recovery (control site). At 18 months, 13 experimental sites (33%) had reached the adequate fidelity score of 4.0 or more, compared to only two control sites (5%). Ten (77%) of the 13 experimental sites that reached an adequate fidelity score, were implementing illness management and recovery. Table 3 shows the main results of the linear mixed models assessing the difference in fidelity over time between  experimental and control groups, adjusted for cluster effect on unit level. The two last rows in the table show the results for the interaction between time and groups. Large values of intraclass correlation coefficient at the unit level reflected large variation among sites for all practices. Combining the four practices, the overall increase in fidelity scores over time was significantly greater for experimental sites than for control sites. Illness management and recovery, physical health care and antipsychotic management also showed significantly greater increase in fidelity over time, while family psychoeducation did not. The greatest increase was for illness management and recovery. Figure 2 displays the differences and shows that the significant changes occurred mostly during the first 12 months. Table 4 shows the post hoc analyses of the changes in mean fidelity for all time intervals for the experimental and control groups and for the difference in change between the two groups. For the combined four practices the difference between experimental and control sites in mean increase in fidelity score (within a range 1-5) over 18 months was 0.86 with 95% CI (0.21-1.50), p = 0.009, with corresponding effect size 0.89 (95% CI 0.43-1.35). For illness management and recovery, the difference was 2.88 (1.89-3.87), p < 0.001, with corresponding effect size 2.24 (1.05-3.44). For physical health care the difference was 0.30 (− 0.04-0.63), p = 0.080, with corresponding effect size 0.68 (− 0.09-1.46). For antipsychotic medication management, the difference was 0.22 (− 0.12-0.57), p = 0.209, with corresponding effect size 0.71 (− 0.37-1.70). As Table 4 shows, the two later medical practices had a significant difference in increase with medium to large effect sizes during the first 12 months. For family psychoeducation, we detected no significant changes over time and only small effect sizes. None of the practices showed a significant difference in change from 12 to 18 months. Figure 2 illustrates the changes reported in Table 4.

Discussion
This study demonstrated that intensive implementation support can facilitate significantly higher fidelity than usual procedures, supporting the study hypothesis. The effect was large for one of the four practices, medium to large for two practices, and absent for one practice. The significant changes occurred mostly during the first 6-12 months of intervention, and only one third of the experimental sites reached an adequate fidelity score of 4.0 after 18 months.
The parsimonious interpretation of our results is that intensive implementation supports can improve the fidelity of EBPs for patients with psychosis. However, the effects may vary for specific EBPs, which we consider below, and which has also been found in other studies of implementation Table 3 Results of linear mixed model assessing the difference of fidelity scores between intervention and control groups in time trend a Control group is reference group Variable Although many studies have demonstrated increased fidelity over time for a variety of EBPs , few randomized trials have evaluated the effectiveness of a defined package of intensive implementation strategies to achieve this goal. The US National Evidence Based Practice Project previously found a strong increase in fidelity over time for five EBPs, including 55% of the sites reaching an adequate fidelity score after 24 months, but the US study lacked a control group for comparison (McHugo et al., 2007). A recent cluster randomized study on implementation support for integrated treatment of concurrent mental health and substance use disorders found a moderate effect for experimental sites compared to control sites on a waiting list (Assefa et al., 2019). A recent trial comparing the effect of three levels (combinations) of implementation support for cardiovascular treatment over 12 months in community clinics found no significant differences in effect among the three levels of implementation support, but some differences compared with non-study control clinics (Gold et al., 2019).
The current study showed marked differences in combined fidelity improvements for the four practices. Illness management and recovery had a large effect of the implementation support compared to the other practices. Several factors may have contributed to this. The intervention is straightforward, primarily using a psychoeducational model. The baseline fidelity scores were low because sites were not previously using the model. The toolkit included a detailed manual, telephone supervision was given for 12 months, and many sites wanted to learn and use the practice. The large effect for the combined practices was to a large extent due to the effect for illness management and recovery.
The implementation supports for physical health care and antipsychotic medication management showed significant medium to large effects. These two interventions are complex, requiring considerable clinical judgment and shared decision-making, and both had higher baseline fidelity scores than the psychosocial practices because the medical practitioners were already providing these services. In addition, fidelity assessments using patient records may have made it more difficult to achieve high fidelity scores due to lack of documentation rather than lack of implementation. Nevertheless, these two practices still achieved significant effects over time. We have not found a comparable study on the effect of implementation support on fidelity to an evidence-based model of physical health care. Our medium effect of implementation support on antipsychotic medication management fidelity was similar to what was found in a study using another fidelity scale for medication management in the treatment of schizophrenia (Howard et al., 2009).
The implementation support for the family psychoeducation showed a lack of significant changes and small effect sizes. The weak result may have occurred because of serious confounds: one of the seven experimental sites was already implementing the practice at baseline, two experimental sites decided not to implement the practice, and the total number of sites was small. Small numbers and poor compliance may have undermined the experiment for this practice.
The current study had several strengths: it was one of few randomized controlled trials assessing an intensive implementation support strategy for implementing EBPs Fig. 2 Changes and differences in fidelity scores between experimental sites and control sites from baseline to 18 months: mean, 95% CI and significance of difference at each time point (*p < 0.05, **p < 0.01) for the treatment of patients with psychosis. In addition, it used random assignment to a clearly defined implementation approach supported by an extensive literature review, a representative sample of routine public mental health service units with limited additional resources, the inclusion of four core EBPs, implementation support over 18 months, and extensive efforts to measure fidelity with well validated scales.
Several limitations also warrant attention. The small sample lacked power to detect differences between groups for some practices, the EBPs may have differed in difficulty of implementation, and the fidelity scales may have been non-comparable (Egeland et al., 2020;Joa et al., 2020;Ruud, 2020a, b). In addition, two sites chose practices to implement that they were already implementing at adequate fidelity at baseline, precluding the possibility of significant improvement. Further, the design with pairwise 1 3 randomization within each clinical unit may have resulted in treatment contamination within sites and influenced the implementation of the control practice. Finally, generalization from Norway, a high-income country with strong government support for mental health care, may be limited.

Conclusions
The study showed that intensive implementation support can improve the fidelity of EBPs in routine mental health services but with variability across practices. The effect was most apparent during the first 12 months. We recommend that future studies examine different components of implementation strategies.

Guidelines Followed
The study followed the Consort Extension guidelines for cluster randomized trials, and the completed checklist for such studies are submitted together with the manuscript.