Over the last three decades, numerous EBPPs have been designed for teachers to support students in schools (Sanetti & Collier-Meek, 2019; Tanner-Smith et al., 2018). However, there is a significant implementation gap resulting in limited uptake and routine use of EBPPs (Lyon & Bruns, 2019), thus limiting their reach and overall impact. Staff development through didactic training is a core implementation strategy to address implementation gaps (Lyon et al., 2017; Owens et al., 2014). Pre-implementation didactic training, however, is often insufficient or suboptimal at promoting teachers’ EBPP adoption (i.e., initial uptake) and fidelity (i.e., adherence to core practices) in pursuit of achieving meaningful changes in youth outcomes (Joyce & Showers, 2002; Merle et al., 2022a, 2022b; Reinke et al., 2008). Only around 40% of teachers adopt a new practice with training alone (Sanetti et al., 2013), and among those who do adopt, fidelity deteriorates within 10 days of adoption (Reinke et al.., 2008; Sanetti & Kratochwill, 2009). Many implementation efforts begin with the provision of training and do not incorporate the individual motivational factors linked to implementer behavior change (Beidas & Kendall, 2010). Because behavior change is necessary for successful implementation (Lyon et al., 2018), failing to address individual-level motivational and volitional factors may moderate implementation outcomes associated with training and follow-up consultation. Given this, our team developed an implementation strategy to complement EBPP training and consultation that targets individual-level mechanisms of behavior change. In this pilot randomized controlled trial, we investigated the longitudinal effects of Beliefs and Attitudes for Successful Implementation in Schools for Teachers (BASIS-T) against an active comparison (AC) control condition on mechanisms of behavior change and implementation outcomes (i.e., adoption and fidelity) across one year within the context of training and consultation on a well-established universal EBPP.

Implementation Strategies to Support Teachers

In order to ensure that teachers successfully implement EBPPs, they need to be supported both prior to and during implementation (Merle et al., 2022a, 2022b). The discrete practices used to support teacher implementation, implementation strategies, represent the methods and techniques used to promote implementation outcomes (e.g., appropriateness, feasibility, adoption and fidelity; Powell et al., 2015). School-based implementation research has largely focused on pre-implementation training, and active-implementation coaching and consultation supports (Fallon et al., 2015; Noell et al., 2014; Solomon et al., 2012; Stormont et al., 2015). Traditional training and consultation models are cornerstone strategies designed to support implementation (Beidas et al., 2012), yet many implementers fail to successfully deliver an EBPP after receiving training and consultation, even when these supports are high quality (Nadeem et al., 2018). In fact, many practitioners are ambivalent to change or lack the self-efficacy to overcome perceived barriers to implementation, which influences adoption decisions (Dart et al., 2012). Moreover, early intervention adopters tend to stop using a new practice shortly after starting (e.g., within 4–6 weeks; Stirman et al., 2012).

In response to these implementation challenges, researchers have developed and tested the impact of specific consultative strategies, such as performance-based feedback (Collier-Meek et al., 2017), prompts/reminders, (Fallon et al., 2018) and practice monitoring (Mouzakitis et al., 2015), on intervention fidelity. Sanetti and colleagues (2013) also utilized the Health Action Process Approach (HAPA) to develop an implementation planning strategy to boost implementation behaviors. However, these implementation strategies often assume that implementers are motivated to adopt and deliver the EBPP (Reinke et al., 2011). Additionally, follow-up implementation supports require additional resources, such as staff and time. Robust pre-implementation experiences may have the potential to yield better early adoption results, which could reduce the number of teachers needing intensive follow-up supports and thus reducing the resource burden for already resource-limited schools. Finally, although a variety of strategies have been developed and tested with teachers, a lack of strategy specificity in the literature has made it difficult to discern how and why these strategies achieve their effect (Merle et al., 2022a, 2022b).

Gaps in the Research

This study seeks to address multiple gaps in the literature. First, there are few pre-implementation strategies available for use in schools that are informed by behavior change theory. Therefore, there is a continued need for school-based implementation research that utilizes theory to develop implementation strategies (Larson et al., 2021; Lyon et al., 2019; Wolfenden et al., 2021). Since the yield of training alone results in around 40% of participants implementing a new practice (Sanetti et al., 2013), there is a need to improve upon the techniques used to enhance training, such as ensuring that participants are motivated prior to beginning training, and have a plan to enact the practice prior to beginning implementation. Increasing the yield of training will theoretically reduce the number of teachers needing intensive, ongoing support. There is also a limited understanding of precisely how implementation strategies achieve their effects, which has created a “black box” problem in implementation research (Lewis et al., 2018). Moreover, there are few studies that explicitly target the underlying mechanisms of behavior change within school-based implementation. This study will seek to build off previous work to uncover the behavior change mechanisms that are most important to support teacher EBPP implementation in schools.

Beliefs and Attitudes for Successful Implementation in Schools (BASIS)–for Teachers

BASIS was developed as an implementation strategy to complement the training and consultation provided by EBP purveyors. It is designed to target individual-level mechanisms of behavior change. BASIS is grounded in two well-established theories of adult behavior change—the Theory of Planned Behavior (TPB; Ajzen, 1985) and Health Action Process Approach (HAPA; Schwarzer, 2008, Schwarzer et al., 2011). Together, HAPA and TPB outline: (1) motivational factors (i.e., task self-efficacy, subjective norms, attitudes/outcome expectancies) that act as hypothesized determinants of behavioral intentions (defined as an individual’s commitment to exhibiting a particular set of behaviors; Ajzen & Manstead, 2007) and (2) volitional factors (Schwarzer et al., 2011) that involve mechanisms and supports that are essential to enable individuals to enact the behaviors they intend to perform (Hagger & Chatzisarantis, 2014).

As most pre-implementation didactic trainings for teachers occur in a large group, BASIS-T also uses a group-based format, which increases feasibility and reduces cost (Joyce & Showers, 2002). Theoretically, BASIS-T works by shifting teachers’ motivational mechanisms to implement by impacting attitudes, self-efficacy, and social norms during the pre-session—and in the post-EBPP session, it is designed to increase volitional mechanisms, and intentions to implement through group-based implementation planning (Ajzen, 1985; Schwarzer et al., 2011). Together, BASIS-T aims to promote early adoption and persistence towards reaching high fidelity. Moreover, to protect against deterioration of intervention fidelity, teachers receive a digitally delivered booster 15 days post-training. BASIS-T sits at the intersection between the preparation and initial implementation phases of the implementation process to promote greater numbers of teachers who initially adopt the EBPP in response to training and higher levels of fidelity among teachers in response to follow-up consultation (Aarons et al., 2011). BASIS was developed iteratively over time originating from practiced-based evidence, and it has been tested with different school-based professionals (Cook et al., 2015; Larson et al., 2021; Lyon et al., 2019). Detailed descriptions of BASIS-T content, structure, and specificity can be found in the Procedures section of this paper, Supplemental Files 1 and 3, and in Larson et al. (2021).

In the first pilot, we examined a preliminary version of BASIS using a pre-post, no control group design, with educators from 62 schools that bookended training in school-wide positive behavior intervention and supports (SW-PBIS). Pre-post-training surveys showed increased favorable attitudes toward EBPPs at post-intervention (Cohen’s d = 1.03; Cook et al., 2015). Attitudes, in turn, were associated with two measures of EBPP fidelity (Cohen’s d = 0.51; Cohen’s d = 0.67). Our team also developed a version of BASIS for school-based clinicians (BASIS-C), and the results of a small-scale randomized trial supported its feasibility, acceptability, appropriateness, and its preliminary impact (Lyon et al., 2019), encouraging support for the proposed trial.

Most recently, our research team evaluated the initial mechanistic effects of BASIS-T on the initial impacts immediately following the post-training session for the current study (Larson et al., 2021). Results from pre-post-training revealed that, compared to the control group, the BASIS-T group had significantly higher self-efficacy (p < 0.001) and outcome expectancies (p < 0.05). Because this 2021 study only assessed the impact on mechanisms before implementation occurred, this study builds on that work through a longitudinal evaluation of (a) whether BASIS-T maintains effects on mechanisms of behavior change over the course of a school year and (b) the impact of BASIS-T on teachers’ EBPP implementation outcomes (i.e., adoption and fidelity) and student behavioral outcomes in the context of purveyor provided training and consultation.

Purpose of Study and Research Questions

The purpose of this study was to conduct a pilot randomized controlled trial using a hybrid type III approach (Landes et al., 2019) to examine the longitudinal effects (one school year) of BASIS-T over and above EBPP training and consultation on (a) theorized mechanisms of behavior change, (b) teacher adoption and fidelity, and (c) classroom-level behavioral outcomes. Specifically, BASIS-T was deployed as an adjunct to “gold standard” training and consultation in the GBG, a well-established universal prevention program with evidence from several randomized trials supporting its effects on short- and long-term child outcomes (e.g., Bowman-Perrott et al., 2016). The following research questions guided this study:

  1. 1.

    Relative to the AC group, do teachers in the BASIS-T group have higher scores post-training on the hypothesized proximal mechanisms of change: attitudes, perceived social norms, self-efficacy, and intentions to implement? Do these scores sustain over the course of one school year?

  2. 2.

    Relative to the AC group, are teachers in the BASIS-T condition more likely to adopt the EBPP? Do they implement the EBPP with greater fidelity and frequency?

  3. 3.

    Relative to the AC group, are teachers in the BASIS-T condition associated with more positive classroom-level behavioral outcomes as indicated by the modified Direct Behavioral Rating (DBR) scale.

  4. 4.

    For significant mechanisms, is there preliminary evidence that the mechanism is associated with the effects of BASIS-T and with classroom-level behavioral outcomes?

Method

This study employed a hybrid type III cluster randomized controlled trial. The hybrid type III approach places greater emphasis on examining the implementation outcomes associated with implementation strategies, while also observing the intervention outcomes secondarily (i.e., the SEB outcomes associated with receiving the GBG; Landes et al., 2019).

Participants

Teachers from nine schools from a school district in the Northern Midwest region of the United States were recruited to participate. The partnering district was interested in collaborating in this study to improve the delivery of classroom-based practices to prevent social, emotional and behavioral (SEB) problems interfering with learning. Teachers were recruited based on their interest in learning about evidence-based classroom practices to prevent and address student SEB problems that interfere with learning. The schools had an average enrollment of 422 students (range 323–520), served a relatively racially (persons of color; M = 26%, min.: 44%-96%, max.) and socioeconomically (free and reduced-priced lunch; M = 44%; min.: 14%, max: 79%) diverse student population.

All nine elementary schools were actively implementing the universal level of school-wide positive behavior interventions and supports, although no data were available on the length of time it was being implemented nor the extent to which it was being implemented with fidelity. Randomization to condition occurred at the school level to reduce contamination across participants and was stratified to balance school characteristics (see Procedures). Out of 129 eligible staff in the nine schools, 88 teachers consented to participate (see CONSORT diagram; Fig. 1). Of these, 82 (93.1%) teachers attended the EBPP training. Of those attending the training, 81 teachers completed both pre- and post-training surveys, for a 98.7% retention rate. The teacher not completing post-training survey was in the BASIS-T condition. Chi-square and t-test analyses uncovered no statistically significant differences between post-survey completers and non-completers on gender, race, grade(s) taught, or any of the outcome variables collected at baseline. Table 1 displays participant demographics for the complete sample and each condition.

Fig. 1
figure 1

CONSORT diagram for study participation

Table 1 Participant demographics

The Good Behavior Game (GBG) Training and Consultation

The GBG is a universal EBPP that uses an interdependent group contingency in which all members of a group (i.e., team) have access to the same consequence, based on the behavior of the collective group (Barrish et al., 1969). The GBG encourages teacher use of social learning principles within a game-like context to reduce disruptive behavior and facilitate engagement. GBG has been evaluated for almost 50 years (e.g., Domitrovich et al., 2010) and endorsed as effective by numerous agencies (e.g., U.S. Center for Substance Abuse, NIDA, OJJDP), leading to its identification as a best practice. In the current study, all teachers participated in a standard, 1.5-day GBG training delivered by certified trainers, blinded to condition, after receiving BASIS-T or AC. GBG training included best practices for educational meetings: didactic content delivery, modeling, rehearsal activities, and performance-based feedback. Trainers also provided three follow-up consultation sessions across the academic year to embedded coaches in each school. The coaches served as a resource to teachers in their building who may need additional support to adopt and deliver GBG with fidelity.

Study Conditions

BASIS-T Condition Pre- and Post-Training Descriptions

Throughout BASIS-T, four components are embedded into a mixture of didactic information and interactive group-based activities, designed to target four theorized mechanisms of behavior change that lead to improved implementation outcomes (e.g., adoption and fidelity). See Supplemental File 1 for an in-depth description of BASIS-T theory of change, structure, components, and content as well as Larson et al. (2021). Supplemental File 3 displays the implementation strategy specifications in alignment with reporting guidelines (Proctor et al., 2013). The BASIS-T condition consisted of pre- and post-training sessions to bookend the GBG training. In this condition, teachers participated in a group-based, interactive, motivational session delivered by a member of the research team, which included a pre-training session prior to the GBG training (3 h). The pre-training experience involved a collection of group-based activities adhering to an elicit-provide-elicit structure, which included eliciting reflections from participants, providing information to stimulate participant thinking to set up reflection, and eliciting discussion to promote change talk among group participants. The facilitator also provided a space for educators to reflect on their professional values, discuss specific topics (e.g., addressing the access gap), share ideas, and problem-solve barriers to adopting new practices. The BASIS-T facilitator used group-based motivational interviewing (MI; Miller & Rollnick, 2012) techniques (i.e., open-ended questioning, reflective listening, ruler questions, pros/cons; Magill & Hallgren, 2019) to elicit change-talk and promote collective self-efficacy.

In the post-training session, which occurred immediately after the final day of EBPP training (i.e., GBG in this study) and lasted around 60 min. During the tailored post-session, participants were first asked to respond to a single-item dichotomous measure capturing their intentions to implement the EBPP. Depending on their answer, teachers convened into two smaller groups. Those who indicated they had intentions to implement the GBG began implementation planning, and those who indicated they did not have intentions to implement or are ambivalent engaged in a further motivational experience that involved normalizing ambivalence via ruler questions and a decisional balance activity to explore pros and cons of changing or not by trying out small components of the EBPP.

Teachers also received a digitally delivered booster 15 days after the beginning of the school year (self-paced, e-course experience developed in Articulate 360) that provided a tailored experienced based on (a) whether they had initiated EBPP delivery or not, (b) whether they had intentions to begin using GBG, and (c), whether they had no intention of using GBG.

The BASIS-T trainer (Author 2) was a doctoral level licensed school psychologist, trained in school-based MI (Frey et al., 2017), with extensive expertise and years of experience supporting educators to prevent and address student SEB problems. Moreover, the project also included a consultant who is a trainer with the Motivational Interviewing Network of Trainers (MINT; motivationalinterviewing.org) and has a history of conducting MI research (Hartzler et al., 2007).

Active Comparison Control Group

Teachers randomly assigned to the AC group received a 3-h pre-training session prior to GBG training, designed to control for dose and delivery of information. The AC facilitator defined, described, and advocated for EBPP implementation in schools and used an educational approach that emphasized didactic delivery of content with opportunities for teachers to reflect on the information that was shared. Teachers in the AC condition also participated in a 1-h post-training, which involved them reviewing and discussing the importance of EBP implementation and reviewing the definition and dimensions of fidelity. Finally, teachers received an e-booster 15 days after the start of the school year that was not tailored and revisited core concepts of the GBG. The trainer for the AC condition was carefully selected as a comparable match to the BASIS-T trainer in both skill in delivery of training content, demographics (e.g., age, race, gender), qualifications (PhD level) and training style.

Procedures

Recruitment and Randomization

Institutional Review Board approval was obtained by the university human subjects committee and the school district research department. Recruitment procedures began through communications with district leadership regarding the needs within elementary schools and GBG. This led to conversations with elementary principals regarding the nature of the project and providing opportunities to ask questions. Interested principals, met with their teaching staff to identify teachers who indicated an interest in receiving free training in an EBPP. Participating schools were randomly assigned to the BASIS-T condition or the AC condition after being paired via a Euclidean distance nearest neighbor matching analysis using variables related to enrollment size, percent of students receiving free and reduced priced meals, and percent of students of color. Each school had a single best matching school with one exception, which was assigned to its second and third closest matching schools, as they were a match, to facilitate a three-way match with the smallest overall Euclidean distance. Within these pairs and the triple match, we randomly assigned to one of the two conditions. Post-assignment, there were no statistically significant differences on any of the matching variables, no differences on several other student variables (i.e., percent of English language learners, qualified for special education, homeless), and no differences on percent of teachers with an advanced degree. Teachers were contacted via email to obtain consent and a link to the pre-training survey via Qualtrics. Online pre- and post-training surveys were collected from the AC condition and BASIS-T condition. Participants had one week prior to the pre-training session and one week after post-training session to complete surveys. Eighty percent of participants completed post-surveys within 2 days of the post-training session. Teachers received $140 for participating in training and $50 for each wave of data collection.

Condition Delivery Procedures

Teachers in both conditions received GBG training and pre- and post-training experiences at separate locations on the same day. The GBG training was provided by a third-party purveyor of the GBG. Teachers were combined by condition across schools for the 2-day training. Teachers in the BASIS-T condition (n = 42) were exposed to the following sequence of training: Day 1: BASIS-T pre-training session, GBG training part 1; Day 2: GBG training part 2, BASIS-T post-training session. Teachers in the AC condition (n = 39) were exposed to the same sequence as the BASIS-T group, only the pre- and post-training sessions involved AC activities.

Measurement Procedures

Demographics were collected at pre-training. Full-scale mechanism data was collected via the Research Electronic Data Capture (REDCap) at pre- and post-training (August, 2019). To capture the impact of BASIS-T over time, teachers also received REDCap surveys each month following post-training between September and February 2020, and one final administration in April, 2020. Data collection was interrupted due to COVID-related school closures in March, 2020. For each separate section throughout the survey, teachers received specific prompts that provided guidance for responding (e.g., “Please select the statement that most closely reflects your implementation of the good behavior game”; “Think back over this past month. On average, how many days during a school week did you play the actual game part of the good behavior game?”).

Measures

A detailed description of study measures, including reliabilities in the current sample, is provided in Supplemental File 2. We selected measures that allowed us to capture various mechanisms of behavior change that align with behavior change theory (i.e., attitudes toward EBP, ownership/role, outcome expectancies, self-efficacy, social norms, intentions to implement; Ajzen, 1985, Ajzen & Manstead, 2007; Schwarzer et al., 2011), implementation outcomes (i.e., adoption, degree of implementation, GBG fidelity, BASIS-T procedural reliability), and student behavioral outcomes. Regarding mechanisms, we selected existing measures with strong psychometric properties. For mechanisms where no measures existed, we constructed them based on guidelines for developing psychometrically sound measures of TPB constructs (Francis et al., 2004). The survey items for adoption, fidelity, and degree of implementation were developed based on prominent implementation outcome and fidelity literature as well as with existing questionnaires (Carroll et al., 2007; Perepletchikova & Kazdin, 2005, Proctor et al., 2011; Sanetti & Kratochwill, 2009). Finally, student behavioral outcomes were measured via teacher-reported class-wide Direct Behavior Rating (DBR; Sims et al., 2021). To reduce participant response burden, a subset of 2–6 items for each mechanism scale were administered at the monthly surveys (Fricker et al., 2014). Items that were selected had the highest shared variance with all items in the scale based on their squared multiple correlations at pre- and post-training, and all reliabilities among reduced subscales correlated 0.89 or above with full scales. The measurement reliabilities, correlations between full and partial item measures, and the timepoints at which each were captured are displayed in Table 2.

Table 2 Measure reliabilities, correlations, and time of collection

Mechanisms

To measure attitudes toward evidence-based practice, eleven items from the school-adapted version of the Evidence-Based Practice Attitudes Scale (EBPAS; Cook et al., 2018). Items were endorsed on a 5-point Likert-type scale ranging from “Not at All” to “Very great Extent” (subscale α range = 0.75–0.91).

The outcome expectancies measure assessed the degree to which teachers believed evidence-based classroom management practices would result in positive outcomes (e.g., “Evidence-based classroom management practices offer significant potential to improve outcomes for students”). Outcome Expectancies were endorsed on a 7-point Likert-type scale ranging from “Completely Disagree” to “Completely Agree” (α = 0.89–0.94).

The ownership or role in managing behavior and teaching appropriate behavior in the classroom measure (e.g., “Student behavior is the responsibility of parents—not teachers”). This construct was measured on a 7-point Likert-type scale ranging from “Completely Disagree” to “Completely Agree” (α = 0.71–0.82).

The modified subjective norms measure is an 8-item measure used to capture two types of EBPP implementation-related subjective norms: injunctive (what a social group would approve of) and descriptive (how a social group actually behaves). Injunctive norms are the perception of what ought to be or what the social group would approve of (e.g., “Others who I respect expect me to adopt and implement evidence-based practices that promote students’ SEB functioning”). Injunctive norm items (4 items, α = 0.71–0.90) are rated on a seven-point scale (-3 = “I should”, 0 = “Neutral”, 3 = “I Should Not”). Descriptive norms describe perceptions of how the social group actually does things (e.g., “Practitioners like me find the time to implement evidence-based practices”). Descriptive norm items (3 items, α = 0.69–0.84) were rated on a seven-point scale (-3 = “Strongly agree”, 0 = “Neutral”, 3 = “Strongly disagree”).

Two measures of perceived behavioral control (i.e., self-efficacy) were used. The General self-efficacy measure (α = 0.92). asked about teachers’ general confidence in using evidence-based practices (e.g., “When I try really hard, I am able to overcome barriers to implement evidence-based practices that help me manage classroom behavior problems”). The Task self-efficacy measure (α = 0.89–0.90) captures tasks associated with positive classroom behavior management generally (e.g., “I am confident I can routinely provide neutral redirection, correction, and/or performance-specific feedback to increase my students’ awareness of and engagement in classroom rules and expectations”). Items are rated on a 7-point Likert-type scale ranging from “Completely Disagree” to “Completely Agree.”

The Intentions to Use measure was used to capture teachers’ intentions to implement specific practices associated with positive classroom behavior management (e.g., “I have every intention of implementing the following practice: Positive reinforcement system is set up which allows students to access rewarding experiences when following classroom rules and expectations regarding behavior”). The scale includes 9 items (four general items, α = 0.98; five GBG-specific items; α = 0.74–0.88) and is rated on a 7-point Likert-type scale ranging from “Completely Disagree” to “Completely Agree”.

Implementation Outcomes

Adoption, fidelity, and degree of implementation were also measured via REDCap. We measured whether the teacher began using the GBG (adoption; 1 = yes, 0 = no), and whether the intervention components or practices were delivered as intended and with the correct dose and frequency (fidelity). Teachers who endorsed implementing the GBG completed a self-rated 14-item measure about the GBG practices they performed. Items were on a 3-point scale (1 = Done consistently and appropriately, 2 = Done inconsistently, 3 = Not done). Some items were more general to good classroom management (e.g., “Identify privileges, activities, and rewards to be earned”), and some items were more GBG-specific (e.g., “Announce and celebrate winning teams only”). An exploratory factor analysis using maximum likelihood extraction and rotating factor loadings using Varimax with Kaiser normalization identified two factors, with items loading in perfect alignment with those hypothesized: General fidelity (6 items) and GBG-specific fidelity (8 items). Therefore, we selected the GBG-specific fidelity items to analyze and report on for this manuscript. Teachers who did not implement at that timepoint had every GBG-specific item replaced with a 3 (“not done”).

Degree of implementation was measured every other month, with teachers responding on a 5-point scale, with lower scores indicating greater likelihood of implementation: 1 = Currently implementing at least some parts of GBG, 2 = Started implementing GBG but stopped entirely, 3 = Have not started implementing GBG but intend to, 4 = Have not started implementing GBG and uncertain if want to, 5 = Have not started implementing GBG and do not intend to.

Student Behavioral Outcomes

We used DBR to measure student outcomes. Teachers were provided with operational definitions and examples of the three target behaviors in question (i.e., on-task, prosocial, and disruptive behaviors). They were asked to reflect on a typical day in their classroom and rate the proportion of time that students were on-task and engaging in prosocial behaviors as well as the number of disruptive behaviors that their students engaged in (Sims et al., 2021).

BASIS-T Procedural Fidelity Assessment

Delivery of BASIS-T was recorded and Authors 1 and 4 independently rated procedural fidelity to capture the degree to which the core features of BASIS-T were delivered as planned (adherence to content, group-based motivational interviewing, and treatment differentiation). Results indicated that the three BASIS-T components were delivered as planned and are reported more extensively in Larson et al. (2021).

Data Analytic Approach

Demographics and descriptive variables were analyzed using means, standard deviations, and group proportions. Crosstabulations with chi-square tests were used to test condition differences in proportion to who adopted the GBG immediately post-training and who adopted GBG at any point. T-tests compared conditions on student behavior outcomes. Spaghetti plots were created for each mechanism and implementation outcome variable to visually assess the participant and condition time trend trajectories (Swihart et al., 2010). Mixed effects modeling was the primary method of testing longitudinal research questions (Magezi, 2015). We applied random effect terms to adjust standard errors due to nesting by timepoint within teacher within school for longitudinal variables. For each dependent variable (implementation status, GBG fidelity, frequency) an initial model was computed that included covariates for condition and time, followed by models that added the interaction of condition and time, quadratic time, and an interaction of quadratic time by condition (Nagle, 2018). In most models, time was centered at pre-test. Time was centered at the first post-training monthly survey for implementation status, GBG fidelity, and delivery amounts, because that was the first time that participants were surveyed about implementation. Model fit was used as the criteria for research question significance testing (Nagle, 2018). We only report here on the best fitting models, selected via the chi-square, Akaike Information Criteria, and Bayesian Information Criteria, with significance and smaller values indicating better model fit (Nagle, 2018). Adoption was analyzed as a two-level mixed effects model, without time, using a Bernoulli distribution link function. Due to the relatively small sample size, restricted maximum likelihood estimation was applied to obtain model coefficients; as appropriate full maximum likelihood was used to obtain model fit statistics. Nesting was found within the implementation status variable (ICC = 0.19). All other intraclass correlations among mechanism and fidelity variables were less than 0.05 with nonsignificant deviance tests, and therefore random effects were fixed in these models to permit convergence.

For the exploratory tests of the basic elements of mediation analyses, we computed Pearson’s correlations and point-biserial correlations between task self-efficacy, implementation status immediately post-training, implementation status at any time during follow-up, student on-task behavior, student prosocial behavior, and student disruptive behavior (van Kestern & Oberski, 2019). Due to the pilot, exploratory nature of the study, we did not adjust for familywise error rate (Feise, 2002).

Effect Sizes

Because pilot studies are underpowered by definition, our study aims to avoid Type II false negatives in preparation for a larger trial. Therefore, we report unstandardized effect sizes and interpret marginal findings in the results below (Lee et al., 2014). We caution readers against drawing strong inference. Reporting effect sizes is crucial for interpretating applied research results and for determining practical significance within intervention science (Kelley & Preacher, 2012). However, clear guidelines for reporting effect size in multilevel models have not been provided (Peng et al., 2013). Non-standardized effect size estimates are produced with multilevel models. While non-standardized effect sizes make it difficult to compare magnitude of effect across outcomes, they do provide information about the directionality of the effect. Thus, in this study, for both non-significant and significant findings, we report on the directionality of the beta estimates provided through multilevel modeling (Table 3). We report Cohen’s d effect sizes for student behavior outcome data (Cohen, 1988).

Table 3 Longitudinal Mixed Effects Models Predicting Change Over Time by Condition

Power Analyses

For our primary research question (RQ 1), this pilot trial is powered to detect a minimum detectable standardized effect size of 0.73 to 0.76. This assumes stratified block randomization at the school level, grade-level cluster Intra-Class Correlation ranging from 0 to 0.15, nine schools with nine teachers per school, and school-level effect size variability of 0.01. As is the norm for pilot studies (Leon et al., 2011), this study is underpowered when considering the complexity of the full analytic model and the multiple paths/tests. To be powered to detect a more moderate effect (d = 0.5), the trial would need to include approximately 15 schools with nine teachers each.

Missing Data

A missing data analysis was completed to determine whether survey completion was balanced between conditions. We calculated the proportion of participants who completed each monthly survey by condition. We also sought to determine whether differences at baseline mechanism score correlated with whether participants responded to surveys. Missing data was deleted listwise after selecting mechanisms of interest in order to retain as much data as possible.

Results

Baseline mean and standard deviation scores for all mechanism variables are displayed in Table 1. The conditions were highly similar on all variables. Model coefficients are displayed in Table 3. Results from the missing data analysis indicated that there was equal missing data between conditions. On average, participants in both conditions completed 72% of monthly surveys. Across the five monthly surveys, 75% of participants completed each monthly survey on average (range: 64–87%). Correlations between survey completion and mechanism score indicated weak correlations (M = 0.010, range =  − 0.0095–0.141). Due to the nature of this pilot study being exploratory, we focus our results on both statistically and marginally significant findings.

RQ1. Overall, model findings suggested that there was one significant effect (task self-efficacy) and one marginal effect (attitudes toward EBP) between the two conditions on dependent variables representing mechanisms of behavioral change. Task self-efficacy increased significantly from pre- to post-training for teachers in the BASIS-T condition while teachers in the AC condition deteriorated during this timeframe (Time Est = − 0.307, p = 0.003; Time × Condition Est = 0.289, p = 0.033; Time2 Est = 0.036, p = 0.047; Time2 x Condition Est = − 0.055, p = 0.025; Model Chi square = 5.32, p = 0.069) (Fig. 2). Task self-efficacy was the only behavior change mechanism variable that had a significant quadratic time trend, therefore results below discuss linear time trends. General self-efficacy scores decreased over time for both conditions (Est = − 0.338, p < 0.001) and there was no difference between conditions across time. Total scores on the EBPAS significantly deteriorated over time (Est = − 0.067, p < 0.001), and differences by condition over time approached significance in favor of BASIS-T (Est = 0.032, p = 0.084). Both groups had outcome expectancy scores that significantly deteriorated over time (Est = − 0.084, p = 0.003) and there were no differences by condition. The BASIS-T condition had a significantly lower ownership/role score on the pre-test (Est = − 0.472, p = 0.028), and there were no differences in change over time overall or by condition. Both groups had subjective norms scores that significantly deteriorated over time (Est = − 0.063, p = 0.009), and there were no differences by condition over time. When examining effect size estimates for each of the DVs representing mechanisms of behavior change, results indicated that all estimates were in the hypothesized direction favoring the BASIS-T condition. On average, both conditions demonstrated a reduction in specific intentions score over time (Est = − 0.168, p < 0.001). There were no differences between conditions at post-training or on rate of change over time on intentions to implement.

Fig. 2
figure 2

Task self-efficacy estimated change over time by month. Note Time point ‘0’ indicates immediate post-training score

RQ2. Our analyses found an immediate effect of BASIS-T on self-reported adoption of GBG. Crosstabulations with chi-square tests found a significant difference between conditions on the percentage of participants who adopted the GBG immediately post-training (AC n = 12, 40.0%; BASIS-T n = 28, 73.7%; χ2(1) = 7.853, p = 0.005). However, by the end of the academic year there were minimal differences between the groups in the percentage of participants who ever adopted any of the GBG practices (AC n = 26, 74.3%; BASIS-T n = 33, 82.5%; χ2 = 0.751, p = 0.386). This was consistent with mixed effects modeling, which found that degree of implementation was significantly better for the BASIS-T condition immediately after training (Est = − 0.812, p = 0.012; see Fig. 3). Both conditions’ degree of implementation improved over time (Time Est =  − 0.438, p < 0.001), and a significant and positive quadratic timepoint indicated flattening over time for both groups (Est = 0.114, p = 0.003). On average, the BASIS-T group had less improvement over time (from follow-up 1 to 5) than the AC group (Est = 0.134, p = 0.043) given the higher proportion of early adopters immediate post training.

Fig. 3
figure 3

Degree of implementation estimated change over time by month. Note Lower scores indicate greater implementation

There were no differences in self-reported intervention fidelity for GBG-specific practices immediately post training, and both conditions’ fidelity improved over time (Est = 0.162, p = 0.029). There was a marginal trend in favor of BASIS-T such that intervention fidelity remained steady for BASIS-T while worsening for the AC condition (Est =  − 0.191, p = 0.057). For teachers delivering the GBG, there were no differences between conditions in the number of times the game was played per day at the first timepoint post-training no changes over time, and no differences between conditions. There was a borderline finding in favor of the BASIS-T condition that indicated this condition may have played the game about one day per week more often than the AC condition at the first timepoint post-training (Est = 1.197, p = 0.097). There were no other differences in the number of times per week the game was played.

RQ3. Conditions did not statistically differ on DBR scores, though effect sizes were in the favor of the BASIS-T condition, for all subscales including student on-task behaviors (Cohen’s d =  − 0.368 [95% CI: − 0.836, 0.102], t (70) = -1.55, p = 0.125), student prosocial behaviors (Cohen’s d =  − 0.274 [95% CI: − 0.740, 0.194], t (70) = − 1.16, p = 0.252), and student disruptive behaviors (Cohen’s d = 0.380 [, t (70) = 1.60, p = 0.114). Again, these findings should be interpreted with caution given they were not significant at the a = 0.05 level.

RQ4. Table 4 displays exploratory analyses of the associations among the mechanism, implementation, and student outcome variables. Exploratory analyses indicated a possible small association between task self-efficacy immediately post-training and whether the GBG was implemented immediately post-training (r = 0.222, p = 0.069), and whether the GBG was ever implemented (r = 0.213, p = 0.081). There were statistically significant associations between task self-efficacy and student on-task behavior (r = 0.283, p = 0.022), and between task self-efficacy and student prosocial behavior (r = 0.263, p = 0.034), but no association between task self-efficacy and student disruptive behavior (r =  − 0.079, p = 0.529). Implementing the GBG immediately post training was associated with ever implementing the GBG (r = 0.581, p < 0.001), student on-task behavior (r = 0.259, p = 0.037), possibly associated with student prosocial behavior (r = 0.207, p = 0.098), but not associated with student disruptive behavior (r = 0.006, p = 0.962). Having ever implemented the GBG was not associated with student on-task behavior (r = − 0.037, p = 0.758), student pro-social behavior (r = 0.164, p = 0.169), or student disruptive behavior (r = 0.163, p = 0.173).

Table 4 Correlations among hypothesized and significant mechanisms and student outcomes

Discussion

The uptake and high-fidelity delivery of universal EBPPs enhances the public health impact of prevention science in schools. Cornerstone didactic training is a key implementation strategy used by schools to facilitate EBPP uptake and use (Joyce & Showers, 2002; Owens et al., 2014); however, due to the disappointing outcomes it tends to yield, it was our task to develop a theory-based strategy to boost the effects of training on teacher uptake and use of universal EBPPs and analyze the underlying mechanisms of action to understand how and why the effects occurred. This study used a hybrid type III, cluster randomized pilot trial design to examine the effects of BASIS-T on teacher implementation outcomes and observe student classroom behavior. Findings revealed that those who received BASIS-T were associated with a 34% increase in immediate adoption of a class-wide EBP compared to the comparison group. We also discuss the nuanced effects of BASIS-T on behavioral mechanisms and provide directions for future research and practice.

Two principles guided our interpretation of the findings. First, we grounded our interpretation in the fact that this was a pilot trial. Pilot trials are crucial for translational research processes because they detect potential signals that could inform planning of a larger, more rigorous, and adequately powered study (Eldridge et al., 2016). Consequently, we discuss both significant and marginal findings below as they elucidate potential signals of effects to inform research questions, measurement approach, and analytic models for a larger trial. Second, BASIS-T was evaluated in the context of a universal intervention. In a review of multiple meta-analyses of universal prevention programs, Tanner-Smith and colleagues (2018) found that, across universal prevention programs targeting school-aged youth, median average effect sizes fell between Cohen’s d = 0.07 and 0.16, with smaller effects on behavior. They concluded that the statistical benchmarks outlined in by Cohen (1988) for small (0.20), medium (0.50), and large (0.80) effects are inappropriate to use in interpreting findings from universal interventions. Thus, effects on student outcomes were interpreted in context of the estimates from Tanner-Smith et al. (2018).

Results from the current study contribute to a nuanced understanding of BASIS-T effects. First, BASIS-T teachers had significantly higher task self-efficacy and marginally higher attitudes towards EBPPs post-training than AC teachers. Task self-efficacy differs from general self-efficacy as it relates to confidence in one’s ability to perform specific behaviors (e.g., specific practices that constitute the GBG) rather than general confidence. These findings align with other research indicating that attitudes and self-efficacy impact behavior change (Sheeran et al., 2016). Second, teachers in the BASIS-T condition reported significantly greater early adoption compared to teachers in the AC condition following training. For example, 74% of teachers in the BASIS-T condition indicated they were early adopters of the GBG practices compared to only 40% in the AC condition. Across both conditions, exploratory analyses indicated that teachers with higher task self-efficacy were more likely to be early adopters, which correlated positively with student on task behaviors and negatively with student disruptive behaviors (Table 4). Conversely, having ever adopted was not associated with student outcomes. Third, we found that teachers’ ratings deteriorated across all mechanisms of behavior change over the academic year, suggesting that as time goes on, teachers are likely to develop less favorable attitudes, lower social norms, and weaker self-efficacy related to EBPP implementation. Although the BASIS-T group was initially associated with significant differences in outcome expectancies (Larson et al., 2021), these deteriorated over time. Fourth, regarding self-reported intervention fidelity, we found a marginal trend in favor of BASIS-T, indicating that fidelity remained relatively steady for BASIS-T, while weakening for the AC condition. There was a marginal finding in favor of the BASIS-T condition indicating that teachers in this condition reported that they played the game about one day per week more than the AC condition post-training. Last, student behavioral outcomes (on task, prosocial, disruptive) favored the BASIS-T condition, and had effect sizes of d = 0.27–0.38, which fall above the median average effect range for universal prevention programs on various outcomes. These findings, although not statistically significant, support the logical argument that earlier adoption to evidence-based behavioral interventions corresponds with improved behavioral outcomes. However, given the nonsignificant value in this study, additional research directly or indirectly associating higher fidelity with student outcomes is needed.

Implications for BASIS Theory of Change

This is the third study analyzing the effects of the BASIS strategy, which enables us to interpret our findings in the context of the results from previous studies. It appears that BASIS has an effect on initial adoption of EBPPs as well as on specific proximal mechanisms of behavior change, including task self-efficacy (p < 0.05) and potentially on outcome expectancies and attitudes toward EBPs (p = 0.08) (This study and Cook et al., 2015; Larson et al., 2021). In turn, increased task self-efficacy appears to increase the early yield of training on practitioner adoption, as significantly more practitioners who receive BASIS initiate implementation right away. However, in response to consultation, practitioners in the AC group are likely to catch up with those in the treatment group. While the proportion of practitioners adopting an EBPP become more similar to that of the BASIS-T condition over the course of the year, the sooner in the year practitioners adopt the EBPP, the greater the likelihood of improved student outcomes. Teachers’ attitudes toward an EBP are also highly dependent on the specific practices within the selected EBP as well as the process of selection, i.e., were teachers viewpoints considered when choosing the EBP? Favorable attitudes have been shown to impact teachers’ decisions to adopt a given practice (Sheeran et al., 2016).

BASIS may have a small, or even negligible, effect on sustained intervention fidelity, although adequately powered studies with rigorous observation of fidelity are needed. BASIS is designed to act on antecedent mechanisms of behavior change by engaging practitioners in a group-based motivational experience prior to and immediately after EBPP training. Fidelity, on the other hand, involves behavioral persistence after initial adoption has occurred, which is exemplified by practitioners’ effort to continuously improve their delivery of an EBPP with fidelity. Fidelity is largely the focus of implementation strategies like consultation, fidelity audits, and protected time for reflection and planning. The findings from the BASIS studies suggest that researchers need to design more precise implementation strategies that are tailored and designed to effect changes in specific implementation outcomes of interest throughout the implementation process. For example, strategies designed to promote adoption (i.e., initiation or starting to implement) may function differently than those aiming to enhance sustained fidelity. Implementation strategies are infrequently designed with such precision in mind, potentially attenuating the observed effects of some implementation efforts (Lewis et al., 2018). To address this, future iterations of BASIS will attempt to study and impact the extent to which teachers engage in existing, ongoing supports to promote sustained behavior change.

The findings from this study suggest the need for multi-faceted implementation approaches that combine and sequence different strategies to promote successful adoption and persistence towards high fidelity. The implementation literature is mixed with regard to whether multi-faceted or bundled strategies are more effective than single component, discrete strategies (Merle et al., 2022a, 2022b; Squires et al., 2014). Regardless, we concur with others who have articulated how the effects of multi-faceted strategies can be increased through precisely targeting mechanisms of behavior change (Lewis et al., 2018), and through careful sequencing of strategies at critical junctures of the implementation process (Kilbourne et al., 2014). For example, engaging practitioners in motivational experiences prior to training may prime them to be more engaged and responsive to training. In turn, volitional implementation strategies situated immediately post training, such as implementation planning, are well-positioned to help motivated practitioners initiate the adoption of EBPPs. Next, post-training consultation can be deployed once practitioners have had a chance to begin implementing an EBPP to support them to persist towards high fidelity, and make fidelity-consistent adaptations to fit the context.

Implications for Research and Practice

BASIS-T serves as a novel implementation strategy targeting motivational processes linked with individual behavior change in a group-based context that is designed to complement training and consultation. This study elucidates the importance of attending to task self-efficacy as a mechanism of behavior change related to adoption for prevention programming. This study necessitates a closer look at determinants of sustainability, i.e., what happens over the course of a school year or beyond that leads to the deterioration in behavior change mechanisms among implementers. Post-training implementation strategies should attend to this phenomenon and buffer implementers against experiences that weaken factors related to behavior change. Coaching grounded in motivational interviewing offers a useful approach to sustain motivational factors to behavior change once implementers experience the daily demands of their jobs (Frey et al., 2020).

Limitations and Future Directions

There are limitations of this study that warrant discussion and pinpoint directions for future research. First, this was a pilot trial with a relatively small sample of schools and teachers. Therefore, interpretations of marginally significant predictors were explored to inform future iterations of BASIS as well as the theory of change and should be interpreted with caution. Further, randomization occurred at the school level, which limits statistical power. This decision, however, was made to increase feasibility and reduce the risk of contamination because one mechanism of BASIS is social norms. Although we explored the association between significant mechanisms (i.e., task self-efficacy) and adoption, we had limited power to perform more robust mediation and moderation analyses. Future studies may consider teacher-level randomization with contamination prevention measures.

Another limitation to this work was the reliance on self-reported collection of implementation outcome data. Self-reported fidelity data alone tends to be overestimated compared to permanent product review and direct observation (McKenna et al., 2014). Therefore, future research testing the BASIS strategy should incorporate a more robust metric of implementation outcomes, such as observational coding of implementer behavior or permanent product review.

BASIS-T had an effect on immediate adoption and revealed potential mechanisms impacting this change. However, to ensure the sustainability of change efforts, future strategies that uncover the mechanisms that protect against implementer fade and regression toward the mean are warranted, as there is scant research in this area (Merle et al., 2022b). In this study, we attempted to protect against implementer deterioration in fidelity by including a timely booster experience roughly two weeks into the school year, which aligns with other research documenting this phenomenon (Noell et al., 2005). Future research should directly target determinants of successful implementation across the implementation process to ensure maintenance of effects. For instance, future iterations of BASIS intend to capture the degree to which teachers engage in ongoing supports, such as coaching and consultation. These data were not collected in this trial, which is a limitation.

Research findings from the various iterations of BASIS continually reveal limited variability in intentions to implement at the individual level, which may be due to social desirability or the way in which behavioral intention items are constructed that results in participants endorsing responses on the higher end of the scale. Developing measures to capture the full range of variability between participants in their status on behavior change is critical to explain variance in these constructs as well as predict variability in implementation outcomes (e.g., adoption and fidelity). Implementation science is still nascent, and researchers need to continually focus on establishing a common set of instruments to capture behavior change constructs related to the adoption and delivery of EBPPs that are validated for school contexts. One avenue explored by Moullin and colleagues (2018), involves incorporating the Rasch measurement theory when developing innovation-specific intentions, which offers advantages to classical factor analysis, such as being able to assess the appropriateness of response options.

Our implementation strategy focused on individual-level mechanisms of behavior change. Results of the pilot study provide generalizable knowledge that self-efficacy is an important mechanism for initiating a new behavior. Indeed, a variety of inner and outer-setting factors interact to determine the sustained use of innovations in schools (Owens et al., 2014). The fit of the innovation, the implementation climate, perceptions of support from colleagues and leaders, external policy and pressures to fulfill other obligations, and myriad other factors are known to contribute to the successful implementation of a new practice that must be identified and targeted in tandem with individual-level strategies (Flottorp et al., 2013). Typically, best practice in implementation science includes a carefully planned, dynamic collaborative process between stakeholders and implementation experts who assist in the selection of an innovative practice based on need (Wolfenden et al., 2021), and the utilization of a variety of strategies targeting the various socioecological levels of an organization at specific time points to achieve implementation success (Aarons et al., 2011; Fixsen et al., 2005). BASIS-T was developed to increase the yield of professional development in schools as a pre-implementation strategy. Since single implementation strategies likely only achieve a slice of the whole effect, future studies and real-world implementation efforts need to look at the efficacy of BASIS within a multi-level, blended approach during active implementation and sustainment phases to capitalize on the early gains and achieve implementation success.

There is enough preliminary evidence supporting BASIS that there is now a need for a larger efficacy trial enabling analysis of with whom, under what conditions, and how or why BASIS works. For instance, it is unknown whether participation in BASIS increases engagement in other evidence-based implementation strategies (e.g., performance feedback), which are important determinants for sustained fidelity (Stormont et al., 2015). Future research on BASIS should look at more proximal process outcomes, such as whether BASIS leads to greater engagement in the EBP training provided by the purveyor group as well as embedded consultation within the school setting. This study was also significantly limited due to interruptions caused by the COVID-19 pandemic. This halted the EBPP purveyor from collecting observational fidelity data for the later data points during the school year, and ultimately limited the amount of data we were able to collect and analyze. Findings related to self-reported fidelity may not hold for “gold standard” observed fidelity (Sanetti & Kratochwill, 2009, 2011).

Conclusion

Successful implementation of EBPPs boils down to behavior change, and motivation is a critical antecedent mechanism of behavior change. Among the behavior change constructs, we believe findings from this study and other studies suggest that task self-efficacy in particular serves as an important antecedent mechanism of behavior change that drives initial adoption. Our work on BASIS has continually been in pursuit of a mechanistic understanding of what, how, and why change occurs for whom and under which circumstances. We urge researchers to carefully target and monitor early adoption and intervention fidelity and consider the specific strategies that operate on each of these important implementation outcomes. This study’s findings advance understanding of the type of implementation strategies that complement pre-implementation training and post-training consultation in schools. Indeed, no single strategy guarantees implementation success, however, we encourage future research that continues to develop and test implementation strategies that clearly specify their change mechanisms at various levels and time points during implementation.