An extensive literature describes the strengths of randomized control trials (RCTs), often considered the “gold-standard” for research evidence, for testing mental health interventions. A key advantage of RCTs is that the study design minimizes influence of confounders via random assignment so that causal relationships can be drawn (Essock et al., 2003). RCTs are commonly distinguished along a continuum of efficacy to effectiveness (Flay et al., 2005). Efficacy trials, which tend to be explanatory and precede effectiveness trials, test whether interventions work under controlled experimental conditions to maximize internal validity (Weisz et al., 1995). For example, an efficacy trial might test a mental health intervention delivered by highly trained and supervised staff or graduate students in a university-based research clinic. Effectiveness trials are RCTs carried out in routine, usual care conditions, like testing an intervention delivered by clinicians in community mental health clinics (CMHCs). Effectiveness trials are designed to balance emphasis on internal and external validity, or applicability of findings to wider settings and populations (Flay et al., 2005; Kessler et al., 2009). To address the research-to-practice gap in mental health services and advance public health, the National Institute for Mental Health (NIMH) has made effectiveness research a strong funding priority and there have been increasing calls for effectiveness research to assess intervention effectiveness in “real world” settings to increase generalizability and public health impact (NIMH, 2021).

An important consideration across the efficacy-effectiveness spectrum is whether the participants are representative of the larger population. An intervention shown to be efficacious and effective can only claim to be so for groups similar to the population from which participants were selected (Flay et al., 2005). There are several notable differences between the study samples in efficacy and effectiveness studies. Efficacy trials often tend to have stricter inclusion criteria and enroll more homogenous participants, in terms of demographic and clinical characteristics (Weisz et al., 1995). To support the generalizability of study findings, effectiveness trials typically have broader inclusion criteria and enroll more heterogenous samples (e.g., higher symptom severity, greater comorbidity) (Huebschmann et al., 2019; Singal et al., 2014; Tansella et al., 2006). Although it is generally assumed that effectiveness trial samples are more representative of the general population compared to efficacy trials, prior work has primarily focused on describing differences in patient samples (Persons & Silberschatz, 1998; Rengerink et al., 2017). To our knowledge, the question of sample representativeness in effectiveness research has not yet been examined in clinician samples.

In the Exploration, Preparation, Implementation and Sustainment (EPIS) framework, inner context factors, like organizational and individual clinician characteristics, are important determinants of implementation success (Aarons et al., 2011). Characterizing how clinicians that participate in effectiveness studies compare to the broader population of community mental health clinicians, for whom intervention delivery is intended, is important to inform the generalizability of treatment and implementation outcomes (e.g., the acceptability of evidence-based practices (EBPs) among practicing clinicians). For example, clinicians that participate in effectiveness trials may be more motivated to learn new EBPs, highly supervised, and more likely to participate in research opportunities compared to the broader population of community clinicians (McLeod et al., 2019). They may also have more specialized training consistent with EBP implementation and more favorable attitudes towards EBPs, which could translate to selection bias in research trials. These factors may potentially differentiate clinicians that participate in effectiveness trials from nationally representative clinician samples, which may limit our understanding of the research-to-practice gap in mental health services.

Comparing clinicians participating in effectiveness trials to nationally representative survey samples of mental health clinicians on attitudes towards EBPs is one way to address the question of representativeness. The Theory of Planned Behavior (Azjen, 1991), which predicts a person’s intentions to engage in a specific behavior, emphasizes the importance of attitudes in behavior change. Clinician attitudes towards EBPs have been widely studied as a clinician-level predictor of implementation, with mixed findings. Some studies have reported that clinicians with favorable attitudes towards EBPs are more likely to adopt them (Ashcraft et al., 2011; Lewis & Simons, 2011; Nakamura et al., 2011), while other studies have shown no association between attitudes and EBP use (Bearman et al., 2013; Higa-McMillan et al., 2015). Despite mixed findings, theoretical and empirical studies underscore the importance of attitudes in EBP adoption. As such, the representativeness of clinician attitudes towards EBPs warrants further investigation.

In this study, we compare demographic and professional characteristics, attitudes toward EBPs, and attitudes towards measurement-based care (MBC) among clinicians participating in a National Institute of Mental Health (NIMH) funded effectiveness trial (Community Study of Outcome Monitoring for Emotional Disorders in Teens; COMET; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018) to two national U.S. survey samples. National survey samples typically target the most prevalent providers of mental health services with the goal of obtaining representative samples of clinicians, and thus can serve as a useful reference for sample generalizability and cross-sample comparison. Survey samples were those used to establish the psychometric properties of three clinician attitudes measures: the Evidence-Based Practice Attitude Scale (EBPAS; Aarons, 2004; Aarons et al., 2010), the Attitudes Toward Standardized Assessment Scales-Monitoring and Feedback (ASA–MF; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018), and the Monitoring and Feedback Scale (MFA; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018). We hypothesized that COMET clinicians would have more positive attitudes towards EBPs and MBC compared to the national clinician samples.

Methods

Participants

176 clinicians were recruited to participate in the Community Outcome Monitoring for Emotional Disorders and Teens (COMET), a multi-site, randomized control trial testing the effectiveness of two EBPs for adolescent anxiety and depression in community mental health centers. COMET compared three conditions: (1) Treatment as usual; (2) Treatment as usual plus MBC (using the Youth Outcome Questionnaire; YOQ; Burlingame et al., 2005); and (3) Unified Protocol for Transdiagnostic Treatment of Emotional Disorders in Adolescents, a cognitive behavioral psychotherapy model (UP-A; Ehrenreich et al., 2008), plus MBC (Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018). Clinicians were from 19 community mental health centers in the Northeastern and Southeastern United States and employed at least part-time. Each agency had an average of 9.26 (SD = 8.4; range 3–31) clinicians in the trial. COMET clinicians were on average 35.5 years old (SD = 10.1, range 23–65) and 85.8% (N = 151) female. 46.6% (N = 82) of the sample identified as Caucasian, 16.5% (N = 29) as African American, 32.9% (N = 57) as Hispanic, 0.6% (N = 57) as Asian American, and 3.4% (N = 6) as other race. Participant demographics are shown in Table 1.

Table 1 Comparisons between demographic and professional characteristics of Community Study of Outcome Monitoring for Emotional Disorders in Teens (COMET) and national U.S. survey samples

Aarons et al. (2010) participants (N = 1089) were recruited at 100 mental health clinics from 75 cities in 26 states; Jensen-Doss et al., participants (N = 504) were mental health clinicians recruited through three professional organizations (American Mental Health Counselors Association; American Association for Marriage and Family Therapy; National Association of Social Workers) that provided mailing lists of random, representative samples of their membership. Additional details on sample recruitment for the national survey samples can be found in Aarons et al. (2010) and Jensen-Doss, Ehrenreich-May, et al. (2018), Jensen-Doss, Haimes, et al. (2018).

Measures

The Clinician Background Questionnaire is a brief questionnaire designed to capture clinician demographic and professional background information (theoretical orientation, degree, licensure status, years of professional experience, professional field).

The Evidence-Based Practice Attitude Scale (EBPAS; Aarons, 2004) is a 15-item instrument of a mental health clinician’s general attitudes towards EBP adoption. Clinicians rate on a 5-point scale the extent to which they agree with each item from 0 (“not at all”) to 4 (“a very great extent”). Responses are averaged into a total score and four subscales: Requirements (3 items, e.g., the extent to which a provider would adopt an EBP if it were required by an agency, supervisor, or state), Appeal (4 items, e.g., the extent to which a provider would adopt an EBP if it were intuitively appealing), Openness (4 items, e.g., the extent to which a provider is generally open to trying new intervention and would be willing to use more structured or manualized interventions), and Divergence (4 items, e.g., the extent to which the provider perceives EBPs are not clinically useful and less important than clinical experience). Higher scores indicate more positive attitudes towards EBPs, with the exception of the Divergence scale where lower scores indicate more positive attitudes. For the total scale score Divergence items are reverse scored prior to computing the mean subscale score. The EBPAS has demonstrated a stable factor structure that has been replicated across multiple studies, and has psychometric support for its use in clinician samples (Aarons, 2004; Aarons et al., 2007). The EBPAS has been translated into more than 18 languages and used in multiple settings including mental health, substance use disorder treatment, and other medical settings. For this study, the EBPAS validation sample was used as a comparison, which included a U.S. nationally representative sample of 1089 mental health service providers from 100 clinics across 26 states in the United States (see Aarons et al., 2010 for recruitment procedures; Table 1). Cronbach’s alpha for the subscales ranged from .67 to .91 (αTotal = .74) in the Aarons et al., 2010 sample, and from .61 to .92 in the current study (αRequirements = .92; αAppeal = .74; αOpenness = .81; αDivergence = .61; αTotal = .85).

The Monitoring and Feedback Attitudes Scale (MFA; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018) is a 14-item measurement of clinician attitudes towards monitoring and feedback. Clinicians rate on a 5-point scale how much they agree with each item from 1 (“Strongly Disagree”) to 5 (“Strongly Agree”). Responses are averaged into two subscales: Benefits (10 items, e.g., the utility of monitoring and feedback for supervision and collaboration with clients) and Harm (4 items, e.g., whether negative feedback might harm the alliance or be misused by clinic administrators). The initial MFA validation sample consisted of 504 mental health providers recruited through the mailing lists of professional organizations (see Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018 for recruitment procedures). Internal consistency for both scales were good (αBenefit = .87; αHarm = .87). In the current study, the MFA scales demonstrated good internal consistency (αBenefit = .87; αHarm = .78).

Attitudes towards Standardized Assessment ScalesMonitoring and Feedback (ASA–MF; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018) is an 18-item instrument that assesses clinician attitudes towards standardized instruments. The ASA–MF focuses specifically on standardized progress measures and their utility for clinical decision making. The ASA–MF provides participants with the definition of routine progress monitoring and standardized measures and asks them to indicate how much they agree with each statement on a scale of 1 (Strongly Disagree) to 5 (Strongly Agree). Responses are averaged across three subscales: Clinical Utility (8 items, e.g., MBC can provide useful clinical information, α = .85), Treatment Planning (8 items, e.g., MBC can be used to guide treatment decisions, α = .85), and Practicality (5 items, e.g., practical concerns like time or paperwork that may get in the way of using MBC, α = .81). The ASA–MF and MFA were developed using the same sample of 504 mental health providers. In the current study, internal consistency for the ASA–MF scales ranged from .77 to .82 (αClinical Utility = .77; αTreatment Planning = .80; αPracticality = .82).

Procedures

CMHC clinicians were recruited from 19 clinics (6 in South Florida and 13 in New England). Interested clinicians were recruited via agency leaders, consented, and completed measures about their attitudes towards EBPs and using MBC prior to being randomized or trained. Once these measures were completed, clinicians were randomly assigned to one of the three treatment conditions. Clinicians received training and consultation in the UP-A and/or YOQ and treated COMET cases as part of their regular case load.

Data Analysis

Rates of missing data were low (0–1.1%). Data were missing completely at random according to Little’s MCAR test (χ2 = 17.58, df = 12, p = .13) (Little, 1988), thus listwise deletion was used. Independent samples χ2 tests were used to compare COMET clinicians and the national survey samples on background characteristics; for χ2 calculation, some education and professional discipline categories were collapsed (see Table 1). Independent samples t-tests were used to compare samples on the EBPAS, MFA, and ASA–MF scale scores. The Benjamini & Hochberg false discovery rate (Benjamini & Hochberg, 1995) was used to correct for multiple comparisons (n = 20 analyses). Cohen’s d = .20, .50, and .80 were used to interpret effect sizes as small, medium, or large, respectively. Multiple regression models were used to compare the samples on scale scores after controlling for demographic and professional characteristics, given these variables may influence attitude scores. Analyses were conducted using R Version 4.0.3.

Results

Comparisons between the COMET and the national survey samples are presented in Table 1. Demographically, compared to the national survey sample clinicians, the COMET sample clinicians were significantly younger (Aarons et al., 2010: t(266.57) = 3.88, p < .01; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018: t(351.68) = 22.47, p < .01), more female (Aarons et al., 2010: χ2(1) = 8.18, p = .004; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018: χ2(1) = 9.66, p = .002), and more ethnically diverse (Aarons et al., 2010: χ2(4) = 96.56, p < .01; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018: χ2(4) = 155.53, p < .01). Professionally, a higher proportion of COMET clinicians had a Master’s-level education (Aarons et al., 2010: χ2(3) = 39.14, p < .01; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018: χ2(2) = 41.87, p < .01) and a training background in psychology compared to other disciplines (Aarons et al., 2010: χ2(1) = 54.33, p < .01; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018: χ2(1) = 34.68, p < .01).

Sample scale scores are shown in Table 2. Compared to clinicians in the Aarons et al. (2010) sample, COMET clinicians had more positive EBP attitude scale scores, as measured by the EBPAS Total Score (t(227.55) = − 8.15, p < .0001, d = − .68 [95% CI − .85, − .52]). Analyses of the EBPAS subscales indicated COMET clinicians reported more positive EBP attitudes if it was required by their agency (t(233.7) = − 7.24, p < .0001, d = − .59 [95% CI − .75, −.43]) or it if was intuitively appealing (t(237.35) = − 3.18, p = .0016, d = − .25 [95% CI − .41, − 0.09]). COMET clinicians also reported greater openness to new interventions (t(237.35) = − 3.18, p = .0016, d = − .31 [95% CI − 0.47, − 0.15]) and were less likely to report perceiving interventions based in research as divergent from usual clinical practices (t(256.97) = 8.71, p < 0.0001, d = 0.63 [95% CI 0.47, 0.80]), when compared to the Aarons et al. (2010) sample.

Table 2 Scale score comparisons between Community Study of Outcome Monitoring for Emotional Disorders in Teens (COMET) and national U.S. survey samples

Similarly, COMET clinicians reported more positive views of standardized progress measures than participants in the Jensen-Doss, Ehrenreich-May, et al. (2018), Jensen-Doss, Haimes, et al. (2018) study. On the MFA, they reported MBC as having more benefits (t(360.72) = − 6.98, p < .0001, d = − .58 [95% CI − .75,− .40]) and less harm (t(394.82) = 12.99, p < .0001. d = 1.01 [95% CI .84, 1.2]). COMET clinicians also reported higher scores on the ASA–MF indicating that standardized measures were more clinically useful (t(383.1) = − 11.76, p < .0001, d = − .94 [95% CI − 1.12, − .76]), beneficial for treatment planning (t(394.33) = − 7.76, p < .0001, d =− .61 [95% CI − .79, − .43]), and practical to use (t(354.98) = − 11.78, p < .0001, d = − .97 [95% CI − 1.15, − 0.79]) than did the Jensen-Doss, Ehrenreich-May, et al. (2018), Jensen-Doss, Haimes, et al. (2018) sample. All findings remained statistically significant (p < .05) after correcting for multiple comparisons using the Benjamini–Hochberg method (n = 20 analyses).

Given the significant differences in clinician background characteristics across study samples, multiple regression models were used to examine how scale scores compared after controlling for these variables. Results indicate that after controlling for demographic and professional characteristics, COMET clinicians continued to have more favorable attitudes towards EBPs (Table 3) and MBC (Table 4) compared to the national survey samples.

Table 3 Multiple regression models for the Evidence-Based Practice Attitude Scale (Aarons et al., 2010)
Table 4 Multiple regression models for the Monitoring and Feedback Attitudes Scale and Attitudes towards Standardized Assessment Scales–Monitoring and Feedback (Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018)

Discussion

While it is generally accepted that samples from effectiveness trials are representative of the broader population, this question has not yet been tested in clinician samples. Understanding how effectiveness trial clinicians compare to the wider population of community mental health clinicians in practice is important for characterizing who participates in research studies and implementation efforts, contextualizing results, and comparing findings across study samples which may result in greater engagement with the research process and use of EBPs. To address this gap, we compared background characteristics and attitudes of clinicians participating in a NIMH effectiveness trial, COMET (Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018), to national U.S. survey samples of community mental health clinicians (Aarons et al., 2010; Jensen-Doss, Ehrenreich-May, et al., 2018; Jensen-Doss, Haimes, et al., 2018). Consistent with hypotheses, results indicate COMET clinicians had more favorable attitudes towards EBPs and MBC compared to clinicians in the two survey samples after controlling for demographic and professional characteristics.

Results raise questions about how well effectiveness trials represent a test of evidence-based practices in community mental health. If the goal of effectiveness studies is to examine EBP delivery in routine practice settings with frontline providers, researchers should critically evaluate the background characteristics, attitudes, and experiences of clinician samples recruited in effectiveness studies. Our findings suggest that clinician samples used in effectiveness trials have highly positive attitudes towards EBPs compared to clinicians sampled from the community, which may translate to selection bias in recruitment and overestimation in the acceptability of EBPs. For example, given that COMET clinicians were recruited by their agency leaders to participate in the trial, these individuals may be more likely to value and engage in the research process (e.g., complete additional study measures, attend consultation calls), which may explain their favorable EBP attitudes. Indeed, recruitment procedures may influence the background characteristics and attitudes of clinicians selected to participate in COMET, and thus finding may not generalize to effectiveness trials that use other recruitment strategies (e.g., agency-wide EBP implementation). Future studies should examine whether variability in clinician attitudes towards EBPs and effectiveness trial recruitment procedures influence engagement in the research process.

Although it is possible that recruiting clinicians with more positive attitudes towards EBPs is an appropriate use of clinical research resources, future work should consider whether effectiveness trials are reaching clinicians with less favorable attitudes towards EBPs. Given that findings from effectiveness trials are often used to identify EBPs for widespread implementation, generalizing results from clinician samples with favorable attitudes may not appropriately predict implementation outcomes. It will remain important to critically examine how community clinicians and agencies are selected for effectiveness trials and develop strategies to engage clinicians with less positive attitudes in the research and EBP evaluation process.

Many studies describe a “voltage drop” in effectiveness when interventions are used in community settings (Chambers et al., 2013). It may be that clinicians who participate in research trials are more willing to adopt EBPs like MBC because of beliefs about the importance of evidence-informed practice. This could help explain the small effect sizes between the treatment as usual and EBP conditions observed in effectiveness trials, in that the usual care condition may be more “evidence-based” compared to routine clinical practice. Given clinician sample selection is an important consideration for both the usual care and intervention conditions, findings suggest that the usual care arm of an effectiveness trial may warrant closer examination (Löfholm et al., 2013). Describing the “voltage drop” in treatment effectiveness may also require further examining clinician-level characteristics, including demographic and professional variables (e.g., years of experience, theoretical orientation, productivity requirements, administrative responsibilities), which may influence EBP attitudes and consequently EBP use.

There are also study limitations. This was a retrospective study, a limited number of comparisons were made about clinician attitudes across the EBPAS, MFA, and ASA–MF scales. Additionally, given that nested data structures of the national survey samples were unknown, clustering was not accounted for in the analyses. The generalizability of effectiveness samples could be assessed across a wider range of practices and implementation characteristics including intervention fidelity, acceptability, and sustained use (Proctor et al., 2011). Future studies should also proactively investigate how clinicians who participate in effectiveness trials differ from those who choose not to (e.g., through a follow-up survey or qualitative interview), and identify predictors of early EBP adopters to help implementation efforts more efficiently allocate EBP training and intervention resources.

Second, although this study used national U.S. survey samples for comparison, survey samples are also research samples. It is possible that clinicians who participated in the surveys where the EBPAS, MFA, and ASA–MF were administered may have more favorable attitudes towards EBPs than those who did not participate. As such, it is possible that gaps between effectiveness samples and the general population of clinicians may be even more profound than those observed here. Third, this study did not assess or compare clinicians on professional or organizational factors, which may directly or indirectly influence EBP use and research participation. The importance of inner organizational context and outer system and policy context factors is well recognized in implementation frameworks like EPIS and should be further considered with respect to generalizability (Aarons et al., 2011).

Finally, given that the Aarons et al. (2010) sample was recruited over 10 years before the COMET sample and that both national survey samples had participants that were significantly older than the COMET sample, findings may be confounded by time. In other words, the positive attitudes that COMET clinicians hold may be reflective of attitudinal shifts as training and implementation efforts have emphasized EBPs over the past decade. Comparison of clinician background characteristics to the broader mental health workforce is also warranted. For example, the American Psychological Association estimates that in 2019, about 70% of psychologists were female, 83% were White, and 55% were between 31 and 50 years old (American Psychological Association, 2022). These demographics are generally reflective of the participants in the national survey samples used in this study, most of whom were masters-level providers. The COMET sample was more racially and ethnically inclusive than the U.S. clinician workforce, with 32.9% identifying as Hispanic, which raises questions about representativeness. Further study of how background characteristics moderate EBP attitudes over time is needed.

Overall, findings suggest that researchers should be cautious when concluding generalizability of findings to practicing clinicians from a single effectiveness trial. Use of innovative sampling methods, recruitment procedures, and study designs to encourage greater representation among clinicians and to evaluate EBP effectiveness in routine practice are important areas for further study. Studies can also examine how differences in attitudes about EBPs and the research process may influence clinician engagement in implementation as well as professional workplace advancement (e.g., more likely to finish training requirements, be promoted to clinical supervisor because of familiarity with EBPs, etc.). Gaining an understanding of factors that differentiate clinicians who participate in effectiveness research and those who do not can lend greater insight to the generalizability of study findings and the development of more targeted supports to increase EBP adoption among the broader population of community mental health clinicians.