Introduction

This is a particularly interesting and challenging time in the health promotion and disease prevention field: On the research-driven side, the U.S. government has mandated the use of empirically-supported programs at all schools and major efforts are underway to promote the diffusion and high-fidelity implementation of empirically-supported programs. On the community-driven side, multiple funding agencies now require community advisory boards for intervention research. The NIH, in its funding of community-partnered processes, proposed that these approaches have the potential to develop more effective interventions by: Increasing relevance of intervention approaches and thus likelihood for success; targeting interventions to the identified needs of community members; and developing intervention strategies that incorporate community norms into scientifically valid approaches (National Institutes of Health 2008). Thus, there is strong recognition of the value of high-fidelity implementation of empirically-supported programs and a growing recognition of the need for local stakeholders to be involved in the planning and contextual adaptation of programs that serve them.

Leaders in prevention science have called for further integration of community and research-driven approaches for maximal intervention effectiveness and sustainability and have highlighted the need for more scientific inquiry in the largely neglected but fundamentally important areas of contextual influences on program effectiveness and diffusion (Biglan et al. 2000; Gray et al. 2003; Wandersman and Florin 2003; Weissberg and Greenberg 1998). A central issue in the local adaptation of programs is the extent to which changes strengthen or undermine the “active ingredients” or core components of the program. Adaptations that are consistent with program theory would generally not be considered a problem, unlike those that undermine the core components (Greenberg 2004). Despite interest in pursuing a “best of both” integration of prevention science and community-collaborative approaches, there is surprisingly little empirical research to guide such efforts. While advocates of participatory approaches assert that the input of local stakeholders in intervention planning and implementation is essential, we know little about the adaptations local stakeholders suggest or enact, and if these adaptations would be expected to strengthen or weaken interventions.

Further, although youth and their behavior are frequently the target of prevention programming, interventions are typically designed and implemented by adults. Thus, the perspective of youth regarding the interventions delivered to them represents an under-utilized resource in prevention programming and adaptation. Major theoretical frameworks underlying health promotion and prevention programs highlight the importance of motivation for change and the personal relevance of the health behavior and outcomes for the individual (e.g. Bandura 1997; Fisher and Misovich 1990). Therefore, local adaptations that enhance perceived relevance for youth may improve effectiveness if they do not undermine core program components.

To help address gaps in the field, the present study provides a primarily qualitative in-depth investigation of the diffusion of two empirically-supported prevention programs for high school students in four ethnically-diverse urban schools. We explore three main questions regarding implementation in these settings: First, what changes does a select group of experienced teachers make as they implement these two empirically-supported violence and substance abuse programs, and why? Second, when asked to reflect on the program, what adaptations do teachers and youth suggest to make it maximally effective? Third, to what extent are these changes suggested by students and teachers consistent with program theory? Our study is designed to generate insights and hypotheses regarding the practices that influence implementation and adaptation in these settings rather than to test claims of evidence more generally (Biglan 2004; Biglan et al. 2000). Our report offers evidence to support theory-building in the nascent field of “Type II” translational research (Rohrbach et al. 2006).

Conceptual Frameworks of Adaptation

While there has been little empirical research on the adaptation of interventions to specific cultural groups and settings, there has been much conceptual work. Building on prior efforts (Resnicow et al. 1999), Kreuter and colleagues (2003) characterize common strategies for improving the cultural appropriateness of prevention programs as peripheral, linguistic, evidential, sociocultural, and constituent-involving. Peripheral refers to surface-level changes to make the program appear more relevant and appealing; e.g., culturally-consistent images; linguistic refers to using appropriate languages. Evidential strategies involve the presentation of data that highlight the relevance of the health issue for the particular group. Sociocultural or “deep” adaptations reflect changes to make the program consistent with deeply-held cultural values, beliefs, or practices. Research on school-based programs has specified domains expected to influence adaptation: Features of the school context, implementation resources, implementers’ attitudes towards the curriculum, and characteristics of the audiences targeted (Ringwalt et al. 2004a, b).

To guide practitioners in their implementation and potential adaptation of school-based prevention programs in the areas of pregnancy and sexually-transmitted infections, the U.S. Centers for Disease Control (CDC) and ETR Associates have worked with program developers to establish a framework and process for adaptations (ETR Associates and CDC Division of Reproductive Health 2007). Using a user-friendly “stoplight” metaphor, this framework uses reviews of effective programs to identify the core content, pedagogical methods, and implementation components that cannot be modified, in contrast to the elements that can be changed in response to the particular needs or characteristics of the population served.

Fidelity and Adaptation of School-Based Prevention Programs: Empirical Research

The substance abuse prevention field has benefited from systematic study of program fidelity for school-based programs; to our knowledge, there is no parallel research on violence prevention. Findings from a 1999 U.S. survey of a representative sample of school staff responsible for implementing drug prevention programs in middle schools indicate that most do not implement empirically-supported programs or use interactive teaching techniques found to be most effective (Ennett et al. 2003; Tobler et al. 2000). Further research found that middle school teachers with a large proportion of ethnic minority youth reported more program adaptations (Ringwalt et al. 2004a, b), and that 80% of teachers reported making adaptations in response to students’ issues (Ringwalt et al. 2004a, b). In other survey research conducted in 12 states, prevention coordinators reported much variability and uncertainty in how programs were implemented (Hallfors and Godette 2002).

While there is much to be learned from large-scale survey research, observational studies of program implementation in classrooms provide substantial benefits for assessing the quality of implementation, the type of adaptations made, and the rationale for these adaptations. Data from prevention coordinators or other district officials are not likely to provide valid information regarding actual implementation because they are typically not present in the classroom. Concerns have also been raised regarding the validity of teachers’ reports of their own implementation of prevention activities (Hansen and McNeal 1999). A recent study conducted by Dusenbury and colleagues examined the implementation of Life Skills Training (LST) (Botvin et al. 1995) by 11 teachers in urban middle schools. Observers rated the “re-teaching” of one session of the curriculum and found that all teachers adapted it. Although teachers made some adaptations that were viewed by the investigators as likely to strengthen the program, the majority of adaptations were considered likely to detract (Dusenbury et al. 2005). Most teachers reported that they supplemented the curriculum to make it more culturally relevant to their primarily African American students or to add enrichment materials. Agreement between teachers’ and observers’ ratings of adaptations was low, suggesting that some teachers are not aware of making changes.

Thus, research on real-world implementation of substance abuse prevention programs attests to the likelihood that empirically-supported programs will be modified as they are diffused into schools and classrooms. Guidance from program developers as to how to make adaptations that capitalize on the potentially useful insights of local stakeholders and that strengthen rather than undermine empirically-supported programs is essential (Dusenbury et al. 2005; Greenberg 2004). The present study’s examination of the changes made by teachers as they implement empirically-supported curricula extends the existing literature on the diffusion of school-based prevention in several unique ways. We focus on high school settings, an understudied area relative to elementary and middle schools; consider adaptation issues in the prevention of violence as well as substance abuse; and assess adaptations made by teachers across multiple sessions as delivered to students. Further, our observations were conducted across the teachers’ first and second implementations of the new curricula mandated by their districts, providing an opportunity to study this dynamic diffusion process.

Suggested Adaptations: Perspectives of Students and Teachers

The second question addressed by the present study concerns the adaptations to the curricula that students and their teachers suggest. Collaborative approaches to planning in the mental health and youth development fields provide models for engaging youth and adults in participatory planning and evaluation (London et al. 2003) and demonstrate the value of youths’ perspectives (Peirson et al. 1997). While eliciting the perspectives of youth likely occurs in the development of many prevention programs, and more rarely in the adaptation of others (Komro et al. 2004), no study to our knowledge has systematically characterized the changes suggested by youth stakeholders, their consistency with program theory, and areas of agreement with adult stakeholders.

Methods

Participants and Settings

Recruitment of Sites and Teachers

We used a purposive approach to recruit the settings and teachers for this research, consistent with the goals of the study. First, we identified two urban school districts in Northern California with highly diverse student populations with respect to ethnicity, immigration status, and socio-economic status, as this would provide an excellent crucible for examining the demands on teachers for serving their students. Second, we confirmed that these two districts were conducting a roll-out of an empirically-supported substance abuse or violence prevention program in their high schools, had a prevention coordinator responsible for this effort, and were providing formal training to teachers in program implementation. Although the districts differed in size and in their respective resources for implementing prevention curricula, both would be characterized as within the higher end of the implementation spectrum compared to other U.S. school districts because they had district-level prevention coordinators and provided formal training for teachers (Ennett et al. 2003). It is important to note that the districts were required to implement empirically-supported curricula in their high schools as a condition of federal funding and to be in accordance with federal and state guidelines.

The larger district was implementing Project Towards No Drug Abuse (TND) (Sussman et al. 2004) and the smaller was implementing Too Good for Drugs and Violence (TGDV) (Mendez Foundation 2000). With IRB approval from the university, the first author then sought the support of the district prevention coordinators and health officers in the school district research approval process. After meeting with the prevention coordinators in each of the two districts, the first author asked the coordinators to identify one to two teachers to recruit for the in-depth observational component of this research using the following criteria: (a) any high school health teachers who were planning to implement empirically-supported curricula in their classes that year; (b) who could be considered “expert” and likely to provide high-quality implementations of those programs; and (c) taught at a school with sufficient stability to enable the research to take place over 1 to 2 years. We sought to recruit the teachers likely to provide the highest-quality implementation because our theoretical focus was on the issue of adaptation rather than basic implementation problems. In the smaller district, the prevention coordinator identified one teacher/setting combination as appropriate for research; this teacher was studied for both years of the project across four semesters of implementing the curriculum. In the larger district, several teacher/setting combinations were identified and the first author recruited three teachers/settings to work with who met the above criteria and would provide highly diverse cases with respect to geographic location as well as the students’ ethnicity, risk factors for substance use, and aggregate achievement. We studied one of the teachers’ implementation over 2 years; the others participated for 1 year only. All of the teachers selected for in-depth observations taught the curriculum in the context of a semester-long health class, rather than in the context of regular academic instruction. In addition to the teachers recruited to participate in the in-depth observational component, the first author also conducted interviews with additional teachers to provide a broader view of implementation in each district.

Overview of Prevention Programs

The 12-session TND curriculum utilizes a “motivation-skills-decision-making” theoretical model focused on correcting cognitive misperceptions about drug use (e.g. that the consequences are positive, that the majority of youth are using), building social and self-control skills in order to bond to lower-risk groups, and strengthening rational decision-making processes to make informed choices about their drug use (Sussman et al. 2004). TND is the only universal substance abuse prevention intervention for high school students that meets rigorous criteria as a “model” program by the Blueprints for Violence Prevention (Center for the Study and Prevention of Violence 1998). The 14-session TGDV program for high school is based in a theory of change grounded in approaches including social norms (e.g., correcting misperceptions that targeted risk behavior is rampant), social cognitive theory (e.g. discussing role models, interactive role plays), and the social development model (Mendez Foundation 2000). TGDV is the only universal high school intervention with a major emphasis on peer violence prevention that is considered to be a model program by the U.S. Substance Abuse and Mental Health Services Administration (SAMHSA); four sessions are specifically devoted to the prevention of substance abuse and the other ten are devoted to non-violent conflict resolution, emotion regulation, and other communication and social skills.

Characteristics of High Schools

The student body of each school ranged between 800 and 2,300. Schools ranged from 605 to 780 on the Academic Performance Index (API), a scale used by the state to reflect aggregate standardized test scores (possible scores = 200 to 1,000). Schools in the larger district included two comprehensive high schools and one arts “magnet” school. According to district data, the percentages of “socio-economically disadvantaged” students were 58% and 46% at the two comprehensive schools and 20% at the arts school. Asian American students comprised the majority in each of the comprehensive high schools (School 1: 63% Asian American, 11% Latino, 7% African American; School 2: 37% Asian American, 19% Latino, 16% Filipino, 11% African American). The arts school was comprised of 32% European Americans followed by Asian Americans (26%), Latinos (16%) and African Americans (10%). The school in the smaller district was a large high school with Asian Americans (52%) and African Americans (30%) as the major groups; 76% were disadvantaged socio-economically.

Procedures and Data Collection

Our primary data sources were classroom observations of multiple sessions of the programs as delivered to students, interviews with students, interviews with teachers, and interviews with program developers. We first conducted extensive observation of the teachers’ initial implementation of the curricula in as naturalistic a manner as possible before engaging them in detailed discussions about what they thought worked well and what they would suggest changing about the curriculum. In all conversations with teachers about the curriculum, the research team expressed a neutral and supportive stance, acknowledging positive or negative comments raised by the teachers, emphasizing that we were not evaluating their performance, and suggesting that they contact their prevention coordinators with curricular questions.

Classroom Observations

The research team consisting of trained advanced undergraduate and masters’ level graduate students conducted a total of 163 observations of the teachers’ implementation of the respective curriculum in their classes over two school years. Because of teachers’ comments regarding meaningful differences in the demands of implementing the curriculum across their classrooms, a team member typically observed the teacher implement the same session with more than one class on the same day when feasible. Observers sat in the back of the room and wrote detailed field notes in order to capture the specific activities and wording used by the teacher, and the responses of students to the curriculum. Observers debriefed the teacher at the end of the session to follow up on any possible adaptations that had occurred and the rationale for these changes. Research team members received a minimum of 10 h of training in the curriculum and in observational methods prior to conducting observations; their field notes were reviewed in weekly meetings for quality control purposes. Research team members also participated in one additional full-day training by a certified trainer for TND.

Student Interviews

The research team, consisting of the first author and two masters-level graduate students, conducted a series of group interviews in each school setting to elicit students’ perceptions of the strengths and weaknesses of the prevention curricula and any suggestions for making it maximally effective. Fourteen focus groups (n = 85 students) were conducted in the larger district; 11 focus groups (n = 103 students) were conducted in the smaller district. To promote students’ recall and the richness of the data, interviews focused on the first half of the curricula were conducted at the mid-point and those focused on the second half were conducted immediately after the end of the implementation. Sample sizes indicated above reflect the number of “unique” participants in total; because these interviews were conducted at least twice in some classes, some students participated more than once. The interviews were approximately 45 min in length. Our initial 13 group interviews were scheduled during lunch (with pizza and $10 gift card provided); we then were granted permission to conduct the remaining group interviews during class periods dedicated to health that were not needed for instruction. This latter strategy enabled a longer time period and maximized participation. In total, 29 students (14%) declined to participate and/or did not receive parental permission. Interviews were tape recorded and transcribed verbatim for later coding.

To guide students in generating detailed feedback on the curriculum, the research team developed the following process for the in-class groups: (1) Students in each classroom divided into small groups of three to five students and each group was given the materials for a specific session; (2) they discussed their answers to several questions regarding what worked and could be improved; and (3) the whole class then discussed each session, starting with the report of the small group responsible for the session with supplementary comments made by other students.

Teacher Interviews

We conducted 22 formal interviews with teachers; all interviews were audio-taped and transcribed verbatim. This number reflects a combination of multiple interviews conducted individually with the four “target” teachers who participated in the in-depth observational component of the study, as well as interviews conducted with five additional teachers in the larger district after they had implemented the curriculum (see below). In order to promote accurate recall and generate specific comments, the teacher interviews were timed during or immediately after their implementation of the program, and the teachers’ guide for the curriculum was used as a prompt. The target teachers were interviewed an average of one to two times per semester of participation in the study. All teachers invited to participate in interviews accepted. Interviews with teachers focused on eliciting their report of any adaptations they had made to the curriculum, their rationale for making any adaptations, and their suggested adaptations to the curriculum. All teachers received a $20 gift card for their participation in each interview. In order to elicit additional perspectives on the curriculum, the first author conducted a group interview of all five “non-target” health teachers who attended an end-of-the-year training at the district office and who had taught the empirically-supported curriculum that year.

Consultation with Program Developers

After initial coding of the qualitative data (see below), the first author used interviews and written exchanges to gather data from the developers of the programs under study in these two districts. Prior to eliciting responses, we refined our database of suggested adaptations to create a list of “actionable” suggestions to which the program developers could respond. In this process, we eliminated 65 comments that were not specific enough to be coded as well as many comments that reflected the same idea from different people. The condensing of the suggestions resulted in the retaining of 97 suggestions from teachers (of an original pool of 163) and 111 suggestions from students (of an original pool of 333) for review. We then asked the program developers to characterize the extent to which the adaptations suggested by students and teachers were consistent with the core content and pedagogical components of the program. Two of the program developers for TND provided data via written comments in an MS Excel database in which they rated each suggested change in terms of whether this was “ok” or “green light;” if not, they explained their concerns or problems with the suggested change. Two TGDV program developers responded to each suggestion during an audio-taped telephone interview that was then transcribed. As discussed below, these responses were then coded systematically by the research team. We used a sensitive approach to address comments from the two program developers from each team: If either expressed a concern, the suggestion was flagged with the appropriate yellow or red code.

Data Analysis

A combination of qualitative and quantitative analytic methods was used. Consistent with guidelines for qualitative analysis (Miles and Huberman 1994; Patton 2002), a multi-stage analytic process was used that combined inductive, deductive, and verification techniques to strengthen the reliability of the coding system and the validity of the findings. To facilitate analysis of key themes across interviews and observational field notes, we used an iterative process to develop and refine the coding of meaningful “chunks” within each data source (Miles and Huberman 1994; Ryan and Bernard 2003). Our initial coding framework was primarily descriptive, with the purpose of identifying all of the relevant data in specific domains of interest for more detailed interpretive coding. Thus, initial codes represented several broad domains: Field notes and teacher interviews were coded for any observed or reported adaptations, and the rationale for the adaptation if available. Transcribed interviews with teachers and students were coded to identify specific suggestion to adapt the program. These initial codes were entered as nodes in Nvivo. Data were coded by at least two members of the research team to establish consistent application of codes to text; any discrepancies were resolved in weekly meetings and revision of the codebook to ensure clear inclusion and exclusion criteria.

Using NVivo, we then conducted more focused coding of the data related to teachers’ adaptations of programs and students’ and teachers’ suggestions for adaptations (see Table 1 for details regarding codes). We coded the data for specific nodes across all respondents. To develop our coding specific to each node, we engaged in a similar iterative process as described above to develop a new set of codes and to check the reliability of our coding. In terms of deductive process, some codes represented types of adaptations outlined by existing conceptual models such as deep, surface, and evidential (Kreuter et al. 2003). With respect to inductive process, we refined the coding system to include themes that emerged from our analysis (e.g. content, bringing real people into the classroom, providing real-life examples to illustrate the curriculum, time limitations). For example, the following field note excerpt was coded as adding informational content: “Teacher deviates from session to talk about Narcanon and other topics.” Excerpts that were coded as a change in teaching approach include: “The teacher…says, ‘We are going to do this round robin style’” and “teacher reads student poem…which talks about relationships and emotions.” More than one code was applied to a chunk of text if appropriate. Sixty-five comments were not coded because they were too vague or not able to be understood.

Table 1 Definitions and examples of major codes

With respect to the program developers’ feedback, we coded their comments into two “stoplight” categories, theory and logistics, in line with the framework discussed earlier (ETR Associates and CDC Division of Reproductive Health 2007). “Green” for the theory stoplight signified that the program developer(s) expressed no theoretical concern, “yellow” indicated caution, and “red” signified a clear response that the suggested change would be inconsistent with the program theory. If the stoplight was coded yellow or red, a secondary code indicating the rationale was also included. Likewise, the logistical stoplight was color-coded in accordance with the program developers’ expression of concern regarding the suggested adaptation for non-theoretical, logistical reasons such as time considerations; a secondary code indicated the rationale. The program developers’ comments were coded by at least two members of the research team to establish consistent application of codes to text; any discrepancies were resolved in weekly meetings to ensure clear inclusion and exclusion criteria. Nine comments (six for TND; three for TGDV) did not receive a code as the program developer had indicated that they did not understand the meaning of the suggestion well enough to characterize it.

As an extension of the qualitative analysis, we developed an SPSS database containing all of the observed and suggested observations as well as their codes to provide a means of further exploring our relationships of interest using quantitative techniques. Each suggestion or change reflected one observation in the dataset, with each major code from the qualitative analysis representing a variable. For example, a student’s suggestion that the wording of a session be changed would be one observation to be coded in terms of the curriculum, district, session, date, source (student versus teacher), and type of adaptation (observed versus suggested) as well as 10 variables (yes/no) indicating if the data point had been coded for each of the kinds of adaptation in our coding framework (deep, surface, evidential, teaching format, addition of informational content, bringing real people into the classroom, providing real-life examples to illustrate the curriculum, time limitations, sequencing of activities, and classroom management issues). The program developers’ responses to suggested adaptations (red/yellow/green for theoretical reasons; red/yellow/green for logistical reasons) were entered as variables in the SPSS dataset.

Results

Adaptations Made By Teachers in Program Implementation

Our first question was: What changes do these teachers make as they implement empirically-supported violence and substance abuse programs in their classrooms, and why? In total, 99 adaptations were observed or reported by the 4 teachers who participated in the in-depth component of the study. The most frequently-observed type of adaptation involved changes in the instructional format (45%), such as restructuring activities from individual work into group work or less commonly, converting group work into individualized activities. Teachers’ primary rationale for making these adaptations was pedagogical in nature, such as their beliefs or experiences that more interactive or experiential learning opportunities make for effective teaching, or to address the diverse learning styles and behavior of their students. For example, one teacher reported that he “used the visual of bb’s dropping onto a metal plate to highlight deaths in the U.S. associated with drug use” and that he “posted on (the) board words to define and discuss for lessons.” In some cases, teachers made adaptations in response to unusually small or large classes that made it difficult to implement the curricular activities as indicated. For example, one of the teachers had almost 40 students in a small class space; the transitions into group work in some classes were observed to be time-consuming because of off-task behavior. In contrast, a small class of highly-motivated students in another school had too few to provide an audience for a talk-show activity and the teacher helped them to develop some new roles.

The next most common type of adaptation (21%) was the integration of real-life experiences of the teacher or others into the curriculum; for example, teachers’ sharing their own experiences about how they deal with stress or communication problems with friends (e.g., a teacher talking about how she managed her anger when her car was damaged), bringing in a recent newspaper article to demonstrate a point from the curriculum, or giving a personal example about coffee drinking as a form of drug use. In a session on the cycle of positive and negative thoughts and behavior, for example, a teacher was observed to “digress to discuss a scenario that happened at school recently, which was a fight that was prevented from escalating.” When asked later about the rationale for the change, the teacher reported, “Yeah, that’s what good teaching is. You find something that’s happened today and then put it in there.”

Sixteen percent of adaptations were characterized as supplementation with additional resources, typically one that the teacher had used before and liked (e.g., sharing an excerpt from Chicken Soup for the Soul, adding a section from an existing health text). Eleven percent of adaptations were “surface” types of adaptations such as teachers’ changing of words and language to make the curricula understandable, a particular challenge in classes with a high proportion of English language learners. Seven percent of adaptations were made in response to time limitations, most commonly the teachers’ skipping activities that they deemed less critical; 5% were changes in the sequence of activities, ranging from minor re-arrangements to skipping an activity because the topic was covered earlier in the semester using different curricula.

We observed few adaptations that appeared to be explicitly socio-cultural or deep in terms of relating to the core values, beliefs, or practices of the specific cultural backgrounds of the students in these highly diverse classes. Teachers reported that their adaptations that entailed the supplementation of existing resources or real-life examples were frequently aimed at responding to the specific needs and “realities” of their students, however, and thus could be characterized as “mid-level” if not “deep” responses to the values and practices of the diverse urban youth represented in these classes. As outlined below, students’ and teachers’ suggested adaptations were much more explicit about the need for deep socio-cultural changes.

We next compared the frequency and type of adaptations that teachers made in their first versus second time implementing their respective curricula. Across sites, roughly the same number of adaptations were observed for each iteration, although more adaptations involving instructional format \({\left[ {24{\text{ vs}}{\text{. }}16;{\text{ }}\chi ^{2} {\left( {1,\,N = 89} \right)} = {\text{ }}3.97,{\text{ }}p <0.05} \right]}\) and the supplementation of resources \({\left[ {12{\text{ vs}}{\text{. }}3;{\text{ }}\chi ^{2} {\left( {1,{\text{ }}N = 89} \right)} = {\text{ }}7.25,{\text{ }}p <0.\left. {01} \right)} \right]}\) were observed the second time the curriculum was taught. These counts demonstrating increased changes in some areas paralleled several of the teachers’ comments that they intended to “follow the script” the first time, and then “make it their own” the next time. Multiple teachers expressed feeling a tension between implementing the curricula in a high-fidelity manner and providing the most effective teaching for their students; e.g. “Fidelity held me back from communicating with the students,” and, “When a student gives you a question you can’t give answers.” No other significant differences in the types of changes in the first versus second iteration were found; due to small cell sizes for some codes when broken down by iteration, however, we were not able to test all analyses.

Suggested Adaptations

Our second research question concerned the kinds of adaptations that these teachers and youth suggest regarding the empirically-supported prevention programs that were implemented in their schools, and the extent to which these suggested changes are consistent with the program theory. A total of 494 suggested changes were elicited from the students (331 suggestions) and teachers (163 suggestions); see Table 2 for detailed information. The most common adaptations suggested by students concerned their desire for additional content (29%) such as discussing the perceived positive aspects of drug use; e.g., “talk about more pros and cons in consequences of marijuana,” “show consequences of what happens when you use drugs” and “talk about how to build self-esteem and how to meditate (like listening to music) when addressing drugs.” In some cases, students expressed that they valued the strategy being taught but that they needed more guidance in putting it to use; e.g., “I know I should think positively, but it is hard to remember it when I am in a difficult situation.” The next most frequent adaptations suggested by students were “surface” in nature (26%), concerning changes in the terminology or visual organization of the curriculum to make it more engaging and attractive to youth; e.g., including “realistic” language or including celebrities as examples. They made suggestions such as: “Change the word ‘dweeb’ cause I don’t hear that many people saying that,” and include “more examples and cartoons of drug abuse.” Students also provided feedback on teaching practices (20%) such as making the curriculum more interactive; e.g., “playing games… I guess it would make you want to participate more,” or “maybe have more hands on type things, not just lectures.”

Table 2 Suggested adaptations by students and teachers

We characterized 10% of students’ suggestions as socio-cultural insofar as they concerned the need for the curriculum to be more socially and culturally appropriate to their daily lives. In some cases, the students’ comments indicated that the strategy being taught could not be realistically applied without negative consequences; e.g., “I don’t think ‘I messages’ would work. If you are about to fight somebody you can’t like stop and like, ‘I feel mad when you say this to me.’” The need to fight back in order to salvage one’s reputation in the school or community was a theme echoed by male and female adolescents regarding the conflict resolution material in TGDV. Other socio-cultural suggestions focused on changing the depiction of drug use to reflect the specific risks and experiences in their own communities; e.g., “I have plenty of friends who smoke a lot and do a lot of drugs…and the drugs are only affecting them health-wise…They still have their friends; they still get along with their parents; they keep their grades up,” and, “A lot of drugs are more popular in [my city]… if the lesson…was just for a certain district it would be a lot more effective.”

Teachers’ suggested adaptations were primarily concerned with instructional practices such as adding homework and projects to deepen students’ reflection on the topics and to engage different learning styles (34%); e.g., “instead of scripted scenarios, have students go in small groups and create their own,” and “include modeling techniques to reduce stress.” These were followed by the need for more content, particularly about different types of drugs and their effects (16%); e.g., “add a discussion of legal drugs—caffeine, pain killers, over the counter drugs…” Thirteen percent of teachers’ suggested changes were focused on surface aspects of the curriculum such as language and including more visuals (13%); e.g., “Change the icons or names of people” to fit role models respected by the specific student population and “use words like ‘hyphy’ (a local hip-hop genre) to have credibility with students.” Beyond changes to make the language more appealing or relevant to the youth, there were also comments that some low-performing and limited-English students had difficulty with the vocabulary. Consistent with the adaptations noted earlier, 12% of teachers suggested that additional resources such as DVD’s be used; e.g., to tell the “real stories of people and their use and recovery.”

Acceptability of Suggested Adaptations with Core Program Components

We examined the data in which program developers characterized students’ and teachers’ suggested adaptations with respect to the core components of each program. As described earlier, we had condensed these suggested adaptations prior to review by the program developers to eliminate repetition of the same suggestion. Regarding theoretical concerns, we found that approximately three-quarters of all suggestions made for each program were “green light” adaptations from the perspective of the program developers (72/104 for TND; 79/104 for TGDV). Chi-square analyses indicated that there were no differences in the theoretical consistency of teachers’ and students’ suggested adaptations for either program \({\left[ {{\text{TND}}\chi ^{2} {\left( {2,\,N = 103} \right)} = {\text{ 0}}.31,{\text{ }}p = 0.86;\,{\text{TGDV}}\chi ^{2} {\left( {2,{\text{ }}N = 80} \right)} = 0.71,{\text{ }}p{\text{ }} = 0.70} \right]}\). Teachers and students were equally likely to propose “green-,” “yellow-,” and “red-light” changes. Regarding logistical concerns, we found that the suggestions made by teachers and students were equally likely to be judged as “green,” “yellow,” or “red” \({\left[ {{\text{TND}}\chi ^{2} {\left( {2,N = 75} \right)} = {\text{ }}1.64,{\text{ }}p{\text{ }} = {\text{ 0}}.44;{\text{ TGDV}}\chi ^{2} {\left( {2,{\text{ }}N = 76} \right)} = 0.72,{\text{ }}p{\text{ }} = 0.66} \right]}\).

We next explored patterns in our codes regarding the kinds of suggested adaptations to each program that were coded “red,” “yellow” or “green,” as well as the rationales provided by the program developers. For TND, multiple students’ suggestions were rejected by program developers because they carried a risk of “deviancy training” (Dishion et al. 1996) or appeared to use “scare tactics” shown to be ineffective in prior research. Examples of red light adaptations suggested by students included asking for outside speakers to talk to them about their experiences of addiction and recovery, more realistic portrayals of “typical high school students using drugs” as characters in role play activities, “visuals to emphasize the effects of using marijuana,” and having students “pressure other students to stop using.” Recommended adaptations by students that were coded “yellow light” included suggestions to provide evidence from research studies to “back the session up,” which was deemed potentially distracting by program developers. Green light adaptations suggested by students included providing more opportunities to practice acting in an assertive manner, and discussing how “jealousness can lead to violence.” There were multiple green light suggestions to enhance the surface appeal of the curriculum by using “more pictures” in order to “catch our attention.” As one student described the curriculum: “I’m looking at it and I’m like ‘another packet to read,’ but then if there’s like color to it I’m going to be like ‘oh what does this say?’”

For TGDV, the program developers expressed few concerns with students’ suggested changes; they viewed the use of outside speakers, the replacement of local characters into role play activities, and students’ addition of their own culturally-relevant role models into the curriculum as consistent with the program theory, with the caveat that these replacements have “clean” backgrounds with respect to their history of drug use and violence. Examples of green light adaptations included students’ suggestions to “teach automatic decisions ’cause when I get angry or if I’m going to have a fight I just blank out,” “help find ways to manage your emotions,” and use “more realistic language” in the curriculum. As in the case of TND, program developers clearly expressed that any discussion of drug use or fighting that could appear as condoning the behavior or could be characterized as a “harm reduction” approach would be unacceptable.

As indicated earlier, teachers’ suggestions for instructional changes included using small group discussions rather than didactic instructions for some lessons, using experiential health education exercises to dramatize the effects of drug use, adding homework assignments, and re-organizing the structure of some sessions. For TND, some of the teachers’ suggested changes to in-class sessions were characterized by the program developers as “red light” because they would replace empirically-tested activities with untested activities, or might distract from the main goals of the session (e.g. spending additional time on family roles in chemical dependency). Multiple teachers suggested more information on drugs other than alcohol and marijuana and on the effects of drugs on students’ physical and mental health. The majority of these suggested changes were not endorsed by the program developers, citing a lack of evidence that a greater focus on knowledge of drugs would affect students’ motivation and behavior regarding drug use. An example of a suggested adaptation that was coded “yellow light” was a teachers’ suggestion to do an activity in which students write down their stressors on pieces of paper and “throw them away.” Acceptable or “green light” changes included altering the look of the curriculum (e.g. “more cartoons”) and minor pedagogical adaptations, such as, “have the students prepare for the talk show before the actual session by giving a warm up” and provide them with opportunities to write in a journal to respond to questions about the curricular content.

For TGDV, the program developers expressed that changes in instructional groupings or adding content to address the needs of their students would be acceptable if it did not involve the elimination of curricular activities. One change deemed essential by a teacher but considered to be “red light” by program developers was the replacement of a session on stereotypes in a racially charged school. As the teacher reported after trying it out, “The dialogue was dangerous. No way is this a safe place to have that dialogue.” Instead, the teacher used a session on stereotypes from another curriculum that used potatoes rather than people to discuss differences. As considered in the Discussion, this example illustrates the tension between local context and program fidelity: The teacher used her experience and judgment to make an adaptation that made sense in her classroom and school, but program developers are understandably reluctant to endorse replacing activities from their curriculum. Green light adaptations included the addition of discussions about how to “save face and not lose respect” when practicing non-violent strategies, supplementing lessons with other curricula and activities (e.g., adding a stress test to a session about stress), and changing the wording for low-performing or limited English students.

Discussion

This study used qualitative methods to extend our understanding of the diffusion of empirically-supported prevention programs in two main areas. First, using sensitizing concepts from existing conceptual frameworks, we identified the kinds of adaptations made by competent and experienced teachers in four urban, ethnically-diverse high schools as they implemented empirically-supported prevention programs in their classrooms, and their rationale for these changes. Second, we elicited and systematically analyzed the feedback of teachers and students regarding how to maximize the effectiveness of these two programs in preventing drug abuse and violence, assessing the extent to which suggestions by these two key stakeholder groups were consistent with the core components of the program as articulated by the program developers. Our findings regarding teachers’ adaptations to programs indicate that all teachers made adaptations to their respective programs, in line with prior observational research on Life Skills Training implemented in middle school settings (Dusenbury et al. 2005). Here, we observed that teachers’ most frequent adaptations involved changing the instructional format of the curricular activities to “work” with what they perceived to be the learning needs or interests of their students, illustrating the curriculum with real-life examples, or altering the wording.

With respect to the adaptations to the curricula suggested by students and teachers, we found that these two groups were equally likely to suggest theoretically- and logistically-acceptable adaptations as judged by the program developers. Not surprisingly, students tended to suggest “surface” adaptations to make the substance abuse curricula (TND) even more realistic and engaging for their age and cultural backgrounds. The majority of these surface-level changes were judged to be theoretically acceptable by the program developers. Students’ requests for content, however, were more theoretically-ambivalent when content included learning about positive aspects of drug use that raised program developers’ concerns regarding deviancy training, or that the program might be erroneously viewed as taking a harm reduction stance.

While almost all of the teachers’ proposed adaptations appeared to reflect reasonable educational practices in terms of engaging students and promoting their learning of relevant concepts, some suggestions entailed the use of health education strategies that are not empirically-supported (e.g. looking at tar to show the effects of smoking on the lungs). This tension between teaching practices viewed as effective by teachers and the recommendations of research studies has been demonstrated in prior observational studies of middle school health education (Hansen and McNeal 1999) and may be particularly accentuated in the current U.S. educational climate focused on the accountability of teachers regarding specific knowledge standards. But while Hansen et al.’s earlier research indicated that middle school teachers tended not to use interactive lessons shown to be most effective in health promotion, the high school teachers in this study primarily sought to increase the interactivity of the lessons while also suggesting more of a focus on knowledge transmission regarding drugs and their consequences.

The strong teachers in our sample viewed good teaching in part as directly connecting with students and making the material “their own” by tailoring and improvising activities to best meet their perceptions of their students’ needs. Beyond the capacity-related challenges encountered in the scaling up of empirically-supported prevention programs in urban schools discussed extensively in the prevention science literature, the present study identifies a set of practice-related tensions among this group of competent and experienced teachers in ethnically-diverse urban high schools that raise new questions with meaningful implications for diffusion practice and future research efforts. The implementations studied here reflect the intersection of prevention science, secondary school education, and health education. Future research should explore if and how practice-based pulls for adaptation observed here are evident in other districts and with other kinds of prevention curricula, and examine the value of engaging teachers in collegial dialogue as part of the diffusion of empirically-supported programs. Another key step is to examine contextual processes at the district, state, and national levels that provide a “press” on teacher and students within the prevention system (Wandersman et al. 2008).

The existing literature on implementation and adaptation in school-based prevention has primarily emphasized middle schools and substance abuse prevention. The present study’s focus on high schools is of particular theoretical interest because high school students are further along in the trajectory of some risk behaviors, there are fewer universal prevention programs to serve them, and far fewer that have been shown to be effective. For example, TND is the only universal program for high schools that meets the most rigorous criteria as a model program in the Blueprints system, and TND and TGDV are among the small number of model programs for high schools in the SAMHSA system. This shortage of empirically-supported violence and substance abuse prevention programs for older adolescents suggests that programs will be delivered to highly diverse populations. Further, because older adolescents are more likely to have experience with risk behaviors, engaging them may be more challenging and their teachers may face more pulls to adapt programs to make them relevant.

The present study is the first to our knowledge to systematically investigate the consistency of local stakeholders’ suggestions with the core components of the program. Our findings that high school students provided meaningful and, for the most part, theoretically-consistent suggestions to enhance the effectiveness and relevance of the programs implemented in their school has several implications. First, many of the students’ “surface” suggestions did not affect the core activities of the program and raised little in the way of theoretical or logistical concerns. Some of these minor changes, however, were viewed by students as critical in terms of how they perceived the program as being interesting versus, in their words, “boring” or “for little kids.” Major theoretical frameworks that guide many prevention efforts emphasize the importance of engagement and perceived relevance as critical to social learning (Bandura 1997), raising the possibility that flexibility in implementation that allows students’ “green light” changes could potentially strengthen effectiveness.

Strengths of this study include the extensive naturalistic observations of the two prevention programs as they were implemented by teachers, the integration of multiple methods for data gathering and analysis, and rigorous attention to increasing the “trustworthiness” of the findings by taking systematic steps to establish the reliability of the analysis among coders and the validity of our interpretation of data (Patton 2002). It should be noted that we cannot rule out the possibility that the presence of observers in the classroom may have influenced the implementation despite our extensive efforts at maintaining a neutral stance. Despite this risk of expectancy effects with observational methods, prior research suggests that this approach yields more sensitive and valid data about teachers’ program implementation (Dusenbury et al. 2005).

Although we used quantitative techniques to extend our thematic analyses, the present study did not seek to make generalizable claims about our processes of interest in the larger population of schools. Rather, our purposive sampling strategy was designed to study the higher end of implementation quality in two ethnically diverse urban districts in a particular context and time. Accordingly, we selected teachers who had been trained in the curriculum and were expected to implement it well. Further, these curricula were taught as part of semester-long health classes. Thus, teachers’ time spent on the curriculum was not in direct competition with academic instruction; we might see more dropping of sessions if this were the case. While these conditions likely enhanced adherence, it is also possible that health teachers are more likely to resist new health curricula than would typical teachers. Several teachers reported that they used time after the implementation of the empirically-supported curriculum to supplement it with activities that have not been empirically-tested. This may account for our findings that teachers suggested the need for additional content and socio-cultural adaptations but were not observed to make many of these adaptations within the mandated curricula. The patterns of suggested and enacted adaptations found for the two specific prevention programs studied here may not generalize to other prevention programs or to other types of educational and social interventions in schools. Programs differ on characteristics that could influence local fit and pulls for adaptation including theoretical grounding, delivery approaches, and the flexibility permitted in implementation. Ideally, future studies of other programs and settings will help to generate a broader knowledge base to enable meta-analytic investigation of questions generated here.Footnote 1

As it is unlikely that any one program can use wording and examples that will be perceived as relevant to all adolescents, specifying the parameters that practitioners can use to modify their programs without undermining core components is critical (Dusenbury et al. 2005). Evidence “farming,” a process in medicine that describes the integration of practitioners’ clinical experiences into a more systematic knowledge base to supplement evidence from randomized clinical trials (Hay et al. 2009), provides a model that may be of use to dissemination science in the school-based prevention area. It is also important to note that eliciting and responding to teenagers’ perspectives on programs where feasible could provide benefits because control is particularly salient for motivation in adolescence (Eccles et al. 1993). Future research that builds on the findings of the present study is needed to investigate if the inclusion of adolescents’ feedback into the adaptation of prevention curricula can strengthen program impact, either by making adaptations to the curricula itself to better fit the audience, by fostering their engagement and buy-in into the process of implementation, or both.