Characterizing Science Classroom Discourse Across Scales


Classroom discourse has long been analyzed as a marker for understanding how learning in schools is socially organized (Mehan, 1979). Lemke’s (1990) seminal analysis of what it looked like to “talk science” began a serious interest in understanding science classroom discourse, especially as it relates to how discourse patterns structure what it means to know in the disciplines. As science education has taken an increasing interest in practices of explanation and argumentation and how those practices contribute to understanding the epistemic aspects of science (Duschl, 2008), discourse analysis has featured prominently in studies of science learning focused on students’ appropriation of various science practices. This work highlights that such practices are inherently dialogic (Kelly, 2014).

Reform efforts thus are really efforts to change the nature of science classroom discourse. Reorganizing the structure of classroom tasks and participation necessarily shifts the nature of classroom discourse (Erickson, 1982). Organizing classroom tasks such that students have to make decisions about how to engage in particular practices creates discursive spaces for students to negotiate what makes those practices authentically scientific (Ford & Wargo, 2007; Lehrer & Schauble, 2004; Manz, 2016; Ryu & Sandoval, 2012; Stroupe, 2014; Warren & Rosebery, 1996). These and other studies of science classroom discourse highlight the crucial role teachers play in framing discourse as merely “doing the lesson” or authentically “doing science” (Jiménez-Aleixandre, Bugallo Rodríguez, & Duschl, 2000). Consequently, efforts to prepare teachers to teach to the latest reforms, like the Next Generation Science Standards (NGSS) in the USA, focus on “core” practices of teaching that center the promotion of rich classroom discourse (Kloser, 2014; Windschitl, Thompson, Braaten, & Stroupe, 2012).

Accepting that classroom discourse is a central, perhaps the central, aspect of student experience in science classrooms means that the field needs methods that allow discourse to be characterized in ways that reliably show differences between teachers, classrooms, and schools that reflect real variations in teaching practice and student experiences. Here, we present an approach to characterizing classroom discourse that directly supports qualitative comparisons between classrooms, both to show differences between teachers and to show change over time. We are especially interested in characterizing how teacher talk is an indicator of the nature of scientific opportunities students have in classrooms and how students consequently respond to those.

We first summarize common approaches in science education to characterize teaching practice. We then describe our method and illustrate its application. We close with considerations of the viability of our method as a tool for understanding variations in science classroom discourse.

Current Tools for Characterizing Science Teaching Practice

Characterizing science teaching practice in ways that support analyses that could be used by researchers, teacher educators, and others to understand the nature of student learning experiences has proven quite hard. We discern three broad approaches to the job, each with strengths and weaknesses.

Observation Protocols

For decades, rounds of science education reform have been attended by the development of observation protocols ostensibly aligned with them. These aim to determine whether any particular lesson adheres to researchers’ judgments of the crucial aspects of reform. Consequently, protocols are developed along particular dimensions, with each dimension rated or scored according to a rubric. A number of such observation protocols were developed to assess the quality of inquiry science teaching in the wake of the first US National Science Education Standards (e.g., Horizon Research, 2000; Piburn & Sawada, 2000; Schultz & Pecheone, 2015). The continuing development of protocols (e.g., Nava et al., 2019) suggests the complexity of the effort and lingering dissatisfaction with available instruments.

These protocols are used to characterize an array of features of instruction, usually rated on some scale from worse to better in relation to a standard like “reform teaching” or “quality.” An advantage of these sorts of instruments is they tend to be developed over long periods of time, derived from relevant research, and put through validation efforts. They also employ external observers to reduce biases from self-report instruments. They have demonstrated value for characterizing large-scale trends in teaching (Banilower, Smith, Weiss, & Pasley, 2006) in ways that are particularly useful for policy analysis.

Observation protocols, however, generally do a poor job of characterizing classroom discourse. While most recent instruments attend to features of classroom talk, often in science-specific ways, they rely on high-level inferences from observers to produce ratings of target features of discourse. These high-level judgments become the data, severing the connection between observation codes and observed interactions. Consequently, assigned codes often mask analytically useful distinctions in how students and teachers interact.

Self-report Instruments

Another category of tools used to characterize science teaching are self-report instruments. Some rely on teachers to assess their frequency of use of certain practices, as in “How often do you… discuss students’ prior knowledge or experience related to the science topic or concept.” (Hayes, Lee, DiStefano, O’Connor, & Seitz, 2016). While such items can be helpful in documenting teachers’ perceptions of their practice, they cannot illuminate, in this case, what different teachers might mean by “discuss” or how those discussions unfold in real classrooms and the consequences from differences in unfolding. More fundamentally, self-report instruments are not generally attuned to discursive features. Not only does this mean they are not analytically appropriate for researchers interested in discourse, they obscure very real and consequential variations in discursive practices and students’ experiences.

Interaction Coding Schemes

As suggested at the start, many studies of science teaching rely on analysis of classroom interactions and discourse. Some of these aim to illustrate processes of interaction and their consequences for learning (Ryu & Sandoval, 2012; Stroupe, 2014), while others are intended to specifically support comparisons of teaching practice (Sandoval et al., 2019; Berland & Reiser, 2011). Generally, such work follows the same pattern of identifying relevant episodes of instruction and coding particular utterances or interactions in some scheme of analytic leverage for the particular study. These schemes are generally not intended to be scalable or shareable.

The great advantage of this method is that it elucidates how particular forms of interaction and talk produce particular forms of learning, in ways that are usable as models of teaching practice. This approach can be tied to particular task or activity structures, such as how variations in the “launch” of lessons are consequential for subsequent student engagement (Kang, Windschitl, Stroupe, & Thompson, 2016). Such efforts can draw attention specifically to important discursive aspects of teaching practice and their consequences (O’Connor, Michaels, Chapin, & Harbaugh, 2017).

How such efforts scale is unclear. One issue is that analytic schemes are typically idiosyncratic. Another is that researchers do not often describe analytic schemes and methods in enough detail for others to take them up. In particular, assigning codes to discourse requires judgments about not just how to code some set of utterances but how much talk to code as a meaningful unit, and there is no real infrastructure for training researchers on techniques developed by others. A major obstacle is simply that fine-grained analysis of discourse is time and labor intensive, making it difficult to do with large samples of teachers or lessons.

A promising approach to resolving some of these issues is the low-inference discourse observation (LIDO) protocol developed from O’Connor’s and Michaels’ long running work on “accountable talk” (O’Connor, Michaels, & Chapin, 2015). Their approach involves coding teacher and student utterances during whole class discussions from transcripts of video-recorded lessons. The aim is to link specific sorts of teacher talk, such as asking open-ended questions or pressing students to elaborate on their reasoning, to specific forms of student talk, such as elaborated answers to questions or responding to other students’ ideas. We adopted their approach in the work that we report here, and it does appear sensitive to changes in classroom discourse in whole class discussions (Sandoval et al., 2018). This scheme remains time intensive, and while it is quite sensitive to differences in whole class discussion, it does not capture other features of classroom discourse that may be of value. Thus, we set out to develop an approach more suited to our goals.

A Multilevel Approach to Characterizing Discourse

We developed the approach that we describe here from a need to trace changes in teaching practice among a group of 25 science teachers engaged in a multiyear professional development (PD) program to teach to the NGSS. Given our view of science practice as fundamentally dialogic, we identified changes in classroom discourse as a primary marker of teaching change. For the reasons outlined above, we judged that existing observation and self-report instruments could not characterize practice at a grain size likely to be sensitive to incremental changes in teaching. Yet, the need to characterize changes in teaching both between the full cohort of teachers and longitudinally suggested traditional discourse analysis was infeasible. We needed a method suitable for longitudinal analysis for our sample able to identify and demonstrate qualitative differences in teaching derived from differences in classroom discourse.

Derived from Interactional Episodes

Our approach is derived from interaction analysis (Erickson, 1992; Jordan & Henderson, 1995). This includes analysis of talk and other aspects of interaction, including the material resources with which interlocutors engage. These resources can include curricular materials (e.g., worksheets), investigative materials, including both the physical material that might be manipulated in an experiment, and the inscriptions of data collection and analysis (e.g., tables or graphs). Other aspects of interaction that could be potentially meaningful for understanding variations in teaching practice and their influences on student activity, such as gesture or prosody, may be included as needed.

The first step of interaction analysis involves identifying and bounding episodes of analytic interest. Our aim was to develop a method to identify and characterize such episodes without requiring fine-grained analysis of talk for each episode. We hoped to avoid transcription whenever possible yet maintain a close enough connection to the video record that fine-grained analysis could be done when desirable. Our first step in analysis, then, is an initial review of the video record of a lesson to log the broad segments of activity in terms of the academic task structure (Erickson, 1982). Task structures change when either the social/physical arrangements of participants change, i.e., from whole class to small group to individual activity, or when the academic task changes, as in a move from the lesson warm-up to the introduction of the main lesson focus. Logging involves recording the timestamp of when a new task structure starts, a coarse code of participation structure (whole class, small group, or individual), and a 1–2 sentence description of the task.

Derived from/for Analytic Categories

Once we have logged a video record, we review it a second time to identify episodes of interest. While this review is grounded in the particulars of discursive interactions, our analysis is organized by a priori categories that we view as central to our overarching interest in capturing variation in teaching practice. These particular analytic categories were developed to capture variation in relation to demands identified with “ambitious” science teaching (Windschitl, Thompson, & Braaten, 2018) that is student-centered, focused around science practices, and richly dialogic.

We define these analytic categories in relation to a positive or negative valence that captures the locus of each category along a rough continuum. For each category, an interactional episode can be coded 1 or −1 to capture the dichotomous poles of the category or 0 if the locus is in between or ambiguous.

Teacher Framing of Activity

The notion of framing developed in sociolinguistics to characterize how people interpret the purpose of and their expected roles in an interaction (Bateson, 1972; Goffman, 1974). In science education, framing has been taken up most notably by Hammer and colleagues to analyze how students interpret the epistemic aims of classroom activity (Elby & Hammer, 2010; Rosenberg, Hammer, & Phelan, 2006) and to highlight the critical role teachers play in framing academic tasks (Berland & Hammer, 2012). Framing is primarily done by teachers as they launch lesson activities (Kang et al., 2016), although Rosenberg et al. show how teachers can reframe the purpose of ongoing activities in response to students’ interpretations of the task. While students can potentially resist a teacher’s framing of a task, we code framing primarily from teacher talk.

We are interested in whether or not teachers frame students’ work in relation to understanding phenomena in the world (1) or as understanding canonical science content (−1). We code a phenomenological orientation when teachers explicitly frame the purpose of learning science concepts and practices as addressing a question or problem in the world. For example, a group of teachers that we worked with developed a unit on cell structure and function, including mitosis, anchored to the phenomenon of how skin wounds heal. When instructional tasks were framed as directly addressing this question, we coded framing as positive (1).

The content framing lacks this orientation to the world and focuses on conceptual mastery for its own sake. If framing language refers to the science topic or concepts that are the focus of an activity but do not link the topic or concepts to a phenomenon, we code such framing negatively (−1). For example, in the wound healing unit, if a teacher started a lesson by announcing, “Today we’re going to learn about mitosis, or how cells divide” but did not link that process to some aspect of healing, then that framing would be coded as oriented toward the science only (−1).

When activities are not clearly framed, are ambiguously framed, or are framed toward problems that are not related to an anchoring phenomenon, they are coded between these two poles (0). Kawasaki & Sandoval (2019) describe the case of a teacher introducing an experiment on inertia in which the task was to explain the trajectory of the fall of a penny when the card holding it above a cup is rapidly pulled out. This is a phenomenon, but the anchoring phenomenon of the unit in which this lesson occurred was to explain the forces at play when two automobiles crash into each other. The teacher linked the penny drop experiment to the concept of inertia but not to the anchoring crash.

Locus of Epistemic Agency

Epistemic agency has recently become a focus of attention in science education and more broadly (Elgin, 2013). We define epistemic agency as the locus of authority and accountability for what counts as knowledge in the classroom and the acceptable means for knowledge construction and evaluation. Here, we identify epistemic agency in teacher-student discursive interactions in which aspects of authority or accountability arise. These include interactions around what problems or questions are worth investigating, who decides how to conduct investigative or analytic tasks, and what standards should apply to classroom work, such as experimental design and models and explanations.

Episodes in which teachers are positioned as the primary authority for knowledge and standards of accountably are negatively coded (−1). These include episodes of traditional IRE discourse (Mehan, 1979), because if students are responding only to known-answer forms of questions, with the teacher in charge of evaluating responses, then the epistemic locus is with the teacher, as the proxy of the discipline. Teacher talk that suggests that students should arrive at the same answer in completing a task or task instructions that prescribe how to complete tasks would be coded as teacher as the primary epistemic agent.

If students are positioned as the locus of authority and accountability, the episode is positively coded (1). These include episodes where teachers explicitly grant students autonomy, either by asking them to decide how to do some task or evaluate some answer, or putting them in dialog with each other to resolve disagreements, or otherwise granting them authority to determine procedures or criteria for accomplishing tasks. Science education literature is replete with examples of this (e.g., Sandoval et al., 2019; Engle & Conant, 2002; Lehrer, Schauble, & Lucas, 2008; Manz, 2016). To be scored positively, we require evidence in student talk that they assume authority and autonomy when it is offered.

If students are given limited agency on non-substantive features of a task (e.g., design their own data table in an otherwise scripted experiment), that is coded as neutral (0). We also use this code in episodes where the locus of epistemic authority is unclear, either because it is not explicitly raised by teachers or students, including cases in which students are silent or their contributions cannot be distinctly heard, or the teacher offers agency but students do not take it up (e.g., insisting the teacher provide standards for evaluating answers).

Version of Science Practice

Visions of ambitious science teaching envision students engaging in authentic versions of science practice. In our view, the forms of practice that students might engage in can either be highly schooled (−1) or more legitimately scientific (1) versions. This code is less discursively defined in some respects, as it is often reified in material aspects of task structure and in the location of tasks within units of instruction. We consider a modeling task presented as an assessment of understanding at the end of a sequence of instruction to be a schooled version of modeling practice. Similarly, a traditional verification lab would be scored as schooled science because such labs typically present scripted procedures with no room for students to substantively influence the conduct of the work. Fine-grained interaction analysis is often not necessary to make this judgment.

A task in which students revised an earlier version of a model in light of new evidence or concepts, we consider a more scientific version of modeling practice (1). This would be the case especially when talk around model revision demonstrated that students had authority to decide what and how to revise and how to evaluate the effects of those revisions on the model’s quality. Similarly, if students argue about the methods they use to accomplish an investigation or make substantive choices about how to design experiments, we would consider those more scientific versions of practice.

An episode where students are tasked with revising an earlier model of their own construction, but their teacher set the criteria for evaluation and students had only limited control over what to modify, we would code as between the two poles (0). We also code as neutral episodes where the amount of responsibility students have to decide how to conduct a specific scientific practice are ambiguous or indeterminate.

Relations Between Categories

How instructional tasks are framed affects their implementation and consequent student opportunities to learn. Yet, framing activity around explaining phenomena could be done while maintaining the locus of epistemic agency with the teacher. For the locus of epistemic agency to be with students, it has to be offered by the teacher and taken up by students. Agency and version of practice are more closely related, in that more authentic versions of practice are partially characterized by higher levels of student agency over how work can be accomplished. Consequently, activity coded as scientific practice (1) is likely to include at least some shared student epistemic agency.

Coding Analytic Categories

Our coding of these categories begins with an examination of each logged segment of a lesson to see whether it addresses Framing, Agency, or Version of Practice (F/A/V). If it appears so, the video segment is reviewed to understand the substance of the interaction in relation to one of the three analytic categories and to bound relevant episodes. For example, a single segment of a teacher giving the instructions for the main task of a lesson typically includes an episode of framing and may often include instructions for accomplishing the task that indicate the locus of epistemic agency. These episodes within a segment are marked with a start and end timestamp and coded as described above. Multiple events in each category (F/A/V) can be coded within a single lesson.

Our research team collaboratively developed this approach in the context of a multiyear PD effort that includes over 150 h of video of classroom lessons from 25 secondary science teachers. The scheme has been applied so far to approximately 25 h of lessons. Once the scheme seemed calibrated, two raters (the 2nd and 3rd authors) assessed its reliability by independently rating 10 h of lessons from 3 teachers. The two raters disagreed on only 4 codes out of a set of 30, for an 87% agreement rate. We interpret this as suggesting this scheme can be easily calibrated.

Application of the Approach

We have used this F/A/V scheme to demonstrate the challenges one teacher faced in consistently framing her teaching toward anchoring phenomena and the consequences this had for student agency and practice (Kawasaki & Sandoval, 2019). Here, we describe an application to show that this analysis characterizes consequential differences between teachers. Participating teachers worked in grade and subject-alike teams throughout the project to revise units of instruction to align with NGSS, with coaching from PD staff (see Kawasaki & Sandoval, 2019). Here, we focus on two teachers, Allison and Kathy (pseudonyms), who worked together on a team for the duration of our project. They agreed to our request to each teach the same intact unit in both the fall and spring semesters and allow us to record them. To illustrate our use of the scheme, we compare the instruction of Allison and Kathy as each taught the fall unit.

The anchoring phenomenon was the question “How do wounds heal”? Kathy, Allison, and their team felt that this would be an engaging anchor for students and would be a good question to enable them to focus on the scientific practice of modeling. Their idea was that they would introduce the phenomenon through a time-lapse video of an open cut forming a scab and the scab eventually falling off to show new skin underneath. Students were asked to draw an initial model of what they thought was happening, and over the course of an intended 6 additional lessons, they would revise the model twice. Teachers planned lessons in which student readings and investigations could be used to generate the evidence and explanatory concepts needed to revise their models.

Generating Analytic Codes

As mentioned, codes were assigned to marked episodes of video-recorded interaction, with brief descriptions of the episode to justify a particular code. As an example, the wound unit opened with students watching the video mentioned above, followed by the teacher asking students to individually attempt to model the process. During her lesson, Kathy framed the purpose of the task multiple times: first, when she introduced the task, shown in the first line of Table 1, and then again as she walked around the room answering students’ questions.

Table 1 Example episodes from Kathy’s opening lesson framing (F) student work toward phenomenon

These are coded positively because Kathy directly states the goal as answering the anchoring question of how wounds heal. During this segment, Kathy sometimes appeared to offer students a high level of epistemic agency and at other times seemed to assert her own agency as the primary knower (Table 2). First, she offers that students can structure their models any way they like. She then encourages students to think microscopically which we coded as ambiguous with respect to agency since she as the teacher effectively asserts the appropriate scale (molecular) of the model, but she makes no bid to control what or how students might represent a molecular model. Later in the activity, she asks students to decide for themselves what information they need to determine if their model answers the question of how wounds heal, which we coded as offering agency to students. She then follows this up with the additional offer to ask what they need to know from her, which clearly positions her as the authoritative source of knowledge about the phenomenon.

Table 2 Example episodes of epistemic agency in Kathy’s lesson

Tables 1 and 2 together summarize the start of the main segment of activity in Kathy’s first lesson of the unit. She begins, as in the first row of Table 1, by saying “So you’re drawing into the initial model box. Now remember we are answering the question, how do skin wounds heal? We saw the images already, you wrote about it, now I want you to show me. Okay?” Students spent several minutes working individually, seated along lab tables organized in 3 long rows from the front to the back of the room. Many students drew macroscopic level pictures of cuts moving through various stages of healing (Fig. 1), while a few students attended to, if not a microscopic level, at least an internal view of what might be going on in the body (Fig. 2).

Fig. 1

View of an initial model of a wound healing, showing a cut finger at the top left, followed by a sequence of drawings showing scabbing and healing

Fig. 2

An initial model of wound healing showing in cross-section 7 phases of scab and tissue growth

Students were thinking about the phenomenon in physiological terms. For instance, as Kathy was walking through the room framing the task of writing down at least 4 pieces of information they needed, students were focused on mechanistic features of the phenomenon, as with a pair of girls working together:

Girl 1: You know how at first it bleeds for a while, and then you know, after a while, it stops bleeding, right?
Girl 2: Yeah
Girl 1: So, that’s why, maybe that first part?

This and other instances of student talk support the student agency (1) code that we assigned to this event, shown in row 3 of Table 2. These girls decide together, on their own, that they need to understand what causes bleeding to stop. Considering talk and students’ developing models together, we coded version of practice as scientific.

Two features of our approach visible here bear emphasis. First, our codes are derived from interactional events but on summaries of talk rather than discrete sets of turns of utterances. Second, in this application, we are interested in assigning an aggregate value for each coding category for each lesson. For epistemic agency, we consider the variation shown in Table 2 to suggest an ambiguity in the locus of agency and so assigned the entire lesson a neutral code for that category. On the other hand, Kathy was consistent in this lesson in orienting student work toward explaining how wounds heal and not in establishing firm evaluative criteria for their initial models, so both framing and version of practice were scored positively. We stress that this choice suits our interest of understanding, across a sample of many teachers over several years, how teaching practice may have changed through their participation in our PD program. Yet, because the codes are tied to interactional events within each lesson, a finer grained analysis can be easily undertaken.

Our summary analysis of Allison’s and Kathy’s wound unit is shown in Table 3. Allison was inconsistent in her framing of lessons, oscillating between explicitly anchoring tasks toward explaining the anchoring phenomenon and framing lessons toward particular concepts without apparent links to the question of how wounds heal. Kathy was completely consistent in framing each lesson toward the wound phenomenon and was generally more successful than her colleague in providing opportunities for her students to express agency in authentic versions of modeling practice.

Table 3 Summary of F/A/V coding of the same unit taught by two teachers (Allison took one more lesson for the unit)

We can see other patterns in this table that are informative for our purposes of understanding how teachers make sense of and enact instruction in relation to the NGSS. First, framing classroom activities toward an anchoring phenomenon appears easier to do than offering epistemic agency or authentic versions of practice to students. This may be a result of our particular PD program, as we spent considerable time during the first 2 years helping teachers reorient their lesson planning toward anchoring phenomena. At the same time, Allison’s inconsistency in framing represents a difficulty shown by other teachers in our project, namely, the ease with which lesson framing slipped back into a focus on the “content” without a clear connection to how that content was helpful for explaining the anchor phenomenon (Kawasaki & Sandoval, 2019). A second thing to notice from Table 3 is that agency and version of practice seem linked in ways that may derive from our definitions, as noted above.

A contrast between Kathy and Allison illustrates how framing toward the world is, on its own, insufficient to create legitimately scientific agentic practice and highlights how our method captures this. The contrast is between how Allison and Kathy frame the 5th lesson in their units. Prior to this lesson, students in both classes had mostly engaged in reading and watching videos about the components of blood and how those work during the phases of wound healing to revise their models (lesson 3) and then introduce the concept of mitosis to understand how new skin forms (lesson 4). The goal for lesson 5 was to examine cell structures through what turned out to be a very schooled version of a lab activity.

Kathy begins by reviewing the work students had done during prior lessons. She points students toward their initial models and asks them to recall what they had learned prior to their first model revision. This is a typical call-and-response summary of the components of blood (platelets, white blood cells, red blood cells, and plasma). She continues in this recitation mode to review the general process of scabbing and the development of new skin through several turns of IRE talk shown in Table 4.

Table 4 Kathy’s framing of the cheek cell lab in lesson 5 of her wound healing unit

Kathy clearly frames the purpose of her last question in terms of answering the anchoring question about how wounds heal. Yet, given that some of her students’ initial models on the first day of the unit represented cells, and in the activity about blood components students had already identified both white and red blood cells, the broad question of whether our bodies are actually made up of cells undermines this frame. The lab that Kathy had her students conduct for the rest of the lesson was a carefully scripted lab in which they swabbed their own cheek cells, examined them under a microscope, and identified major cellular structures.

Allison begins the same lesson in her unit with her students answering a warm-up question, “What closes a wound?” After a brief review of their answers, Allison begins to introduce the main activity for the lesson, with a clear orientation to the anchor phenomenon.

Allison: Ok! So! We have- shh. (2 s) We have been talking about a wound healing.

She then summarizes the activities over the last few lessons and gets to the same summary as Kathy.

Allison: We’ve learned- (5 s). We have learned about, um, the parts of the blood that aid in the cessation or the stopping of the bleeding. Ok? All of those things are going to go into what you are going to do today. Today, you are going to look at [picks up a worksheet and waves it] cheek cells. Cheek cells. Your cheek cells. Under the microscope. Today we are going to look at cheek cells. I do not think we are going to have time to look at the onion. Tomorrow we are going to look at an onion, and yeast. (1 s) What is the difference (1 s) between (1 s) your cheek cell and an onion cell? (3 s)
Students: [uninterpretable]
Allison: Say it again please?
Boy: The cheek cells are an animal cell and the onion is a plant cell.
Allison: Good. The cheek cell is a- is an animal cell. You are animals. What kind of animals are you?
Class: Mammals.
Allison: Mammals. We are mammals. The broad category of what we are, we are mammals. Anybody tell us- tell me- well, tell us. The three things that make up mammals.

Here, Allison segues into a discussion of the properties of mammals, taking particular effort to note that the platypus and echidna are the only two mammals that do not give live birth. At the start of this sequence, Allison frames the class activity as building on the work they have done to this point. As she continues, she gets further and further from the anchoring phenomenon. It is not made clear what being a mammal has to do with wound healing nor is the contrast between cheek and onion cells motivated by that phenomenon. Following the mammal tangent, Allison returns to the cheek cells, asking students what they can expect to see under the microscope and what “structures” they can expect to see. After some initial hesitation from students, one boy answers:

Boy: Cell membrane.
Allison: You should be able to see what? [gesturing toward boy]
Boy: Cell membrane.
Allison: Cell membrane. And I’m going to, um- I’m going to kind of… make you sound like a genius here. We’re going to call this plasma membrane [writes “plasma membrane” on board].

By this point, Allison has severed any connection between the cheek cell lab and the anchoring phenomenon. She also, in the final turn, makes quite clear that she is the locus of knowledge in the classroom by converting the boy’s response to the question into her preferred response. Her students then did the scripted cheek cell lab.

Viability of this Approach

We believe that our approach is a viable technique for characterizing qualitative differences in classroom experiences, in our case teaching practice, in a way that is directly derived from classroom discourse. An express goal is to have an approach that does not require fine-grained analysis of discourse to support judgments of analytic value but can support that level of analysis as needed. It enables a characterization of discursive features at the level of an entire lesson while maintaining the ability to pursue microanalysis of interactional episodes within lessons. We see three features of our approach that suggest broad viability.

Discursive Foundations of Analytic Categories

We stress the discursive foundations of our analytic constructs to emphasize that the viability of our approach rests on the ability of our coding categories to capture valued features of discourse. We aim to have shown here that this is the case and that our specific codes capture salient differences in the valence of classroom discourse, namely, whether it is primarily oriented in toward disciplines students are meant to master or out toward the world in which students live and for which science concepts and practices may be useful. These differences have identifiable consequences for students’ opportunities to learn. While our analytic categories are, of course, related and overlap conceptually, we have shown that they are analytically distinct.

Malleability of Analytic Categories

We think our specific analytic framework is useful for researchers interested particularly in issues of epistemic agency, but this approach could be adapted around other discourse-related categories. We consider the malleability of possible analytic categories a major strength of our approach. There are a great many features of science classroom discourse that analysts could potentially find salient, for a wide variety of analytic purposes. Our project is not to assert that framing, agency, and version of practice are the most important aspects of classroom discourse to attempt to capture. We simply find them useful for understanding the extent to which our PD supported teacher change in ways that we valued.

We stress, therefore, that the judgments encoded by our analytic categories represent our values for teaching practice, values that we do not assume are or should be shared by teachers. We are mindful that in making judgments of the qualities of teaching practice around framing, agency, and versions of practice, we mean quality as “the nature of” rather than “the goodness of” teaching practice. We believe that shifting epistemic agency from teachers to students and engaging students in legitimately scientific versions of practice will produce more meaningful science learning because a substantial body of evidence shows this (NRC, 2012). Similarly, it is well established now that framing is a crucial means of orienting students’ engagement and cognition in classrooms (Berland & Hammer, 2012; Engle, 2006; Kang et al., 2016; Rosenberg et al., 2006).

While it is beyond the scope of the present paper to argue this point fully, we see in these coded patterns of teaching practice shortcomings to our approach to PD. For example, one of our major emphases in PD was to support teachers in revising their own instruction to open it up for student agency, specifically to engage students in discussions about how to conduct various science practices. The fact that after 2 years (about 70 contact hours), Kathy, Allison, and their peers were struggling to consistently offer authentic, agentic opportunities for practice to their students suggests our approach was not working very well. It may also suggest the difficulties posed by the NGSS in relation to the accountability pressures teachers face in the USA, pressures that seem to work against increased agency and autonomy for students. Rather than interpreting these codes that we have assigned to aspects of Kathy’s and Allison’s teaching as judgments of their teaching abilities, we interpret them as indicative of our own struggles to organize PD experiences to move them toward the teaching practices we value.


Our motive for developing this approach was to be able to compare teaching practice among a cohort of 25 science teachers over a 3-year period. While we have a way to go to complete those comparisons, we see our approach as amenable to the job because while being tied to features of classroom discourse, reliable analytic judgments can be made without transcribing and microanalyzing episodes of interaction. This makes it feasible to analyze relatively large corpuses of video data in much shorter time spans than traditional discourse or interaction analysis. We suggested above some of the advantages that this has over observational methods that take as the fundamental data source observers’ inferences about classroom interaction. It is very different also from approaches typical of large-scale studies that characterize classroom differences at such gross grain sizes that they actually obscure features of interaction that are often of analytic interest (e.g., Cannady, Vincent-Ruz, Chung, & Schunn, 2019).

At the same time, because our approach is directly tied to episodes of classroom interaction, more fine-grained analyses can be easily initiated (e.g., see Kawasaki & Sandoval, 2019). We are not at all suggesting that this approach supersedes fine-grained interaction or discourse analysis. We simply aim to expand the scale at which such work can be applied by taking this sort of middle-out approach. Methods for analyzing science classroom discourse are developed for a range of scholarly purposes. We find this approach useful for the set of purposes that we are currently pursuing. To the extent that other science education researchers are interested in being able to compare discursive features of science classrooms at scale, our approach may be useful.


  1. Banilower, E., Smith, P. S., Weiss, I. R., & Pasley, J. D. (2006). The status of k-12 science teaching in the United States: Results from a national observation survey. In D. W. Sunal & E. L. Wright (Eds.), The impact of the state and national standards on k-12 science teaching (pp. 83–122). Greenwich: Information Age Publishing.

    Google Scholar 

  2. Bateson, G. (1972). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. San Francisco: Chandler Publishing Co..

    Google Scholar 

  3. Berland, L. K., & Hammer, D. (2012). Framing for scientific argumentation. Journal of Research in Science Teaching, 49(1), 68–94.

    Article  Google Scholar 

  4. Berland, L. K., & Reiser, B. J. (2011). Classroom communities’ adaptations of the practice of scientific argumentation. Science Education, 95(2), 191–216.

    Article  Google Scholar 

  5. Cannady, M. A., Vincent-Ruz, P., Chung, J. M., & Schunn, C. D. (2019). Scientific sensemaking supports science content learning across disciplines and instructional contexts. Contemporary Educational Psychology, 59, 59.

    Article  Google Scholar 

  6. Duschl, R. A. (2008). Science education in 3 part harmony: Balancing conceptual, epistemic and social goals. Review of Research in Education, 32, 268–291.

    Article  Google Scholar 

  7. Elby, A., & Hammer, D. (2010). Epistemological resources and framing: A cognitive framework for helping teachers interpret and respond to their students’ epistemologies. Personal epistemology in the classroom: Theory, research, and implications for practice, 409–434.

  8. Elgin, C. Z. (2013). Epistemic agency. Theory and Research in Education, 11(2), 135–152.

    Article  Google Scholar 

  9. Engle, R. A. (2006). Framing interactions to foster generative learning: A situative explanation of transfer in a community of learners classroom. Journal of the Learning Sciences, 15(4), 451–498.

    Article  Google Scholar 

  10. Engle, R. A., & Conant, F. R. (2002). Guiding principles for fostering productive disciplinary engagement: Explaining an emergent argument in a community of learners classroom. Cognition and Instruction, 20(4), 399–483.

    Article  Google Scholar 

  11. Erickson, F. (1982). Classroom discourse as improvisation: Relationships between academic task structure and social participation structures in lessons. In L. C. Wilkinson (Ed.), Communicating in the classroom (pp. 153–181). New York: Academic Press.

    Google Scholar 

  12. Erickson, F. (1992). Ethnographic microanalysis of interaction. In M. D. LeCompte, W. L. Millroy, & J. Preissle (Eds.), The handbook of qualitative research in education (pp. 201–225). San Diego, CA: Academic Press.

    Google Scholar 

  13. Ford, M. J., & Wargo, B. M. (2007). Routines, roles, and responsibilities for aligning scientific and classroom practices. Science Education, 91(1), 133–157.

    Article  Google Scholar 

  14. Goffman, E. (1974). Frame analysis: An essay on the organization of experience. Cambridge, MA: Harvard University Press.

    Google Scholar 

  15. Hayes, K. N., Lee, C. S., DiStefano, R., O’Connor, D., & Seitz, J. C. (2016). Measuring science instructional practice: A survey tool for the age of ngss. Journal of Science Teacher Education, 27, 137–164.

    Article  Google Scholar 

  16. Horizon Research. (2000). Inside the classroom: Observation and analytic protocol. NC: Retrieved from Chapel Hill.

    Google Scholar 

  17. Jiménez-Aleixandre, M. P., Bugallo Rodríguez, A., & Duschl, R. A. (2000). “Doing the lesson” or “doing science”: Argument in high school genetics. Science Education, 84, 757–792.

    Article  Google Scholar 

  18. Jordan, B., & Henderson, A. (1995). Interaction analysis: Foundations and practice. Journal of the Learning Sciences, 4(1), 39–103.

    Article  Google Scholar 

  19. Kang, H., Windschitl, M., Stroupe, D., & Thompson, J. (2016). Designing, launching, and implementing high quality learning opportunities for students that advance scientific thinking. Journal of Research in Science Teaching, 53(9), 1316–1340.

    Article  Google Scholar 

  20. Kawasaki, J., & Sandoval, W. A. (2019). The role of teacher framing in producing coherent NGSS-aligned teaching. Journal of Science Teacher Education, 30(8), 906–922.

  21. Kelly, G. J. (2014). Discourse practices in science learning and teaching. In N. G. Lederman & S. K. Abell (Eds.), Handbook of research on science education (Vol. 2, pp. 321–336). New York: Routledge.

    Google Scholar 

  22. Kloser, M. (2014). Identifying a core set of science teaching practices: A delphi expert panel approach. Journal of Research in Science Teaching, 51(9), 1185–1217.

    Article  Google Scholar 

  23. Lehrer, R., & Schauble, L. (2004). Modeling natural variation through distribution. American Educational Research Journal, 41(3), 635–679.

    Article  Google Scholar 

  24. Lehrer, R., Schauble, L., & Lucas, D. (2008). Supporting development of the epistemology of inquiry. Cognitive Development, 23(4), 512–529.

    Article  Google Scholar 

  25. Lemke, J. L. (1990). Talking science: Language, learning, and values. Norwood, NJ: Ablex.

    Google Scholar 

  26. Manz, E. (2016). Examining evidence construction as the transformation of the material world into community knowledge. Journal of Research in Science Teaching, 53(7), 1113–1140.

    Article  Google Scholar 

  27. Mehan, H. (1979). Learning lessons: Social organization in the classroom. Cambridge, MA: Harvard University Press.

    Book  Google Scholar 

  28. Nava, I., Park, J., Dockterman, D., Kawasaki, J., Schweig, J., Hunter Quartz, K., & Martinez, J. F. (2019). Measuring teaching quality of secondary mathematics and science residents: A classroom observation framework. Journal of Teacher Education, 70(2), 139–154.

    Article  Google Scholar 

  29. NRC. (2012). A framework for k-12 science education: Practices, crosscutting concepts, and core ideas. Washington, DC: National Academy Press.

    Google Scholar 

  30. O'Connor, C., Michaels, S., Chapin, S., & Harbaugh, A. G. (2017). The silent and the vocal: Participation and learning in whole-class discussion. Learning and Instruction, 48, 5–13.

    Article  Google Scholar 

  31. O’Connor, C., Michaels, S., & Chapin, S. (2015). “Scaling down” to explore the role of talk in learning: From district intervention to controlled classroom study. In L. B. Resnick, C. S. C. Asterhan, & S. N. Clarke (Eds.), Socializing intelligence through academic talk and dialogue (pp. 111–126). Washington, DC: American Educational Research Assn.

    Google Scholar 

  32. Piburn, M., & Sawada, D. (2000). Reformed teaching observation protocol (rtop): Reference manual, ACEPT technical report no IN00–3. AZ: Retrieved from Tempe.

    Google Scholar 

  33. Rosenberg, S., Hammer, D., & Phelan, J. (2006). Multiple epistemological coherences in an eighth-grade discussion of the rock cycle. Journal of the Learning Sciences, 15(2), 261–292.

    Article  Google Scholar 

  34. Ryu, S., & Sandoval, W. A. (2012). Improvements to elementary children’s epistemic understanding from sustained argumentation. Science Education, 96(3), 488–526.

    Article  Google Scholar 

  35. Sandoval, W. A., Kwako, A. J., Modrek, A., & Kawasaki, J. (2018). Patterns of classroom talk through participation in discourse-focused teacher professional development. In J. Kay & R. Luckin (Eds.), Rethinking learning in the digital age: Making the learning sciences count. 13th International Conference of the Learning Sciences (ICLS) 2018 (Vol. 2, pp. 760–767). London: ISLS.

  36. Sandoval, W. A., Enyedy, N., Redman, E. H., & Xiao, S. (2019). Organising a culture of argumentation in elementary science. International Journal of Science Education, 41(3), 1848–1869.

  37. Schultz, S. E., & Pecheone, R. L. (2015). Assessing quality teaching in science. In T. J. Kane, K. A. Kerr, & R. C. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the measures of effective teaching project (pp. 444–483). John Wiley & Sons.

  38. Stroupe, D. (2014). Examining classroom science practice communities: How teachers and students negotiate epistemic agency and learn science-as-practice. Science Education, 98(3), 487–516.

    Article  Google Scholar 

  39. Warren, B., & Rosebery, A. S. (1996). “This question is just too, too easy!” Students’ perspectives on accountability in science. In L. Schauble & R. Glaser (Eds.), Innovations in learning: New environments for education (pp. 97–125). Mahwah, NJ: Erlbaum.

    Google Scholar 

  40. Windschitl, M., Thompson, J., & Braaten, M. (2018). Ambitious science teaching. Boston, MA: Harvard Education Press.

    Google Scholar 

  41. Windschitl, M., Thompson, J., Braaten, M., & Stroupe, D. (2012). Proposing a core set of instructional practices and tools for teachers of science. Science Education, 96(5), 878–903.

    Article  Google Scholar 

Download references


This work is supported by a grant from the National Science Foundation (award #1503511). The views and opinions expressed herein are those of the authors only, and do not represent the official views and opinions of the NSF.

Author information



Corresponding author

Correspondence to William A. Sandoval.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sandoval, W.A., Kawasaki, J. & Clark, H.F. Characterizing Science Classroom Discourse Across Scales. Res Sci Educ 51, 35–49 (2021).

Download citation


  • Classroom discourse
  • Teaching practice
  • Student agency