Keywords

1 Introduction

Teaching is attributed key importance for society: It is supposed to support the young generations with personal growth and autonomy (“Bildung”), help qualifying a workforce, socializing citizens and integrating them into society, and, thereby, both reproducing and stabilizing society as well as building human powers to develop and change it. Teaching also plays a role in the context of allocating students to different career paths (Fend, 2008, p. 53, see also Biesta, this volume).Footnote 1 Thus, one aim of research on teaching is finding out, how teaching can best fulfil one or several of these functions. However, the functions are controversial and some of them difficult to reconcile with each other. Teaching is set within fields of tension between different aims and expectations; for example, between the aims of fostering student autonomy versus ensuring that students achieve specific predefined educational goals and between treating all students equal versus compensating social disadvantage (Helsper et al., 2001; see also Biesta, this volume). Moreover, teaching is a social activity and, as such, intricate, not fully controllable and ambiguous (e.g., Cohen, 1989; Luhmann, 2002; Ricken, 2009). This makes it difficult to find answers to the question what constitutes good and successful teaching. Consequently, not only teachers, but also educational researchers, operate within a field of tension: The expectation that educational research should produce implementable advice for practice and the demand that educational research should give account of the whole complexity and ambiguity of the research topic can be considered difficult to reconcile.

Different research paradigms address this field of tension in fundamentally different ways. According to Kuhn (1962) a paradigm refers to a unique combination of ontology, epistemology, and methodology or to “a whole way of doing science, in some particular field” (Godfrey-Smith, 2003, p. 76).Footnote 2 Any paradigm may include a variety of theories which cover that field in general, or some part of it. The field of research on teaching is heterogeneous at the beginning of the twenty-first century, with several paradigms and many theories being concurrently relevant.Footnote 3 Some paradigms of research on teaching explicitly aim at answering the question, how teaching can best achieve its functions and aims. This is often framed as the quest for teaching quality. Acknowledging that quality is both a normative and an empirical concept, answers are given in reference to conceptualizations of “good teaching” and/or “successful teaching” (Fenstermacher & Richardson, 2005). Paradigms in traditional pedagogy and didactics aim at specifying “good teaching” and providing guidance for reflective practitioners, combining philosophical and scientific concepts, professional wisdom, norms and rules for manoeuvring the complex space of teaching and supporting the process of “Bildung” (Prange, 2012; Terhart, 2016; Westbury et al., 2000). The concept of “successful teaching”, in contrast, requires empirical analysis of “what works”. It is the focus of teaching effectiveness research (TER; Kyriakides et al., this volume; Muijs et al., 2014; Scheerens, this volume; Seidel & Shavelson, 2007), which can be understood as a research paradigm that uses quantitative methods for explaining and predicting criteria for “teaching success” with characteristics of teaching. This paradigm is based on an instrumental understanding of teaching (see Chazan et al., 2016). Other paradigms present in the field of research on teaching explicitly refrain from defining and analysing “teaching quality” and ask more fundamental questions, such as “What is teaching?” “What distinguishes teaching from other forms of social practice?” (e.g., Breidenstein, 2006; Kolbe et al., 2008). They emphasize more the complexity, context-specificity, and ambiguity of teaching. The underlying understanding of teaching has been called “fundamental” by Chazan et al. (2016; see also Herbst & Chazan, this volume). The practice theoretical paradigm is one example. It aims at reconstructing practices to gain an understanding of the social order in the classroom without a priori assumptions regarding their desirability.

In the present paper we bring together two of these contrasting paradigms: We present a specific instantiation of TER, the Theory of Basic Dimensions of Teaching Quality (TBD), identifying major limitations and unresolved issues. We argue that major limitations of TER are a too simplistic concept of relations between teaching and learning and a neglect of the sequencing of interactions in the classroom and, thus, a lack of understanding of classroom dynamics.Footnote 4 In the attempt to better understand, and to some extent overcome these limitations, the chapter refers to a fundamentally different combination of ontology, epistemology, and methodology: the paradigm of practice theories. The chapter aims at critically reflecting TER, and in particular TBD, by contrasting it with a practice theoretical perspective and, based on this reflection, it also aims at developing first ideas for a reconceptualization of theoretical foundations of TBD and research methods used in the field. Building bridges between disparate paradigms is a risky project—yet it may help strengthen the theoretical foundations of research on teaching. Among other things such a venture creates awareness for the particular sets of assumptions, values, and beliefs about the social, about knowledge and about research itself, that characterize a research paradigm and may appear self-evident, almost “natural” from within.Footnote 5

Consequently, the paper deviates from the pattern of other chapters in this volume. From an epistemological point of view, our aim is “doing theory”: we explore and outline a specific theory of teaching quality as an example of TER theories (TBD), but we move on to revisit, re-conceptualize this theory and its measurement approach by discussing it from the perspective of a fundamentally different paradigm of social science (practice theories).

The structure of the chapter is the following: In Sect. 2 foundations and limitations of general TER are discussed as well as some recent developments in this field of research. TBD—as one specific theory within TER – is introduced in Sect. 3 and specific voids concerning this theory are identified. Section 4 introduces “opportunity-use-models of the effects of teaching”, a further development of TER, which integrates mediated process-product research with constructivist systems theory. In Sect. 5 the practice theoretical perspective is introduced, and in Sect. 6 its potential for reconceptualising classroom dynamics in TER is mapped out. Ultimately, in Sect. 7, we will answer the questions put forward by the editors of this volume.

2 Foundations and Limitations of TER

TER (Scheerens, this volume; Seidel & Shavelson, 2007) responds to the quest for teaching quality through empirical and quantitative studies of “successful” teaching. The paradigm is part of the Educational Effectiveness branch of Educational Research which has developed over several decades (Hall et al., 2020; Kyriakides et al., this volume; Reynolds et al., 2014). Its approach to researching teaching is rooted in the epistemological perspective of “critical rationalism” (Popper, 1959). Its core is the search for teaching characteristics, patterns, or types of teaching which statistically predict so-called “student outcomes”—mostly learning gains in different subjects. TER, thus, aims at offering comparatively straightforward answers to the question, how classroom teaching and learning can be improved.Footnote 6 Since its invention more than half a century ago, TER has mostly been based on the observation of classroom processes, combined with the measurement of so-called „student outcomes“ (Creemers & Kyriakides, 2015). In its beginning, it aimed at „determining how more and less effective teachers act and then trying to get teachers to act in the ways that distinguish the more effective ones” (Gage & Needels, 1989, p. 253) by examining direct effects of processes (mostly teacher behaviour, sometimes student behaviour, and teacher-student interactions) on outcomes (mostly student achievement). Later it was merged with the paradigm of cognitivism, and student cognitions were included as mediators between process and product in the so-called “mediated process-product approach” (e.g., Borich, 1986; Creemers & Kyriakides, 2015; Doyle, 1977; Rothkopf, 1976; Winne, 1987; see also Hiebert & Stigler, this volume). Meanwhile, TER has grown into a wide range of research activities which receive much attention within different research communities (educational psychology, organization research, Large Scale Assessment) as well as communities of professionals and policy makers.

2.1 Criticisms Against TER

TER has been harshly criticized ever since its emergence (Scheerens, this volume). Various points of criticism have been summarized, reviewed and evaluated by Gage and Needels (1989) already in the 1980ies—yet their paper is in many regards still relevant. They distinguished between conceptual criticism, methodological criticism, criticism of productivity, and criticisms of interpretation-evaluation. Conceptual criticism includes a neglect of teachers’ intentions (i.e., the teachers’ own conceptions of the purposes of their behaviour in the classroom), a neglect of contextual conditions influencing teaching (e.g., subject matter, grade level, student characteristics), and a neglect of the sequential nature of classroom interactions (i.e., that teaching a topic requires an introduction into the topic, consolidation of new knowledge, reasoning about the topic as well as transfer and that teaching effectiveness might depend on the concrete positioning of a teacher behaviour within such a sequence). Another conceptual criticism is that “the goal directed, normative nature of teaching makes it not amenable to empirical investigation” (ibid., p. 258). Teaching aims at achieving purposes which have been defined by humans and, thus, are subject to constant change. This variability, the critique argues, precludes the development of nomological laws and, thus, the use of scientific methods.Footnote 7 Criticism further concerns the assumption that teaching is directly related to learning. This has been dismissed as being too simplistic and mechanical, as reflecting “an overly simplified notion of causality” (Tom, 1984, p. 70). Even more fundamentally, process-product research has been criticized as being “atheoretical” (see Gage & Needels, 1989). Criticisms of methodology encompass that process-product research has searched for “implausible relationships between teaching behaviour occurring at one point in time and student achievement in another subject-matter at a relatively distant other point in time” (ibid., p. 265) and, related to this, that it has treated teacher behaviours as generalizable across time and subject-matters, that it has used predetermined coding categories based on common sense and prior unstructured observations instead of systematic ethnography, that content is often ignored, that cognitive, emotional and motivational processes are neglected, that experiments are not used enough, and that inadequate achievement tests are used (ibid.). Further, process-product research has been criticised for being not sufficiently productive in terms of solving practical problems related to teaching—in particular, answering the question, how to best support student learning. Criticism of interpretation and application concerned the use of meta-analysis, difficulties with implementation in experimental studies, and that universal rules for teachers have been derived from correlational findings, which gives teachers mistaken confidence in the certainty of scientific results (ibid.).

Gage and Needels (1989) rebutted most of this criticism, even though they agreed that investigating longer sequences of teaching and learning activities and including content would be enlightening. Since they wrote their paper, additional progress in addressing the aforementioned points of criticism has been achieved (see Kyriakides et al., this volume, for an overview on phases of TER). The productivity of process-product research can no longer be called into question in terms of quantity and impact, e.g., on teacher education and educational policy. Moreover, several experiments suggest that inducing teachers to use teaching strategies and methods found to be correlated with achievement gains in other classes can actually help them increase the effectiveness of teaching (e.g., Good & Grouws, 1979; Griffin & Barnes, 1986). Further, as outlined above, process-product research has become more complex in the past decades, e.g., by moving from behaviouristic to cognitive approaches. The role of context (e.g., Creemers & Scheerens, 1994; Dunkin & Biddle, 1974) and content matter (Scheerens, 2017; Schmidt & Maier, 2009) further received increasing attention as well as teacher cognitions (Bardach & Klassen, 2020; Clark & Yinger, 1979; Hill et al., 2005, 2019; Kunter et al., 2013; Shavelson, 1983; Shulman, 1986). In addition to achievement, motivation (e.g., Pintrich, 2003) and emotions (e.g., Mayring & Rhoeneck, 2003) were examined as so-called “student outcomes”. Plain taxonomies of effectiveness factors have been converted into theoretical models and even theories, such as the integrative process-mediation-product model based on developmental and educational theories which Cappella et al. (2016) presented in the latest edition of the Handbook of research on teaching, the comprehensive dynamic model of educational effectiveness (Kyriakides et al., this volume), or more focused approaches such as the TBD outlined in the present chapter (see Sect. 3). In terms of methodology, some recent coding protocols have been more strongly anchored in theoretical foundations than their predecessors (in particular those developed by Bell et al., 2012 and Hamre et al., 2013). Moreover, the rating process has been better understood and geared to support validity arguments (ibid). Content-focused longitudinal designs (e.g., Klieme et al., 2009; Wright & Nuthall, 1970), including experimental designs (Decristan et al., 2015), allow for studying proximal relationships between teaching and learning, as pre-post-measures and observations are both focused on a single specific unit of instruction. Recently such a design has even been combined with a study examining generalizability of effects across systemic contexts (Opfer et al., 2020). Many of these newer studies also acknowledge the normative nature of teaching by measuring multiple “products” (so-called “student outcomes”) in parallel.

To sum up, on an international scale, there is a large body of empirical investigations of teaching within TER, more specifically within the enhanced mediated process-product-approach and many of the critical issues listed by Gage and Needels (1989) have been addressed. However, we argue in the following that past attempts of addressing the criticisms concerning conceptions of causality as well as concerning the neglect of the sequentiality of teaching interactions are still unsatisfactory. Even with mediating and moderating factors included, most TER still assumes a unidirectional, causal chain connecting teacher behaviour ultimately with student learning. As a consequence, the complexity of the moment-to-moment flow of classroom interactions, with teachers and students sometimes initiating an exchange, sometimes responding to each other (as shown, e.g., by classroom ethnography), as well as the contingencies and ambiguities involved in social interactions are not well understood in the teaching effectiveness paradigm. Before we discuss these issues, we will present TBD as one specific instantiation of TER, since the arguments become more vivid when they are illustrated with a specific theory.

3 The Theory of Three Basic Dimensions of Teaching Quality

The theory of basic dimensions of teaching quality (TBD) intends to give an account of results of TER in a systematic and parsimonious way, building upon findings of process-product- as well as mediated process-product-research (Seidel & Shavelson, 2007; Wang et al., 1993). Yet, it adds conceptions of human learning rooted in the paradigm of cognitive constructivism (Aebli, 1963; DeCorte, 2000; Piaget, 1955; Posner et al., 1982). TBD has grown out of an attempt (a) to identify basic dimensions among the many constructs used in TER, and (b) to explain how teaching quality, as covered by these basic dimensions, drives student learning and educational outcomes. Therefore, as Praetorius et al. (2020) pointed out, the theory has two main parts: (a) a structural part, specifying three dimensions which span the space of teaching quality, and (b) an explanatory part, showing how teaching quality explains and predicts student learning. In the Sects. 3.1 and 3.2, both parts of the theory will be outlined together with related research findings. In both cases, empirical findings are mixed—sometimes confirming the theory, sometimes rebutting it, sometimes suggesting a revision, e.g., the introduction of additional dimensions.

While conceptual foundations for TBD have been established by Klieme et al. as early as 2001, the model had not been evaluated in a comprehensive way until recently. Praetorius et al. (2018) reviewed more than 20 research projects guided by TBD, finding partial empirical support for the model. Applying criteria from the logical empiricism tradition in philosophy of science, Praetorius et al. (2020) stopped short of calling TBD a “theory”, as they deplored a lack in clarity, coherence and comprehensiveness. Yet, they acknowledged that TBD is a parsimonious set of theoretical statements linking teaching to learning, related (although not always in a clear way) to well established theories, providing testable hypotheses and some guidance for professional practice. The present chapter, using a wider notion of “theory” (see Sect. 6), does call TBD a theory. However, we agree that TBD needs further theory development, including “elaborating on the underlying socio-cultural assumptions more explicitly” (Praetorius et al., 2020, p. 28). This chapter intends to respond to that request.

3.1 The Structural Part of TBD: Identifying Basic Dimensions of Teaching Quality

Clausen (2002) developed a broad set of high-inference video ratings based partly on pedagogical traditions (didactics, reform pedagogy), partly on empirical research in teaching effectiveness and classroom climate, and applied them to the German sample of the TIMSS 1995 Video Study. Through factor analysis of these ratings, Klieme et al. (2001) identified three “basic dimensions” labelled (1) Classroom management, (2) Cognitive activation, and (3) Student support (see also Klieme, 2019; Klieme & Rakoczy, 2003; Kunter & Trautwein, 2013, who provide detailed references to the research literature). Following Praetorius et al. (2018), these dimensions may be characterized as follows.

  • Classroom management covers two key principles of teaching: identifying and strengthening desirable student behaviours (e.g., through communicating clear rules and establishing stable routines) and preventing undesirable ones (e.g., through monitoring and intervening immediately if necessary).

  • Cognitive activation includes exploring and building on students’ prior knowledge and ways of thinking, assigning challenging problems, engaging students in higher-level thinking processes and metacognition—as suggested by constructivist concepts of teaching for understanding.

  • Student support is indicated by warmth and respect in classroom interactions, good social relationships, and teachers’ helping with student learning.

Several empirical studies supported the three-dimensional structure, using high-inference observations of classroom practice (Klieme et al., 2001; Rakoczy, 2008), student questionnaires (e.g., Fauth et al., 2014) or questionnaires combined with an assessment of teaching materials (Baumert et al., 2010; Kunter & Voss, 2011); (for an overview of related research see Praetorius et al., 2018). Similar dimensioning has further been suggested by other researchers. In particular, the “Teaching Through Interactions (TTI)” framework, operationalized by the CLASS observation instrument (Hamre et al., 2013; Pianta & Hamre, 2009), includes classroom organization, instructional support, and emotional support, which has some resemblance to TBD—even though specific definitions and operationalisations are not the same (Praetorius et al., 2018; Praetorius & Charalambous, 2018; see also Bell, 2020). Moreover, Diederich and Tenorth (1997) argued that classroom teaching requires a certain level of student attentiveness, student understanding, and student motivation—conditions which Klieme et al. (2001) related to the basic dimensions of their model.

However, it should be noted that within the TBD approach (in contrast to TTI and CLASS), there is no canonical operationalization. Consequently, findings regarding the dimensional structure of teaching measures vary sometimes by mode (questionnaires vs. observations) or by perspective (teachers vs. students); also by grade level and subject taught. Some studies have suggested a need for additional dimensions such as clarity (Nilsen & Gustafsson, 2016), subject matter quality (Lipowsky et al., 2018), and cognitive support (Kleickmann et al., 2020).

It should further be noted that the dimensions of teaching quality are not independent from each other. Lack of understanding for their relationships has been a major criticism when Praetorius et al. (2020) evaluated the state of the art in TBD. Conceptually, teaching subject matter for student understanding and helping students to feel competent (a major aspect of student support) do overlap. Alternative modelling approaches developed outside of TBD suggest a hierarchy with classroom management as the foundational or “easiest” and cognitive activation as the most demanding area (Pietsch, 2010).

3.2 The Process Part of TBD: Explaining So-Called “Student Outcomes”

The theory developed by Diederich and Tenorth (1997) served as a starting point for outlining potential effects of the three basic dimensions on students, more specifically on their attentiveness, achievement and motivation. In order to provide more detailed arguments, the explanatory part of the TBD theory additionally refers to different paradigms of classroom research and learning theory:

  • Classroom management lays the foundation for learning by preventing disruptions, noise and disorder, e.g. through continuous monitoring of student behaviour (Kounin, 1970). A certain level of quietness and orderliness is a precondition for “time on task”, for attentively engaging with the learning content, which should have a positive effect on achievement (Evertson & Weinstein, 2013). If characterized by “informational behavioural regulation” rather than strict teacher control, classroom management may also foster students’ experience of autonomy and competence (Kunter et al., 2007).

  • Students’ achievement and depth of understanding will further depend on the way the learning content is framed and presented. Based on cognitive constructivist learning theories (Aebli, 1963; DeCorte, 2000; Piaget, 1955; Posner et al., 1982) it can be assumed that knowledge and understanding will be fostered if, among others, students’ pre-knowledge is activated, new content is challenging pre- or misconceptions, and students are required to provide arguments and negotiate meaning. TBD assumes that “Cognitive activation”, comprising those features, makes deep processing of the learning content more likely.

  • Finally, TBD assumes “Student support”—including, among others, providing opportunities for students to present their thinking, informative feedback, and respectful and warm relationships between teachers and students—to foster the experience of autonomy, competence and relatedness which, according to the self-determination theory of motivation (Ryan & Deci, 2000), will deepen students’ learning motivation and interest in the subject matter.

Figure 3.1 summarizes the hypothesized paths in a mediated process-product type of model. However, empirical research based on pre-post-designs and carried out mainly in Germany and Switzerland mostly addressed direct effects of teaching quality on student learning.

Fig. 3.1
A process flow diagram for T B D. It includes opportunities provided, students' use of opportunities, and effects. It has 3 factors cognitive activation, classroom management, and student support divides further.

Process part of TBD. (Praetorius et al., 2020, p. 20. Adapted and translated from Klieme et al., 2006)

According to the review by Praetorius et al. (2018), the relationship between classroom management and achievement growth has been supported by the majority of studies—e.g. in secondary mathematics classes (Kuger et al., 2017; Lipowsky et al., 2009), primary science classes (Decristan et al., 2015), secondary German (reading) classes (Klieme et al., 2008) and English as a foreign language classes (Helmke et al., 2008). Some studies report classroom management to be additionally related to growth in student motivation (Doan et al., 2020; Kunter & Voss, 2011; Rakoczy, 2008). The effects hypothesized for cognitive activation and student support have found weaker empirical support. Cognitive activation was associated with achievement growth, e.g., for secondary mathematics (Dubberke et al., 2008; Kunter & Voss, 2011; Lipowsky et al., 2009), secondary German (reading) classes (Klieme et al., 2010), and primary science education (Decristan et al., 2015). Student support was associated with growth in students’ interest in primary (Fauth et al., 2019) and secondary schools (Klieme et al., 2008; Kunter, 2005). Yet, when restricting the review to the most powerful empirical design, multi-level longitudinal analyses modelling all three dimensions at once, less than half of the expected effects for cognitive activation and support were confirmed (Praetorius et al., 2018).

Although not explicated in TBD, student support further sometimes also correlates with achievement growth (e.g., Decristan et al., 2015, for science education in German primary schools), and cognitive activation correlates with change in student motivation (as, e.g., shown for mathematics classrooms in Shanghai in the TALIS Video Study; see Doan et al., 2020). One study even found cognitive activation to mediate effects of student support and classroom management on student interest in biology (Dorfner et al., 2018). Thus, functional consequences of teaching quality dimensions are not as clear-cut as expected. This may be due to the interrelation between the three dimensions discussed before. When two or all three (partially confounded) dimensions are included at once in predicting so-called „student outcomes“, oftentimes just one dimension prevails. So far, little is known on how the three dimensions of teaching quality interact and complement each other.

Likewise, there has been little research testing the mediation part of the model, i.e. the hypothesis that teaching quality has an effect on so-called „student outcomes” through students’ attentiveness, cognitive activity, and feeling of self-determination. Some empirical findings supported parts of this mediation model (Rakoczy, 2008). Recently, a German enhancement to the TALIS Video Study, applying a post-hoc student questionnaire to measure individual mediators or “the individual use of learning opportunities”, confirmed effects of use on achievement and interest, but failed to establish mediation (Praetorius et al., 2020). To account for the interplay between the individual “use” and the “opportunities” provided by teaching, some researchers (e.g., Seidel, 2020) suggested moderation instead of mediation models, allowing for direct effects of either variable on “outcomes” plus an interaction term. However, as we argue in the following section, it is questionable whether „use” can in fact be disentangled from “opportunities” and measured through standardized student questionnaires.

4 Teaching Effectiveness Beyond Claims of Unidirectional Causal Impact: The Concept of Opportunity and Use

The idea that teachers can directly cause student learning has long been questioned (see Biesta as well as Hiebert & Stigler, this volume). The criticism of mechanistic linear conceptions of causality in TER has been addressed with the concept of “opportunity and use”. This approach, which—as we argue—transcends mediated process-product models, is popular in the German speaking quantitative research on teaching effectiveness (Vieluf et al., 2020). A few authors (most from German-speaking countries) have also published research in English-speaking journals citing an opportunity-use-model (e.g., Brühwiler & Blatchford, 2011; Göbel & Helmke, 2010; Lipowsky et al., 2009). Yet, there is no publication in English language which provides a detailed description of the theoretical concept and its background.

Fend (1981, 1982) and Helmke (2003), who developed the first opportunity-use models (meanwhile other authors have formulated additional variations of these models), built upon international mediated process-product research as well as discussions about “opportunity to learn” (McDonnell, 1995). The “opportunity” in their models refers to teaching processes and the “use” to individual cognitive, motivational and emotional processes, i.e. the mediators in mediated process-product research. Learning, according to opportunity-use-models, only takes place when the learning opportunities emerging during the lesson are used by the students. Additionally, the roles of the context at different levels of the educational system and that of the individual characteristics of students and teachers are recognised. These are conceptualized as independent variables affecting opportunities and/or use, but sometimes also as moderator variables that moderate the associations between opportunity and use.

As shown in Fig. 3.1, the process part of TBD has also been framed as an opportunity-use-model (Klieme et al., 2006; Kunter & Trautwein, 2013; Praetorius et al., 2020). The three dimensions—classroom management, cognitive activation, and student support—are thought to describe patterns of classroom interaction indicating a specific quality of learning opportunities. Their effects on student achievement and student motivation are conceptualized to be mediated by students’ use of the learning opportunities. Students are more motivated to learn and learn more, the more they get involved with the lesson content (“time on task”), the deeper they process this content, and the more they experience autonomy, competence, and relatedness during the lesson. Effective classroom management, cognitive activation and student support make this more likely.

Some parallels to Hiebert and Stigler’s (in this book) concept of “sustained learning opportunities (SLOs)” become apparent. In particular, the three dimensions refer to patterns of interaction “that emerge during classroom lessons from the interactions of multiple mediating variables to create the contexts in which learning occurs” (ibid, p. 62) and it is assumed that teachers contribute to their interactive emergence, but cannot directly cause student learning. However, Fend went further than that. He also drew on systems theory (Luhmann, 2002; Luhmann & Schorr, 1979) to map out the relation between teaching and learning. He argued that the same opportunities are not always used by all students and that they are more likely to be used by students with whose psychic structures (e.g., pre-knowledge) they are compatible. Hence, teaching has no universal quality, but needs to be adaptive to the particular needs of each individual student. This idea could be understood as implying the existence of multiple moderation effects, i.e., systematic variation in the strength of the relations between opportunities and use depending on different student and context characteristics. The context at different levels of the educational system as well as individual characteristics of students and teachers are assumed not only to affect the qualities of teaching and learning themselves, but also the relation between both.

Yet, Fend also argued that the potential influence of teachers on students is—even more fundamentally—limited by the psychic systems’ momentum of its own (in the sense of the German term “Eigendynamik”) or by the “autonomous intentionality” (Fend, 2008, p. 130) of students. Use is then to a certain extent uncertain.

To understand this argument, it appears helpful to include a short summary of some aspects of Luhmann’s complex constructivist systems theory, because Fend (in later publications of his opportunity-use-model) explicitly referred to this theory. Luhmann (1986) conceptualized systems as self-organized and autopoetic.Footnote 8 They need to ensure their continued existence and, to this end, they only take up information that is relevant for their survival and development. In the process, systems develop immanent structures and stabilize themselves implying that the elements of the system are continuously reproduced by the elements of the system. For social systems (e.g., school classes) and psychic systems (e.g., students and teachers) the elements are not substance but meanings. Social systems reproduce themselves on the basis of communication, while psychic systems reproduce themselves on the basis of thoughts. Different systems are operatively closed against each other: no system can contribute elements to the respective other system. Hence, no teacher can instil knowledge into students or change their thoughts nor can the social system of the classroom directly produce changes in a student. However, different systems can be structurally coupled: A personal or social system can observe other systems, learn how they function, and start adjusting their structures accordingly. Systems further can be self-reflective; they can notice “before-after differences”. Learning, in this perspective, means that structural changes in the psychic systems of students take place with the aim of adapting to an environment. Such changes are self-induced and need to build upon the existing structures. Teachers can only try to trigger and support them, but not directly intervene into the psychic structures of students. Thus, there is a “technology deficit” inherent to education, i.e., a lack of a linear causal relation between teaching and learning (Luhmann & Schorr, 1979).

Fend’s (1982, 2008) concept of opportunity-use refers to Luhmann’s notion of a “technology deficit” when arguing that teaching cannot directly cause student learning, but only open up or limit opportunities for individual and autonomous forms of accommodation, i.e., for cognitive processes of revising existing cognitive schemas, perceptions, and understandings so that new information can be incorporated.

Similar to many other theorists of teaching (e.g., Cohen, 1989 or Biesta, this volume), Fend (1981, 1982) also emphasized that teaching is a social interaction, which is inherently uncertain. In social situations it is impossible to know exactly how others think and feel and what they mean when they say or do something. The behaviours of all others are to a certain extent unpredictable. Each individual decides what to do and how to behave under considerable uncertainty (this is called “double contingency” by Luhmann (1986)). Therefore, teachers can contribute to the interactive emergence of learning opportunities, but they cannot determine them. How their doings and sayings are understood by students might differ from how they were intended, and reactions of students—which are to a certain extent unpredictable—also fundamentally shape the interactions. Hiebert and Stigler (this volume) also write that teachers cannot create SLOs on their own, but only together with the students. Yet, Fend made this argument more explicit by representing the relations between opportunity and use as reciprocal and moderated, and as affected by a certain “Eigendynamik” of the different systems involved.

The following example aims at illustrating all three argumentsFootnote 9: One strategy for stimulating a cognitively activating dialogue is asking questions like: “Well, could you please explain why you think so?”Footnote 10 Yet, such questions cannot directly change students’ thinking. The opportunity for cognitive activation inherent in such questions only unfolds when the student addressed by the question—or at least classmates—understand the question as invitation to reflect the own preconceptions (some might, for example, understand it as an implicit negative feedback revealing that what they had said was wrong and, consequently, pull out of the dialogue). When students understand the question as invitation to reflect, then they also must be motivated and able to cooperate and contribute to the dialogue by giving responses that offer insight into the way they construct the subject matter. Whether students are able and motivated is likely to depend on individual characteristics (their general learning motivation (trait), for example) as well as characteristics of the situation and momentary emotional states. Yet, it is, probably, also to a certain extent spontaneous and incidental how the student reacts; result of the students’ autonomous intentionality (sometimes even largely unmotivated students participate). So, how the student reacts to the teacher question depends on many things and is quite uncertain. Yet, this reaction fundamentally shapes the subsequent course of classroom interaction and, thus, also the emergence of further learning opportunities. For example, when the student who was asked to explain her thoughts answers: “no idea”, the teacher might insist or ask others. But when nobody replies, ultimately, the teacher cannot force students to think about the question, and will probably drop the topic. If students are often unwilling to get involved in such debates, then the teacher might give up and generally stop asking questions of this kind. If, however, the student participates and explains the reasons for her assumption, then the teacher gets a chance to inquire further about her ideas, ask why- and how-questions, and support the student with explaining her thoughts.Footnote 11 Then additional opportunities for reflecting preconceptions emerge in the interaction for the student herself, but also for her classmates. Hence, from this point of view, learning is not a consequence or “outcome” of classroom interactions, but rather it is part thereof, since students’ use (and non-use) of learning opportunities shapes the course of the classroom interaction and consequently also the emergence of further learning opportunities.

Modelling a causal chain of variables—such as “inquiring into students thoughts causes student to reflect their ideas which causes students to learn”—does not live up to the complexity of classroom interactions, where inquiring into students’ thoughts requires participation of students who might decide not to, and where the use of learning opportunities and the moment-to-moment changes in students’ concepts and ideas are not only shaped by the opportunities but also shaping the opportunities. Hence, an opportunity-use approach fundamentally differs from a mediation and even goes beyond a moderation approach. The reason is the highly interactive nature of classroom activities: opportunity and use, teaching and learning can hardly be separated. As a consequence, conceptualizing the interplay of teacher and student behaviours as well as their cognitions, emotions, and motivational states in the classroom is quite difficult in a quantitative research paradigm that assumes linear causality between separable elements (see also Fauth et al., 2020).

Taking the opportunity-use-idea serious, we now conclude that Fig. 3.1 does not reflect this idea properly. So far, TBD has mostly been presented as a classical mediated process-product theory, i.e., a typical example of TER assuming linear causal relationships. This view is now challenged from a true opportunity-use perspective. The complexity of reciprocal interrelations between opportunities and use, teaching and learning, and teacher and student behaviour in the classroom is also reflected in an inconsistency in operationalisations of TBD, which has been pointed out by Fauth et al. (2020): Items meant to assess TBD dimensions sometimes refer to student behaviours, sometimes to teacher behaviours, and some leave open whose behaviours exactly they are referring to. More specifically, “classroom management” sometimes refers to teacher actions aimed at preventing disruptions and sometimes to the occurrence of disruptions, i.e. student behaviour. “Student support” sometimes refers to teacher behaviours, e.g., the type of feedback they give, and sometimes it refers to the quality of relationships between students and teachers which is inherently reciprocal. “Cognitive activation” is predominantly used as a label for specific teacher behaviours, such as inducing cognitive conflict, but sometimes it also refers to students’ contributions to the classroom discourse, such as providing reasons for their answers to teacher questions (see also Praetorius et al., 2018). In a traditional mediation model, teacher and student behaviours have different positions within one causal chain and are, thus, not interchangeable. From an opportunity-use perspective it could be argued that opportunity and use are separable only at the level of concrete doings and sayings, i.e. single utterances or single gestures, because they stand in a complex non-causal but reciprocal relation. What we see when we observe learning opportunities is often the result of a complex process of situational adaptations of what teachers had planned and developed beforehand to their perception of students’ needs in any concrete situation (and sometimes also to their own situational needs) and students’ contributions to the interaction. Thus, the teacher and student behaviours observed in the context of TBD research might be considered to be different sides of the same pattern of interaction.Footnote 12

In conclusion, opportunity-use-models suggest a reconceptualization of the relation between teaching and learning that better takes the interactive, and therefore uncertain, nature of teaching and the “technology deficit” into account and conceptualizes learning as an autonomous process that cannot be enforced from the outside. What the opportunity-use model does not explain well is why specific sequences of interaction frequently emerge during lessons even though teacher behaviour cannot cause student behaviour and vice versa. Why do, for example, many (but not all!) students stop chatting with the neighbour when the teacher gazes at them? The gaze is rather not likely to have a causal effect. It does not physically prevent chatting. Yet, framing the gaze as an “opportunity” for stopping to chat is also not fully convincing. In the following we argue that perspectives from practice theories can make a significant contribution to answering this question.

5 Perspectives from the Paradigm of “Practice Theories”

The notion of “teaching practices” or “classroom practices” is oftentimes used when getting into details of classroom interaction and measurement thereof (e.g., Bell et al., 2020a). Creemers and Kyriakides (2015, p. 108) consider “understanding effective teaching practices” to be the main goal of process-product-studies on teacher (!) effectiveness, but they do not provide any definition of “practices”. Rather, they move on listing strands of “teacher behaviour”, ultimately elaborating eight “teacher factors that attempt to measure teacher behaviour in the classroom” within their dynamic model (ibid, p. 116). Likewise, Ball refers to teachers’ classroom activities such as explaining, eliciting, diagnosing, and providing feedback as “high leverage teaching practices” (Ball & Forzani, 2009). Balls’ conception of “practices”, which is very influential in the US, also includes generic aspects of teaching such as “implementing organizational routines”, “coordinating and adjusting instruction during a lesson”, “building respectful relationships with students”, and professional activities outside of classrooms (e.g. “talking about a student with parents or other caregivers”). As in Ball’s list, descriptions of “practices” are often focused on the teacher, although in observation and measurement it is acknowledged that the enactment of practices is a co-construction by teachers and students (Bell et al., 2012). All in all, for at least 20 years (see Walberg & Paik, 2000), the term “teaching practices” has been used in a pragmatic way, without clear definition, when describing, classifying, or measuring activities inside or even outside the classroom (see Grossman et al., 2009; Lampert, 2010).

In contrast to this pragmatic and rather fuzzy talk of “teaching practices”, there is a deeper theoretical tradition of “practice theories”, based in sociology.

In Germany there is already a large body of research on teaching based on practice theories (e.g., Breidenstein, 2006; Idel & Rabenstein, 2013; Kolbe et al., 2008; Reh & Rabenstein, 2013; Reh et al., 2011). Also in the English-speaking discourse this perspective has gained significance (e.g., Edwards-Groves, 2017; Grootenboer & Edwards-Groves, 2019; Herbst & Chazan, 2003).

Sociological “practice theories” are heterogeneous in many regards, but commonly refer to an understanding of practices influenced by the American pragmatism (Pierce, Dewey, and James) and by Wittgenstein. Fundamental for the development of practice theories are the works of Bourdieu and Giddens, as well as the late work of Foucault. Also Butler, Latour, Garfinkel and Taylor are often referenced in this context. Schatzki (1996) as well as Reckwitz (2002, 2003) have worked out the commonalities of these theories to further develop the foundations of a practice theory.Footnote 13 Similar to Luhmann’s (1984) constructivist theory of social systems, practice theories can be considered a sub-type of “cultural theories”, i.e., of theories “which explain or understand action and social order by referring to symbolic and cognitive structures and their ‘social construction of reality’” (Reckwitz, 2002, p. 246). However, while Luhmann described the social as systems that self-reproduce through communication, practice theories argue that the social consists of practices which include more than communication.

A practiceFootnote 14 has been defined as a nexus or “set of hierarchically organized doings/sayings, tasks and projects” (Schatzki, 2002, p. 73). And as a “routinized way in which bodies are moved, objects are handled, subjects are treated, things are described and the world is understood” (Reckwitz, 2002, p. 250). Practices also encompass know-how as well as affects, ends, and purposes (which are not considered to belong to an individual but to be part of the practice) as well as artefacts. All the elements connected within a practice routinely occur together in a specific way and form a block of meaning that is intersubjectively understood (Reckwitz, 2002, p. 249). Yet, this meaning is largely tacit.Footnote 15

This definition is in accordance with a common understanding of practice as “a habitual way or mode of acting” (e.g. Lampert, 2010). Yet, practice theories go beyond this and they understand practices not as an individual habit but as collectively shared. Practices do not serve the purpose of an individual, they include a shared purpose in themselves. They also include mental doings and sayings and affects as well as artefacts, not only physical doings or sayings. Moreover, practices are not a concrete combination of elements. In carrying out practices there is always the possibility of small changes in interpretations and patterns of action, so there are always nuances which do not necessarily change the intersubjective meaning of the practice (Reckwitz, 2003; Reh et al., 2011; Schatzki, 1996). The practice theoretical perspective further has a “flat ontology” and does not distinguish between macro and micro levels (Schatzki, 2016). This means that the term “practice” can refer to events of differing complexity.

Teaching itself can be understood as a complex social practice, but it is also the interconnection (“bundle” or “complex”) of a multiplicity of more basic practices (e.g. putting one’s hand up, picking somebody, answering, looking for help, helping, reading, calculating, see e.g. Reh et al., 2011, p. 214). German research on teaching rooted in a practice theoretical perspective has often focused on two practices: pedagogical pointing and addressing (Idel & Rabenstein, 2013; Reh et al., 2011; Ricken, 2009).Footnote 16 These practices and many (maybe even all) other basic practices, included in the practice of teaching, can be found in other social contexts as well. Yet, the way they hang together within the practice of teaching is specific. Thus, in the practice theoretical perspective—instead of social norms or accumulated individual rational choices or autopoeisis of systems—practices are the fundament of the social order in the classroom, i.e. the reason for the constancy and continuity of patterns of doings and sayings in the classroom. To go back to our example in Sect. 4: From a practice theoretical perspective, students stop chatting with the neighbour when the teacher gazes at them, because they have come to participate in a practice of “studenting”.

Learning can be considered part of every practice inside and outside of schools, including teaching (Lave, 1993; Lave & Wenger, 1991). Learning is existential to social practices as such, because “practices exist only if learned” (Schatzki, 2017, p. 34). Performing social practices always requires „knowing how to x, knowing how to identify x-ings, and knowing how to prompt as well as to respond to x-ings“ (Schatzki, 2002, p. 77). Hence, coming to participate in a practice involves learning or coming to know what is needed to participate in it. It is coming to be able to carry out the sayings, doings, tasks, and projects that compose a practice, attaining increasingly greater facility with the performance, performing a wider variety of actions that make up the practice, using the artefacts, organisms, and things and arrangements in the settings where practices are carried out more flexibly and skilfully, choosing better what to do in a practice, coping better with relevant rules and starting to contribute to the determination of normativity related to the practice (Schatzki, 2017, pp. 31–34). This requires propositional knowledge, but—in particular—practical understandings or “know-how” (ibid, p. 24) as well as routinized modes of intentionality, i.e. of wanting or desiring certain things and avoiding others, and also a certain emotionality (Reckwitz, 2002, p. 254). Hence, learning—from a practice theoretical perspective—is that transformation of a subject, which is necessary for participation in the social practices a learner encounters (Lave & Wenger, 1991; Schatzki, 2017).

6 Reflecting TBD from the Background of Practice Theories

In the previous Sects. 2, 3, 4 and 5, TER, TBD, opportunity-use-theories, and practice theories were introduced and discussed separately. In this section we aim at bringing those perspectives together. Practice theories and TER, including TBD, appear to have little in common at first sight. TER assumes that the mind is the place of knowledge and meaning structures. TER even aims at finding out how the minds of students can be changed purposefully in a specific way. Practice theories, in contrast, locate know-how and meaning within practices. Similar to systems theory and the opportunity-use model of Fend (1981, 1982), practice theories further advocate an understanding of learning as situated (Ricken, 2009) and reject the idea that teaching processes can purposefully “produce” changes in students’ minds. Another fundamental difference between TER and practice theories concerns normativityFootnote 17: TER implicitly presumes that a high score in an achievement test (or a motivation questionnaire) is a desirable goal and central aim of schooling—which is an a priori normative decision (e.g. Sauerwein & Klieme, 2016). Practice theoretical research rather aims at understanding the inner logic of teaching (e.g., Fritzsche et al., 2010, p. 97). Its stance has often been described as one of “normative abstinence”. Practice theoretical research on teaching reconstructs implicit ends and shared (often implicit) understandings of what is appropriate and not appropriate as part of practices. But it mostly refrains from determining which ends teaching should have, and from evaluating practices as good or bad, effective or ineffective, from the normative perspective of the researcher. Of course, education is always normatively charged and practices reconstructed may well bear normative consequences. However, the normative evaluation is ultimately left to the reader. Thus, there is more room for ambiguity, ambivalence and contradictions in this paradigm than in TER.Footnote 18 Accordingly, research on teaching that uses practice theories as theoretical foundation uses mainly qualitative methods, TER mainly quantitative methods. Yet, it is precisely these fundamental differences between the two paradigms which make it interesting to bring them together. Referring to Mannheim’s (1931/1995) theory of perspectivism (“Standortgebundenheit”) and inspired by the ethnographic strategy of “alienation” (“Befremdung”; Hirschauer & Amann, 1997) we argue that a deeper understanding of the familiar can be achieved when it is moved into distance, when it is irritated by taking a different perspective. More specifically, we argue that practice theories can help developing a conception of the relation between teaching and learning beyond the assumption of a linear causation and that it can contribute ideas how to better take the interactive nature of classroom teaching into account in TER.Footnote 19

6.1 Associations Between Teaching Dimensions and So-Called “Student Outcomes” Reinterpreted from a Practice-Theoretical Perspective

From the practice theoretical perspective, the observed correlations between teaching dimensions and so-called “student outcomes” (e.g., changes in achievement test results or in measures of learning motivation, etc.) can be seen in a different light. A practice theoretical perspective suggests understanding teaching as well as test-taking and questionnaire-responding each as specific bundles of practices:

Test-taking describes a practice of producing written (or sometimes oral) responses to questions or assignments, which fulfil certain criteria like being presented with a characteristic expressive style, having a certain structure, being focused, etc. The practice of test-taking might further be interpreted as one variation of the practice of pointing, more specifically, a form of “re-pointing”, i.e., of showing and explaining to the teacher something that he*she had showed and explained before. Often, test-taking also involves general academic practices (e.g., argumentation), and subject-specific academic practices (e.g., mathematical reasoning or solving quadratic equations). Hence, the results of a specific test can be seen as indicative of students’ participation in a specific nexus of practices at a certain point in time; a nexus of practices that has a priori been defined as “good” within research practice (or professional practice or policy guidelines).

Scores in questionnaires aimed at measuring so-called “student outcomes” alternative to achievement (e.g., learning motivation) can also be considered the result of a specific practice of filling out questionnaires. They are further self-reports of individual prior participation in the practices the questionnaire asks about, e.g. active learning. Sometimes they only focus on the affective component of these practices, e.g. experiencing enjoyment during learning.

Test scores and questionnaire responses might thus be seen to reveal whether students have come to participate in a priori defined and normatively charged practices. However, they can inform only whether students have come to participate in these practices, but not where (inside or outside school). In contrast, indicators of teaching dimensions (codings and ratings done by external observers or by participants themselves) inform about the presence or absence of specific a priori selected practices or bundles of practices during lessons. For example, a high score on the rating dimension “disciplinary climate” for a lesson indicates the absence of practices of disruption and the presence of the practice of collectively focusing attention on a defined learning content. A high score on the scale “cognitive activation” for a lesson indicates that practices such as irritating preconceptions or arguing have been present during that lesson (Klieme, 2019; Rakoczy & Pauli, 2006; Reusser, 2006; Schreyer et al., in press). Practices of using “errors” as learning opportunities or the absence of practices of social devaluation are, for example, observed in classrooms with a high score for “student support” (Rakoczy & Klieme, 2016).

When test-taking, questionnaire responding and teaching are all understood as complex bundles of practices, then correlations between so-called “student outcomes” and teaching dimensions can come about for three reasons:

First, the teaching dimensions refer to practices that are also part of test-taking. When a large part of the lesson time is spent on practicing these practices and many students participate, then it is likely that students will also participate skilfully in these practices when taking the test. Solving mathematical equations is one example for such practices that might be practiced during a lesson and later be part of an achievement test. Other examples include “comparing and evaluating different task solutions” or “providing reasons for answers to a question”, which are both part of instruments aimed at measuring the dimension “cognitive activation” (see Praetorius et al., 2018). Hence, it appears that “cognitive activation” during the lesson should be particularly closely associated with test results. Empirical evidence for this is mixed (ibid.). One reason might be that in many studies measures of “cognitive activation” and achievement tests are not systematically aligned in terms of including similar practices.

Second, correlations between ratings/codes and test-scores can also come about when practices, which are observed to be frequent during a lesson, preclude participation in practices that are part of test-taking. This concerns, in particular, practices of disturbing a lesson which are sometimes observed as indicators for the dimension “classroom management”. When students participate a lot in these practices, then they have less opportunity to come to participate in practices that will be part of the test, such as argumentation, solving mathematical equations, etc. In fact, many (but also not all) studies examining effects of TBD on so-called “student outcomes” reviewed by Praetorius et al. (2018) found negative correlations between the presence of disruptions and discipline problems in the classroom and outcomes. One reason for the inconsistent results could be that students also practice the practices relevant for the test at home when it is too noisy in the classroom.

Third, correlations can also be observed when practices involved in taking a test form a nexus with practices assessed by an observation instrument to measure teaching dimensions. For example, specific teacher practices of asking “why and how questions” can be associated with specific types of student argumentation, but not in a sense that “why and how questions” cause student argumentation. Rather, students may have come to participate in the practice of answering “why and how questions” with a specific type of argumentation. In this case, a correlation between the teacher practice of asking “why and how questions” and student test-scores would be observed if taking the test required argumentation practices, because the former would imply that students have often practiced argumentation in the classroom.

This interpretation of the teaching dimensions and their associations with so-called “student outcomes” has some parallels with the concept of “opportunity to learn” (McDonnell, 1995), only that it is not exclusively focused on content but on practices more generally. And it can better explain the empirical observation that correlations between teaching dimensions and so-called “student outcomes” are often only weak or moderate and sometimes expected effects are even absent (see e.g., Seidel & Shavelson, 2007 for TER in general and Praetorius et al., 2018 for research on TBD) than the idea of a linear causal effect between teaching and student learning. More specifically, one reason could be that practices, for example the practice of answering “why and how questions” by providing a certain type of arguments, exist in some classes only and not in others (because here students and teachers have not come to participate in it) or even only for some students, but not for others (because some have come to participate in the practices, others not). Then a correlation cannot be observed universally. Another possible reason for weak correlations is that the two types of research instruments (classroom observations vs. tests or other “outcome”-measures) provide information about the prevalence of practices in the classroom from different angles and with different blind spots: Classroom observations allow for exploring in much detail what happens in the classroom and who participates in which practices, which artefacts are used, etc. during one or several specific lessons. However, accessing mental doings and sayings is difficult through classroom observation. It is, for example, difficult to observe whether students in the classroom, who are not actively participating in a classroom debate, nevertheless formulate answers to the teacher questions “in their heads” or whether they drift off to think about something else. A test can help answering the latter question to a certain extent. However, with test results it can never be excluded that high test scores only reflect that students have participated in relevant practices at home with parents or friends. This is a particular weakness of using achievement tests in research on teaching. Hence, even when tests and observations are well-aligned and really provide different proxies for the presence of the same practices, they are still likely to differ in their results to a certain extent. The practice theoretical perspective can create a particular awareness of this difficulty in research on teaching, because it emphasizes that both types of instruments ultimately aim at assessing similar practices.

Giving up the idea of causality in TER is radical and we don’t argue that all TER should do this. However, we think that going along with one alternative argumentation can be instructive and a good complement. In particular, it might be insightful to examine associations between teacher and student practices not only under the assumption that they cause each other, but also under the assumption that they may be associated through reiteration of the association, and, consequently, solely in some classrooms but not in others. This implies that research on teaching quality should reflect more systematically similarities of practices needed for taking achievement tests and the practices enacted in classrooms in the future (research on instructional sensitivity already moves in this direction, see e.g., Naumann et al., 2019). Moreover, it suggests that research on teaching quality should not only search for strong correlations, but also systematically examine differences between classrooms regarding the size of correlations between teacher and student practices, regarding patterns of behaviour-response. A practice theoretical perspective further raises awareness that, in schools, students learn constantly and not only subject-specific academic content—they also come to participate in many other social practices. This points to a need for identifying practices that students should not come to participate in schools. High quality teaching might not only imply that students learn normatively desirable practices such as argumentation, solving mathematical problems, interpreting poems and the like, but it might additionally imply that students do not come to participate in practices that can be considered undesirable, such as devaluing others to secure one’s position of power or denying oneself in order to be accepted. Of crucial interest might further become the process of initiation into “high-quality” classroom practices as well as that into practices considered “low quality” (for an example of research examining the process pf initiation see Kemmis et al., 2014). The latter type of research might also focus on the question, why some students in some classrooms do not come to participate in “high-quality” practices while others in the same classroom do. Consequently, an important approach to researching teaching effectiveness might become the detailed reconstruction of the interactive emergence of “high-quality” as well as “low-quality” practices (for an example of a reconstruction of the interactive emergence of cognitive activation see Schreyer et al., in press) as well as the reconstruction of shared meanings or “practical rationality” (for an example see Herbst & Chazan, 2003)—in combination with the common quantitative analysis of correlations.

6.2 The Sequentiality of Classroom Interactions and Implications for the Observation of Teaching Dimensions

The extension of perspective inspired by practice theories, mapped out in this chapter, might also be instructive for a further development of instruments for classroom observation with the aim of better taking the sequentiality of classroom interactions into account. Quantitative observation-based analysis of teaching can take the form of low-inference scoring or high-inference rating. For low inference scoring the occurrence of observable, separate events, types of utterances or types of questions during the lesson is counted or classified. Examples are Bales’ Interaction Process Analysis (Bales, 1976) and measures of teacher clarity (e.g., Rosenshine & Fürst, 1971). In contrast, high-inference rating requires more interpretation. The observers assess the degree or intensity—but sometimes also the frequency or a combination between frequency and intensity—of more complex patterns of teacher-student interactions. Gage and Needels (1989) argued that even low inference coding requires that preceding and subsequent events to the behaviour of interest are used as contextual information to infer meaning. Nevertheless, many of the earlier instruments coded rather isolated behaviours of teachers or students and used precedent and subsequent events in a rather indirect way as background information to choose the correct code only.

In contrast, recent high-inference observation protocols explicitly acknowledge the complexity of social interactions in the classroom. Two examples are the CLASS system (Hamre et al., 2013) and the observation system recently developed for use in different education systems in the TALIS Video Study (Bell et al., 2020b). For many codes included in these instruments, raters are instructed to use evidence from both teacher and student behaviour. Some codes even explicitly refer to the dynamics of teacher-student interaction, reflecting the foundational assumption that “teaching is intertwined with learning” (Bell, 2020, p. 57). This is true, e.g., for “Aligning instruction to student thinking”. The observation manual for this component (Bell et al., 2020b, p. 75) actually refers to two types of interactions:

  • “The teacher uses students’ contributions.” The manual identifies “four types of evidence that count as using student contributions”, e.g. “asking a question in response” or “having students provide the next step”, and provides several examples, e.g.: “A student gives an answer and the teacher says to another student ‘Is that correct?’”; “Students are working in groups and the teacher selects groups to present their work in front of the whole class”.

  • “If students make errors or struggle mathematically, the teacher provides cues or hints to support student understanding”. Again, the manual provides a definition of “cues and hints” and several examples, e.g.: “Look at it again, here, look at this side.”; “Anything else?”

The manual further specifies grading schemes for the two types of interactions, discriminating by frequency (“not at all—rarely—sometimes—frequently”), which raters shall apply to segments of 15 min. In addition to the written manual, the observation system is comprised of training procedures, including master-rated training videos to be discussed between master raters and trainees. Thus, Bell et al. (2020b) conceptualize rating as socially co-constructed: a professional practice sui generis.

Yet, a suggestion made by Reh and Rabenstein (2013) goes even beyond this. They proposed making more use of the respective interaction partner’s reactions for interpreting a doing or saying in the classroom and inferring the meaning of the situation. They illustrated this with the example of a teacher saying to a student: “you did this well”, which can be praise but also sarcasm. Relevant for the further course of the interaction is not the interpretation of this event by external observers, not even what the teacher intended to communicate, but first and foremost the interpretation of the addressee as well as that of the by-standing students. Another example illustrating this suggestion actually comes from the TALIS Video Study. The observation protocol developed in this study was used in different education systems. To address potential “cultural” differences it states: “To understand whether a disruption is occurring in a specific culture, the raters must attend to how the other students and teacher react to the behaviour. A student eating food in class might not be a disruption in a classroom in one country’s context but in another, it is a disruption.” (Bell et al., 2020b, p. 28). Arguably, differences in interpretation of eating in class might not only be related to different traditions in different regions of the world, but they might also differ between schools—depending on school cultures—and even between individuals within schools. Hence, the Bell et al.’s argument might be put in more general terms: Eating in the classroom has very likely no universal meaning. Relevant for the further course in the classroom interaction is, therefore, not the objective event as such, but the meaning attributed to the event by those present in the situation.

The crucial point for operationalisations of TBD and similar dimensional frameworks is the following: In order to come up with quantitative measures, certain episodes of teacher-student-interaction need to be qualitatively understood. Ratings of teaching quality may require raters to identify instances of certain teaching practices, reconstructing episodes (e.g., does a student “struggle”? Does the teacher react to this “struggle”? Is this reaction meant and/or perceived as supporting student understanding?), and judging qualities of their enactment (e.g., does some teacher utterance qualify as a hint? Does some student behaviour qualify as a disruption?). As teacher and student behaviour, opportunity and use, are inextricably connected within such episodes, raters need to develop a holistic understanding of classroom activity, its co-construction by all participants and the socio-cultural fibres woven into it. Hence, it might be helpful to expand the use of the larger situational context and, in particular, of the reactions of interaction partners to an event of interest to infer meaning of that event in the process of coding/ rating. Methods developed within a qualitative research paradigm, in particular in the context of research examining practices, might be a useful basis.

The suggestion made by Reh and Rabenstein further raises awareness that for understanding teaching and learning in the classroom it may not only be important whether teachers do something during a lesson, but also how they do it. Going back to the example “aligning instruction to student thinking” presented above, the manual states: “The teacher uses students’ contributions” and provides an example: “A student gives an answer and the teacher says to another student ‘Is that correct?’”. In this example, the question “Is that correct?” can be understood in different ways: Some students might think that the teacher implies that the first answer was definitively not correct and that they should present the correct response instead. Other students might feel invited to think about the first students’ answer. In both cases the teacher has used students’ contributions. However, in order to understand classroom routines relevant in the context of “cognitive activation” or a deep processing of the learning content it additionally appears relevant how the teacher used the student contribution and, in particular, how the students perceived and interpreted this use.

Another issue is the choice of the coding unit: Observation systems often focus on specific and concrete behaviours (low-inference systems) while systems previously used in the context of TBD mostly used broad characteristics of the whole lesson (high-inference systems). High inference ratings often create an ideal picture of teaching without informing how exactly this ideal, e.g., a quiet and engaged or supportive climate, emerges in interactions. Low-inference ratings, on the other hand, often inform whether and how often specific behaviours occur during the lesson but not why. Another alternative may be identifying blocks of meaning inspired by the idea of practices as interconnected elements—forms of bodily activities, forms of mental activities, ‘things’ and their use, understandings, states of emotion and motivational knowledge. Specific behaviours within such units of meaning would be considered interchangeable; a practice can involve different behaviours and still be the same practice (Reckwitz, 2003; Reh et al., 2011; Schatzki, 1996). Sophisticated protocols such as TALIS-Video are in fact referring to such complex units of meaning, as shown above. It should be noted that the rating ultimately aims at grading some “quality component”, such as the degree of alignment between instruction and student thinking, cutting across various practices. The degree of alignment between instruction and student thinking, as rated in the TALIS-Video protocol, does not indicate a certain practice in the sense of practice theories. It is a more abstract measurement of a feature that cuts across various practices. As shown above, implementing the protocol requires raters to understand the type and quality of practice they observe, but the rating as such refers to the abstract feature rather than the practice as a unit of meaning. Yet, to make this inference it is important to understand the different practices, during which this abstract feature shows itself, as good as possible.

Helpful for realizing this might also be considering the importance of bodies (e.g. pointing with a finger, smiling) and artefacts (e.g. the blackboard, a pen for writing or an experimental kit) more systematically. For example, facial expressions and gestures of teachers and students indicating excitement might indicate that the teacher question “Is that correct?” is, in this class, routinely a start into a lively debate about solutions to math tasks that the teacher and (at least some) students usually enjoy. A sceptical facial expression of the teacher asking this question might, in contrast, indicate that the teacher is not content with the prior answer given by the student and wants other students to correct it. Yet, another teacher might ask this question and, in the same moment, take a piece of chalk and turn her back to the class. This probably indicates that the question “Is that correct?” is meant as invitation to demonstrate the solution step by step while the teacher notes it on the chalkboard. Explicitly including descriptions of bodies and artefacts in coding manuals could be helpful to increase the reliability and validity of coding and ratings of teaching dimensions.

Another difficult question is the choice of level for analysing teaching effects. Ethnographic analysis of teaching rooted in practice theories usually identifies specific and characteristic situations which often involve only a few students, not necessarily the whole class. TER often uses multilevel models and focuses on the class-level. It could be argued that within-class differences should receive more attention in this latter strand of research. A large body of research shows that teachers interact differently with different students in the classroom and that students participate in very different ways in classroom practices. For example, students perceived as struggling more with learning often get more learning support and less pressure, but teachers often give high achievers more warmth and emotional support (Babad, 1993). High-achievers are further often more involved in whole class interactions than low achievers and, consequently, get more opportunities for “practicing” several practices such as argumentation (Brophy, 1983). Even the same classroom situation provides different opportunities for different students. Schatzki (2017) pointed out: “Learning also takes a course in the literal sense that its occurrences form a broken space-time path through bundles of practices and arrangements (cf. Dreier’s notion of personal trajectories). The shape taken by any such path typically reflects opportunities to learn that are afforded at particular space-time locations in bundles: at or in particular workstations, stoves, classrooms, training fields, meeting rooms, and the like” (p. 30). Whether and how students can participate in classroom practices also depends on their prior participation in related practices, both inside and outside school. Hence, it appears most realistic to judge the quality of the lesson for each individual student separately. At least, the evaluation of teaching quality should take intraclass differences into account in some way, e.g., by using variances and extreme values in addition to mean scores or by including information on how many of the students are participating in which practices during lessons (see also Vieluf et al., 2020; either type of score specification has also been used for some codes in Bell et al., 2020b).

7 Conclusion—With a Response to the Questions Guiding This Book

  • Do we already have a theory/ theories on teaching? If so, which are they?

  • In the future, in what ways might it be possible, if at all, to create a (more comprehensive) theory of teaching?

The present chapter answers the first question with a clear “yes”: There is a multitude of theories of teaching (for the German speaking context see e.g., Lüders, 2014). The aim of the present chapter, however, is not providing an overview. Rather, it brings together two disparate paradigms—TER and practice theories—with the aim of refining one specific theory, TBD, by reflecting and scrutinizing it from the perspective of practice theories.

At the same time, we are reluctant to answer the second question in an affirmative way. From our perspective, creating “A” comprehensive theory of teaching does not seem to be a reasonable goal of scientific discourse. The reasons for this position are discussed in combination with a response to the third, meta-theoretical, and the fourth, more substantive question:

  • What is a theory (of teaching)?

  • What should it contain and why?

“Theory” is a fuzzy concept (see also Praetorius & Charalambous, this volume). Definitions of “theory” differ considerably between research paradigms, depending on epistemological and ontological perspectives (see e.g., Abend, 2008; Zima, 2017). The goal of creating “A” comprehensive theory of teaching, only makes sense within the traditional “statement view” of theory from critical rationalism (Popper, 1965/2005), which assumes a theory to be a coherent set of definitions, axioms, derived hypotheses, and empirical statements testing (i.e. potentially falsifying) these hypotheses. Within this perspective various criteria for the quality of theories have been formulated, such as consistency of statements, parsimony and inclusion of definitions of all terms, but also testability and empirical support (e.g., Kane & Marsh, 1980; see also Peratorius & Charalambous, this volume). TER is associated with this epistemological perspective (Scheerens, this volume). “Theory” here usually consists of constructs covering various elements and features of classroom teaching, procedures operationalizing those constructs, and models linking them with student learning and other constructs which have been a priori defined as desirable outcomes of schooling. Teaching effectiveness theories attempt to explain and predict so-called “student outcomes”, explicitly modelled as effects of the learning environment. Earlier work within this paradigm was often just listing or grouping variables that had been identified as correlates of student achievement. Current work in TER, such as the TBD, includes more complex sets of statements, including theoretical postulates about why specific teaching dimensions have effects on student learning and other so-called “student outcomes”. These theories may still not live up to the quality criteria formulated by Kane and Marsh (1980, for a specific discussion of TBD in light of these criteria see Praetorius et al., 2020b), but they are closer to this postulated ideal as compared to earlier approaches in TER.

Alternative epistemological perspectives, however, challenge fundamental assumptions of critical rationalism, in particular, the idea that an objective truth can be discovered using scientific methods. These alternative perspectives also have a long history, i.e., approaches emphasizing the “site-dependency” (Mannheim, 1931/1995) and social constructedness of knowledge (e.g., Fleck, 1935/1980) or those addressing the development, rise and fall of theories (Kuhn, 1962) as well as the so-called “Non-statement view” (Sneed, 1979). According to Kuhn, general principles such as, in the field of education, (a) the idea of the learning environment having causal impact on students’ information processing vs. (b) the idea that the classroom is a social sphere consisting of practices, can hardly be contested empirically, although they have inspired much sound empirical work—mostly quantitative in the first case, qualitative in the second case. These general principles belong to the core assumptions of separate paradigms which are basically incommensurable, since they are framing, if not constituting the field of classroom teaching and learning in different ways.

Separate paradigms include not only different basic assumptions about the social, about teaching and learning, but also differ with regard to their understandings of “theory” (Kuhn, 1962, p. 94). For example, Reckwitz (2002), one exponent of practice theories, understands social theories as vocabularies which offer “contingent systems of interpretation which enable us to make certain empirical statements (and exclude other forms of empirical statements)” and “a heuristic device, a sensitizing ‘framework’ for empirical research in the social sciences” which “opens up a certain way of seeing and analysing social phenomena” (p. 257). The core concepts and principles provide a framework for the development of theories of specific practices (Hirschauer, 2015, p. 172). Yet, a priori normative assumptions about how these theories should look like are often avoided within the practice theoretical paradigm (“normative abstinence”, see Sect. 4.). Instead, practice theories provide a theoretical framework for analysing research practices themselves, i.e., processes of “doing theory”, “doing empirical studies”, and “doing publications” (e.g., Bourdieu, 2015).

Hence, answers to the questions what constitutes a theory and what it should contain depend on the perspective.Footnote 20 The epistemological perspective of critical rationalism has been the key reference for TER and TBD. In this paper we argue in favour of recognizing diversity of perspectives—also with reference to epistemology—instead of opting for a single set of criteria for a “good theory”, because different perspectives always have different blind spots and can complement each other. In particular, since TBD integrates constructivist learning theories with TER to explain why certain types of classroom interaction are more effective than others for co-constructing knowledge in the classroom, it seems prudent to also draw on a constructivist understanding of the co-construction of knowledge within the social sciences. From our excursion into practice theories (in particular the reading of Bourdieu, 2015) we further take along for future research the idea to involve more in critical reflection of research practices—including the micro-politics and struggles for positions—in the field of quantitative empirical educational research and in critical reflection of the researchers’ role in the process of knowledge construction.

Considering the incommensurability of paradigms, we think that it is desirable that TBD, TER in general, and practice theories alike will grow and become more and more sophisticated, and, instead of converging into one grand theory of teaching, even diversify into separate (sub-)theories. New paradigms, such as neuroscience, may further start to compete with existing strands of social science and the humanities. Nevertheless, we argue (in opposition to Kuhn) that fruitful exchange between paradigms is possible and we attempted to involve in just that in the present chapter, which has the aim of using practice theories for refining TBD in a process inspired by the idea of “alienation”.Footnote 21

  • Can such a theory accommodate differences across subject matters and student populations taught? If so, how? If not, why?

This question points to what is probably the most striking difference between TER/ TBD and practice theories. Bell (2020, p. 57) claims that “teaching is definitionally situated in social-historical contexts”. Yet, educational effectiveness research traditionally assumes that constructs and measures apply across contexts, and that relationships between teaching and learning are universal. Without this assumption (mostly left implicit), researchers would not be able to refer to studies from all kinds of contexts (countries, language areas, social groups, school types, age and grade levels, with different learning trajectories and classroom experiences) when deriving and discussing their own research question, and to merge all kinds of studies in meta-analyses. At the same time, using seemingly “identical” constructs and measures across contexts allows EER/TER to identify differences across subject matters and student populations taught. First, teaching variables have been compared, and it has been claimed that mean levels differ between groups of students, institutions, subjects or even education systems (e.g., more demanding mathematical tasks were observed in Japanese classrooms compared to German classrooms; Bell et al. 2020c; Stigler & Hiebert, 1999). Second, the size and orientation of relationships between teaching and learning outcomes have been compared and claimed to differ between groups of students (e.g., classroom management having a stronger effect on student achievement for low achieving students; Seiz et al., 2016), institutions (e.g., student-oriented teaching being correlated with achievement in comprehensive schools only; Bayer, 2020) or between different education systems (Doan et al., 2020). Thus, the assumption that educational processes are universal has been questioned from within the EER/TER paradigm.

Accommodating differences by explicitly comparing contexts or groups, however, has been challenged on three levels: (1) Adopting methods from cross-cultural psychology (e.g., van de Vijver & Tanzer, 1997), the equivalence of measures has been questioned. (2) Even when differences are measured in a valid, methodologically sound way, this does not mean they are understood on a theoretical level. (3) More fundamentally, any comparison requires a priori categorization and often uses binaries (e.g. male-female, low achievers vs. high achievers). Often, the complexity, situatedness, social constructedness and dynamic nature of such categories as well as their embeddedness in societal power structures are neglected (e.g., Phillips, 2010).

Practice theories, in contrast, refrain from any claims about “universal” relationships. A practice, understood as a nexus of doings, sayings, teleoaffective structures (affects, aims and purposes which are part of the practices) and artefacts, exists only when it is reiterated. Thus, relations between the doings and sayings included can be found across time and space. Yet, because the relations are not assumed to be causal, they exist only within the practice. They are not singular, but also never universal. They exist in their specific form only for those who have come to participate in them (Schäfer, 2016). Consequently, classroom ethnography (Breidenstein, 2012) attempts to reconstruct practices in a given social context. Understanding the role of the context (and the school subject) is part of understanding practices. General ideas (such as “practice”, “shared meaning”, and “pedagogical pointing”) are used across studies and cases. Yet, they are supplying language to talk about teaching, while full, theory-driven, empirically saturated understanding is achieved on the basis of individual cases or groups of cases. Thus, PT also “accommodates” differences across subject matters and student populations taught, but conceptualizes these as socially constructed (see also Rabenstein et al., 2013).

8 Final Note

In the introduction we argued that not only teaching, but also educational research itself, is situated in fields of tension. One such field of tension is between the intention to provide educational practice with clear and convertible recommendations and the wish to do justice to the whole complexity, contingency, uncertainty and ambiguity of social interactions. Multiple research paradigms address this tension in different ways. By themselves they are necessarily limited and “under-determined by empirical ‘facts’” (Reckwitz, 2002, p. 257). Yet, they all contribute substantially to our understanding of the social world. Mannheim (1931/1995) argued that a “true” picture can emerge from integrating different perspectives. Our aim was not finding such a synthesized truth in the middle. We argue more cautiously that dialogue between paradigms helps reflecting the own paradigmatic perspectives and research practices as well as underlying values and that it can inspire new research ideas. Accordingly, our paper is the result of an open process of bringing perspectives together and reflecting on irreconcilabilities with the purpose of “doing theory”.