Fresh evidence on the relationship between years of experience and teaching quality

Gore, Jennifer; Rosser, Brooke; Jaremus, Felicia; Miller, Andrew; Harris, Jess

doi:10.1007/s13384-023-00612-0

Fresh evidence on the relationship between years of experience and teaching quality

Open access
Published: 03 March 2023

Volume 51, pages 547–570, (2024)
Cite this article

Download PDF

You have full access to this open access article

The Australian Educational Researcher Aims and scope Submit manuscript

Fresh evidence on the relationship between years of experience and teaching quality

Download PDF

8613 Accesses
5 Citations
125 Altmetric
14 Mentions
Explore all metrics

Abstract

It is commonly assumed that experienced teachers are more proficient than beginners. However, evidence supporting this premise is complicated by diverging research traditions and mixed results. We explore the fundamental relationship between years of experience and teaching quality using a comprehensive pedagogical model. Our analysis of 990 lessons, taught by 512 primary teachers in New South Wales during 2014–15 and 2019–21, found no significant differences in pedagogy across the experience range (< 1–24 + years). We canvass two possible explanations: that initial teacher education (ITE) performs better than is typically assumed; and/or that experience, including ongoing participation in many forms of professional development (PD), has minimal impact on pedagogical quality. The important lesson from this study, however, is that the continual positioning of beginning teachers and ITE as deficient is unwarranted and, instead, we should focus on providing teachers with access to high-impact PD throughout their careers.

The role of opportunities to learn in early childhood teacher education from two perspectives: A multilevel model

Article Open access 28 October 2021

An Expert Teacher’s Use of Teaching with Variation to Support a Junior Mathematics Teacher’s Professional Learning

Learning opportunities in teacher education and proficiency levels in general pedagogical knowledge: new insights into the accountability of teacher education programs

Article 09 May 2019

Introduction

Politicians and media commentators consistently bemoan the quality of teachers in the face of declining or stagnating performance on international assessments (Churchward & Willis, 2019; Dinham, 2015). In this context, beginning teachers and initial teacher education (ITE) have been subjected to an unrelenting procession of reviews and reform efforts (Tatto et al., 2018). Over the past 40 years Australia has seen more than 100 inquiries into ITE (Louden, 2008), with the latest commissioned by the Education Minister in 2021 (Paul et al., 2021; Tudge, 2021). As a result, ITE in Australia has undergone numerous reforms including greater prescription of teacher education course content, new teacher accreditation schemes, new minimum literacy and numeracy standards, and new ‘classroom readiness’ assessments for graduate teachers (Barnes & Cross, 2018; Rowe & Skourdoumbis, 2019; Simpson et al., 2021; Teacher Education Ministerial Advisory Group [TEMAG], 2014). Similar reforms have occurred around the globe, with countries such as England, France, Germany, Norway, Austria, and the United States instituting regulatory and policy changes to improve the quality of new teachers (Furlong, 2013; Mayer, 2021; Page, 2015; Simpson et al., 2021; Tatto et al., 2018).

In this article, we ask to what extent is the focus on ITE justified? While initiatives designed to improve the quality of graduate teachers have intensified, we are concerned about the absence of strong evidence documenting how, or indeed if, teaching quality varies by years of experience (Churchward & Willis, 2019; Graham et al., 2020; Mockler, 2018). The methodological challenges involved in measuring teaching quality mean that few robust large-scale studies have been conducted to provide such evidence (Hill et al., 2015). One of the only such studies conducted in Australia, involving classroom observations of 80 teachers, indicates that quality may not vary significantly with experience, finding no difference between teachers with 0–3 years’ experience and those with 5 + years’ experience (Graham et al., 2020).

It is often assumed, without robust evidence, that declining student outcomes stem from declining teacher quality and, further, that to improve student achievement, nations must, by necessity, raise the quality of new teachers (Churchward & Willis, 2019; Mockler, 2018; Tatto et al., 2018). Such assumptions imply significant problems with those enrolled in teaching degrees, with recent graduates, and/or with the ITE programs in which they participate. If new teachers (or their preparation) are to blame for stagnating student achievement, one might expect beginning teachers to deliver ‘poorer’ quality lessons than their more experienced colleagues. It is this concern about how quality of teaching changes with experience that our research interrogates.

We have provided fresh evidence on this question by analysing the quality of 990 lessons from a sample of 512 Australian teachers, ranging from those in their first year to those with more than 24 years’ experience. We used a comprehensive model of pedagogy called the Quality Teaching (QT) Model to address our research question: What is the relationship between teachers’ years of experience and the quality of their teaching? Following, we begin by reviewing three distinct research traditions that contribute insights on the relationship between teachers’ experience and teaching quality. Next, we describe our research methods and results and, finally, canvass different explanations for our key and somewhat surprising finding that beginning teachers deliver instruction that is of commensurate quality to that of their experienced colleagues.

Background to the study

While large-scale studies of classroom practice using standardised instruments for assessing quality have been a relatively recent addition to the literature (Graham et al., 2020; Hill et al., 2015), decades of research from diverging research traditions provide insights into the value and effects of teaching experience. We will discuss three categories. The first (which we refer to as Category 1), largely based in the United States (US), tests associations between teacher characteristics (including years of experience) and student achievement on standardised tests (e.g., Harris & Sass, 2011; Kini & Podolsky, 2016; Ladd & Sorensen, 2017; Papay & Kraft, 2015; Rockoff, 2004). Here, experience is defined as years spent teaching in classrooms post-graduation. As Graham et al. (2020) observe, these studies do not directly measure teaching practice and tend to use narrow measures of student outcomes (i.e., standardised test results in a few subjects).

Category 2 studies focus on differences in the cognition, behaviour, and performance of expert and novice teachers (e.g., Borko & Livingston, 1989; Gudmundsdottir & Shulman, 1987; Hattie & Yates, 2014; Leinhardt, 1989). Research in this category is US-centric but includes studies from European and Asian education systems. Often conducted in controlled ‘laboratory’ settings (Hattie & Yates, 2014; Tsui, 2005, 2009), Category 2 studies decontextualise teachers’ work by assessing how they perform on specific tasks and make comparisons that are not necessarily about teachers’ years of experience.

Category 3 studies (with which our own work is most closely aligned) measure differences in teaching quality using direct observations of classroom practice. These studies, largely based in the US, use a variety of pedagogical frameworks, such as the Classroom Assessment Scoring System (CLASS) (Pianta et al., 2008), Danielson’s (2007) Framework for Teaching (FfT), and subject-specific frameworks such as the Mathematical Quality of Instruction (MQI) instrument (Hill, 2005) and the Protocol for Language Arts Teaching Observation (PLATO) (Grossman et al., 2014). While such studies typically do not make years of experience their key focus, the researchers often include experience categories in their statistical models. Our study extends this third group by focussing specifically on the role of teacher experience and contributing much needed insight into teaching quality beyond US contexts. Key findings generated by these three research agendas are outlined below, in turn.

Category 1. Studies of student achievement as proxy for quality

Studies investigating the relationship between teacher experience and student achievement have generated mixed insights (Graham et al., 2020). When teacher characteristics such as years of experience were first put into models predicting student achievement scores, they were often shown to be weak (or non-significant) predictors (Hanushek & Rivkin, 2006, 2012; Nye et al., 2004; Rivkin et al., 2005). Kini and Podolsky (2016) argue that recent studies showing a stronger association between teacher experience and student achievement are able to do so because of increased availability of data to match students with individual teachers and more advanced research methods. The association has typically been found in the first 3 to 5 years of teaching, with a sizeable number of studies now reporting ‘rapid’ gains in effectiveness during teachers’ first few years on the job (Araujo et al., 2016; Harris & Sass, 2011; Henry et al., 2012; Kini & Podolsky, 2016; Ladd & Sorensen, 2017; Papay & Kraft, 2015; Rice, 2010, 2013; Rockoff, 2004). However, not all studies have found such effects (Hill et al., 2015).

The picture is consistently less clear after the first 3 to 5 years. While some studies show small but significant improvement in teachers’ effectiveness well into their careers (Harris & Sass, 2011; Kraft & Papay, 2014; Ladd & Sorensen, 2017; Papay & Kraft, 2015), others indicate the ‘value’ added to student achievement scores plateaus or even declines after 3 to 5 years (Hanushek & Rivkin, 2006, 2012; Henry et al., 2012; Rice, 2010, 2013; Rockoff, 2004). Importantly, these findings vary by schooling context. For example, Kraft and Papay (2014) demonstrated that after 5 and 10 years of experience, teachers in the most supportive schools outperform their counterparts in the least supportive schools by 20% and 38%, respectively. In addition, the type of experience matters. Huang and Moon (2009) found total years of experience was not a significant predictor of student achievement, but years of experience teaching a particular grade level was. Despite conflicting evidence, the prevailing view is “for most teachers, experience increases effectiveness” (Kini & Podolsky, 2016, p. 1).

The validity of these studies, however, has been challenged. First, the results on standardised tests themselves can be distorted by factors such as content type, student socioeconomic status, and gender (Leder & Forgasz, 2018), casting doubt that student test results are a reliable or valid measure of teacher effectiveness. Second, the value-added models (VAMs)^{Footnote 1} on which studies of teacher effectiveness tend to rely have been critiqued as incomplete, volatile, and inconsistent (Amrein-Beardsley & Close, 2019; Darling-Hammond et al., 2012; Hallinger et al., 2014; Reynolds et al., 2014). While VAMs have become increasingly sophisticated and now include ‘controls’ for a range of factors, Rockoff and Speroni (2010) argue results are still “biased if some teachers are persistently given students that are difficult to teach” (p. 261) while others have greater choice over which schools they teach in (Hanushek & Rivkin, 2006). Notably, a measure of teaching or pedagogy is rarely included in VAMs, so what experienced teachers actually ‘do’ to achieve higher outcomes, when such a relationship is found, remains a mystery (Hill et al., 2015; Ingvarson & Rowe, 2008).

Category 2. Studies of differences between expert and novice teachers

Research focussed on differences in cognition, behaviour, and functioning between expert and novice teachers overwhelmingly documents the superiority of expert teachers (Hattie & Yates, 2014; Tsui, 2009). Novice teachers have been found to struggle to effectively plan and deliver coherent lessons and teaching units and to select developmentally appropriate content and teaching strategies (e.g., Borko & Livingston, 1989; Gudmundsdottir & Shulman, 1987; Leinhardt, 1989; Westerman, 1991). During lessons, novice teachers often fail to activate students’ prior knowledge, to improvise when things go awry, and to notice and interpret classroom patterns (e.g., Berliner, 1988; Borko & Livingston, 1989; Hattie & Yates, 2014; Leinhardt, 1989; Westerman, 1991). The difficulties novice teachers face in reflecting on teaching and interpreting student behaviour (Kim & Klassen, 2018) can also lead to greater attention on student discipline than student learning and thinking (Huang & Li, 2012; McIntyre et al., 2017; Wolff et al., 2017), compared to expert teachers.

Despite this documented list of novice inadequacies, we find it risky to generalise findings from particular expertise studies to populations beyond the research context for at least three reasons. First, there is no consensus on how to define an expert teacher (Berliner, 2001; Tsui, 2009), substantially influencing results across the literature base. Second, while the term ‘novice’ pertains to those with little practical experience in a particular domain, an ‘expert’ is not simply someone who has accumulated more years of experience (Berliner, 2001; Hattie & Yates, 2014; Johnson, 2005). Researchers select ‘expert’ teachers based on a number of other characteristics, such as recommendations from school leaders, the attainment of state- and national-level teaching awards, and student achievement scores (Tsui, 2009). Third, expertise research is often cross-sectional and carried out in controlled settings away from teachers’ classrooms in response to controlled stimuli, such as classroom vignettes or video excerpts (Hattie & Yates, 2014; Tsui, 2005, 2009).

However, it is well accepted that the contexts of the school and classroom, as well as the resources, goals, and orientations of teachers, are important contributors to teacher expertise (Berliner, 2001; Schoenfeld, 2011), as are opportunities to engage in ‘deliberate’ goal-directed practice with feedback, support, and encouragement from peers and knowledgeable others (Berliner, 2001; Hattie & Yates, 2014). As such, it is difficult to make valid assessments of teacher expertise through the deployment of controlled stimuli alone.

In response to these concerns, a small but growing number of recent studies have reconceptualised teacher expertise as a process of development, situated in schools and classrooms (Tsui, 2005, 2009). While these ‘more naturalistic’ studies, which take the form of in-depth case studies and longitudinal analyses, offer far greater ecological validity than those conducted in ‘laboratory’ settings, the insights they generate still cannot be used to make generalisations about the quality of teaching delivered by broader populations of beginning and experienced teachers.

Category 3. Studies using observational frameworks to assess teaching quality

The use of observation frameworks to study pedagogy is an emerging field of research (Hill et al., 2015) in which the relationship between teacher experience and teaching quality has rarely been the focus (Graham et al., 2020). We identified 11 studies that used direct observational measures of teaching quality in primary/elementary, middle, or high school classrooms. When teacher experience has been addressed it has often been: (1) investigated as part of a host of other teacher background characteristics (e.g., Bryant et al., 1991; Guo et al., 2012; Hill et al., 2015; Mihaly & McCaffrey, 2015; National Institute of Child Health and Human Development Early Child Care Research Network [NICHD ECCRN], 2002, 2005; Stuhlman & Pianta, 2009); (2) represented by a few blunt categories such as greater or fewer than 5 years (e.g., Cortina et al., 2015; Graham et al., 2020; Hill et al., 2015; Mihaly & McCaffrey, 2015); or (3) largely overlooked with limited or no discussion of the results relating to teacher experience (e.g., Bryant et al., 1991; Gitomer et al., 2014; Mihaly & McCaffrey, 2015; Pianta et al., 2002).

Nevertheless, these limited investigations typically report few significant pedagogical differences between teachers of different experience levels across grades using a variety of observation tools including CLASS (e.g., Cortina et al., 2015; Gitomer et al., 2014; Graham et al., 2020), MQI (e.g., Hill et al., 2015), FfT, and PLATO (e.g., Mihaly & McCaffrey, 2015). Indeed, akin to some Category 1 studies, teacher characteristics including experience have been found to explain very little (if any) of the variability in teaching quality (see Gitomer et al., 2014; Hill et al., 2015; Mihaly & McCaffrey, 2015; NICHD ECCRN 2002, 2005). Arguably, this lack of difference is surprising given current governmental anxieties about and efforts to improve the quality of new teachers and ITE both in Australia and overseas.

When studies have found significant pedagogical differences by experience, the differences have been isolated to specific aspects of instruction and/or are difficult to interpret. For example, Guo et al. (2012) found a small but significant negative relationship between years of experience and the amount of time spent on ‘academic’ activities, while earlier research in first and third grade classrooms found the opposite trend (NICHD ECCRN, 2002, 2005). Graham et al. (2020) reported that ‘transitioning’ teachers with 4–5 years’ experience had worse scores for the Negative Climate and Instructional Learning Formats dimensions of CLASS, while Hill et al. (2015) found teachers with more than 2 years’ experience outperformed novices on the Classroom Organisation domain.

Similarly, using data from the Measures of Effective Teaching (MET) study, the largest known study of classroom practice to date (Bill & Melinda Gates Foundation, 2012), Mihaly and McCaffrey (2015) found no systematic differences in classroom observation scores by teachers’ years of experience across subjects (English and Math), grades (4–8), or observation frameworks (CLASS, FfT, and PLATO). However, puzzlingly, English language teachers with 3 years’ experience scored significantly higher on observations using CLASS and FfT than those with 4 or more years’ experience, but not when the subject- or domain-specific instrument PLATO was used. Conversely, there were no significant differences on CLASS and FfT scores across experience categories for Math teachers.

Despite such findings, it would be premature to infer that experience is irrelevant. These studies tend to classify all teachers with at least 5 years in the profession as ‘experienced,’ thereby obscuring possible differences across the career span. Furthermore, most studies examine teaching quality in US classrooms using pedagogical models developed for that context, such as CLASS (e.g., Cortina et al., 2015; Gitomer et al., 2014; Graham et al., 2020; Hill et al., 2015; Mihaly & McCaffrey, 2015) and its predecessors (e.g., Guo et al., 2012; Pianta et al., 2002; NICHD ECCRN, 2002, 2005; Stulhman & Pianta, 2009). As a result, it is unclear whether findings would be consistent across teaching populations using alternative frameworks.

In sum, our review demonstrates that diverging research traditions, each with their own strengths and limitations, provide complicated answers to the question of whether concern about the quality of new teachers and ITE is warranted. Each tradition shines a different light on the question. Studies on teacher experience and student achievement (Category 1) show that student test scores improve in the first few years of a teacher’s career, with mixed findings after this point. However, they provide limited insight into how the actions of teachers might drive these trends. Studies of expert and novice teachers (Category 2) document the superiority of experts, but acknowledge that experience and expertise are not synonymous. Observational studies of pedagogy (Category 3), which have primarily been conducted in US classrooms, demonstrate few differences overall between beginning and experienced teachers yet tend to rely on blunt comparisons among a few experience categories.

In this paper, we favour the third in situ approach because it enables investigation of differences in pedagogy by years of experience, but we seek to address two gaps. First, we extend our vista to teaching across the career span, from beginning teachers to those with more than 24 years’ experience, thus conducting a more fine-grained analysis of how experience matters beyond the 2-, 3- or 5-year marks commonly used in Category 3 studies. Second, we contribute much needed evidence of teaching quality beyond the US using a comprehensive pedagogical model developed for use in Australian schools, known as the Quality Teaching (QT) Model. In previous work, the QT Model has been used to examine: the relationship between teaching quality and school socioeconomic status (Gore et al., 2022); improvement in teaching quality following participation in Quality Teaching Rounds (QTR) professional development (PD) (Gore et al., 2017); and, associated improvement in student achievement (Gore et al., 2021). By employing the QT Model and examining teaching in Australia across the career span, we seek to provide fresh insights into the relationship between teaching quality and experience.

Methods

To address the research question—What is the relationship between teachers’ years of experience and the quality of their teaching?—we drew on classroom observational data derived from two randomised controlled trials (RCTs) conducted in NSW government schools during 2014–15 and 2019–21. Both trials were designed to assess the efficacy of QTR—an approach to teacher PD that involves teachers working in professional learning communities to observe and analyse each other’s lessons using the QT Model.^{Footnote 2} Participating teachers reported their years of experience in surveys and had lessons observed by the research team before they were randomly allocated to treatment groups. Although the RCTs were designed for a different purpose, the baseline (i.e., pre-intervention) data have enabled us to explore associations between teaching quality and years of experience for a relatively large sample. The trials received university and education department ethics approvals before recruitment commenced. Below, we include only those details pertinent to our research question. Readers interested in details of the trials might like to access earlier publications of the study protocols and outcomes (see Gore et al., 2015, 2017, 2021; Miller et al., 2019).

The Quality Teaching Model

The QT Model is a comprehensive model of pedagogy derived from an extensive research synthesis of classroom factors that positively impact student learning (Ladwig & King, 2003). Applicable to any developmental stage or curriculum area, the QT Model has its roots in research undertaken on Authentic Pedagogy (Newmann, 1996) and Productive Pedagogy (Lingard et al., 2001). For almost two decades, the model has been endorsed by the NSW Department of Education as a model of teaching quality for government schools (NSW Department of Education and Training, 2006; Quality Teaching Academy, 2020), signalling its enduring resonance with teachers and school leaders.

The QT Model has three dimensions, each consisting of six elements (18 elements in total) that focus teachers’ attention on principles underpinning the quality of teaching as manifest in classroom practice: (1) Intellectual Quality, (2) Quality Learning Environment, and (3) Significance (see Table 1). Each element is accompanied by a 1-to-5 coding scale and associated descriptors that distinguish quality at a high level of specificity (see online Appendix for an elaboration of one of the elements). Together, the 18 elements comprise a holistic model of pedagogy addressing lesson content and the intellectual demands placed on students, the environment within which learning occurs, and the relevance of lesson content to students’ lives beyond the classroom.

Table 1 The Quality Teaching Model

Full size table

While there is no consensus on what should be included in a pedagogical framework (Coe et al., 2014; Martinez et al., 2016), research using the QT Model has demonstrated several broad positive impacts on the profession (see Gore & Bowe, 2015; Gore et al., 2017, 2021; Gore & Rickards, 2021; Gore & Rosser, 2022), lending validity to its use as a tool for teachers and making its use in this study an important contribution to the literature. For example, the aforementioned RCTs demonstrated that the QT Model, when used as the core component of QTR PD, is associated with improved student achievement in mathematics, improved teaching quality, improved teacher morale, and improved teacher perceptions of appraisal and recognition (Gore et al., 2017, 2021).

Data sources

To address the relationship between experience and quality, we drew on pre-intervention data only, given that post-intervention data (after teachers participate in QTR) would confound the results. A total of 990 baseline observations of whole lessons taught by 512 teachers in 260 schools were conducted. The schools were representative of schools in Australia, with an average Index of Community Socio-Educational Advantage^{Footnote 3} (ICSEA) of 1005 (standard deviation = 83), consistent with the national mean of 1000 and standard deviation of 100. All observations used in this analysis were of primary school lessons, the majority (79%) of which were conducted in Grade 3 or 4 classrooms (age 8–10 years). The observed lessons were mostly in the key learning areas (KLAs) of English (52.8%) and Mathematics (28.4%), with a range of other KLAs such as Human Society and its Environment (HSIE) (7%), Physical Development, Health and Physical Education (PDHPE) (3.9%), Creative Arts (3.7%), and Science (3.5%) represented. All teachers participating in the research were employed on at least a 12-month contract, given that the RCTs sought to measure change over the course of a school year for teachers who had their own classes and were not in casual employment.

Years of experience

In an online questionnaire, teachers reported demographic information, including their years of teaching experience, using the following categories: less than 1 year, 1–3 years, 4–6 years, 7–9 years, 10–12 years, 13–15 years, 16–18 years, 19–21 years, 22–24 years, and more than 24 years. While much of the literature uses the starting category of 0–3 years (e.g., Graham et al., 2020), we were keen to contribute fresh insights about the first year of teaching and so used < 1 and 1–3 years’ experience for our main analyses. Given concerns about the quality of ITE and readiness of beginning teachers, it is useful to make this distinction. The sample included teachers across the entire range. While the majority of lesson observations were taught by teachers with between 1 and 15 years’ experience, at least 34 observations occurred in every experience category (Table 2).

Table 2 Number of observations by teachers’ years of experience

Full size table

Sample

Where possible each teacher was observed twice during data collection, which was the case for 92% of the teachers. To scrutinise the sample for potential bias, we investigated if there were any systematic patterns or differences in QT scores for those with only one observation (Table 2). The proportion of single observations by experience category ranged from 0 to 16% and, when expressed as an effect size (Cohen’s d), the mean differences in QT scores between teachers with one or two observations ranged from −0.89 to 0.49. This suggests that there is no systematic variation across the experience categories for those with only one observation. Similarly, the distribution of teachers in each experience category by ICSEA indicates no clear pattern. Using cut points representing half of one standard deviation away from the national mean ICSEA value of 1000 (Table 2), we found nothing to suggest our results by years of experience would be biased due to over-representation from any specific experience category or ICSEA category.

Quality of teaching measure

Dimension level scores and an overall QT score were obtained by researchers observing and coding whole lessons using the QT Model. Dimension level scores were calculated using the mean of the six elements for each of the Intellectual Quality, Quality Learning Environment, and Significance dimensions (range 1–5). The total QT score was calculated using the mean of the 18 elements (range 1–5).

In total, 64 raters were involved in data collection across the two RCTs. They all received two days’ intensive training and subsequently completed independent scoring against pre-rated (20-min) lesson extracts. To ensure reliable scoring, no rater was sent into the field for data collection unless they achieved above 90% exact scoring. To further investigate inter-rater reliability among the large pool of raters, 317 lessons were double-coded (~ 32% of total observations) and the intraclass correlation coefficient (ICC _{(1)—one-way random effects}) of the QT score was calculated. The ICC for a single measure (single-rater score used for analysis) was 0.848 (95% CI 0.814–0.876), indicating good reliability at the lesson level.

The two observations of the same teacher at each time point (which account for 960 of the 990 observations) were investigated for consistency at the teacher level, using an intraclass correlation coefficient (ICC _{(3)—two-way mixed effects}). The ICC (average measures) for the two observations displayed moderate reliability at 0.603 (95% CI 0.524–0.669), indicating some variability between the two lessons at the teacher level. However, the raw change in mean QT score (overall) between repeated observations was −0.009 (95% CI −0.069–0.050), equating to a negligible difference of 0.33%. This indicates that, while there was some variability between the repeated measures, the magnitude of the variability renders it largely inconsequential.

Analysis

Data were analysed using IBM PASW Statistics 27 (SPSS Inc. Chicago, IL) software with alpha levels set at p < 0.05. Linear mixed models were fitted, treating years of experience as the explanatory variable (categorical) and teaching quality as the dependent variable (continuous). Teaching quality was calculated at the dimension level using the mean of the six elements in the dimension. The mean of all 18 elements was used as a total measure of teaching quality given that the elements combine to form a holistic model of pedagogy. To account for the hierarchical nature of the data (teachers within schools and multiple lessons per teacher), random intercepts for school, and teacher within school, were included in the model. The school ICSEA value was also included as a covariate. To ensure the correct p-value when comparing the different experience categories, pairwise comparisons (Sidak contrasts) were used to assess differences between categories in relation to the reference category of less than 1 year. Effect sizes were calculated using Cohen’s d = (reference group mean—comparison group mean) / pooled standard deviation (reference and comparison groups). Ninety-five per cent confidence intervals (95% CIs) of the effect size were computed using the compute.es function (Del Re, 2013) in R version 3.4.4 (R Core Team, 2022). This function computes the confidence intervals using the variance in d derived by the Hedges and Olkin (1985) formula. For comparison to previous research in the field, we also conducted a second analysis using a combined 0–3 years category as the reference category in pairwise comparisons.

Results

Figure 1 illustrates the mean QT scores for each experience category, with lesson scores, group means, and 95% confidence intervals depicted. Table 3 provides the means, standard deviations, and 95% confidence intervals for the overall QT score, as well as for each QT dimension, by experience category. The overlapping confidence intervals across the experience categories, visible in Fig. 1 and outlined in Table 3, highlight the similarity in the average dimension and QT scores.

Table 3 Pairwise comparisons

Full size table

The test of fixed effects for QT scores formally demonstrated no significant differences between experience categories at the p < 0.05 level (F(9, 486) = 0.569, p = 0.823). There were also no significant differences in the Intellectual Quality, Quality Learning Environment, and Significance dimensions by experience category. The pairwise comparisons also demonstrated no significant differences between the reference category (< 1 year) and all other categories (Table 3) at the p < 0.05 level. In short, years of teaching experience did not explain a significant proportion of the variance in the quality of teaching.

When analysed using 0–3 years as a combined reference category to mimic comparisons used in previous literature, we also found no significant difference (F(8, 489) = 0.535, p = 0.830) for the QT score. Likewise, there were no significant differences identified in the dimension level analysis.

Discussion

This study investigated how pedagogical quality, as measured by the QT Model, varies by years of teaching experience in classrooms in NSW, Australia. We sought to test this relationship with a robust and comprehensive model of pedagogy and with a wide range of teaching experience levels. Our analysis of nearly 1000 lessons found no significant differences between experience categories across the range, from teachers in their first year to those teaching for more than 24 years. No significant differences were found for overall QT score or among the dimensions of Intellectual Quality, Quality Learning Environment, and Significance by experience category. This somewhat counterintuitive finding makes an important empirical contribution, calling into question relentless critiques of the adequacy of beginning teachers.

Empirically, our study adds to the handful of international (Category 3) studies that use observation tools to examine teaching quality by years of experience. Overall, these studies also show small, non-significant differences between beginning and more experienced teachers. Significant differences found have been isolated to specific aspects of instruction and are inconsistent across samples.

Taken as a whole, evidence is accumulating that newly qualified teachers, on average, demonstrate a level of teaching quality commensurate to that of experienced teachers in a variety of contexts. In Australia, this pattern of evidence has now been found in two states, using two different measurement frameworks, the QT Model in NSW and CLASS in Queensland (for CLASS, see Graham et al., 2020), which suggests the result is not simply due to the sample or selection of observation tool. Our larger sample size, broader career span, and closer examination of the first year of teaching than most prior studies make important contributions to this body of literature.

Our finding that graduate teachers (on average) demonstrate pedagogical quality that is equivalent to their experienced colleagues is somewhat at odds with (Category 1) studies that report rapid gains in student achievement during the first few years of teaching (Kini & Podolsky, 2016; Ladd & Sorensen, 2017). It also is at odds with (Category 2) studies that document weaknesses in the cognition, behaviour, and functioning of novice teachers. Methodological differences in what is being measured and the extent to which context is considered might account for these different findings.

The fact that students are not randomly allocated to teachers, nor teachers randomly allocated to schools, must also be considered when explaining different findings between categories of studies. Early-career teachers are more often employed in hard-to-staff and disadvantaged schools (Luschei & Jeong, 2018; McKenzie et al., 2014; Rice, 2010, 2013). While we found no significant association between years of experience and school disadvantage (ICSEA) in our sample, this may be due to the small number of teachers who participated in each school. The complex relationships among school advantage, teaching experience, teaching quality, and student outcomes all warrant further investigation with larger samples of teachers from participating schools.

Nonetheless, we suggest two explanations that, if valid, have significant implications for education in Australia and beyond. First, ITE programs could be performing relatively well and better than is typically assumed in policy and the media (Mockler, 2022). That is, new graduates may enter the profession ‘classroom ready’ (TEMAG, 2014) and capable of demonstrating levels of pedagogical skill commensurate with their experienced colleagues. It is possible that improvements to ITE programs by higher education institutions, including lengthening the required days of professional experience and the use of standardised capstone assessments of teaching practice, have considerably advanced the teaching capacity of graduates compared to earlier cohorts. Such generational or ‘cohort effects’ are well-documented in the expertise (Category 2) literature whereby average performers today, across many skill domains, outperform the ‘experts’ of generations past (Hattie & Yates, 2014).

This is not to say that graduate teachers do not face difficulties. Indeed, the high rates of beginning teacher attrition (Dadvand & Dawborn-Gundlach, 2021) attest to the challenges of effective induction and adequate support. Nonetheless, our evidence suggests that those entering the teaching profession today might be better equipped to deliver higher quality instruction than their predecessors were at the outset of their careers. If this is the case, resources spent on the continuous procession of reviews and reforms in ITE might be more effective if directed elsewhere.

A second possible explanation is that on-the-job experience is insufficient to improve the quality of teaching over teachers’ careers. Indeed, the three categories of literature we have reviewed converge in finding that experience alone does not guarantee continual improvement. However, teachers gain more than classroom experience over the course of their careers, including participation in countless hours of PD much of which is designed to enhance practice. In Australia, almost 80% of teachers have been teaching for more than 5 years (McKenzie et al., 2014) and 99% of teachers participate in various forms of in-service training each year (Organisation for Economic Co-operation and Development [OECD], 2019). Specifically, in NSW, full-time teachers must document a minimum of 100 h of PD every 5 years to maintain their accreditation (New South Wales Education Standards Authority, 2021).

Our finding of no difference by years of experience raises important questions about the role of PD in strengthening the quality of teaching, especially given that many of the most rigorously evaluated PD interventions produce little change in student outcomes (Sims et al., 2021). The lack of difference we found in the quality of teaching across career stages could suggest that years of participation in PD has not translated into improvements in the quality of teaching (as measured by the QT Model), at least not above the level demonstrated by new members of the teaching profession. At the same time, importantly, some studies show that it is possible to enhance teachers’ classroom practice through PD (Garrett et al., 2019; Gore et al., 2017). To improve the quality of teaching across the career span, we need to ensure that all teachers participate in high-impact forms of PD with demonstrated positive effects on pedagogy.

While we have presented alternative explanations for our findings, both might have merit. Beginning teachers might be better prepared to enter the classroom than were previous cohorts, and all teachers might require better support to continue to improve their pedagogy throughout their careers. Of course, any explanation remains speculative without further investigation, including qualitative research to understand more deeply the relationship between years of experience and teaching quality. For now, we posit that: (1) new graduates can produce teaching of equivalent quality to their more experienced colleagues; and (2) years of experience do not ensure superior quality instruction. Therefore, we argue for policy that does not assume the inadequacy of beginning teachers or ITE and, instead, recognises the need for investment in high-impact forms of PD at all stages of teachers’ careers.

To be clear, we do not wish to imply that participation in PD is not worthwhile. It is plausible that PD has many positive effects that are not directly observable in teachers’ pedagogy as measured by the QT Model; for example, improving their content knowledge, morale, or self-efficacy—clearly worthy outcomes. Nor are we suggesting that years of experience are irrelevant. Experienced teachers make valuable contributions to school improvement through such activities as leadership, mentoring, and coaching.

More broadly, it is important to remember teaching is a contextual endeavour, with the structural inequalities that pervade society likely to continue affecting teaching quality, irrespective of the quality of PD provided (Gore et al., 2022). What students bring to school (including family background or conditions of poverty) remains predictive of academic performance even when statistical models control for teacher effects (Hanushek, 2016; Hattie, 2003, 2009; Konstantopoulos, 2006; Konstantopoulos & Borman, 2011; OECD, 2005). However, the impact of societal factors remains obscured when current debates focus so heavily on teachers. As Graham et al. (2020) argue, the “narrow focus on ITE and the graduates it produces may mean that the true nature and breadth of the problems impacting school education remain undetected and unresolved, while others are magnified beyond their actual or practical significance” (p. 2).

Limitations

Several limitations of this research should be noted. In terms of study design, our results pertain only to primary school teachers (mostly Years 3 and 4) in the NSW government school sector. Additional research is needed beyond these parameters to assess whether the findings hold more broadly. Second, only a select number of teachers from each school participated in the RCTs. Given research ethics requirements, teachers were asked to opt-in and participant places at each school were limited. Hence, our results may reflect a more motivated sample of teachers unrepresentative of whole schools. It is also possible that only very confident beginning teachers opted-in or were asked to be involved—although this could be true of teachers across the entire range of experience levels. Third, given that our analysis is cross-sectional, no information is provided to demonstrate changes to pedagogical quality over time. Nor is it possible to determine how cohort effects influence the results.

There are innumerable methodological challenges associated with ‘measuring’ something as complicated as the quality of teaching. While the QT Model offers a holistic model of teaching quality, we acknowledge that the work of teaching is broader than the pedagogical quality of lessons measured by the model. There are complex skills, knowledges, and dispositions that go beyond what can be directly observed in classrooms, highlighting the importance of considering a wide range of measures when evaluating teachers and teaching.

Finally, we acknowledge that ratings of lessons require a degree of subjective judgement. However, this limitation is mitigated by the careful elaboration of descriptors for each point on the 1-to-5 rating scale (see online Appendix), extensive training of observers, and strong ICCs. While inter-rater reliability for the QT score is considered ‘good’ (ICC = 0.848), the ‘moderate’ reliability at the teacher level (ICC = 0.603) indicates variability between lessons taught by the same teacher (i.e., 60% of teaching quality, as measured by the QT Model, is captured using two observations). Increasing the number of observations might reduce variability in QT score and increase the explanatory power of the statistical models presented. Without such data, appropriate caution should be used in generalising these results.

Conclusion

This study used a standardised pedagogical observation instrument, the Quality Teaching Model, to assess the quality of teaching produced by teachers across a broad range of experience levels. Despite continued government and media focus questioning the quality of new teachers and ITE, we found no evidence to indicate new teachers were inadequate, despite less on-the-job experience. Our findings suggest that ITE programs are producing graduates whose classroom practice is on par with those across the career span and that experience, including participation in formal and informal PD, does not necessarily produce higher quality pedagogy. In response, we urge policy makers to: (1) acknowledge the good work being done by beginning teachers and ITE programs; and (2) ensure that teachers have access to demonstrably high-impact PD over the entire course of their careers.

To be clear, we are not suggesting that beginning teachers do not face immense challenges upon entry to the profession. Nor do we wish to imply that experienced teachers are not valuable or that PD in general is not worthwhile. Rather, we argue that policy efforts to raise the quality of teaching must focus on the provision of PD with evidence of positive effects on pedagogy at all career stages. At the same time, we acknowledge that high-impact PD alone cannot compensate for the stratifying effects on student outcomes of increasing inequity, school resourcing disparities, and school socioeconomic segregation (Bonnor & Shepherd, 2017). Schools should be resourced properly and teachers supported well, especially in difficult contexts where delivering quality teaching is harder (Gore et al., 2022). Nonetheless, the fresh evidence we have provided—showing no relationship between quality and experience—raises important questions about assumptions held and provisions made to support ongoing improvement in teaching across all contexts and career stages.

Data availability

Not applicable.

Notes

Value-added models (VAMs) are statistical models that use a variety of measures to predict student performance on standardised tests (e.g., past test results and demographic information). A teacher’s ‘value added’ is the difference between the statistical model’s predictions and their students’ actual test performance (see Opper, 2019).
Note that our research adopts a post-positivist stance. We are of the view that it is possible to produce meaningful measurements of pedagogical quality using robust observation frameworks and rigorous research designs (e.g., RCT designs). At the same time, however, we recognise that such measurements are necessarily value-laden and partial.
ICSEA is a standardised measure of school-level advantage in Australia. It includes parental education and occupation, school location, and proportion of Indigenous students. The mean ICSEA score is 1000 and the standard deviation is 100.

References

Amrein-Beardsley, A., & Close, K. (2019). Teacher-level value-added models on trial: Empirical and pragmatic issues of concern across five court cases. Educational Policy, 26, 5–28. https://doi.org/10.1177/0895904819843593
Article Google Scholar
Araujo, M. C., Carneiro, P., Cruz-Aguayo, Y., & Schady, N. (2016). Teacher quality and learning outcomes in Kindergarten. The Quarterly Journal of Economics, 131(3), 1415–1454. https://doi.org/10.2307/26372667
Article Google Scholar
Barnes, M., & Cross, R. (2018). ‘Quality’ at a cost: The politics of teacher education policy in Australia. Critical Studies in Education, 62(4), 455–470. https://doi.org/10.1080/17508487.2018.1558410
Article Google Scholar
Berliner, D. C. (1988). Implication of studies on expertise in pedagogy for teacher education and evaluation. In J. Pfleiderer (Ed.), New directions for teacher assessment. Proceedings of the 1988 ETS Invitational Congress (pp. 39–68). Educational Testing Service.
Berliner, D. C. (2001). Learning about learning from expert teachers. International Journal of Educational Research, 35, 463–482.
Article Google Scholar
Bill & Melinda Gates Foundation. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Retrieved from https://usprogram.gatesfoundation.org/news-and-insights/usp-resource-center/resources/gathering-feedback-on-teaching-combining-high-quality-observations-with-student-surveys-and-achievement-gains--report
Bonnor, C., & Shepherd, B. (2017). Losing the game: State of our schools in 2017. Retrieved from https://cpd.org.au/wp-content/uploads/2017/06/FINAL-Losing-the-Game-June-21-with-dedication.pdf
Borko, H., & Livingston, C. (1989). Cognition and improvisation: Differences in mathematics instruction by expert and novice teachers. American Educational Research Journal, 26, 473–498. https://doi.org/10.3102/00028312026004473
Article Google Scholar
Bryant, D. M., Clifford, R. M., & Peisner, E. S. (1991). Best practices for beginners: Developmental appropriateness in Kindergarten. American Educational Research Journal, 28(4), 783–803.
Article Google Scholar
Churchward, P., & Willis, J. (2019). The pursuit of teacher quality: Identifying some of the multiple discourses of quality that impact the work of teacher educators. Asia-Pacific Journal of Teacher Education, 47(3), 251–264. https://doi.org/10.1080/1359866X.2018.1555792
Article Google Scholar
Coe, R., Aloisi, C., Higgins, S., & Major, L. E. (2014). What makes great teaching? Review of the underpinning research. Retrieved from https://www.suttontrust.com/wp-content/uploads/2014/10/What-Makes-Great-Teaching-REPORT.pdf.
Cortina, K. S., Miller, K. F., McKenzie, R., & Epstein, A. (2015). Where low and high inference data converge: Validation of CLASS assessment of mathematics instruction using mobile eye tracking with expert and novice teachers. International Journal of Science and Mathematics Education, 13(2), 389–403. https://doi.org/10.1007/s10763-014-9610-5
Article ADS Google Scholar
Dadvand, B., & Dawborn-Gundlach, M. (2021). The challenge to retain second-career teachers. Retrieved from https://pursuit.unimelb.edu.au/articles/the-challenge-to-retain-second-career-teachers
Danielson, C. (2007). Enhancing professional practice: A framework for teaching. Association for Supervision and Curriculum Development.
Google Scholar
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8–15. https://doi.org/10.1177/003172171209300603
Article Google Scholar
Del Re, A. C. (2013). Compute.es: Compute effect sizes [Computer Software]. Retrieved from https://cran.rproject.org/package=compute.es.
Dinham, S. (2015). The worst of both worlds: How the US and UK are influencing education in Australia. Education Policy Analysis Archives, 23, 1–20. https://doi.org/10.14507/epaa.v23.1865
Article Google Scholar
Furlong, J. (2013). Globalisation, neoliberalism, and the reform of teacher education in England. The Educational Forum, 77(1), 28–50. https://doi.org/10.1080/00131725.2013.739017
Article Google Scholar
Garrett, R., Citkowicz, M., & Williams, R. (2019). How responsive is a teacher’s classroom practice to intervention? A meta-analysis of randomized field studies. Review of Research in Education, 43(1), 106–137. https://doi.org/10.3102/0091732X19830634
Article Google Scholar
Gitomer, D., Bell, C., Qi, Y., McCaffrey, D., Hamre, B., & Pianta, R. C. (2014). The instructional challenge in improving teaching quality: Lessons from a classroom observation protocol. Teachers College Record, 116(6), 1–32.
Article Google Scholar
Gore, J. M., & Bowe, J. M. (2015). Interrupting attrition? Re-shaping the transition from preservie to inservice teaching through Quality Teaching Rounds. International Journal of Educational Research, 73, 77–88. https://doi.org/10.1016/j.ijer.2015.05.006
Article Google Scholar
Gore, J., Jaremus, F., & Miller, A. (2022). Do disadvantaged schools have poorer teachers? Rethinking assumptions about the relationship between teaching quality and school-level advantage. The Australian Educational Researcher, 49, 635–656. https://doi.org/10.1007/s13384-021-00460-w
Article PubMed Google Scholar
Gore, J., Lloyd, A., Smith, M., Bowe, J., Ellis, H., & Lubans, D. (2017). Effects of professional development on the quality of teaching: Results from a randomised controlled trial of Quality Teaching Rounds. Teaching and Teacher Education, 68, 99–113. https://doi.org/10.1016/j.tate.2017.08.007
Article Google Scholar
Gore, J. M., Miller, A., Fray, L., Harris, J., & Prieto, E. (2021). Improving student achievement through professional development: Results from a randomised controlled trial of Quality Teaching Rounds. Teaching and Teacher Education, 101, 103297. https://doi.org/10.1016/j.tate.2021.103297
Article Google Scholar
Gore, J., & Rickards, B. (2021). Rejuvenating experienced teachers through Quality Teaching Rounds professional development. Journal of Educational Change, 22, 335–354. https://doi.org/10.1007/s10833-020-09386-z
Article Google Scholar
Gore, J., & Rosser, B. (2022). Beyond content-focused professional development: Powerful professional learning through genuine learning communities across grades and subjects. Professional Development in Education, 48(2), 218–232. https://doi.org/10.1080/19415257.2020.1725904
Article Google Scholar
Gore, J., Smith, M., Bowe, J., Ellis, H., Lloyd, A., & Lubans, D. (2015). Quality teaching rounds as a professional development intervention for enhancing the quality of teaching: Rational and study protocol for a cluster randomised controlled trial. International Journal of Educational Research, 74, 82–95. https://doi.org/10.1016/j.ijer.2015.08.002
Article Google Scholar
Grossman, P., Cohen, J., & Brown, L. (2014). Understanding instructional quality in English Language Arts: Variations in the relationship between PLATO and value-added by content and context. In K. Kerr, R. Pianta, & T. Kane (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching Project (pp. 303–331). Jossey-Bass.
Google Scholar
Gudmundsdottir, S., & Shulman, L. (1987). Pedagogical content knowledge in social studies. Scandinavian Journal of Educational Research, 31, 59–70. https://doi.org/10.1080/0031383870310201
Article Google Scholar
Guo, Y., Connor, C. M., Yang, Y., Roehrig, A. D., & Morrison, F. J. (2012). The effects of teacher qualification, teacher self-efficacy, and classroom practices on fifth graders’ literacy outcomes. The Elementary School Journal, 113, 3–24. https://doi.org/10.1086/665816
Article Google Scholar
Graham, L. J., White, S. L. J., Cologon, K., & Pianta, R. C. (2020). Do teachers’ years of experience make a difference in the quality of teaching? Teaching and Teacher Education, 96, 1–10. https://doi.org/10.1016/j.tate.2020.103190
Article Google Scholar
Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis of the evidence. Educational Assessment, Evaluation and Accountability, 26(1), 5–28. https://doi.org/10.1007/s11092-013-9179-5
Article Google Scholar
Hanushek, E. A. (2016). What matters for student achievement: Updating Coleman on the influence of families and schools. Education next, 16(2), 18–26.
Google Scholar
Hanushek, E. A., & Rivkin, S. G. (2006). Teacher quality. In E. Hanushek & F. Welch (Eds.), Handbook of the economics of education (Vol. 2, pp. 1051–1078). Elsevier.
Google Scholar
Hanushek, E. A., & Rivkin, S. G. (2012). The distribution of teacher quality and implications for policy. Annual Review of Economics, 4, 131–157. https://doi.org/10.1146/annurev-economics-080511-111001
Article Google Scholar
Harris, D. N., & Sass, T. R. (2011). Teacher training, teacher quality and student achievement. Journal of Public Economics, 95, 798–812. https://doi.org/10.1016/j.jpubeco.2010.11.009
Article Google Scholar
Hattie, J. (2003). Teachers make a difference: What is the research evidence? [Paper presentation]. Australian Council for Educational Research 2003 Research Conference, “Building teacher quality: What does the research tell us?,” Melbourne, Australia. Retrieved from http://research.acer.edu.au/research_conference_2003/4/
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.
Google Scholar
Hattie, J., & Yates, G. C. R. (2014). Visible learning and the science of how we learn. Routledge.
Google Scholar
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Academic Press. Retrieved from https://idostatistics.com/hedges-olkin-1985-statistical-methods-for-metaanalysis/
Henry, G. T., Fortner, C. K., & Bastian, K. C. (2012). The effects of experience and attrition for novice high-school science and mathematics teachers. Science, 335(6072), 1118–1121. https://doi.org/10.1126/science.1215343
Article ADS CAS PubMed Google Scholar
Hill, H. (2005). Content across communities: Validating measures of elementary mathematics instruction. Educational Policy, 19(3), 447–475. https://doi.org/10.1177/0895904805276142
Article Google Scholar
Hill, H. C., Blazar, D., & Lynch, K. (2015). Resources for teaching: Examining personal and institutional predictors of high-quality instruction. AERA Open, 1(4), 1–23. https://doi.org/10.1177/2332858415617703
Article Google Scholar
Huang, R., & Li, Y. (2012). What matters most: A comparison of expert and novice teachers’ noticing of mathematics classroom events. School Science and Mathematics, 112(7), 420–432. https://doi.org/10.1111/j.1949-8594.2012.00161.x
Article Google Scholar
Huang, F. L., & Moon, T. R. (2009). Is experience the best teacher? A multilevel analysis of teacher characteristics and student achievement in low performing school. Educational Assessment, Evaluation and Accountability, 21, 209–234. https://doi.org/10.1007/s11092-009-9074-2
Article Google Scholar
Ingvarson, L., & Rowe, K. (2008). Conceptualising and evaluating teacher quality: Substantive and methodological issues. Australian Journal of Education, 52(1), 5–35.
Article Google Scholar
Johnson, K. (2005). The ‘general’ study of expertise. In K. Johnson (Ed.), Expertise in second language learning and teaching (pp. 11–33). Palgrave Macmillan.
Chapter Google Scholar
Kim, L. E., & Klassen, R. M. (2018). Teachers’ cognitive processing of complex school-based scenarios: Differences across experience levels. Teaching and Teacher Education, 73, 215–226. https://doi.org/10.1016/j.tate.2018.04.006
Article Google Scholar
Kini, T., & Podolsky, A. (2016). Does teaching experience increase teacher effectiveness? A review of the research. Learning Policy Institute.
Book Google Scholar
Konstantopoulos, S. (2006). Trends of school effects on student achievement. Teachers College Record, 108(12), 2550–2581.
Article Google Scholar
Konstantopoulos, S., & Borman, G. D. (2011). Family background and school effects on student achievement: A multilevel analysis of the Coleman data. Teachers College Record, 113(1), 97–132. https://doi.org/10.1177/016146811111300101
Article Google Scholar
Kraft, M. A., & Papay, J. P. (2014). Can professional environments in schools promote teacher development? Explaining heterogeneity in returns to teaching experience. Educational Evaluation and Policy Analysis, 36(4), 476–500. https://doi.org/10.3102/0162373713519496
Article PubMed PubMed Central Google Scholar
Ladd, H. F., & Sorensen, L. C. (2017). Returns to teacher experience: Student achievement and motivation in middle school. Education Finance and Policy, 12(2), 241–279. https://doi.org/10.1162/EDFPa_00194
Article Google Scholar
Ladwig, J. G., & King, M. B. (2003). Quality teaching in NSW public schools: An annotated bibliography. NSW Department of Education and Training Professional Support and Curriculum Directorate.
Google Scholar
Leder, G. C., & Forgasz, H. J. (2018). Measuring who counts: Gender and mathematics assessment. ZDM Mathematics Education, 50(4), 687–697. https://doi.org/10.1007/s11858-018-0939-z
Article Google Scholar
Leinhardt, G. (1989). Math lessons: A contrast of novice and expert competence. Journal for Research in Mathematics Education, 20, 52–75. https://doi.org/10.5951/jresematheduc.20.1.0052
Article Google Scholar
Lingard, B., Ladwig, J., Mills, M., Bahr, M., Chant, D., Warry, M., Ailwood, J., Capeness, R., Christie, P., Gore, J., Hayes, D., & Luke, A. (2001). Queensland school reform longitudinal study: Final report. Education Queensland.
Google Scholar
Louden, W. (2008). 101 damnations: The persistence of criticism and the absence of evidence about teacher education in Australia. Teachers and Teaching, 14(4), 357–368. https://doi.org/10.1080/13540600802037777
Article Google Scholar
Luschei, T. F., & Jeong, D. W. (2018). Is teacher sorting a global phenomenon? Cross-national evidence on the nature and correlates of teacher quality opportunity gaps. Educational Researcher, 47(9), 556–576. https://doi.org/10.3102/0013189X18794401
Article Google Scholar
Martinez, F., Taut, S., & Schaaf, K. (2016). Classroom observation for evaluating and improving teaching: An international perspective. Studies in Educational Evaluation, 49, 15–29. https://doi.org/10.1016/j.stueduc.2016.03.002
Article Google Scholar
Mayer, D. (Ed.). (2021). Teacher education policy and research: Global perspectives. Springer. https://doi.org/10.1007/978-981-16-3775-9
Book Google Scholar
McIntyre, N. A., Mainhard, M. T., & Klassen, R. M. (2017). Are you looking to teach? Cultural, temporal and dynamic insights into expert teacher gaze. Learning and Instruction, 49, 41–53. https://doi.org/10.1016/j.learninstrucC016.12.005
Article Google Scholar
McKenzie, P., Weldon, P., Rowley, G., Murphy, M., & Mcmillan, J. (2014). Staff in Australia’s schools 2013: Main report on the survey. Retrieved from https://research.acer.edu.au/cgi/viewcontent.cgi?article=1021&context=tll_misc
Mihaly, K., & McCaffrey, D. F. (2015). Grade-level variation in observational measures of teacher effectiveness. In T. J. Kane, K. A. Kerr, & R. C. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the measures of effective teaching project (pp. 9–49). Wiley.
Google Scholar
Miller, A., Gore, J., Wallington, C., Harris, J., Prieto-Rodriguez, E., & Smith, M. (2019). Improving student outcomes through professional development: Protocol for a cluster randomised controlled trial of Quality Teaching Rounds. International Journal of Educational Research, 98, 146–158. https://doi.org/10.1016/j.ijer.2019.09.002
Article Google Scholar
Mockler, N. (2018). Early career teachers in Australia: A critical policy historiography. Journal of Education Policy, 33(2), 262–278. https://doi.org/10.1080/02680939.2017.1332785
Article Google Scholar
Mockler, N. (2022). Constructing teacher identities: How the print media define and represent teachers and their work. Bloomsbury Publishing.
Book Google Scholar
National Institute of Child Health and Human Development Early Child Care Research Network. (2002). The relation of global first-grade classroom environment to structural classroom features and teacher and student behaviors. The Elementary School Journal, 102(5), 367–387. https://doi.org/10.1086/499709
Article Google Scholar
National Institute of Child Health and Human Development Early Child Care Research Network. (2005). A day in third grade: A large-scale study of classroom quality and teacher and student behaviour. The Elementary School Journal, 105(3), 305–323. https://doi.org/10.1086/428746
Article Google Scholar
Newmann, F. M. (1996). Authentic achievement: Restructuring schools for intellectual quality. Jossey Bass.
Google Scholar
New South Wales Education Standards Authority. (2021). Maintenance of teacher accreditation policy. Retrieved from https://educationstandards.nsw.edu.au/wps/portal/nesa/teacher-accreditation/resources/policies-procedures/maintenance-of-teacher-accreditation-policy
New South Wales Department of Education and Training. (2006). Quality teaching in NSW public schools: A classroom practice guide (2nd ed.). Retrieved from https://app.education.nsw.gov.au/quality-teaching-rounds/Assets/Classroom_Practice_Guide_ogogVUqQeB.pdf
Nye, B., Konstantopoulos, S., & Hedges, L. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3), 237–257.
Article Google Scholar
Opper, I. M. (2019). Value-added modeling 101: Using student test scores to help measure teaching effectiveness. RAND Corporation.
Book Google Scholar
Organisation for Economic Co-operation and Development. (2005). Teachers matter: Attracting, developing and retaining effective teachers. Retrieved from https://www.oecd.org/education/school/attractingdevelopingandretainingeffectiveteachers-finalreportteachersmatter.htm
Organisation for Economic Co-operation and Development. (2019). Country note - Australia: Results from TALIS 2018. Retrieved from https://www.oecd.org/education/talis/TALIS2018_CN_AUS.pdf
Page, T. M. (2015). Common pressures, same results? Recent reforms in professional standards and competences in teacher education for secondary teachers in England, France and Germany. Journal of Education for Teaching, 41(2), 180–202. https://doi.org/10.1080/02607476.2015.1011900
Article Google Scholar
Papay, J. P., & Kraft, M. A. (2015). Productivity returns to experience in the teacher labor market: Methodological challenges and new evidence on long-term career improvement. Journal of Public Economics, 130, 105–119. https://doi.org/10.1016/j.jpubeco.2015.02.008
Article Google Scholar
Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom assessment scoring system™: Manual K-3. Brookes Publishing.
Google Scholar
Pianta, R. C., La Paro, K. M., Payne, C., Cox, M. J., & Bradley, R. (2002). The relation of kindergarten classroom environment to teacher, family, and school characteristics and child outcomes. The Elementary School Journal, 102(3), 225–238. https://doi.org/10.1086/499701
Article Google Scholar
Paul, L., Louden, W., Elliot, M., & Scott, D. (2021). Next steps: Report of the Quality Initial Teacher Education Review. Report prepared for the Australian Government. Retrieved from https://www.dese.gov.au/quality-initial-teacher-education-review/resources/next-steps-report-quality-initial-teacher-education-review
Quality Teaching Academy. (2020). Quality teaching: Classroom practice guide (3rd ed.). NSW Department of Education.
Google Scholar
R Core Team. (2022). R: A language and environment for statistical computing (Version 3.4.4) [Computer Software]. R Foundation for Statistical Computing. Retrieved from https://www.r-project.org/
Reynolds, D., Sammons, P., De Fraine, B., Van Damme, J., Townsend, T., Teddlie, C., & Stringfield, S. (2014). Educational effectiveness research (ERR): A state-of-the-art review. School Effectiveness and School Improvement, 25(2), 197–230. https://doi.org/10.1080/09243453.2014.885450
Article Google Scholar
Rice, J. K. (2010). The Impact of teacher experience: Examining the evidence and policy implications. National Centre for Analysis of Longitudinal Data in Education Research. Retrieved from https://eric.ed.gov/?id=ED511988
Rice, J. K. (2013). Learning from experience? Evidence on the impact and distribution of teacher experience and implications for teacher policy. Education Finance and Policy, 8(3), 332–348. https://doi.org/10.1162/EDFP_a_00099
Article Google Scholar
Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417–458. https://doi.org/10.1257/0002828041302244
Article Google Scholar
Rockoff, J. E. (2004). The impact of individual teachers on student achievement: Evidence from panel data. The American Economic Review, 94(2), 247–252. https://doi.org/10.1257/0002828041302244
Article Google Scholar
Rockoff, J. E., & Speroni, C. (2010). Subjective and objective evaluations of teacher effectiveness. American Economic Review, 100, 261–266. https://doi.org/10.1257/aer.100.2.261
Article Google Scholar
Rowe, E. E., & Skourdoumbis, A. (2019). Calling for ‘urgent national action to improve the quality of initial teacher education’: The reification of evidence and accountability in reform agendas. Journal of Education Policy, 34(1), 44–60. https://doi.org/10.1080/02680939.2017.1410577
Article Google Scholar
Schoenfeld, A. H. (2011). Reflections on teacher expertise. In Y. Li & G. Kaiser (Eds.), Expertise in mathematics instruction (pp. 327–341). Springer.
Chapter Google Scholar
Simpson, A., Cotton, W., & Gore, J. (2021). Teacher education/ors in Australia: Still shaping the profession despite policy intervention. In D. Mayer (Ed.), Teacher education policy and research: Global perspectives (pp. 11–26). Springer.
Chapter Google Scholar
Sims, S., Fletcher-Wood, H., O’Mara-Eves, A., Cottingham, S., Stansfield, C., Van Herwegen, J., & Anders, J. (2021). What are the characteristics of teacher professional development that increase pupil achievement? A systematic review and meta-analysis. Education Endowment Foundation. Retrieved from https://educationendowmentfoundation.org.uk/education-evidence/evidence-reviews/teacherprofessional-development-characteristics
Stuhlman, M. W., & Pianta, R. C. (2009). Profiles of educational quality in first grade. The Elementary School Journal, 109(4), 323–342. https://doi.org/10.1086/593936
Article Google Scholar
Tatto, M. T., Burn, K., Menter, I., Mutton, T., & Thompson, I. (2018). Learning to teach in England and the United States: The evolution of policy and practice. Routledge.
Google Scholar
Teacher Education Ministerial Advisory Group. (2014). Action now, classroom ready teachers [Report]. Retrieved from https://www.dese.gov.au/teaching-and-school-leadership/resources/action-now-classroom-ready-teachers-report-0
Tsiu, A. B. M. (2005). Expertise in teaching: Perspectives and issues. In K. Johnson (Ed.), Expertise in second language learning and teaching (pp. 167–189). Palgrave Macmillan.
Chapter Google Scholar
Tsui, A. B. M. (2009). Teaching expertise: Approaches, perspectives and characterizations. In A. Burns & J. C. Richards (Eds.), The Cambridge guide to second language teacher education (pp. 190–197). Cambridge University Press.
Chapter Google Scholar
Tudge, A. (2021). Initial teacher education review launched [Media release]. Retrieved from https://ministers.dese.gov.au/tudge/initial-teacher-education-review-launched
Westerman, D. A. (1991). Expert and novice teacher decision making. Journal of Teacher Education, 42(4), 292–305. https://doi.org/10.1177/002248719104200407
Article Google Scholar
Wolff, C. E., Jarodzka, H., & Boshuizen, H. P. A. (2017). Differences between expert and novice teachers’ interpretations of problematic classroom management events. Teaching and Teacher Education, 66, 295–308. https://doi.org/10.1016/j.tate.2017.04.015
Article Google Scholar
Zeichner, K. (2006). Different conceptions of teacher expertise and teacher education in the USA. Education Research and Perspectives, 33(2), 60–79.
Google Scholar

Download references

Acknowledgements

We wish to thank all teachers and principals who took part in this research, as well as all project staff. We are especially grateful to the project manager of both RCTs, Wendy Taggart.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions. This research was funded by the Paul Ramsay Foundation, the NSW Department of Education and the Australian Research Council (Grant No. DP180100285).

Author information

Authors and Affiliations

Teachers and Teaching Research Centre, School of Education, The University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
Jennifer Gore, Brooke Rosser, Felicia Jaremus, Andrew Miller & Jess Harris

Authors

Jennifer Gore
View author publications
You can also search for this author in PubMed Google Scholar
Brooke Rosser
View author publications
You can also search for this author in PubMed Google Scholar
Felicia Jaremus
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Miller
View author publications
You can also search for this author in PubMed Google Scholar
Jess Harris
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors confirm contributions to the paper as follows: Study conception and design: JG, AM, JH; Data analysis: AM, FJ, JG; Draft manuscript preparation: BR, FJ, JG. All authors reviewed the analysis, refined the manuscript and approved the final version.

Corresponding author

Correspondence to Brooke Rosser.

Ethics declarations

Competing interests

The authors report no potential conflict of interest.

Ethical approval

The studies were approved by the University of Newcastle’s Human Research Ethics Committee (Approval Nos. H-2014-0123, H-2018-0340) and the NSW Department of Education State Education Research Applications Process (SERAP) (Approval Nos. 2014103, 2018458).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 46 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gore, J., Rosser, B., Jaremus, F. et al. Fresh evidence on the relationship between years of experience and teaching quality. Aust. Educ. Res. 51, 547–570 (2024). https://doi.org/10.1007/s13384-023-00612-0

Download citation

Received: 25 July 2022
Accepted: 01 February 2023
Published: 03 March 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s13384-023-00612-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Fresh evidence on the relationship between years of experience and teaching quality

Abstract

Similar content being viewed by others

The role of opportunities to learn in early childhood teacher education from two perspectives: A multilevel model

An Expert Teacher’s Use of Teaching with Variation to Support a Junior Mathematics Teacher’s Professional Learning

Learning opportunities in teacher education and proficiency levels in general pedagogical knowledge: new insights into the accountability of teacher education programs

Introduction

Background to the study

Category 1. Studies of student achievement as proxy for quality

Category 2. Studies of differences between expert and novice teachers

Category 3. Studies using observational frameworks to assess teaching quality

Methods

The Quality Teaching Model

Data sources

Years of experience

Sample

Quality of teaching measure

Analysis

Results

Discussion

Limitations

Conclusion

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical approval

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 46 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation