Abstract
In recent years, substantial investments have been made in reengineering systems of teacher evaluation. The new generation models of teacher evaluation typically adopt a standards-based view of teaching quality and include a value-added measure of growth in student learning. With more than a decade of experience and research, it is timely to assess empirical evidence bearing on the efficacy of this school improvement strategy. This paper examines the new generation of teacher evaluation along three lines of analysis: evidence on the magnitude, consistency, and stability of teacher effects on student learning, evidence on the impact of teacher evaluation on growth in student learning, and literature from the sociology of organizations on how schools function. Although the trend towards focusing on teacher evaluation is increasingly evident internationally, most of the empirical research evaluated in this paper is from the USA. This critical evaluation of the empirical literature yields two key conclusions. First, we conclude that the policy logic supporting this reform remains considerably stronger than the empirical evidence. Second, we suggest that alternative improvement strategies may yield more positive results and at a lower cost in terms of staff time and district funds.
Similar content being viewed by others
References
Aaronson, D., Barrow, L., & Sander, W. (2007). Teachers and student achievement in the Chicago public high schools. Journal of Labor Economics, 25(1), 93–135.
Atkinson, A., Burgess, S., Croxsonc, B., Gregg, P., Propper, C., Slater, H., & Wilson, D. (2009). Evaluating the impact of performance-related pay for teachers in England. Labour Economics, 16(3), 251–261.
Attinello, J., Lare, D., & Waters, F. (2006). The value of teacher portfolios for evaluation and professional growth. NASSP Bulletin, 90(2), 132–152.
Baker, E. L., Barton, P., E., Darling-Hammond, L., Haertel, E., Ladd, H.F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R.J., & Shepard, L.A. (2010). Problems with the use of student test scores to evaluate teachers. EPI Briefing Paper #278. Washington, DC: Economic Policy Institute
Balfanz, R., & Byrnes, V. (2006). Closing the mathematics achievement gap in high-poverty middle schools: enablers and constraints. Journal of Education for Students Placed at Risk, 11(2), 143–159.
Ball, S. (2003). The teacher’s soul and the terrors of performativity. Journal of Education Policy, 18(2), 215–228.
Barth, R. (1980). Run school run. Cambridge. MA: Harvard University Press.
Barth, R. (1986). On sheep and goats and school reform. Phi Delta Kappan, 68(4), 293–296.
Bembry, K. L., & Schumacker, R. E. (2002). Establishing the utility of a classroom effectiveness index as a teacher accountability measure. Journal for Effective Schools, 1(1), 61–77.
Bidwell, C. E. (1965). The school as a formal organization. In J. G. Marsh (Ed.), Handbook of organizations (pp. 972–1022). Chicago: Rand McNally.
Blasé, J., & Kirby, P. (2009). Bringing out the best in teachers: what effective principals do. Thousand Oaks: Corwin.
Borman, G. D. (2005). National efforts to bring reform to scale in high-poverty schools: outcomes and implications. Review of Research in Education, 29(1), 1–27.
Borman, G., & Kimball, S. (2005). Teacher quality and educational equality: do teachers with higher standards-based evaluation ratings close student achievement gaps? The Elementary School journal, 106(1), 3–20.
Bressoux, P., & Bianco, M. (2004). Long-term teacher effects on pupils’ learning gains. Oxford Review of Education, 30(3), 327–45.
Bridges, E. (1967). Instructional leadership: a concept re-examined. Journal of Educational Administration, 5(2), 136–147.
Bridges, E. (1990). Managing the incompetent teacher (2nd ed.). Eugene: ERIC Clearinghouse on Educational Management.
Bryk, A. S., Sebring, P. B., & Allensworth, E. (2010). Organizing schools for improvement: lessons from Chicago. Chicago: University of Chicago Press.
Callahan, R. E. (1962). Education and the cult of efficiency. Chicago: University of Chicago Press.
Camburn, E., Rowan, B., & Taylor, J. E. (2003). Distributed leadership in schools: the case of elementary schools adopting comprehensive school reform models. Educational Evaluation and Policy Analysis, 25(4), 347–373.
Castetter, W. B. (1976). The personnel function in educational administration. New York: MacMillan.
Coleman, J. S., Campbell, E. Q., Hobson, C. J., Mcpartland, J., Mood, A. M., Weinfeld, F. D., & York, R. T. (1966). Equality of educational opportunity. Washington, DC: U.S. Government.
Cotton, K. (2000). The schooling practices that matter most. Alexandria: Association for Supervision and Curriculum Development.
Creemers, B., & Kyriakides, L. (2008). The dynamics of educational effectiveness: a contribution to policy, practice and theory in contemporary schools. New York: Routledge.
Crosnoe, R. (2011). Fitting in, standing out: navigating the social challenges of high school to get an education. Cambridge: Cambridge University Press.
Cuban, L. (1988). The managerial imperative and the practice of leadership in schools. Albany: State University of New York Press.
Danielson, C. (2007). Enhancing professional practice: a framework for teaching (2nd ed.). Alexandria: Association for Supervision and Curriculum Development.
Darling-Hammond, L., & Youngs, P. (2006). Defining “highly qualified teachers”: what does “scientifically-based research” actually tell us? Educational Researcher, 31(9), 13–25.
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8–15.
De Fraine, J., Van Damme, J., & Onghena, P. (2002). Accountability of schools and teachers: what should be taken into account? European Educational Research Journal, 1(3), 403–427.
Duke, D. L. (1990). Developing teacher evaluation systems that promote professional growth. Journal of Personnel Evaluation in Education, 4, 131–144.
Duke, D. L., Showers, B. K., & Amber, M. (1980). Teacher and shared decision-making: the costs and benefits of involvement. Educational Administrative Quarterly, 16(1), 25–35.
Ellett, C., & Teddlie, C. (2003). Teacher evaluation, teacher effectiveness and school effectiveness: perspectives from the USA. Journal of Personnel Evaluation in Education, 17(1), 101–128.
Flores, A. A. (2012). The implementation of a new policy on teacher appraisal in Portugal: how do teachers experience it at school? Educational Assessment, Evaluation and Accountability, 24(4), 351–368.
Fullan, M. (2001). Leading in a culture of change. San Francisco: Jossey-Bass.
Garet, M. S., & Delany, M. (1988). Students, courses, and stratification. Sociology of Education, 61(2), 61–77.
Gates Foundation. (2013). Measures of effective teaching (MET). Downloaded January 14, 2013 from http://www.gatesfoundation.org/united-states/Pages/measures-of-effective-teaching-fact-sheet.aspx.
Glass, G. (2013). Gates Foundation wastes more money pushing VAM. Downloaded January 14, 2013 from http://ed2worlds.blogspot.com/2013/01/gates-foundation-wastes-more-money.html.
Gleeson, D., & Husbands, C. (2003). Modernizing schooling through performance management: a critical appraisal. Journal of Education Policy, 18(5), 499–511.
Goldhaber, D. (2002). The mystery of good teaching. Education Next, 2(1). Downloaded on Jan. 3, 2013 from http://educationnext.org/the-mystery-of-good-teaching/.
Goldhaber, D., & Anthony, E. (2007). Can teacher quality be effectively assessed? National board certification as a signal of effective teaching. Review of Economics and Statistics, 89(1), 134–150.
Gough, D. (2007). Weight of evidence: a framework for the appraisal of the quality and relevance of evidence. Applied and Practice-based Research, 22(2), 213–228.
Gray, J., Wilcox, B., Goldstein, H., Hannon, V., Hedger, K., Jesson, D., Rasbash, J., & Sime, N. (1995). Good school, bad school: evaluating performance and encouraging improvement. Buckingham: Open University Press.
Grotke, E. (1953). Professional distance and teacher evaluation. Phi Delta Kappan, 34(4), 127–130.
Hallinger, P. (2013). A conceptual framework for reviews of research in educational leadership and management. Journal of Educational Administration, 51(2), 126–149.
Hallinger, P., & Heck. (1998). Exploring the principal’s contribution to school effectiveness: 1980–1995. School Effectiveness and School Improvement, 9(2), 157–191.
Hallinger, P., & Murphy, J. F. (2012). Running on empty? Finding the time and capacity to lead learning. NASSP Bulletin, 97, 5–21.
Hallinger, P., Ko, J., & Walker, A. (2014). Exploring whole school vs. subject department improvement in Hong Kong secondary schools. School Effectiveness and School Improvement, in press.
Hamilton, L. S., Stecher, B. M., Russell, J. L., Marsh, J. A., & Miles, J. (2008). Accountability and teaching practices: school-level actions and teacher responses. In B. Fuller, M. K. Henne, & E. Hannum (Eds.), Strong states, weak schools: the benefits and dilemmas of centralized accountability. St. Louis: Emerald (Research in the Sociology of Education, Vol. 16, pp. 31–66).
Hanushek, E. (1992). The trade-off between child quantity and quality. Journal of Political Economy, 100, 84–117.
Hanushek, E. (2010). The economic value of higher teacher quality. Cambridge: National Bureau of Economic Research. Working Paper 16606 http://www.nber.org/papers/w16606.
Hanushek, E., & Rivkin, S. (2010). Generalizations about using value-added measures of teacher quality. American Economic Review, 100(2), 267–71.
Harris, D. N., & Herrington, C. D. (2006). Accountability, standards, and the growing achievement gap: lessons from the past half century. American Journal of Education, 112(2), 209–238.
Harvey, L. (2005). A history and critique of quality evaluation in the UK. Quality Assurance in Education, 13(4), 263–276.
Hattie, J. A. C. (2009). Visible learning: a synthesis of over 800 meta-analyses relating to achievement. London: Routledge.
Hawley, W., & Rosenholtz, S. (1984). Good schools: what research says about improving school achievement. Peabody Journal of Education, 61, 117–124.
Heck, R. H., & Hallinger, P. (2009). Assessing the contribution of distributed leadership to school improvement and growth in math achievement. American Educational Research Journal, 46, 626–658.
Heneman, H., III, & Milanowski, A. T. (2007). Assessing human resource alignment: the foundation for building total teacher quality improvement. Madison: Consortium for Policy Research in Education.
Herman, R., & Stringfield, S. (1997). Ten promising programs for educating all children: evidence of impact. Arlington: Education Research Service.
Hopkins, D., & Stern, D. (1996). Quality teachers, quality schools: international perspectives and policy implications. Teaching and Teacher Education, 12(5), 501–517.
Horng, E. L., Klasik, D., & Loeb, S. (2010). Principal time-use and school effectiveness. National Center for the Analysis of Longitudinal Data in Education research. Retrieved June 1st 2010 from www.stanford.edu/.../Principal%20Time-Use%20Research%20Paper%20(revised).pdf.
Ikemoto, G., Taliaferro, L., & Adams, E. (2012). Playmakers: how great principals build and lead great teams of teachers. New York: New Leaders.
Joyce, B., & Showers, B. (2002). Student achievement through staff development. Alexandria: Association for Supervision and Curriculum Development.
Kelly, A., & Downey, C. (2010). Value-added measures for schools in England: looking inside the ‘black box’ of complex metrics. Educational Assessment, Evaluation and Accountability, 22(3), 181–198.
Kimball, S. M., & Milanowski, A. T. (2009). Examining teacher evaluation validity and leadership decision making within a standards-based evaluation system. Educational Administration Quarterly, 45(1), 34–70.
Kimball, S. M., White, B., Milanowski, A. T., & Borman, G. (2004). Examining the relationship between teacher evaluation and student assessment results in Washoe County. Peabody Journal of Education, 79(4), 54–78.
Koppich, J., & Showalter, C. (2005). Strategic management of human capital: a cross-case analysis of five districts. Madison: Strategic Management of Human Capital.
Kyriakides, L., Creemers, B., Antoniou, P., & Demetriou, D. (2009). A synthesis of studies searching for school factors: implications for theory and research. British Educational Research Journal, 36(1), 1–24.
Lachat, M. A., & Smith, S. (2005). Practices that support data use in urban high schools. Journal of Education for Students Placed at Risk, 10(3), 333–349.
Latham, G., & Wexley, K. (1981). Increasing productivity through performance appraisal. Menlo Park: Addison Wesley.
Lee, M. S., & Hallinger, P. (2012). Exploring the impact of national context on principals’ time use: economic development, societal culture, and educational system. School Effectiveness and School Improvement, 23(4), 461–482.
Leithwood, K. (2001). School leadership in the context of accountability policies. International Journal of Leadership in Education, 4(3), 217–235.
Leithwood, L., & Earl, L. (2000). Educational accountability effects: an international perspective. Peabody Journal of Education, 75(4), 1–20.
Leithwood, K., Harris, A., & Strauss, T. (2010). Leading school turnaround: how successful leaders transform low-performing schools. San Francisco: Jossey-Bass.
Lewis, A. (2008). Add it up: using research to improve education and minority students. Washington, DC: Poverty and Race Research Action Council. Available from http://www.prrac.org/pubs_aiu.pdf.
Liu, S., & Zhao, D. (2013). Teacher evaluation in China: latest trends and future directions. Educational Assessment, Evaluation and Accountability, 25(3), 231–250.
Lortie, D. (1975). School-teacher: a sociological study. Chicago: University of Chicago Press.
Louis, K. S., Dretzke, B., & Wahlstrom, K. (2010). How does leadership affect student achievement? Results from a national US survey. School Effectiveness and School Improvement, 21(3), 315–336.
Loup, K., Garland, J., Ellett, C., & Rugutt, J. (1996). Ten years later: findings from a replication of a study of teacher evaluation practices in our 100 largest districts. Journal of Personnel Evaluation in Education, 10(3), 203–26.
Marshall, K. (1996). How I confronted HSPS (hyperactive superficial principal syndrome) and began to deal with the heart of the matter. Phi Delta Kappan, 76(5), 336–345.
May, H., & Supovitz, J. A. (2011). The scope of principal efforts to improve instruction. Educational Administration Quarterly, 47(2), 332–352.
McCaffrey, D. F., Lockwood, J. R., Koretz, D. M., & Hamilton, L. (2003). Evaluating value-added models for teacher accountability. Santa Monica: Rand.
Medley, D., & Coker, H. (1987). The accuracy of principals’ judgments of teacher performance. Journal of Educational Research, 80(4), 242–267.
Mendro, R. L. (1998). Student achievement and school and teacher accountability. Journal of Personnel Evaluation in Education, 12, 257–267.
Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.
Meyer, J. W., & Rowan, B. (1975). Notes on the structure of educational organizations: revised version. Paper presented at the annual meeting of the American Sociological Association, San Francisco, CA.
Milanowski, A. (2004a). Relationships among dimension scores of standards-based teacher evaluation systems and the stability of evaluation score/student achievement relationships over time. Madison: Wisconsin Center for Education Research. CPRE-UW Working Paper Series TC-04-02.
Milanowski, A. (2004b). The relationship between teacher performance evaluation scores and student achievement: evidence from Cincinnati. Peabody Journal of Education, 79(4), 33–53.
Milanowski, A.T., Kimball, S., & White, B. (2004). The relationship between standards-based teacher evaluation scores and student achievement: replication and extensions at three sites. CPRE-UW Working Paper Series TC-04-01. Madison, WI: Wisconsin Center for Education Research, Consortium for Policy Research in Education
Milanowski, A., Kimball, S., & Odden, A. (2005). Teacher accountability measures and links to learning. In L. Stiefel, A. Schwartz, R. Rubenstein, & J. Zabel (Eds.), Measuring school performance and efficiency: implications for practice and research (pp. 137–161). Washington D.C.: Yearbook of the American Education Finance Association.
Millman, J. (1981). Handbook of teacher evaluation. Beverly Hills: Sage.
Millman, J. (1997). Grading teachers, grading schools; is student achievement a valid evaluation measure? Thousand Oaks: Corwin Press.
Murphy, J. F. (1990). Principal instructional leadership. In R. S. Lotto & P. W. Thurston (Eds.), Advances in educational administration: changing perspectives on the school (Vol. 1, Pt. B, pp. 163–200). Greenwich: JAI.
Murphy, J. F. (1991). Restructuring schools: capturing and assessing the phenomena. New York: Teachers College Press.
Murphy, J. F. (2008). Turning around failing schools: leadership lessons from the organizational sciences. Thousand Oaks: Corwin Press.
Murphy, J., Hallinger, P., Lotto, L., & Miller, S. (1987). Barriers to implementing the instructional leadership role. The Canadian Administrator, 27(3), 1–9.
Murphy, J. F., Hallinger, P., & Heck, R. H. (2013). Leading via teacher evaluation: the case of missing clothes? Educational Researcher, 42(6), 349–354.
Musella, D. (1970). Improving teacher evaluation. Journal of Teacher Education, 21(1), 15–21.
Odden, A. (2004). Lessons learned about standards-based teacher evaluation systems. Peabody Journal of Education, 79(4), 126–137.
Odden, A., & Wallace, M. (2008). How to achieve world class teacher compensation. Indianapolis: Freeload.
Popham, W. (1988). The dysfunctional marriage of formative and summative teacher evaluation. Journal of Personnel Evaluation in Education, 1(3), 269–273.
Purkey, S., & Smith, M. (1983). Effective schools: a review. The Elementary School Journal, 83(4), 426–452.
Range, B., Scherz, S., Holt, C., & Young, S. (2011). Supervision and evaluation: the Wyoming perspective. Educational Assessment, Evaluation and Accountability, 23(3), 243–265.
Reyes, P., Scribner, J., & Scribner, A. (1999). Lessons from high-performing Hispanic schools: creating learning communities. New York: Teachers College.
Reynolds, D., Teddlie, C., Hopkins, D., & Stringfield, S. (2000). Linking school effectiveness and school improvement. In C. Teddlie & D. Reynolds (Eds.), The international handbook of school effectiveness research (pp. 206–231). London: Falmer.
Reynolds, D., Muijs, D., & Treharne, D. (2003). Teacher evaluation and teacher effectiveness in the United Kingdom. Journal of Personnel Evaluation in Education, 17(1), 83–100.
Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2000). Teachers, schools, and academic achievement. Cambridge: National Bureau of Economic Research. NBER Working Paper # W6691.
Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73, 417–458.
Robinson, V. M. J., & Timperly, H. (2007). The leadership of the improvement of teaching and learning. Australian Journal of Education, 51(3), 247–262.
Robinson, V. M. J., Lloyd, C. A., & Rowe, K. J. (2008). The impact of leadership on student outcomes: an analysis of the differential effects of leadership types. Educational Administration Quarterly, 44(5), 635–674.
Rockoff, J. (2004). The impact of individual teachers on student achievement: evidence from panel data. The American Economic Review, 94(2), 247–252.
Rockoff, J., & Speroni, C. (2010). Subjective and objective evaluations of teacher effectiveness. American Economic Review, 100(2), 261–66.
Rosenholtz, S. J. (1991). Teachers’ workplace: the social organization of schools. New York: Teachers College Press.
Rothstein, J. (2009). Student sorting and bias in value added estimation: selection on observables and unobservables. Cambridge: National Bureau of Economic Research. Working Paper, 14666.
Rowan, B. R., Correnti, R., & Miller, R. J. (2002). What large-scale survey research tells us about teacher effects on student achievement: insights from the prospects study of elementary schools. Teachers College Record, 104, 1525–1567.
Sanders, W., & Horn, S. (1994). The Tennessee value-added assessment system (TVASS). Mixed-methods model methodology in educational assessment. Journal of Personnel Evaluation in Education, 8, 299–311.
Sanders, W., & Rivers, J. (1996). Cumulative and residual effects of teachers on future student academic achievement. Knoxville: University of Tennessee Value-Added Research and Assessment Center.
Sanders, W., Ashton, J., & Wright, S. (2005). Comparison of the effects of NBPTS-certified teachers with other teachers on the rate of student academic progress. Washington, DC: U.S. Department of Education and National Science Foundation.
Scheerens, J., & Bosker, R. J. (1997). The foundations of educational effectiveness. Oxford: Pergamon.
Sebastian, J., & Allensworth, E. (2012). The influence of principal leadership on classroom instruction and student learning: a study of mediated pathways to learning. Educational Administration Quarterly, 48(4), 626–663.
Showers, B. (1985). Teachers coaching teachers. Educational Leadership, 42(7), 43–49.
Skedsmo, G. (2011). Formulation and realisation of evaluation policy: inconcistencies and problematic issues. Educational Assessment, Evaluation and Accountability, 23(1), 5–20.
Slavin, R., Karweit, N., & Madden, N. (1989). Effective programs for students at risk. Boston: Allyn and Bacon.
Spillane, J. P., Camburn, E., & Pareja, A. (2009). School principals at work: a distributed perspective. In K. Leithwood, B. Mascall, & T. Strauss (Eds.), Distributed leadership according to the evidence (pp. 87–110). London: Routledge.
Stiggins, R., & Duke, D. (1988). The case for commitment to teacher growth: research on teacher evaluation. Albany: SUNY Press.
Supovitz, J. A., & Klein, V. (2003). Mapping a course for improved student learning: how innovative schools systematically use student performance data to guide improvement. Philadelphia: Consortium for Policy Research in Education.
Teddlie, C., & Reynolds, D. (2000). The international handbook of school effectiveness research. New York: Falmer Press.
Thomas, S. (2001). Dimensions of secondary school effectiveness: comparative analyses across regions. School Effectiveness and School Improvement, 12(3), 285–322.
Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education. Available at: www.educationsector.org/usr_doc/RushToJudgment_ES_Jan08.pdf. Accessed 14 Jul 2013.
Tyack, D. B. (1974). One best system. Cambridge: Harvard University Press.
Vescio, V., Ross, D., & Adams, A. (2008). A review of research on the impact of professional learning communities on teaching practice and student learning. Teaching and Teacher Education, 24(1), 80–91.
Walberg, H. (2011). Improving student learning: action principles for families, classrooms, schools, districts, and states. Charlotte: Information Age.
Walker, A. D., & Ko, J. (2011). Principal leadership in an era of accountability: a perspective from the Hong Kong context. School Leadership & Management, 31(4), 369–392.
Webster, W. J., & Mendro, R. L. (1997). The Dallas value-added accountability system. In J. Millman (Ed.), Grading teachers, grading schools; is student achievement a valid evaluation measure? (pp. 81–99). Thousand Oaks: Corwin.
Weick, K. E. (1976). Educational organizations as loosely coupled systems. Administrative Science Quarterly, 21, 1–19.
White, B. (2004). The relationship between teacher evaluation scores and student achievement: evidence from Coventry, RI. CPRE-UW Working Paper Series TC-04-04. Madison, WI: University of Wisconsin-Madison, Wisconsin Center for Education Research, Consortium for Policy Research in Education, San Diego, CA.
Wilson, M., Hallman, P. J., Pecheone, R., & Moss, P. (2014). Using student achievement test scores as evidence of external validity for indicators of teacher quality: Connecticut’s Beginning Educator Support and Training Program. Education Evaluation and Policy Analysis. in press.
Wise, A. E., Darling-Hammond, L., McLaughlin, M., & Bernstein, H. (1985). Teacher evaluation: a study of effective practices. Elementary School Journal, 86(1), 60–121.
Wright, S., Horn, S., & Sanders, P. (1997). Classroom context effects on student achievement: implications for teacher evaluation. Journal of Personnel Evaluation in Education, 11, 57–67.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hallinger, P., Heck, R.H. & Murphy, J. Teacher evaluation and school improvement: An analysis of the evidence. Educ Asse Eval Acc 26, 5–28 (2014). https://doi.org/10.1007/s11092-013-9179-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11092-013-9179-5