Skip to main content

Evaluating Teacher Performance and Teaching Effectiveness: Conceptual and Methodological Considerations

  • Chapter
  • First Online:
Teacher Evaluation Around the World

Abstract

Educational theory inextricably links teachers to student learning, as the key factor mediating educational policies and student experiences in the classroom, with research consistently showing a relationship between a range of teacher and classroom variables that exert an important influence on student outcomes. This chapter highlights the key conceptual and methodological issues involved in the evaluation of teaching and teachers, with particular focus on the distinction between the concepts of performance and effectiveness. It considers the implications of assumptions and choices around why the evaluation is conducted, what is evaluated, and how it is evaluated, presenting a range of methods to collect data on performance and effectiveness. Additionally, we analyze issues related to the reliability and validity of resulting inferences about teacher performance or effectiveness and the implications for policy and practice. Finally, the distinctions and commonalities in evaluating performance and effectiveness in practice are exemplified through the presentation of different models of teacher evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Subject knowledge; commitment to student learning; monitoring and managing student learning; reflecting around and learning about their own practice; and membership in learning communities.

  2. 2.

    Learner development; learning differences; learning environments; content knowledge; application of content; assessment, planning for instruction; instructional strategies; professional learning and ethical practice; and leadership and collaboration.

  3. 3.

    In 2020, guidelines for remote teaching were issued for the FFT, which focus on components that are thought to be most relevant for online learning and remote instruction (The Danielson Group, 2020).

  4. 4.

    The area of emotional support encompasses the dimensions of classroom climate, teacher sensitivity, and regard for student perspectives, while classroom organization includes behavior management, productivity, and instructional learning format. Finally, instructional support is operationalized into concept development, quality of feedback, and language modeling.

  5. 5.

    In these models, teachers who have been identified for their excellence in teaching and mentoring are chosen as coaches to provide support to new teachers as well as experienced colleagues who may require help. Coaches are also responsible for the teachers’ formal personnel evaluations. Typically, coaches do not work in a single school, but are matched with teachers from different schools according to grade level or subject area.

  6. 6.

    AYPs were defined as a specific amount of yearly progress in standardized test scores a school, district, or state was expected to make in a year.

  7. 7.

    Schools can adopt commercially available tests or develop their own, provided these are “rigorous, aligned to content standards, and appropriate for the teacher’s classes and students” (District of Columbia Public Schools, 2011, p. 2; Gitomer & Joyce, 2015).

References

  • AERA, APA, NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.

    Google Scholar 

  • Amrein-Beardsley, A. (2008). Methodological concerns about the education value-added assessment system. Educational Researcher, 37(2), 65–75.

    Google Scholar 

  • Anderson, J. (2013, March 30). Curious grade for teachers: Nearly all pass. New York Times.

    Google Scholar 

  • Apple, M. W. (2007). Ideological success, educational failure? On the politics of no child left behind. Journal of Teacher Education, 58(2), 108–116.

    Article  Google Scholar 

  • Australian Institute for Teaching and School Leadership. (2018). Australian Professional Standards for Teachers. AITSL.

    Google Scholar 

  • Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Shepard, L. A., et al. (2010). Problems with the use of student test scores to evaluate teachers. EPI Briefing Paper (278).

    Google Scholar 

  • Ball, D. L., & Rowan, B. (2004). Introduction: Measuring instruction. The Elementary School Journal, 5(1), 3–10.

    Article  Google Scholar 

  • Ball, D. L., Thames, M. H., & Phelps, G. (2008). Content knowledge for teaching: What makes it special? Journal of Teacher Education, 59(5), 389–407.

    Article  Google Scholar 

  • Bell, C. A., Dobbelaer, M. J., Klette, K., & Visscher, A. (2019). Qualities of classroom observation systems. School Effectiveness and School Improvement, 30(1), 3–29.

    Article  Google Scholar 

  • Bell, C. A., Klieme, E., & Praetorius, A.-K. (2020). Conceptualising teaching quality into six domains for the Study. In OECD, global teaching insights technical report (pp. 1–24). OECD Publishing.

    Google Scholar 

  • Betebenner, D. W. (2009). Norm- and criterion-referenced student growth. Educational Measurement: Issues and Practice, 28(4), 42–51.

    Article  Google Scholar 

  • Betebenner, D. W. (2011). A technical overview of the student growth percentile methodology: Student growth percentiles and percentile growth projections/trajectories. The National Center for the Improvement of Educational Assessment.

    Google Scholar 

  • Bill & Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project. Bill & Melinda Gates Foundation.

    Google Scholar 

  • Bill & Melinda Gates Foundation. (2012a). Gathering feedback for teaching. Research Paper. Bill & Melinda Gates Foundation.

    Google Scholar 

  • Bill & Melinda Gates Foundation. (2012b). Asking students about teaching. Policy and practice brief.

    Google Scholar 

  • Bransford, J., Darling-Hammond, L., & LePage, P. (2005). Introduction. In L. Darling-Hammond & J. Bransford (Eds.), Preparing teachers for a changing world: What teachers should learn and be able to do (pp. 1–39). Jossey-Bass.

    Google Scholar 

  • Brennan, R. L. (2001). Some problems, pitfalls, and paradoxes in educational measurement. Educational Measurement: Issues and Practice, 20(4), 6–18.

    Article  Google Scholar 

  • Brookhart, S. M. (2009). The many meanings of multiple measures. Education Leadership, 67(3), 6–12.

    Google Scholar 

  • Brophy, J., & Goode, T. L. (1986). Teacher behavior and student achievement. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 328–375). MacMillan.

    Google Scholar 

  • California Commission on Teacher Credentialing. (2009). California standards for the teaching profession (CSTP).

    Google Scholar 

  • CASEL. (2020). CASEL’S SEL framework: What are the core competence areas and where are they promoted? CASEL.

    Google Scholar 

  • Close, K., Amrein-Beardsley, A., & Collins, C. (2020). Putting teacher evaluation systems on the map: An overview of state’s teacher evaluation systems post–every student succeeds act. Education Policy Analysis Archives, 28(58), 1–26.

    Google Scholar 

  • Cohen, D. K. (1995). Rewarding teachers for student performance. In S. Fuhrman, & J. O’Day (Eds.), Rewards and reforms: Creating educational incentives that work. Jossey-Bass.

    Google Scholar 

  • Cole, M. S., Bedeian, A. G., Hirschfeld, R. R., & Vogel, B. (2011). Dispersion-composition models in multilevel research: A data-analytic framework. Organizational Research Methods, 14(4), 718–734.

    Article  Google Scholar 

  • Connecticut State Department of Education. (2010). Common core of teaching: Foundational skills. CSDE.

    Google Scholar 

  • Corcoran, S. P. (2010). Can teachers be evaluated by their students’ test scores? Should they be? The use of value-added measures of teacher effectiveness in policy and practice. Annenberg Institute for School Reform.

    Google Scholar 

  • Council of Chief State School Officers. (2013). InTASC model core teaching standards and learning progressions for teachers 1.0. CCSO.

    Google Scholar 

  • Danielson, C. (2013). The framework for teaching evaluation instrument, 2013 edition. Danielson group.

    Google Scholar 

  • Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy evidence. Education Policy Analysis Archives, 8(1), 1–44.

    Article  Google Scholar 

  • Darling-Hammond, L. (2006). Constructing 21st-century teacher education. Journal of Teacher Education, 57(3), 300–314.

    Article  Google Scholar 

  • Darling-Hammond, L. (2008). Reshaping teaching policy, preparation, and practice: Influences of the national board for professional teaching standards. In R. Stake, S. Kushner, L. Ingvarson, & J. Hattie (Eds.), Assessing teachers for professional certification: The first decade of the national board for professional teaching standards (Advances in Program Evaluation) (Vol. 11, pp. 25–53). Emerald Group Publishing Limited.

    Google Scholar 

  • Darling-Hammond, L. (2015). Can value added add value to teacher evaluation? Educational Researcher, 44(2), 132–137.

    Article  Google Scholar 

  • Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation. Phi Delta Kappan, 93(6), 8–15.

    Google Scholar 

  • De Corte, W., Lievens, F., & Sackett, P. R. (2007). Combining predictors to achieve optimal tradeoffs between selection quality and adverse impact. Journal of Applied Psychology, 92, 1380–1393.

    Article  Google Scholar 

  • De Pascale, C. (2012). Managing multiple measures. Principal, 91(5), 6–10.

    Google Scholar 

  • Department for Education, England. (2013). Teachers’ standards: Guidance for school leaders, school staff and governing bodies. DFE.

    Google Scholar 

  • District of Columbia Public Schools. (2011). Teacher-assessed student achievement data (TAS) guidance. DCPS.

    Google Scholar 

  • Doss, C. J. (2019). Student growth percentiles 101: Using relative ranks in student test scores to help measure teaching effectiveness. RAND Corporation.

    Google Scholar 

  • Duncan, A. (2012, agosto 22). Change is hard. Retrieved from US Department of Education: https://www.ed.gov/news/speeches/change-hard

  • Dynarski, M. (2016). Teacher observations have been a waste of time and money. Brookings Institution.

    Google Scholar 

  • Ehlert, M., Koedel, C., Parsons, E., & Podgursky, M. J. (2014). The sensitivity of value-added estimates to specification adjustments: Evidence from school- and teacher-level models in missouri. Statistics and Public Policy, 1(1), 19–27.

    Article  Google Scholar 

  • Elmore, R. F. (1996). Getting to scale with good educational practice. Harvard Educational Review, 66(1), 1–26.

    Article  Google Scholar 

  • Every Student Succeeds Act, Title I Section 1111(2)(B)(III)(vi) (2015).

    Google Scholar 

  • Ferguson, R. F. (2012). Can student surveys measure teaching quality? Phi Delta Kappan, 94(3), 24–28.

    Article  Google Scholar 

  • Florida Department of Education. (2018). 2017–18 District educator evaluation ratings. Retrieved from Archived Statewide District Evaluation Results: http://www.fldoe.org/teaching/performance-evaluation/archive.stml

  • Gitomer, D. H., & Joyce, J. (2015). A review of the DC IMPACT teacher evaluation system. National Research Council.

    Google Scholar 

  • Gitomer, D. H., & Zisk, R. C. (2015). Knowing what teachers know. Review of Research in Education, 39, 1–53.

    Article  Google Scholar 

  • Gitomer, D. H., Martinez, J. F., Battey, D., & Hyland, N. E. (2019). Assessing the assessment: Evidence of reliability and validity in the edTPA. American Educational Research Journal, 58(1), 3–31.

    Article  Google Scholar 

  • Glazerman, S., Goldhaber, D., Loeb, S., Raudenbush, S., Staiger, D. O., & Whitehurst, G. J. (2011). Passing muster: Evaluating evaluation systems. Brown Center on Education Policy at Brookings.

    Google Scholar 

  • Glazerman, S., Loeb, S., Goldhaber, D., Staiger, D., Raudenbush, S., & Whitehurst, G. (2010). Evaluating teachers: The important role of value-added. Brookings Institution.

    Google Scholar 

  • Goe, L. (2007). The link between teacher quality and student outcomes: A research synthesis. National Comprehensive Center for Teacher Quality.

    Google Scholar 

  • Goe, L., & Croft, A. (2009). Methods of evaluating teacher effectiveness. National Comprehensive Center for Teacher Quality.

    Google Scholar 

  • Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. National Comprehensive Center for Teacher Quality.

    Google Scholar 

  • Goldhaber, D., & Anthony, E. (2007). Can teacher quality be effectively assessed? National board certification as a signal of effective teaching. The Review of Economics and Statistics, 89(1), 134–150.

    Article  Google Scholar 

  • Goldhaber, D., Walch, J., & Gabele, B. (2014). Does the model matter? Exploring the relationship between different student achievement-based teacher assessments. Statistics and Public Policy, 1(1), 28–39.

    Article  Google Scholar 

  • Goldstein, J., & Noguera, P. A. (2006). A thoughtful approach to teacher evaluation. Educational Leadership, 63(6), 31–37.

    Google Scholar 

  • Good, T. L. (2014). What do we know about how teachers influence student performance on standardized tests: And why do we know so little about other student outcomes? Teachers College Record, 116, 1–41.

    Google Scholar 

  • Goodman, S. F., & Turner, L. J. (2013). The design of teacher incentive pay and educational outcomes: Evidence from the New York City bonus program. Journal of Labor Economics, 31(2), 409–420.

    Article  Google Scholar 

  • Grossman, P., Loeb, S., Cohen, J., & Wyckoff, J. (2013). Measure for measure: The relationship between measures of instructional practice in middle school English language arts and teachers’ value-added scores. American Journal of Education, 119, 445–470.

    Article  Google Scholar 

  • Guarino, C. M., Reckase, M. D., & Wooldridge, J. M. (2012). Can value-added measures of teacher performance be trusted? Education Policy Center at Michigan State University.

    Google Scholar 

  • Guarino, C. M., Reckase, M. D., Stacy, B., & Wooldridge, J. M. (2015). A comparison of student growth percentile and value-added models of teacher performance. Statistics and Public Policy, 2(1), 1–11.

    Article  Google Scholar 

  • Guerriero, S. (2018). Teachers’ pedagogical knowledge and the teaching profession: Background report and project objectives. OECD Publishing.

    Google Scholar 

  • Hallinger, P., Heck, R. H., & Murphy, J. (2014). Teacher evaluation and school improvement: An analysis of the evidence. Educational Assessment, Evaluation and Accountability, 26(1), 5–28.

    Article  Google Scholar 

  • Hamilton, L. (2005). Lessons from performance measurement in education. In R. Klitgaard & P. C. Light (Eds.), High-performance government (pp. 381–405). RAND Corporation.

    Google Scholar 

  • Hamre, B. K., & Pianta, R. C. (2007). Learning opportunities in preschool and early elementary classrooms. In R. C. Pianta, M. J. Cox, & K. L. Snow (Eds.), School readiness & the transition to kindergarten in the era of accountability (pp. 49–84). Paul H. Brookes Publishing Co.

    Google Scholar 

  • Hanushek, E. A., & Rivkin, S. G. (2010). Using value-added measures of teacher quality. CALDER - Urban Institute.

    Google Scholar 

  • Hanushek, E. A., Kain, J. F., & Rivkin, S. G. (1999). Do higher salaries buy better teachers? NBER Working Paper No. 7082.

    Google Scholar 

  • Harris, D. N., & Sass, T. R. (2009). What makes for a good teacher and who can tell? CALDER working paper.

    Google Scholar 

  • Harris, D. N., Ingle, W. K., & Rutledge, S. A. (2014). How teacher evaluation methods matter for accountability: a comparative analysis of teacher effectiveness ratings by principals and teacher value-added measures. American Educational Research Journal, 51(1), 73–112.

    Article  Google Scholar 

  • Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.

    Google Scholar 

  • Hill, H. C., Blunk, M., Charalambous, C., Lewis, J., Phelps, G. C., Sleep, L., & Ball, D. L. (2008). Mathematical knowledge for teaching and the mathematical quality of instruction: An exploratory study. Cognition and Instruction, 26, 430–511.

    Article  Google Scholar 

  • Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56–64.

    Article  Google Scholar 

  • Jackson, C. K. (2016). What do test scores miss? The importance of teacher effects on non-test score outcomes. NBER.

    Google Scholar 

  • Johnson, S. M., & Fiarman, S. E. (2012). The potential of peer review. Educational Leadership, 70(3), 20–25.

    Google Scholar 

  • Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). American Council on Education.

    Google Scholar 

  • Kane, T. J., & Staiger, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation. NBER Working Paper 14607.

    Google Scholar 

  • Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching. Bill & Melinda Gates Foundation. Retrieved from http://k12education.gatesfoundation.org/download/?Num=2680&filename=MET_Gathering_Feedback_Research_Paper1.pdf

  • Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2011, Summer). Evaluating teacher effectiveness: Can classroom observations identify practices that raise achievement? Education Next (pp. 55–60).

    Google Scholar 

  • Kane, T., McCaffrey, D., Miller, T., & Staiger, D. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment. Research Paper. Bill & Melinda Gates Foundation.

    Google Scholar 

  • Kennedy, M. M. (2008). Sorting out teacher quality. Phi Delta Kappan, 90(1), 59–63.

    Article  Google Scholar 

  • Kloser, M. (2014). Identifying a core set of science teaching practices: A Delphi expert panel approach. Journal of Research in Science Teaching, 51(9), 1185–1217.

    Article  Google Scholar 

  • Kloser, M., Edelman, A., Floyd, C., Martinez, J. F., Stecher, B., Srinivasan, J., & Lavin, E. (2021). Interrogating practice or show and tell? Using a digital portfolio to anchor a professional learning community of science teachers. Journal of Science Teacher Education, 32(2), 210–241.

    Article  Google Scholar 

  • Kuhfeld, M. (2017). When students grade their teachers: A validity analysis of the tripod student survey. Educational Assessment, 22(4), 253–274.

    Article  Google Scholar 

  • Kurtz, M. D. (2018). Value-added and student growth percentile models: What drives differences in estimated classroom effects? Statistics and Public Policy, 5(1), 1–8.

    Article  Google Scholar 

  • Lachlan-Haché, L., Cushing, E., & Bivona, L. (2012a). Student learning objectives as measures of educator effectiveness: The basics. American Institutes for Research.

    Google Scholar 

  • Lachlan-Haché, L., Cushing, E., & Bivona, L. (2012b). Student learning objectives: Benefits, challenges, and solutions. American Institutes for Research.

    Google Scholar 

  • LAUSD. (2021a, April 3). History of EDST. Retrieved from https://achieve.lausd.net/Page/11782#spn-content

  • LAUSD. (2021b). Teaching and learning framework. LAUSD.

    Google Scholar 

  • Linn, R. L. (2000). Assessments and accountability. Educational Researcher, 29(2), 4–16.

    Article  Google Scholar 

  • Lockwood, J. R., McCaffrey, D. F., Hamilton, L. S., Stecher, B., Le, V.-N., & Martinez, J. F. (2007). The sensitivity of value-added teacher effect estimates to different mathematics achievement measures. Journal of Educational Measurement, 44(1), 47–67.

    Article  Google Scholar 

  • Los Angeles Unified School District. (2019). 2018–2019 EDS final evaluation report for teachers and non-classroom teachers: Administrator handbook. LAUSD.

    Google Scholar 

  • Maine Department of Education. (2012). Common core teaching standards. MDE.

    Google Scholar 

  • Martínez Rizo, F. (2015). La evaluación del desempeño docente. Una propuesta para la educación básica en México. In G. Guevara Niebla, M. T. Melendez Irigoyen, F. E. Ramon Castaño, H. Sanchez Perez, & F. Tirado Segura (Eds.), La evaluación docente en México (pp. 64–95). INEE-Fondo de Cultura Económica.

    Google Scholar 

  • Martinez, J. F. (2012). Consequences of omitting the classroom in multilevel models of schooling: An illustration using opportunity to learn and reading achievement. School Effectiveness and School Improvement, 23(3), 305–326.

    Article  Google Scholar 

  • Martinez, J. F., & Fernandez, M. P. (2019). Evaluación docente con indicadores múltiples: Consideraciones conceptuales y metodológicas en torno a la validez. In J. Manzi, M. R. Garcia, & S. Taut (Eds.), Validez de Evaluaciones Educacionales en Chile y Latinoamérica (pp. 531–562). Ediciones UC.

    Google Scholar 

  • Martinez, J. F., Borko, H., & Stecher, B. (2012). Measuring instructional practices in middle school science using classroom artifacts. Journal for Research in Science Teaching, 49, 38–67.

    Article  Google Scholar 

  • Martinez, J. F., Schweig, J., & Goldschmidt, P. (2016a). Approaches for combining multiple measures of teacher performance: Reliability, validity, and implications for evaluation policy. Educational Evaluation and Policy Analysis, 38(4), 738–756.

    Article  Google Scholar 

  • Martinez, J. F., Taut, S., & Schaaf, K. (2016b). Classroom observation for evaluating and improving teaching: An international perspective. Studies in Educational Evaluation, 49, 15–29.

    Article  Google Scholar 

  • Marzano, R. J., & Toth, M. D. (2013). Teacher evaluation that makes a difference: A new model for teacher growth and student achievement. ASCD.

    Google Scholar 

  • Matsumura, L. C., Garnier, H. E., Slater, S. C., & Boston, M. D. (2008). Toward measuring instructional interactions “at-scale.” Educational Assessment, 13, 267–300.

    Article  Google Scholar 

  • Medley, D. M., & Coker, H. (1987). The accuracy of principals’ judgments of teacher performance. The Journal of Educational Research, 80(4), 242–247.

    Article  Google Scholar 

  • Meyer, R. H. (1996). Value-added indicators of school performance. In E. A. Hanushek & D. W. Jorgenson (Eds.), Improving America’s schools: The role of incentives (pp. 197–223). The National Academies Press.

    Google Scholar 

  • Meyer, R., Pier, L., Mader, J., Christian, M., Rice, A., Loeb, S., Hough, H., et al. (2019). Can we measure classroom supports for social-emotional learning? Applying value-added models to student surveys in the CORE districts. PACE.

    Google Scholar 

  • Mihaly, K., McCaffrey, D., Staiger, D., & Lockwood, J. R. (2013). A composite estimator of effective teaching (MET Project). The RAND Corporation.

    Google Scholar 

  • Millman, J. (1981). Student achievement as a measure of teacher competence. In Handbook of teacher evaluation (pp. 146–166). Sage.

    Google Scholar 

  • Ministry of Education, Chile. (2008). Marco para la Buena Enseñanza. MINEDUC.

    Google Scholar 

  • Muijs, D. (2006). Measuring teacher effectiveness: Some methodological reflections. Educational Research and Evaluation, 12(1), 53–74.

    Article  Google Scholar 

  • Mullens, J. E. (1995). Classroom instructional processes: A review of existing measurement approaches and their applicability for the teacher followup survey. U.S. Department of Education.

    Google Scholar 

  • Mullis, I. V., Martin, M. O., Foy, P., Kelly, D. L., & Fishbein, B. (2020). TIMSS 2019 international results in mathematics and science. TIMSS & PIRLS International Study Center.

    Google Scholar 

  • National Board for Professional Teaching Standards. (2016). What teachers should know and be able to do (2nd ed.). NBPTS.

    Google Scholar 

  • National Commission on Excellence in Education. (1983). A nation at risk: The imperative for educational reform. U.S. Department of Education.

    Google Scholar 

  • National Council of Teachers in Mathematics. (2000). Principles and standards for school mathematics. NCTM.

    Google Scholar 

  • National Research Council. (2010). Preparing teachers: Building evidence for sound policy. National Academy of Sciences.

    Google Scholar 

  • NCTQ. (2015). State teacher policy yearbook: National summary. National Council on Teacher Quality (NCTQ).

    Google Scholar 

  • NCTQ. (2017). Running in place: How New teacher evaluations fail to live up to promises. NCTQ.

    Google Scholar 

  • NCTQ. (2019). State of the states 2019: Teacher & principal evaluation policy. National Council on Teacher Quality (NCTQ).

    Google Scholar 

  • New York City Department of Education. (2019). Advance guide for educators 2019–2020. NYCDE.

    Google Scholar 

  • OECD. (2013). Teachers for the 21st century: Using evaluation to improve teaching. OECD Publishing.

    Google Scholar 

  • OECD. (2019). TALIS 2018 results: Teachers and school leaders as lifelong learners (Vol. 1). OECD Publishing.

    Google Scholar 

  • OECD. (2020). Global teaching insights: A video study of teaching. OECD Publishing.

    Google Scholar 

  • Paige, M. (2020). Moving forward while looking back: How can VAM lawsuits guide teacher evaluation policy in the age of ESSA? Education Policy Analysis Archives, 28(64), 1–18.

    Google Scholar 

  • Papay, J. (2012). Refocusing the debate: Assessing the purposes and tools of teacher evaluation. Harvard Educational Review, 82(1), 123–141.

    Article  Google Scholar 

  • Pecheone, R. L., Shear, B., Whittaker, A., & Darling-Hammond, L. (2013). 2013 edTPA field test: Summary report. SCALE.

    Google Scholar 

  • Peterson, K. D. (1995). Teacher evaluation: A comprehensive guide to new directions and practices. Corwin.

    Google Scholar 

  • Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes: Standardized observation can leverage capacity. Educational Researcher, 38(2), 109–119.

    Article  Google Scholar 

  • Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2007). Classroom assessment scoring system. Paul H. Brookes.

    Google Scholar 

  • Popham, W. J. (1971). Performance tests of teaching proficiency: Rationale, development, and validation. American Educational Research Journal, 8(1), 105–117.

    Article  Google Scholar 

  • Popham, W. J. (2007). Instructional insensitivity of tests: Accountability’s dire drawback. Phi Delta Kappan, 146–155.

    Google Scholar 

  • Porter, A., Youngs, P., & Odden, A. (2001). Advances in teacher assessments and their uses. In V. Richardson (Ed.), Handbook of research on teaching (4th ed., pp. 259–297). AERA.

    Google Scholar 

  • Reynolds, A. (1992). Getting to the core of the apple: A theoretical view of the knowledge base of teaching. Journal of Personnel Evaluation in Education, 6, 41–55.

    Article  Google Scholar 

  • Rivkin, S. G., Hanushek, E. A., & Kain, J. F. (2005). Teachers, schools, and academic achievement. Econometrica, 73(2), 417–458.

    Article  Google Scholar 

  • Rothstein, J. (2016). Can value-added models identify teachers’ impacts? IRLE—UC Berkeley.

    Google Scholar 

  • Rowan, B., & Correnti, R. (2009). Studying reading instruction with teacher logs: Lessons from the study of instructional improvement. Educational Researcher, 38(2), 120–131.

    Article  Google Scholar 

  • Rubin, D. B., Stuart, E. A., & Zanutto, E. L. (2004). A potential outcomes view of value-added assessment in education. Journal of Educational and Behavioral Statistics, 29(1), 103–116.

    Article  Google Scholar 

  • S.B. 736, Student Success Act. (2010). St. FL.

    Google Scholar 

  • S.B. 736, Student Success Act, Section 1012.343(3)(a)1 (2010).

    Google Scholar 

  • Sass, T. R. (2008). The stability of value‐added measures of teacher quality and implications for teacher compensation policy. CALDER—Urban Institute.

    Google Scholar 

  • Sato, M. (2014). What is the underlying conception of teaching of the edTPA? Journal of Teacher Education, 65(5), 421–434.

    Article  Google Scholar 

  • Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Benford, R., & Bloom, I. (2002). Measuring reform practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and Mathematics, 102(6), 245–253.

    Article  Google Scholar 

  • Schweig, J. D. (2016). Moving beyond means: Revealing features of the learning environment by investigating the agreement of student ratings. Learning Environments Research, 19(3), 441–462.

    Article  Google Scholar 

  • Schweig, J., Baker, G., Hamilton, L. S., & Stecher, B. M. (2018). Building a repository of assessments of interpersonal, intrapersonal, and higher-order cognitive competencies. RAND Corporation.

    Google Scholar 

  • Shulman, L. (1998). Teacher portfolios: A theoretical activity. In N. Lyons (Ed.), With portfolio in hand (pp. 23–37). Teachers College Press.

    Google Scholar 

  • Shulman, L. S. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review, 57(1), 1–22.

    Article  Google Scholar 

  • Stecher, B. M., Wood, A. C., Gilbert, M., Borko, H., Kuffner, K. L., Arnold, S. C., & Dorman, E. H. (2005). Using classroom artifacts to measure instructional practices in middle school mathematics: A two-state field test (CSE Report 662). CRESST.

    Google Scholar 

  • Stecher, B., & Kirby, S. N. (2004). Organizational improvement and accountability: Lessons for education from other sectors. RAND Corporation.

    Google Scholar 

  • Steele, J., Hamilton, L. S., & Stecher, B. M. (2010). Incorporating student performance measures into teacher evaluation systems. The RAND Corporation.

    Google Scholar 

  • Stodolsky, S. S. (1990). Classroom observation. In J. Millman & L. Darling-Hammond (Eds.), The new handbook of teacher evaluation: Assessing elementary and secondary school teachers (pp. 175–190). Corwin Press.

    Google Scholar 

  • Taut, S., & Sun, Y. (2014). The development and implementation of a national, standards-based, multi-method teacher performance assessment system in Chile. Education Policy Analysis Archives, 22(71).

    Google Scholar 

  • The Danielson Group. (2020). The framework for remote teaching. The Danielson Group.

    Google Scholar 

  • Tucker, P. D., & Stronge, J. H. (2005). Linking teacher evaluation and student learning. Association for Supervision and Curriculum Development.

    Google Scholar 

  • U.S. Department of Education. (2001). No child left behind act (Executive Summary). U.S. Department of Education.

    Google Scholar 

  • Walkington, C., & Marder, M. (2018). Using the UTeach observation protocol (UTOP) to understand the quality of mathematics instruction. ZDM Mathematics Education, 50, 507–519.

    Article  Google Scholar 

  • Walsh, E., & Isenberg, E. (2013). How does a value-added model compare to the colorado growth model? Mathematica Policy Research.

    Google Scholar 

  • Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. The New Teacher Project.

    Google Scholar 

  • West, M. R. (2016). Should non-cognitive skills be included in school accountability systems? Preliminary evidence from California’s CORE districts. Evidence Speaks Reports, 1(13), 1–7.

    Google Scholar 

  • Windschitl, M., Thompson, J., & Braaten, M. (2018). Ambitious science teaching. Harvard Education Press.

    Google Scholar 

  • Wise, A. E., Darling-Hammond, L., McLaughlin, M. W., & Bernstein, H. T. (1985). Teacher evaluation: A study of effective practices. The Elementary School Journal, 86(1), 60–121.

    Article  Google Scholar 

  • Wragg, E. C. (1999). An introduction to classroom observation. Routledge.

    Google Scholar 

  • Yuan, K., Le, V., McCaffrey, D. F., Marsh, J. A., Hamilton, L. S., Stecher, B. M., & Springer, M. G. (2013). Incentive pay programs do not affect teacher motivation or reported practices: Results from three randomized studies. Educational Evaluation and Policy Analysis, 35(1), 3–22.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María Paz Fernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Fernández, M.P., Martínez, J.F. (2022). Evaluating Teacher Performance and Teaching Effectiveness: Conceptual and Methodological Considerations. In: Manzi, J., Sun, Y., García, M.R. (eds) Teacher Evaluation Around the World. Teacher Education, Learning Innovation and Accountability. Springer, Cham. https://doi.org/10.1007/978-3-031-13639-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13639-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13638-2

  • Online ISBN: 978-3-031-13639-9

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics