, Volume 50, Issue 3, pp 507–519 | Cite as

Using the UTeach Observation Protocol (UTOP) to understand the quality of mathematics instruction

  • Candace WalkingtonEmail author
  • Michael Marder
Original Article


The UTeach Observation Protocol (UTOP) was designed to inform STEM teacher education. The instrument has been used in prior studies examining inter-rater reliability and relationships to teacher value-added scores. However, prior work has not shown examples of how rating with the UTOP works in practice nor has it discussed the instrument’s strengths and limitations. Here, we describe how the UTOP draws upon theories and practices heavily emphasized in teacher preparation—including deep student engagement, classroom management, STEM content fluency, lesson structuring, and innovative instructional models. We then present the ratings of three sample elementary mathematics lessons on the UTOP. We show how the UTOP reveals important aspects of teachers’ instruction, and discuss key strengths and weaknesses of the instrument. We find that the UTOP provides a broad view of instructional practice useful for informing systemic professional development, while also addressing content-specific teaching behaviors critical to STEM teaching. However, it may be cumbersome to consider so many teaching indicators simultaneously, and less emphasis is given to theory-driven indicators of the development of mathematical reasoning. This article provides a novel theoretical, empirical, and practical base of knowledge for using or making decisions about whether to use the UTOP for math classroom observations.


Classroom observation UTOP Teacher preparation Teaching effectiveness Mathematics 


  1. Amrein-Beardsley, A. (2008). Methodological concerns about the education value-added assessment system. Educational Researcher, 37(2), 65–75.CrossRefGoogle Scholar
  2. Backes, B. (2016). Can UTeach? Assessing the Relative Effectiveness of STEM Teachers. Working Paper, National Center for Analysis of Longitudinal Data in Education Research. American Institutes for Research.
  3. Ball, D., Thames, M. H., & Phelps, G. (2008). Content knowledge for teaching: What makes it special? Journal of Teacher Education, 59(5), 389–407.CrossRefGoogle Scholar
  4. Banks, J. A. (2004). Multicultural education. In J. A. Banks & C. A. McGee Banks (Eds.), Handbook of Research on Multicultural Education (pp. 3–29). San Francisco: Jossey-Bass.Google Scholar
  5. Barth, P., Dillon, N., Hull, J., & Higgins, B. (2016). Fixing the holes in the teacher pipeline: An overview of teacher shortages.
  6. Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31.CrossRefGoogle Scholar
  7. Boaler, J. (2002a). Experiencing school mathematics: Traditional and reform approaches to teaching and their impact on student learning. Mahwah, NJ: Lawrence Erlbaum Publishers.Google Scholar
  8. Boaler, J. (2002b). Paying the price for “sugar and spice”: Shifting the analytical lens in equity research. Mathematical Thinking and Learning, 2(2&3), 127–144.CrossRefGoogle Scholar
  9. Bybee, R., Taylor, J., Gardner, A., Van Scotter, P., Powell, J., Westbrook, A., & Landes, N. (2006). The BSCS 5E instructional model: Origins and effectiveness. Colorado Springs, Co: BSCS, 5, 88–98.Google Scholar
  10. Carpenter, T. P., & Lehrer, R. (1999). Teaching and learning mathematics with understanding. In E. Fennema & T. Romberg (Eds.), Mathematics classrooms that promote understanding (pp. 19–32). New York: Routledge.Google Scholar
  11. Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.CrossRefGoogle Scholar
  12. Deci, E. L., & Ryan, R. M. (1992). The initiation and regulation of intrinsically motivated learning and achievement. In A. Boggiano & T. Pittman (Eds.), Achievement and motivation: A social-developmental perspective (pp. 9–36). New York: Cambridge University Press.Google Scholar
  13. Doyle, W. (2006). Ecological approaches to classroom management. In C. M. Evertson & C. S. Weinstein (Eds.), Handbook of classroom management: Research, practice, and contemporary issues (pp. 97–125). Mahwah: Erlbaum.Google Scholar
  14. Echevarria, M. (2003). Anomalies as a catalyst for middle school students’ knowledge construction and scientific reasoning during science inquiry. Journal of Educational Psychology, 95(2), 357–374.CrossRefGoogle Scholar
  15. Fredricks, J. A., Blumenfeld, P. C., & Paris, A. H. (2004). School engagement: Potential of the concept, state of the evidence. Review of Educational Research, 74(1), 59–109.CrossRefGoogle Scholar
  16. Gess-Newsome, J., & Lederman, N. G. (Eds.), (2001). Examining pedagogical content knowledge: The construct and its implications for science education (Vol. 6). Dordrecht: Kluwer.Google Scholar
  17. Goldman, S. R., Petrosino, A., Sherwood, R. D., Garrison, S., Hickey, D. T., Bransford, J. D., & Pellegrino, J. W. (1994). Multimedia environments for enhancing science instruction. In S. Vosniadou, E. De Corte, R. Glaser & H. Mandl (Eds.), International perspectives on the psychological foundations of technology-based learning environments (pp. 257–284). New York: Springer.Google Scholar
  18. Gutstein, E. (2006). Reading and writing the world with mathematics: Toward a pedagogy for social justice. Abingdon: Taylor & Francis.Google Scholar
  19. Hapgood, S., Magnusson, S. J., & Palincsar, A. (2004). Teacher, text, and experience: A case of young children’s scientific inquiry. The Journal of the Learning Sciences, 13(4), 455–505.CrossRefGoogle Scholar
  20. Harlen, W. (2005). Teachers’ summative practices and assessment for learning–tensions and synergies. Curriculum Journal, 16(2), 207–223.CrossRefGoogle Scholar
  21. Hehir, T. (2002). Eliminating ableism in education. Harvard Educational Review, 72(1), 1–33.CrossRefGoogle Scholar
  22. Hiebert, J., & Grouws, D.A. (2007). The effects of classroom mathematics teaching on students’ learning. In F. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 371–404). Charlotte: Information Age Publishing.Google Scholar
  23. Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831.CrossRefGoogle Scholar
  24. Hinchey, P. H. (2010). Getting Teacher Assessment Right: What Policymakers Can Learn from Research. Boulder, CO: National Education Policy Center. Accessed 9 Mar 2018.
  25. Hmelo-Silver, C. E. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16(3), 235–266.CrossRefGoogle Scholar
  26. Hmelo-Silver, C. E., Duncan, R. G., & Chinn, C. A. (2007). Scaffolding and achievement in problem-based and inquiry learning: A response to Kirschner, Sweller, and Clark (2006). Educational Psychologist, 42(2), 99–107.CrossRefGoogle Scholar
  27. Horizon Research Inc. (1999). Local Systemic Change through teacher enhancement classroom observation protocol. Accessed 9 Mar 2018.
  28. Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Bill & Melinda Gates Foundation. Accessed 9 Mar 2018.
  29. Kapur, M. (2016). Examining productive failure, productive success, unproductive failure, and unproductive success in learning. Educational Psychologist, 51(2), 289–299.CrossRefGoogle Scholar
  30. Khisty, L. L. (1995). Making inequality: issues of language and meanings in mathematics teaching with Hispanic students. In W. G. Secada, E. Fennema & L. B. Adajian (Eds.), New Directions for Equity in Mathematics Education (pp. 279–297). Cambridge: Cambridge University Press.Google Scholar
  31. Kolodner, J. L., Camp, P. J., Crismond, D., Fasse, B., Gray, J., Holbrook, J., & Ryan, M. (2003). Problem-based learning meets case-based reasoning in the middle-school science classroom: Putting learning by design into practice. The Journal of the Learning Sciences, 12(4), 495–547.CrossRefGoogle Scholar
  32. Krajcik, J. S., & Czerniak, C. M. (2014). Teaching science in elementary and middle school: A project-based approach. Abingdon: Routledge.Google Scholar
  33. Kupermintz, H. (2002). Teacher effects as a measure of teacher effectiveness: Construct validity considerations in TVAAS (Tennessee Value-Added Assessment System). CSE Technical Report 563. University of California, Los Angeles.Google Scholar
  34. Ladson-Billings, G. (1995). Toward a theory of culturally relevant pedagogy. American Educational Research Journal, 32(3), 465–491.CrossRefGoogle Scholar
  35. Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.CrossRefGoogle Scholar
  36. Learning Mathematics for Teaching (2006). A Coding rubric for Measuring the Quality of Mathematics in Instruction (Technical Report LMT1.06). Ann Arbor, MI: University of Michigan, School of Education.Google Scholar
  37. Lesh, R., & Harel, G. (2003). Problem solving, modeling, and local conceptual development. Mathematical Thinking and Learning, 5(2/3), 157–190.CrossRefGoogle Scholar
  38. Linnenbrink-Garcia, L., Patall, E., & Messersmith, E. (2013). Antecedents and consequences of situational interest. British Journal of Educational Psychology, 83, 591–614.CrossRefGoogle Scholar
  39. Lynch, S. (2000). Equity and Science Education Reform. Mahwah: Lawrence Erlbaum.Google Scholar
  40. Marder, M., & Hamrock, C. (2016). Math and science outcomes for students of teachers from standard and alternative pathways in Texas. Working Paper.
  41. Marder, M., Walkington, C., Abraham, L., Allen, K., Arora, P., Daniels, M., Dickinson, G., Ekberg, D., Gordon, J., Ihorn, S., & Walker, M. (2010). The UTeach Observation Protocol (UTOP) Training Guide. UTeach Natural Sciences, University of Texas Austin.Google Scholar
  42. Marzano, R. J., & Marzano, J. S. (2003). The key to classroom management. Educational Leadership, 61(1), 6–13.Google Scholar
  43. Mishra, P., & Koehler, M. J. (2006). Technological pedagogical content knowledge: A framework for teacher knowledge. Teachers College Record, 108(6), 1017–1054.CrossRefGoogle Scholar
  44. Moll, L. C., Amanti, C., Neff, D., & Gonzalez, N. (1992). Funds of knowledge for teaching: Using a qualitative approach to connect homes and classrooms. Theory into practice, 31(2), 132–141.CrossRefGoogle Scholar
  45. National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA.Google Scholar
  46. Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom Assessment Scoring System (CLASS). Baltimore: Brookes.Google Scholar
  47. Polikoff, M. S. (2014). Does the Test Matter? Evaluating teachers when tests differ in their sensitivity to instruction. In T. Kane, K. Kerr & R. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching project (pp. 278–302). San Francisco: Jossey-Bass.Google Scholar
  48. Potvin, P., & Hasni, A. (2014). Interest, motivation and attitude towards science and technology at K-12 levels: A systematic review of 12 years of educational research. Studies in Science Education, 50(1), 85–129.CrossRefGoogle Scholar
  49. Redfield, D. L., & Rousseau, E. W. (1981). A meta-analysis of experimental research on teacher questioning behavior. Review of Educational Research, 51(2), 237–245.CrossRefGoogle Scholar
  50. Samson, G. K., Strykowski, B., Weinstein, T., & Walberg, H. J. (1987). The effects of teacher questioning levels on student achievement: A quantitative synthesis. The Journal of Educational Research, 80(5), 290–295.CrossRefGoogle Scholar
  51. Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science, 3(4), 207–217.CrossRefGoogle Scholar
  52. Schwartz, D., Lin, X., Brophy, S., & Bransford, J. (1999). Toward the development of flexibly adaptive instructional designs. In C. Reigeluth (Ed.), Instructional Design Theories and Models (pp. 188–213). Hillsdale: Erlbaum.Google Scholar
  53. Schwartz, D. L., & Bransford, J. D. (1998). A time for telling. Cognition and Instruction, 16(4), 475–5223.CrossRefGoogle Scholar
  54. Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–14.CrossRefGoogle Scholar
  55. Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455–488.CrossRefGoogle Scholar
  56. Stodolsky, S. S. (1984). Teacher evaluation: The limits of looking. Educational Researcher, 13(9), 11–18.CrossRefGoogle Scholar
  57. Tate, W. F. (1994). Race retrenchment and reform of school mathematics. Phi Delta Kappan, 75(6), 477–480, 482–484.Google Scholar
  58. UTeach Institute (2007). UTeach Elements of Success.
  59. Valencia, R. R. (1997). Conceptualizing the notion of deficit thinking. In R. R. Valencia (Ed.), The evolution of deficit thinking: Educational thought and practice (pp. 1–12). London: The Falmer Press.Google Scholar
  60. Van De Walle, J. A., Karp, K. S., & Bay-Williams, J. M. (2013). Elementary and middle school mathematics: Teaching developmentally (8th edn.). Boston: Allyn & Bacon.Google Scholar
  61. Viiri, J. (2003). Engineering teachers’ pedagogical content knowledge. European Journal of Engineering Education, 28(3), 353–359.CrossRefGoogle Scholar
  62. Walkington, C., Arora, P., Ihorn, S., Gordon, J., Walker, M., Abraham, L., & Marder, M. (2011). Development of the UTeach Observation Protocol: A classroom observation instrument to evaluate mathematics and science teachers from the UTeach preparation program. UTeach Natural Sciences, University of Texas at Austin.
  63. Walkington, C., & Marder, M. (2014). Exploring excellence in teaching using the UTeach Observation Protocol: Connecting teaching behaviors to teacher value-added on assessments measuring conceptual understanding. In T. Kane, K. Kerr & R. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching project (pp. 234–277). San Francisco: Jossey-Bass.
  64. Walkington, C., & Valerius, M. (April, 2012). Using classroom observation research to inform debates about teaching effectiveness. Paper presentation at 2012 Research Pre-session for National Council of Teachers of Mathematics Annual Meeting, Philadelphia, PA.Google Scholar
  65. Walshaw, M., & Anthony, G. (2008). The teacher’s role in classroom discourse: A review of recent research into mathematics classrooms. Review of Educational Research, 78(3), 516–551.CrossRefGoogle Scholar
  66. Weinstein, C. S., & Novodorsky, I. (2015). Middle and secondary classroom management: Lessons from research and practice (5th edn.). McGraw Hill.Google Scholar
  67. Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on teacher effectiveness. New York City: The New Teacher Project.Google Scholar
  68. Wu, X., Anderson, R. C., Nguyen-Jahiel, K., & Miller, B. (2013). Enhancing motivation and engagement through collaborative discussion. Journal of Educational Psychology, 105(3), 622–632.CrossRefGoogle Scholar

Copyright information

© FIZ Karlsruhe 2018

Authors and Affiliations

  1. 1.Department of Teaching and LearningSouthern Methodist UniversityDallasUSA
  2. 2.Department of PhysicsUniversity of Texas at AustinAustinUSA

Personalised recommendations