Abstract
The UTeach Observation Protocol (UTOP) was designed to inform STEM teacher education. The instrument has been used in prior studies examining inter-rater reliability and relationships to teacher value-added scores. However, prior work has not shown examples of how rating with the UTOP works in practice nor has it discussed the instrument’s strengths and limitations. Here, we describe how the UTOP draws upon theories and practices heavily emphasized in teacher preparation—including deep student engagement, classroom management, STEM content fluency, lesson structuring, and innovative instructional models. We then present the ratings of three sample elementary mathematics lessons on the UTOP. We show how the UTOP reveals important aspects of teachers’ instruction, and discuss key strengths and weaknesses of the instrument. We find that the UTOP provides a broad view of instructional practice useful for informing systemic professional development, while also addressing content-specific teaching behaviors critical to STEM teaching. However, it may be cumbersome to consider so many teaching indicators simultaneously, and less emphasis is given to theory-driven indicators of the development of mathematical reasoning. This article provides a novel theoretical, empirical, and practical base of knowledge for using or making decisions about whether to use the UTOP for math classroom observations.
This is a preview of subscription content, access via your institution.

References
Amrein-Beardsley, A. (2008). Methodological concerns about the education value-added assessment system. Educational Researcher, 37(2), 65–75.
Backes, B. (2016). Can UTeach? Assessing the Relative Effectiveness of STEM Teachers. Working Paper, National Center for Analysis of Longitudinal Data in Education Research. American Institutes for Research. http://www.caldercenter.org/sites/default/files/WP%20173.pdf.
Ball, D., Thames, M. H., & Phelps, G. (2008). Content knowledge for teaching: What makes it special? Journal of Teacher Education, 59(5), 389–407.
Banks, J. A. (2004). Multicultural education. In J. A. Banks & C. A. McGee Banks (Eds.), Handbook of Research on Multicultural Education (pp. 3–29). San Francisco: Jossey-Bass.
Barth, P., Dillon, N., Hull, J., & Higgins, B. (2016). Fixing the holes in the teacher pipeline: An overview of teacher shortages. http://www.centerforpubliceducation.org/Main-Menu/Staffingstudents/An-Overview-of-Teacher-Shortages-At-a-Glance.
Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21(1), 5–31.
Boaler, J. (2002a). Experiencing school mathematics: Traditional and reform approaches to teaching and their impact on student learning. Mahwah, NJ: Lawrence Erlbaum Publishers.
Boaler, J. (2002b). Paying the price for “sugar and spice”: Shifting the analytical lens in equity research. Mathematical Thinking and Learning, 2(2&3), 127–144.
Bybee, R., Taylor, J., Gardner, A., Van Scotter, P., Powell, J., Westbrook, A., & Landes, N. (2006). The BSCS 5E instructional model: Origins and effectiveness. Colorado Springs, Co: BSCS, 5, 88–98.
Carpenter, T. P., & Lehrer, R. (1999). Teaching and learning mathematics with understanding. In E. Fennema & T. Romberg (Eds.), Mathematics classrooms that promote understanding (pp. 19–32). New York: Routledge.
Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.
Deci, E. L., & Ryan, R. M. (1992). The initiation and regulation of intrinsically motivated learning and achievement. In A. Boggiano & T. Pittman (Eds.), Achievement and motivation: A social-developmental perspective (pp. 9–36). New York: Cambridge University Press.
Doyle, W. (2006). Ecological approaches to classroom management. In C. M. Evertson & C. S. Weinstein (Eds.), Handbook of classroom management: Research, practice, and contemporary issues (pp. 97–125). Mahwah: Erlbaum.
Echevarria, M. (2003). Anomalies as a catalyst for middle school students’ knowledge construction and scientific reasoning during science inquiry. Journal of Educational Psychology, 95(2), 357–374.
Fredricks, J. A., Blumenfeld, P. C., & Paris, A. H. (2004). School engagement: Potential of the concept, state of the evidence. Review of Educational Research, 74(1), 59–109.
Gess-Newsome, J., & Lederman, N. G. (Eds.), (2001). Examining pedagogical content knowledge: The construct and its implications for science education (Vol. 6). Dordrecht: Kluwer.
Goldman, S. R., Petrosino, A., Sherwood, R. D., Garrison, S., Hickey, D. T., Bransford, J. D., & Pellegrino, J. W. (1994). Multimedia environments for enhancing science instruction. In S. Vosniadou, E. De Corte, R. Glaser & H. Mandl (Eds.), International perspectives on the psychological foundations of technology-based learning environments (pp. 257–284). New York: Springer.
Gutstein, E. (2006). Reading and writing the world with mathematics: Toward a pedagogy for social justice. Abingdon: Taylor & Francis.
Hapgood, S., Magnusson, S. J., & Palincsar, A. (2004). Teacher, text, and experience: A case of young children’s scientific inquiry. The Journal of the Learning Sciences, 13(4), 455–505.
Harlen, W. (2005). Teachers’ summative practices and assessment for learning–tensions and synergies. Curriculum Journal, 16(2), 207–223.
Hehir, T. (2002). Eliminating ableism in education. Harvard Educational Review, 72(1), 1–33.
Hiebert, J., & Grouws, D.A. (2007). The effects of classroom mathematics teaching on students’ learning. In F. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 371–404). Charlotte: Information Age Publishing.
Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831.
Hinchey, P. H. (2010). Getting Teacher Assessment Right: What Policymakers Can Learn from Research. Boulder, CO: National Education Policy Center. http://nepc.colorado.edu/publication/getting-teacher-assessment-right. Accessed 9 Mar 2018.
Hmelo-Silver, C. E. (2004). Problem-based learning: What and how do students learn? Educational Psychology Review, 16(3), 235–266.
Hmelo-Silver, C. E., Duncan, R. G., & Chinn, C. A. (2007). Scaffolding and achievement in problem-based and inquiry learning: A response to Kirschner, Sweller, and Clark (2006). Educational Psychologist, 42(2), 99–107.
Horizon Research Inc. (1999). Local Systemic Change through teacher enhancement classroom observation protocol. http://www.horizon-research.com/instruments/lsc/cop.php. Accessed 9 Mar 2018.
Kane, T. J., & Staiger, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Bill & Melinda Gates Foundation. http://files.eric.ed.gov/fulltext/ED540960.pdf. Accessed 9 Mar 2018.
Kapur, M. (2016). Examining productive failure, productive success, unproductive failure, and unproductive success in learning. Educational Psychologist, 51(2), 289–299.
Khisty, L. L. (1995). Making inequality: issues of language and meanings in mathematics teaching with Hispanic students. In W. G. Secada, E. Fennema & L. B. Adajian (Eds.), New Directions for Equity in Mathematics Education (pp. 279–297). Cambridge: Cambridge University Press.
Kolodner, J. L., Camp, P. J., Crismond, D., Fasse, B., Gray, J., Holbrook, J., & Ryan, M. (2003). Problem-based learning meets case-based reasoning in the middle-school science classroom: Putting learning by design into practice. The Journal of the Learning Sciences, 12(4), 495–547.
Krajcik, J. S., & Czerniak, C. M. (2014). Teaching science in elementary and middle school: A project-based approach. Abingdon: Routledge.
Kupermintz, H. (2002). Teacher effects as a measure of teacher effectiveness: Construct validity considerations in TVAAS (Tennessee Value-Added Assessment System). CSE Technical Report 563. University of California, Los Angeles.
Ladson-Billings, G. (1995). Toward a theory of culturally relevant pedagogy. American Educational Research Journal, 32(3), 465–491.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
Learning Mathematics for Teaching (2006). A Coding rubric for Measuring the Quality of Mathematics in Instruction (Technical Report LMT1.06). Ann Arbor, MI: University of Michigan, School of Education.
Lesh, R., & Harel, G. (2003). Problem solving, modeling, and local conceptual development. Mathematical Thinking and Learning, 5(2/3), 157–190.
Linnenbrink-Garcia, L., Patall, E., & Messersmith, E. (2013). Antecedents and consequences of situational interest. British Journal of Educational Psychology, 83, 591–614.
Lynch, S. (2000). Equity and Science Education Reform. Mahwah: Lawrence Erlbaum.
Marder, M., & Hamrock, C. (2016). Math and science outcomes for students of teachers from standard and alternative pathways in Texas. Working Paper. https://uteach.utexas.edu/sites/default/files/student-gains-by-pathway-working-paper-2017jan10.pdf.
Marder, M., Walkington, C., Abraham, L., Allen, K., Arora, P., Daniels, M., Dickinson, G., Ekberg, D., Gordon, J., Ihorn, S., & Walker, M. (2010). The UTeach Observation Protocol (UTOP) Training Guide. UTeach Natural Sciences, University of Texas Austin.
Marzano, R. J., & Marzano, J. S. (2003). The key to classroom management. Educational Leadership, 61(1), 6–13.
Mishra, P., & Koehler, M. J. (2006). Technological pedagogical content knowledge: A framework for teacher knowledge. Teachers College Record, 108(6), 1017–1054.
Moll, L. C., Amanti, C., Neff, D., & Gonzalez, N. (1992). Funds of knowledge for teaching: Using a qualitative approach to connect homes and classrooms. Theory into practice, 31(2), 132–141.
National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA.
Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom Assessment Scoring System (CLASS). Baltimore: Brookes.
Polikoff, M. S. (2014). Does the Test Matter? Evaluating teachers when tests differ in their sensitivity to instruction. In T. Kane, K. Kerr & R. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching project (pp. 278–302). San Francisco: Jossey-Bass.
Potvin, P., & Hasni, A. (2014). Interest, motivation and attitude towards science and technology at K-12 levels: A systematic review of 12 years of educational research. Studies in Science Education, 50(1), 85–129.
Redfield, D. L., & Rousseau, E. W. (1981). A meta-analysis of experimental research on teacher questioning behavior. Review of Educational Research, 51(2), 237–245.
Samson, G. K., Strykowski, B., Weinstein, T., & Walberg, H. J. (1987). The effects of teacher questioning levels on student achievement: A quantitative synthesis. The Journal of Educational Research, 80(5), 290–295.
Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science, 3(4), 207–217.
Schwartz, D., Lin, X., Brophy, S., & Bransford, J. (1999). Toward the development of flexibly adaptive instructional designs. In C. Reigeluth (Ed.), Instructional Design Theories and Models (pp. 188–213). Hillsdale: Erlbaum.
Schwartz, D. L., & Bransford, J. D. (1998). A time for telling. Cognition and Instruction, 16(4), 475–5223.
Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–14.
Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455–488.
Stodolsky, S. S. (1984). Teacher evaluation: The limits of looking. Educational Researcher, 13(9), 11–18.
Tate, W. F. (1994). Race retrenchment and reform of school mathematics. Phi Delta Kappan, 75(6), 477–480, 482–484.
UTeach Institute (2007). UTeach Elements of Success. http://uteach-institute.org/elements-of-success/.
Valencia, R. R. (1997). Conceptualizing the notion of deficit thinking. In R. R. Valencia (Ed.), The evolution of deficit thinking: Educational thought and practice (pp. 1–12). London: The Falmer Press.
Van De Walle, J. A., Karp, K. S., & Bay-Williams, J. M. (2013). Elementary and middle school mathematics: Teaching developmentally (8th edn.). Boston: Allyn & Bacon.
Viiri, J. (2003). Engineering teachers’ pedagogical content knowledge. European Journal of Engineering Education, 28(3), 353–359.
Walkington, C., Arora, P., Ihorn, S., Gordon, J., Walker, M., Abraham, L., & Marder, M. (2011). Development of the UTeach Observation Protocol: A classroom observation instrument to evaluate mathematics and science teachers from the UTeach preparation program. UTeach Natural Sciences, University of Texas at Austin. https://uteach.utexas.edu/sites/default/files/UTOP_Paper_Non_Anonymous_4_3_2011.pdf.
Walkington, C., & Marder, M. (2014). Exploring excellence in teaching using the UTeach Observation Protocol: Connecting teaching behaviors to teacher value-added on assessments measuring conceptual understanding. In T. Kane, K. Kerr & R. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching project (pp. 234–277). San Francisco: Jossey-Bass. https://uteach.utexas.edu/sites/default/files/WalkingtonMarderMET2013.pdf.
Walkington, C., & Valerius, M. (April, 2012). Using classroom observation research to inform debates about teaching effectiveness. Paper presentation at 2012 Research Pre-session for National Council of Teachers of Mathematics Annual Meeting, Philadelphia, PA.
Walshaw, M., & Anthony, G. (2008). The teacher’s role in classroom discourse: A review of recent research into mathematics classrooms. Review of Educational Research, 78(3), 516–551.
Weinstein, C. S., & Novodorsky, I. (2015). Middle and secondary classroom management: Lessons from research and practice (5th edn.). McGraw Hill.
Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on teacher effectiveness. New York City: The New Teacher Project.
Wu, X., Anderson, R. C., Nguyen-Jahiel, K., & Miller, B. (2013). Enhancing motivation and engagement through collaborative discussion. Journal of Educational Psychology, 105(3), 622–632.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Walkington, C., Marder, M. Using the UTeach Observation Protocol (UTOP) to understand the quality of mathematics instruction. ZDM Mathematics Education 50, 507–519 (2018). https://doi.org/10.1007/s11858-018-0923-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11858-018-0923-7