Abstract
In this article, we analyze theoretical as well as methodological challenges in measuring instructional quality in mathematics classrooms by examining standardized observational instruments. At the beginning, we describe the results of a systematic literature review for determining subject-specific aspects measured in recent lesson studies in mathematics education. The main results are that there is little or no consistency in the conceptualization and nomination of subject-specific aspects. We therefore structured these different aspects along two perspectives, a mathematical perspective on mathematics educational quality of instruction as well as a pedagogical perspective. Furthermore, referring to the usage of these observational instruments in the field, in this paper we inquire into methodological challenges in measuring instructional quality in mathematics classrooms, e.g., the optimal number of raters and lessons to be observed. The results are twofold: on the one hand, there are recent studies that provide a useful answer to these questions. On the other hand, these results appear to be specific to the given data. Therefore, this problem seems to be unsolved so far.
Similar content being viewed by others
Notes
Sometimes the word instructional research is used only if classroom observation methods are performed, as for example Helmke puts it: “The silver bullet of the description and assessment of instruction is without doubt observation” (own translation, 2012, p. 288).
References
American Educational Research Association/American Psychological Association. (1999). Standards for educational and psychological testing. Washington: American Educational Research Association.
Atweh, B., Clarkson, P., & Nebres, B. (2003). Mathematics education in international and global contexts. In A. J. Bishop, M. A. Clements, C. Keitel, J. Kilpatrick, & F. K. S. Leung (Eds.), Second international handbook of mathematics education (pp. 185–229). Dordrecht: Springer Netherlands.
Baumert, J., Kunter, M., Blum, W., Brunner, M., Voss, T., Jordan, A., & Tsai, Y.-M. (2010). Teachers’ mathematical knowledge, cognitive activation in the classroom, and student progress. American Educational Research Journal, 47(1), 133–180.
Beaton, A. E., Mullis, I. V. S., Martin, M. O., Gonzales, E. J., Kelly, D. L., & Smith, T. A. (1996). Mathematics achievement in the middle school years: IEA’s Third International Mathematics and Science Study. Chestnut Hill: Boston College.
Blömeke, S., Gustafsson, J.-E., & Shavelson, R. J. (2015). Beyond Dichotomies. Zeitschrift für Psychologie, 223(1), 3–13.
Blum, W., Drücke-Noe, C., Hartung, R., & Köller, O. (2006). Bildungsstandards Mathematik: Konkret. Sekundarstufe 1: Aufgabenbeispiele, Unterrichtsanregungen, Fortbildungsideen. Berlin: Cornelsen Scriptor.
Brennan, R. L. (2001). Generalizability theory. New York: Springer.
Brennan, R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24(1), 1–21. doi:10.1080/08957347.2011.532417.
Brophy, J. (2000). Teaching. Brüssel: International Academy of Education.
Brophy, J. (2006). Observational research on generic aspects of classroom teaching. In P. A. Alexander & P. H. Winne (Eds.), Handbook of educational psychology (2nd ed., pp. 755–780). Mahwah: Erlbaum.
Buchholtz, N., Kaiser, G., & Blömeke, S. (2014). Die Erhebung mathematikdidaktischen Wissens—Konzeptualisierung einer komplexen Domäne. Journal für Mathematik-Didaktik, 35(1), 101–128.
Casabianca, J. M., McCaffrey, D. F., Gitomer, D. H., Bell, C. A., Hamre, B. K., & Pianta, R. C. (2013). Effect of observation mode on measures of secondary mathematics teaching. Educational and Psychological Measurement, 73(5), 757–783.
Charalambous, C. Y., & Hill, H. C. (2012). Teacher knowledge, curriculum materials, and quality of instruction: Unpacking a complex relationship. Journal of Curriculum Studies, 44(4), 443–466.
Clare, L., Valdés, R., Pascal, J., & Steinberg, J. (2001). Teachers’ assignments as indicators of instructional quality in elementary schools (CSE Technical Report No. 545). Los Angeles: National Center for Research on Evaluation.
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability. New York: Wiley.
Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Perspectives in social psychology. New York: Plenum.
Drollinger-Vetter, B. (2011). Verstehenselemente und strukturelle Klarheit: Fachdidaktische Qualität der Anleitung von mathematischen Verstehensprozessen im Unterricht. Münster: Waxmann.
Drollinger-Vetter, B., & Lipowsky, F. (2006). Fachdidaktische Qualität der Theoriephasen. In E. Klieme, C. Pauli, & K. Reusser (Eds.), Dokumentation der Erhebungs- und Auswertungsinstrumente zur schweizerisch-deutschen Videostudie “Unterrichtsqualität, Lernverhalten und mathematisches Verständnis” (Teil 3: Hugener, Isabelle; Pauli, Christine & Reusser, Kurt: Videoanalysen (pp. 189–205). Frankfurt am Main: GFPF.
Fend, H. (1981). Theorie der Schule (2., durchges. Aufl). U- & -S-Pädagogik. München [u.a.]: Urban & Schwarzenberg.
Gates Foundation (2012). Gathering feedback for teaching: Combining high quality observations with student surveys and achievement gains. Research paper, http://www.metproject.org/downloads/MET_Gathering_Feedback_Research_Paper.pdf. Accessed 22 Jan 2016.
Hattie, J. (2009). Visible learning. Synthesis of over 800 meta-analyzes relating to achievement. London: Routledge.
Helmke, A. (2012). Unterrichtsqualität und Lehrerprofessionalität: Diagnose, Evaluation und Verbesserung des Unterrichts. Seelze: Klett-Kallmeyer.
Hiebert, J., Gallimore, R., Garnier, H., & Stigler, J. (2003). Teaching mathematics in seven countries. Results from the TIMSS 1999 video study. Washington: National Center for Education Statistics.
Hiebert, J., & Grouws, D. A. (2007). The effects of classroom mathematics teaching on students’ learning. In F. K. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 371–404). Charlotte: Information Age.
Hill, H. C., Blunk, M. L., Charalambous, C. Y., Lewis, J. M., Phelps, G. C., Sleep, L., & Ball, D. L. (2008). Mathematical knowledge for teaching and the mathematical quality of instruction: An exploratory study. Cognition and Instruction, 26(4), 430–511.
Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56–64. doi:10.3102/0013189X12437203.
Hill, H. C., Kapitula, L., & Umland, K. (2010). A validity argument approach to evaluating teacher value-added scores. American Educational Research Journal, 48(3), 794–831. doi:10.3102/0002831210387916.
Hill, H. C., Rowan, B., & Ball, D. L. (2005). Effects of teachers’ mathematical knowledge for teaching on student achievement. American Educational Research Journal, 42(2), 371–406.
Horizon Research, Inc. (2000). Inside the classroom observation and analytic protocol. Chapel Hill: Horizon Research, Inc.
Howard, G. S., Maxwell, S. E., Weiner, R. L., Boynton, K. S., & Rooney, W. M. (1980). Is a behavioral measure the best estimate of behavioral parameters? Perhaps not. Applied Psychological Measurement, 4, 293–311.
Jacobs, J., Garnier, H., Gallimore, R., Hollingsworth, H., Givvin, K. B., Rust, K., Kawanaka, T., Smith, M., Wearne, D., Manaster, A., Etterbeek, W., Hiebert, J., Stigler, J. (2003). TIMSS 1999 video study technical report: volume 1: Mathematics study, NCES (2003-012), U.S. Department of Education. Washington, DC: National Center for Education Statistics.
Kane, M. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 18–64). Westport: Praeger.
Kersting, N. B., Givvin, K. B., Thompson, B. J., Santagata, R., & Stigler, J. W. (2012). Measuring usable knowledge: Teachers’ analyses of mathematics classroom videos predict teaching quality and student learning. American Educational Research Journal, 49(3), 568–589. doi:10.3102/0002831212437853.
Klieme, E., Pauli, C., & Reusser, K. (2009). The Pythagoras study. In T. Janik & T. Seidel (Eds.), The power of video studies in investigating teaching and learning in the classroom (pp. 137–160). Münster: Waxmann.
Klieme, E., & Rakoczy, K. (2008). Empirische Unterrichtsforschung und Fachdidaktik. Outcome-orientierte Messung und Prozessqualität des Unterrichts. Zeitschrift für Pädagogik, 54, 222–237.
Kounin, J. S. (1970). Disciplin and group management in classrooms. New York: Holt, Rinehart and Winston.
Kunter, M., & Baumert, J. (2006). Who is the expert? Construct and criteria validity of student and teacher ratings of instruction. Learning Environments Research, 9(3), 231–251. doi:10.1007/s10984-006-9015-7.
Kunter, M., Baumert, J., & Köller, O. (2007). Effective classroom management and the development of subject-related interest. Learning and Instruction, 17(5), 494–509. doi:10.1016/j.learninstruc.2007.09.002.
Kunter, M., Klusmann, U., Baumert, J., Richter, D., Voss, T., & Hachfeld, A. (2013). Professional competence of teachers: Effects on instructional quality and student development. Journal of Educational Psychology, 105(3), 805–820. doi:10.1037/a0032583.
Learning Mathematics for Teaching Project. (2011). Measuring the mathematical quality of instruction. Journal of Mathematics Teacher Education, 14, 25–47.
Lipowsky, F., Rakoczy, K., Pauli, C., Drollinger-Vetter, B., Klieme, E., & Reusser, K. (2009). Quality of geometry instruction and its short-term impact on students’ understanding of the Pythagorean Theorem. Learning and Instruction, 19(6), 527–537. doi:10.1016/j.learninstruc.2008.11.001.
Lotz, M., Lipowsky, F., Faust, G. (2013). Dokumentation der Erhebungsinstrumente des Projekts “Persönlichkeits-und Lernentwicklung von Grundschülern” (PERLE). 3. Technischer Bericht zu den PERLE-Videostudien. Materialien zur Bildungsforschung: Vol. 23,3. Frankfurt am Main: Gesellschaft zur Förderung Pädagogischer Forschung [u.a.].
Lüdtke, O., Robitzsch, A., Trautwein, U., & Kunter, M. (2009). Assessing the impact of learning environments: How to use student ratings of classroom or school characteristics in multilevel modeling. Contemporary Educational Psychology, 34, 120–131. doi:10.1016/j.cedpsych.2008.12.001.
Marder, M., & Walkington, C. (2014). Classroom observation and value-added models give complementary information about quality of mathematics teaching. In T. Kane, K. Kerr, & R. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the Measuring Effective Teaching project (pp. 234–277). New York: Wiley.
Matsumura, L. C., Garnier, H. E., Pascal, J., & Valdés, R. (2002). Measuring instructional quality in accountability systems: Classroom assignments and students achievement. Educational Assessment, 8, 207–229.
Matsumura, L. C., Garnier, H., Slater, S. C., & Boston, M. D. (2008). Toward measuring instructional interactions “at-scale”. Educational Assessment, 13, 267–300.
Oser, F., Dick, A., & Patry, J.-L. (Eds.). (1992). Effective and responsible teaching: The new synthesis. San Francisco: Jossey Bass.
Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes: Standardized observation can leverage capacity. Educational Researcher, 38(2), 109–119. doi:10.3102/0013189X09332374.
Praetorius, A.-K., Lenske, G., & Helmke, A. (2012). Observer ratings of instructional quality: Do they fulfill what they promise? Learning and Instruction, 22, 387–400.
Praetorius, A.-K., Pauli, C., Reusser, K., Rakoczy, K., & Klieme, E. (2014). One lesson is all you need? Stability of instructional quality across lessons. Learning and Instruction, 31, 2–12.
Reyes, M. R., Brackett, M. A., Rivers, S. E., White, M., & Salovey, P. (2012). Classroom emotional climate, student engagement, and academic achievement. Journal of Educational Psychology, 104, 700–712. doi:10.1037/a0027268.
Rosenshine, B. (1970). Evaluation of instruction. Review of Educational Research, 40, 279–300.
Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Benford, R., & Bloom, I. (2002). Measuring reform practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and Mathematics, 102(6), 245–253. doi:10.1111/j.1949-8594.2002.tb17883.
Scheerens, J. (2004). Review of school and instructional effectiveness. Background paper prepared for the Education for All Global Monitoring Report 2005. Paris: UNESCO.
Scheerens, J., & Bosker, R. (1997). The foundations of educational effectiveness. Oxford, UK: Pergamon.
Schmidt, W. H., Tatto, M. T., Bankov, K., Blömeke, S., Cedillo, T., Cogan, L., et al. (2007). The preparation gap: Teacher education for middle school mathematics in six countries. Mathematics teaching in the 21st century (MT21). East Lansing: Michigan State University, Center for Research in Mathematics and Science Education.
Schoenfeld, A. H. (2013). Classroom observations in theory and practice. ZDM-The International Journal on Mathematics Education, 45(4), 607–621.
Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454–499. doi:10.3102/0034654307310317.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Thousand Oaks: Sage.
Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–31.
Shulman, L. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review, 57(1), 1–22.
Smith, E., & Gorard, S. (2007). Improving teacher quality: Lessons from America’s No Child Left Behind. Cambridge Journal of Education, 37(2), 191–206.
Soar, R. S., Medley, D. M., & Coker, H. (1983). Teacher evaluation: A critique of currently used methods. The Phi Delta Kappan, 65, 239–246.
Thompson, C. J., & Davis, S. B. (2014). Classroom observation data and instruction in primary mathematics education: Improving design and rigour. Mathematics Education Research Journal, 26(2), 301–323. doi:10.1007/s13394-013-0099-y.
Veenman, S., Kenter, B., & Post, K. (2000). Cooperative learning in Dutch primary classrooms. Educational Studies, 26(3), 281–302.
Acknowledgments
We thank Nils Buchholtz, Andreas Busse and the reviewers for helpful suggestions and comments on earlier versions of this article.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schlesinger, L., Jentsch, A. Theoretical and methodological challenges in measuring instructional quality in mathematics education using classroom observations. ZDM Mathematics Education 48, 29–40 (2016). https://doi.org/10.1007/s11858-016-0765-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11858-016-0765-0