Classroom observation frameworks for studying instructional quality: looking back and looking forward
Abstract
Observation-based frameworks of instructional quality differ largely in the approach and the purposes of their development, their theoretical underpinnings, the instructional aspects covered, their operationalization and measurement, as well as the existing evidence on reliability and validity. The current paper summarizes and reflects on these differences by considering the 12 frameworks included in this special issue. By comparing the analysis of three focal mathematics lessons through the lens of each framework as presented in the preceding papers, this paper also examines the similarities, differences, and potential complementarities of these frameworks to describe and evaluate mathematics instruction. To do so, a common structure for comparing all frameworks is suggested and applied to the analyses of the three selected lessons. The paper concludes that although significant work has been pursued over the past years in exploring instructional quality through classroom observation frameworks, the field would benefit from establishing agreed-upon standards for understanding and studying instructional quality, as well as from more collaborative work.
Keywords
Research synthesis Instructional quality Mathematics instruction Observation frameworksNotes
Acknowledgements
We would like to thank all authors who have contributed to this special issue and have invested considerable time and energy in replying to all our questions and requests. Our gratitude also goes to the reviewers of each single paper within this special issue as well as the reviewers of this paper who helped to improve the quality of the special issue considerably.
Supplementary material
References
- AERA/APA/NCME (2014). Standards for educational and psychological testing. Washington: American Educational Research Association.Google Scholar
- Ball, D. L., Sleep, L., Boerst, T. A., & Bass, H. (2009). Combining the development of practice and the practice of development in teacher education. The Elementary School Journal, 109(5), 458–474.CrossRefGoogle Scholar
- Bell, C., Gitomer, D. H., McCaffrey, D. F., Hamre, B. K., Pianta, R. C., & Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17(2–3), 62–87. https://doi.org/10.1080/10627197.2012.715014.CrossRefGoogle Scholar
- Bell, C. A., Qi, Y., Croft, A., Leusner, D., McCaffrey, D. F., Gitomer, D. H., & Pianta, R. (2014). Improving observational score quality: Challenges in observer thinking. In K. Kerr, R. Pianta & T. Kane (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching project (pp. 50–97). San Francisco: Jossey-Bass.Google Scholar
- Berlin, R., & Cohen, J. (2018). Understanding instructional quality through a relational lens. ZDM Mathematics Education. (this issue).Google Scholar
- Berliner, D. C. (2005). The near impossibility of testing for teacher quality. Journal of Teacher Education, 56(3), 205–213. https://doi.org/10.1177/0022487105275904.CrossRefGoogle Scholar
- Boston, M. D., & Candela, A. G. (2018). The instructional quality assessment as a tool for reflecting on instructional practice. ZDM Mathematics Education. (this issue).Google Scholar
- Brennan, R. L. (2001). Generalizability theory. New York: Springer.CrossRefGoogle Scholar
- Casabianca, J. M., Lockwood, J. R., & McCaffrey, D. F. (2015). Trends in classroom observation scores. Educational and Psychological Measurement, 75(2), 311–337. https://doi.org/10.1177/0013164414539163.CrossRefGoogle Scholar
- Chapman, C., Reynolds, D., Muijs, D., Sammons, P., Stringfiled, S., & Teddlie, C. (2016). Educational effectivness and improvement research and practice. In C. Chapman, D. Muijs, D. Reynolds, P. Sammons & C. Teddlie (Eds.), The Routledge international handbook of educational effectiveness and improvement: research, policy, and practice (pp. 1–24). New York: Routledge.Google Scholar
- Charalambous, C. Y., & Litke, E. (2018). Studying instructional quality by using a content-specific lens: The case of the mathematical quality of Instruction framework. ZDM Mathematics Education. (this issue).Google Scholar
- Charalambous, C. Y., & Pitta-Pantazi, D. (2016). Perspectives on priority mathematics education: Unpacking and understanding a complex relationship linking teacher knowledge, teaching, and learning. In L. English & D. Kirshner (Eds.), Handbook of international research in mathematics education (3rd edn., pp. 19–59). Abingdon: Routledge.Google Scholar
- Charalambous, C. Y., & Praetorius, A. K. (2018). Studying instructional quality in mathematics through different lenses: In search of common ground. ZDM Mathematics Education. (this issue).Google Scholar
- Cohen, D. K. (2011). Teaching and its predicaments. Cambridge: Harvard University Press.CrossRefGoogle Scholar
- Cronbach, L. (1990). Essentials of psychological testing (5th edn.). Boston: Allyn & Bacon, Inc.Google Scholar
- Diederich, J., & Tenorth, H. E. (1997). Theorie der Schule. Ein Studienbuch zu Geschichte, Funktionen und Gestaltung. Berlin, Germany: Cornelsen.Google Scholar
- Fend, H. (1981). Theorie der Schule [Theory of the schooling]. München: Urban & Schwarzenberg.Google Scholar
- Gitomer, D. (2009). Crisp measurement and messy context: A Clash of assumptions and metaphors—Synthesis of Section III. In G. Drew (Ed.), Measurement issues and assessment for teaching quality (pp. 223–233). Thousand Oaks: Sage.CrossRefGoogle Scholar
- Gitomer, D. H., & Bell, C. A. (2013). Evaluating teaching and teachers. In K. F. Geisinger (Ed.), APA handbook of testing and assessment in psychology (Vol. 3, pp. 415–444). Washington: American Psychological Association.Google Scholar
- Grossman, P., & McDonald, M. (2008). Back to the future: Directions for research in teaching and teacher education. American Educational Research Journal, 45(1), 184–205. https://doi.org/10.3102/0002831207312906.CrossRefGoogle Scholar
- Hattie, J. A. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. New York: Routledge.Google Scholar
- Herlihy, C., Karger, E., Pollard, C., Hill, H. C., Kraft, M. A., Williams, M., & Howard, S. (2014). State and local efforts to investigate the validity and reliability of scores from teacher evaluation systems. Teachers College Record, 116(1), 1–28.Google Scholar
- Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56–64. https://doi.org/10.3102/0013189X12437203.CrossRefGoogle Scholar
- Kennedy, M. M. (2010). Approaches to annual performance assessment. In M. M. Kennedy (Ed.), Teacher assessment and the quest for teacher quality: A handbook (pp. 225–250). San Francisco: Jossey-Bass.Google Scholar
- Ko, J., Sammons, P., & Bakkum, L. (2016). Effective teaching. Education Development Trust. https://www.educationdevelopmenttrust.com/~/media/EDT/Reports/Research/2015/r-effective-teaching.pdf. Accessed September 15, 2017.
- Konstantopoulos, S. (2012). Teacher effects: Past, present and future. In S. Kelly (Ed.), Assessing teacher quality: Understanding teacher effects on instruction and achievement (pp. 33–48). New York: Teachers College Press.Google Scholar
- Koretz, D. (2008). Measuring up: What educational testing really tells us. Cambridge: Harvard University Press.Google Scholar
- Krosnick, J. A., & Presser, S. (2010). Questionnaire design. In J. D. Wright & P. V. Marsden (Eds.), Handbook of survey research (2nd edn., pp. 503–512). West Yorkshire: Emerald Group.Google Scholar
- Kyriakides, L., Creemers, B. P. M., & Panayiotou, A. (2018). Using educational effectiveness research to promote quality of teaching: The contribution of the dynamic model. ZDM Mathematics Education. (this issue).Google Scholar
- Lampert, M. (2010). Learning teaching in, from, and for practice: What do we mean? Journal of Teacher Education, 61(1–2), 21–34.CrossRefGoogle Scholar
- Lindorff, A., & Sammons, P. (2018). Going beyond structured observations: Looking at classroom practice through a mixed method lens. ZDM Mathematics Education. (this issue).Google Scholar
- Maykut, P. S., & Morehouse, R. (1994). Beginning qualitative research: A philosophic and practical guide. London: Falmer Press.Google Scholar
- McKnight. C. C. (1979). Model for the Second Study of Mathematics. In Bulletin 4: Second IEA Study of Mathematics. Urbana, Illinois: SIMS Study Center.Google Scholar
- Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd edn., pp. 13–103). Washington: American Council on Education & National Council on Measurement in Education.Google Scholar
- Metzler, H. (1990). Methodological interdependencies between conceptualization and operationalization in empirical social sciences. In E. Zarnecka-Bialy (Ed.), Logic counts. Reason and argument (Vol. 3, pp. 167–176). Dordrecht: Springer.CrossRefGoogle Scholar
- Muijs, D., Kyriakides, L., van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014). State of the art-teacher effectiveness and professional learning. School Effectiveness and School Improvement, 25(2), 231–256. https://doi.org/10.1080/09243453.2014.885451.CrossRefGoogle Scholar
- Muijs, D., Reynolds, D., Sammons, P., Kyriakides, L., Creemers, B. P. M., & Teddlie, C. (2018). Assessing individual lessons using a generic teacher observation instrument: How useful is the International System for Teacher Observation and Feedback (ISTOF)? ZDM Mathematics Education. (this issue).Google Scholar
- Open Science Collaboration (2017). Maximizing the reproducibility of your research. In S. O. Lilienfeld & I. D. Waldman (Eds.), Psychological science under scrutiny: Recent challenges and proposed solutions (pp. 1–21). New York: Wiley.Google Scholar
- Patton, M. Q. (2002). Qualitative research & evaluation methods (3rd edn.). London: Sage Publications.Google Scholar
- Praetorius, A. K., Lenske, G., & Helmke, A. (2012). Observer ratings of instructional quality: Do they fulfill what they promise? Learning and Instruction, 22(6), 387–400. https://doi.org/10.1016/j.learninstruc.2012.03.002.CrossRefGoogle Scholar
- Praetorius, A., Pauli, K., Reusser, C., Rakoczy, K., & Klieme, E. (2014). One lesson is all you need? Stability of instructional quality across lessons. Learning and Instruction, 31, 2–12.CrossRefGoogle Scholar
- Praetorius, A.-K., Klieme, E., Herbert, B., & Pinger, P. (2018). Generic dimensions of teaching quality: The German framework of the three basic dimensions. ZDM Mathematics Education. (this issue).Google Scholar
- Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd edn.). Thousand Oaks: Sage Publications.Google Scholar
- Rosenshine, B. (1983). Teaching functions in instructional programs. The Elementary School Journal, 83(4), 335–351. https://doi.org/10.1086/461321.CrossRefGoogle Scholar
- Scheerens, J. (2013). The use of theory in school effectiveness research revisited. School Effectiveness and School Improvement, 24(1), 1–38. https://doi.org/10.1080/09243453.2012.691100.CrossRefGoogle Scholar
- Schlesinger, L., Jentsch, A., Kaiser, G., König, J., & Blömeke, S. (2018). Subject-specific characteristics of instructional quality in mathematics education. ZDM Mathematics Education. (this issue).Google Scholar
- Schoenfeld, A. (2018). Video analyses for research and professional development: the teaching for robust understanding (TRU) framework. ZDM Mathematics Education. (this issue).Google Scholar
- Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47, 609–612. https://doi.org/10.1016/j.jrp.2013.05.009.CrossRefGoogle Scholar
- Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454–499. https://doi.org/10.3102/0034654307310317.CrossRefGoogle Scholar
- Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Thousand Oaks: Sage.Google Scholar
- Shavelson, R. J., Webb, N. M., & Burstein, L. (1986). Measurement of teaching. In M. Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 50–91). New York: Macmillan.Google Scholar
- Stein, M. K., Grover, B., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33, 455–488. https://doi.org/10.3102/00028312033002455.CrossRefGoogle Scholar
- Tomlinson, C. A., & Moon, T. R. (2013). Assessment and student success in a differentiated classroom. Alexandria: ASCD.Google Scholar
- Walkington, C., & Marder, M. (2018). Using the UTeach Observation Protocol (UTOP) to understand the quality of mathematics instruction. ZDM Mathematics Education. (this issue).Google Scholar
- Walkowiak, T. A., Berry, R. Q., Pinter, H. H., & Jacobson, E. D. (2018). Utilizing the M-Scan to measure standards-based mathematics teaching practices: Affordances and limitations. ZDM Mathematics Education. (this issue).Google Scholar
- Whetten, D. A. (1989). What constitutes a theoretical contribution? Academy of Management Review, 14, 490–495.CrossRefGoogle Scholar
- Wirtz, M., & Caspar, F. (2002). Beurteilerübereinstimmung und Beurteilerreliabilität. Göttingen: Hogrefe.Google Scholar