Classroom observation frameworks for studying instructional quality: looking back and looking forward

Praetorius, Anna-Katharina; Charalambous, Charalambos Y.

doi:10.1007/s11858-018-0946-0

Classroom observation frameworks for studying instructional quality: looking back and looking forward

Original Article
Published: 26 May 2018

Volume 50, pages 535–553, (2018)
Cite this article

ZDM Aims and scope Submit manuscript

Anna-Katharina Praetorius ORCID: orcid.org/0000-0001-7581-367X¹^na1 &
Charalambos Y. Charalambous²^na1

6183 Accesses
104 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

Observation-based frameworks of instructional quality differ largely in the approach and the purposes of their development, their theoretical underpinnings, the instructional aspects covered, their operationalization and measurement, as well as the existing evidence on reliability and validity. The current paper summarizes and reflects on these differences by considering the 12 frameworks included in this special issue. By comparing the analysis of three focal mathematics lessons through the lens of each framework as presented in the preceding papers, this paper also examines the similarities, differences, and potential complementarities of these frameworks to describe and evaluate mathematics instruction. To do so, a common structure for comparing all frameworks is suggested and applied to the analyses of the three selected lessons. The paper concludes that although significant work has been pursued over the past years in exploring instructional quality through classroom observation frameworks, the field would benefit from establishing agreed-upon standards for understanding and studying instructional quality, as well as from more collaborative work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Classroom Observations

Article 05 September 2018

Possible biases in observation systems when applied across contexts: conceptualizing, operationalizing, and sequencing instructional quality

Article Open access 02 July 2022

Studying instructional quality by using a content-specific lens: the case of the Mathematical Quality of Instruction framework

Article 18 January 2018

Notes

For ease of discussion, we refer to generic, content-specific and hybrid frameworks, although, as explained in the introductory paper (see Charalambous and Praetorius 2018), these frameworks can be situated along a continuum in terms of how generic and content-specific they are.
It is possible, however, that the intended uses of the frameworks are (better) clarified in other publications.
The quotes here and in what follows are directly drawn from the preceding papers of this special issue.
In the publications in the special issue, this challenge is solved by the CLASS researchers by focusing on one version of CLASS (i.e., CLASS-UE), and by the TBD researchers by providing a comprehensive overview of elements included in different studies. In the following, we refer to the results presented based on these decisions.
For ease of reference, in what follows we use the term “elements” to more collectively capture the different terms employed across instruments to describe what they contain on their respective Level-1, 2, and 3.
DMEE and MECORS complement these with low-inference instruments. High-inference instruments require a high degree of subjective judgment on the raters’ part, thus allowing more latitude for interpretation. In contrast, low-inference instruments constrain such interpretations by focusing on more readily observable behaviors and thus reduce both ambiguity and the need for interpretation (see more on this distinction in the publication by Kennedy 2010, p. 231).
Whereas the two preceding measurement decisions are closely tied to the specific framework, the following two might vary to a certain degree from study to study, but still allow us to point to some differences among frameworks. Presented here is the typical approach according to the authors of the respective papers in the special issue.
Instead of conducting a comprehensive review on these issues, we contacted the authors of each paper and asked them to provide us with information regarding reliability and validity. For some frameworks (e.g., TBD) a systematic literature review was conducted to gather such information. For other frameworks for which a large number of publications was available (e.g., CLASS), doing so would, however, have been a huge effort; therefore, the data provided should be seen as indicative.
Because the frameworks differed in the criteria used to examine reliability and validity, Tables C1 and C2 include those common criteria that were also reported by most frameworks.
For example, Shavelson et al. (1986) suggest that a crossed design should be preferred when the lessons are considered as interchangeable, which is the case when the teachers are observed on the same day teaching the same or similar lessons content-wise; this design should also be used when certain features are largely similar across teachers (e.g., the time interval between observations among teachers is shorter than the time interval between observations within lessons of each teacher).
In making this argument we do not mean to imply that the MET study was not without limitations in terms of design and results obtained.
For parsimony reasons, for this work we used Level-1 elements instead of Level-2 elements to organize the huge list generated from the preceding procedure.
The general description of Level-1 indicators was not always consistent with the Level-2 and Level-3 indicators. Some Level-1 classifications were very broad and encompassed other Level-1 or Level-2 classifications within the same instrument. Some Level-3 indicators captured diverse elements which were not necessarily corresponding to or reflecting Level-2 indicators within the same instrument.
The distinction among the intended, implemented, and achieved curriculum (McKnight 1979) gets close to this idea, but still does not emphasize the importance of explicitly attending to students’ use of opportunities.

References

AERA/APA/NCME (2014). Standards for educational and psychological testing. Washington: American Educational Research Association.
Google Scholar
Ball, D. L., Sleep, L., Boerst, T. A., & Bass, H. (2009). Combining the development of practice and the practice of development in teacher education. The Elementary School Journal, 109(5), 458–474.
Article Google Scholar
Bell, C., Gitomer, D. H., McCaffrey, D. F., Hamre, B. K., Pianta, R. C., & Qi, Y. (2012). An argument approach to observation protocol validity. Educational Assessment, 17(2–3), 62–87. https://doi.org/10.1080/10627197.2012.715014.
Article Google Scholar
Bell, C. A., Qi, Y., Croft, A., Leusner, D., McCaffrey, D. F., Gitomer, D. H., & Pianta, R. (2014). Improving observational score quality: Challenges in observer thinking. In K. Kerr, R. Pianta & T. Kane (Eds.), Designing teacher evaluation systems: New guidance from the Measures of Effective Teaching project (pp. 50–97). San Francisco: Jossey-Bass.
Google Scholar
Berlin, R., & Cohen, J. (2018). Understanding instructional quality through a relational lens. ZDM Mathematics Education. (this issue).
Berliner, D. C. (2005). The near impossibility of testing for teacher quality. Journal of Teacher Education, 56(3), 205–213. https://doi.org/10.1177/0022487105275904.
Article Google Scholar
Boston, M. D., & Candela, A. G. (2018). The instructional quality assessment as a tool for reflecting on instructional practice. ZDM Mathematics Education. (this issue).
Brennan, R. L. (2001). Generalizability theory. New York: Springer.
Book Google Scholar
Casabianca, J. M., Lockwood, J. R., & McCaffrey, D. F. (2015). Trends in classroom observation scores. Educational and Psychological Measurement, 75(2), 311–337. https://doi.org/10.1177/0013164414539163.
Article Google Scholar
Chapman, C., Reynolds, D., Muijs, D., Sammons, P., Stringfiled, S., & Teddlie, C. (2016). Educational effectivness and improvement research and practice. In C. Chapman, D. Muijs, D. Reynolds, P. Sammons & C. Teddlie (Eds.), The Routledge international handbook of educational effectiveness and improvement: research, policy, and practice (pp. 1–24). New York: Routledge.
Google Scholar
Charalambous, C. Y., & Litke, E. (2018). Studying instructional quality by using a content-specific lens: The case of the mathematical quality of Instruction framework. ZDM Mathematics Education. (this issue).
Charalambous, C. Y., & Pitta-Pantazi, D. (2016). Perspectives on priority mathematics education: Unpacking and understanding a complex relationship linking teacher knowledge, teaching, and learning. In L. English & D. Kirshner (Eds.), Handbook of international research in mathematics education (3rd edn., pp. 19–59). Abingdon: Routledge.
Google Scholar
Charalambous, C. Y., & Praetorius, A. K. (2018). Studying instructional quality in mathematics through different lenses: In search of common ground. ZDM Mathematics Education. (this issue).
Cohen, D. K. (2011). Teaching and its predicaments. Cambridge: Harvard University Press.
Book Google Scholar
Cronbach, L. (1990). Essentials of psychological testing (5th edn.). Boston: Allyn & Bacon, Inc.
Google Scholar
Diederich, J., & Tenorth, H. E. (1997). Theorie der Schule. Ein Studienbuch zu Geschichte, Funktionen und Gestaltung. Berlin, Germany: Cornelsen.
Fend, H. (1981). Theorie der Schule [Theory of the schooling]. München: Urban & Schwarzenberg.
Google Scholar
Gitomer, D. (2009). Crisp measurement and messy context: A Clash of assumptions and metaphors—Synthesis of Section III. In G. Drew (Ed.), Measurement issues and assessment for teaching quality (pp. 223–233). Thousand Oaks: Sage.
Chapter Google Scholar
Gitomer, D. H., & Bell, C. A. (2013). Evaluating teaching and teachers. In K. F. Geisinger (Ed.), APA handbook of testing and assessment in psychology (Vol. 3, pp. 415–444). Washington: American Psychological Association.
Google Scholar
Grossman, P., & McDonald, M. (2008). Back to the future: Directions for research in teaching and teacher education. American Educational Research Journal, 45(1), 184–205. https://doi.org/10.3102/0002831207312906.
Article Google Scholar
Hattie, J. A. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. New York: Routledge.
Google Scholar
Herlihy, C., Karger, E., Pollard, C., Hill, H. C., Kraft, M. A., Williams, M., & Howard, S. (2014). State and local efforts to investigate the validity and reliability of scores from teacher evaluation systems. Teachers College Record, 116(1), 1–28.
Google Scholar
Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56–64. https://doi.org/10.3102/0013189X12437203.
Article Google Scholar
Kennedy, M. M. (2010). Approaches to annual performance assessment. In M. M. Kennedy (Ed.), Teacher assessment and the quest for teacher quality: A handbook (pp. 225–250). San Francisco: Jossey-Bass.
Google Scholar
Ko, J., Sammons, P., & Bakkum, L. (2016). Effective teaching. Education Development Trust. https://www.educationdevelopmenttrust.com/~/media/EDT/Reports/Research/2015/r-effective-teaching.pdf. Accessed September 15, 2017.
Konstantopoulos, S. (2012). Teacher effects: Past, present and future. In S. Kelly (Ed.), Assessing teacher quality: Understanding teacher effects on instruction and achievement (pp. 33–48). New York: Teachers College Press.
Google Scholar
Koretz, D. (2008). Measuring up: What educational testing really tells us. Cambridge: Harvard University Press.
Google Scholar
Krosnick, J. A., & Presser, S. (2010). Questionnaire design. In J. D. Wright & P. V. Marsden (Eds.), Handbook of survey research (2nd edn., pp. 503–512). West Yorkshire: Emerald Group.
Google Scholar
Kyriakides, L., Creemers, B. P. M., & Panayiotou, A. (2018). Using educational effectiveness research to promote quality of teaching: The contribution of the dynamic model. ZDM Mathematics Education. (this issue).
Lampert, M. (2010). Learning teaching in, from, and for practice: What do we mean? Journal of Teacher Education, 61(1–2), 21–34.
Article Google Scholar
Lindorff, A., & Sammons, P. (2018). Going beyond structured observations: Looking at classroom practice through a mixed method lens. ZDM Mathematics Education. (this issue).
Maykut, P. S., & Morehouse, R. (1994). Beginning qualitative research: A philosophic and practical guide. London: Falmer Press.
Google Scholar
McKnight. C. C. (1979). Model for the Second Study of Mathematics. In Bulletin 4: Second IEA Study of Mathematics. Urbana, Illinois: SIMS Study Center.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd edn., pp. 13–103). Washington: American Council on Education & National Council on Measurement in Education.
Google Scholar
Metzler, H. (1990). Methodological interdependencies between conceptualization and operationalization in empirical social sciences. In E. Zarnecka-Bialy (Ed.), Logic counts. Reason and argument (Vol. 3, pp. 167–176). Dordrecht: Springer.
Chapter Google Scholar
Muijs, D., Kyriakides, L., van der Werf, G., Creemers, B., Timperley, H., & Earl, L. (2014). State of the art-teacher effectiveness and professional learning. School Effectiveness and School Improvement, 25(2), 231–256. https://doi.org/10.1080/09243453.2014.885451.
Article Google Scholar
Muijs, D., Reynolds, D., Sammons, P., Kyriakides, L., Creemers, B. P. M., & Teddlie, C. (2018). Assessing individual lessons using a generic teacher observation instrument: How useful is the International System for Teacher Observation and Feedback (ISTOF)? ZDM Mathematics Education. (this issue).
Open Science Collaboration (2017). Maximizing the reproducibility of your research. In S. O. Lilienfeld & I. D. Waldman (Eds.), Psychological science under scrutiny: Recent challenges and proposed solutions (pp. 1–21). New York: Wiley.
Google Scholar
Patton, M. Q. (2002). Qualitative research & evaluation methods (3rd edn.). London: Sage Publications.
Google Scholar
Praetorius, A. K., Lenske, G., & Helmke, A. (2012). Observer ratings of instructional quality: Do they fulfill what they promise? Learning and Instruction, 22(6), 387–400. https://doi.org/10.1016/j.learninstruc.2012.03.002.
Article Google Scholar
Praetorius, A., Pauli, K., Reusser, C., Rakoczy, K., & Klieme, E. (2014). One lesson is all you need? Stability of instructional quality across lessons. Learning and Instruction, 31, 2–12.
Article Google Scholar
Praetorius, A.-K., Klieme, E., Herbert, B., & Pinger, P. (2018). Generic dimensions of teaching quality: The German framework of the three basic dimensions. ZDM Mathematics Education. (this issue).
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd edn.). Thousand Oaks: Sage Publications.
Google Scholar
Rosenshine, B. (1983). Teaching functions in instructional programs. The Elementary School Journal, 83(4), 335–351. https://doi.org/10.1086/461321.
Article Google Scholar
Scheerens, J. (2013). The use of theory in school effectiveness research revisited. School Effectiveness and School Improvement, 24(1), 1–38. https://doi.org/10.1080/09243453.2012.691100.
Article Google Scholar
Schlesinger, L., Jentsch, A., Kaiser, G., König, J., & Blömeke, S. (2018). Subject-specific characteristics of instructional quality in mathematics education. ZDM Mathematics Education. (this issue).
Schoenfeld, A. (2018). Video analyses for research and professional development: the teaching for robust understanding (TRU) framework. ZDM Mathematics Education. (this issue).
Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47, 609–612. https://doi.org/10.1016/j.jrp.2013.05.009.
Article Google Scholar
Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454–499. https://doi.org/10.3102/0034654307310317.
Article Google Scholar
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Thousand Oaks: Sage.
Google Scholar
Shavelson, R. J., Webb, N. M., & Burstein, L. (1986). Measurement of teaching. In M. Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 50–91). New York: Macmillan.
Stein, M. K., Grover, B., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33, 455–488. https://doi.org/10.3102/00028312033002455.
Article Google Scholar
Tomlinson, C. A., & Moon, T. R. (2013). Assessment and student success in a differentiated classroom. Alexandria: ASCD.
Google Scholar
Walkington, C., & Marder, M. (2018). Using the UTeach Observation Protocol (UTOP) to understand the quality of mathematics instruction. ZDM Mathematics Education. (this issue).
Walkowiak, T. A., Berry, R. Q., Pinter, H. H., & Jacobson, E. D. (2018). Utilizing the M-Scan to measure standards-based mathematics teaching practices: Affordances and limitations. ZDM Mathematics Education. (this issue).
Whetten, D. A. (1989). What constitutes a theoretical contribution? Academy of Management Review, 14, 490–495.
Article Google Scholar
Wirtz, M., & Caspar, F. (2002). Beurteilerübereinstimmung und Beurteilerreliabilität. Göttingen: Hogrefe.
Google Scholar

Download references

Acknowledgements

We would like to thank all authors who have contributed to this special issue and have invested considerable time and energy in replying to all our questions and requests. Our gratitude also goes to the reviewers of each single paper within this special issue as well as the reviewers of this paper who helped to improve the quality of the special issue considerably.

Author information

Anna-Katharina Praetorius and Charalambos Y. Charalambous contributed equally to this work.

Authors and Affiliations

University of Zurich, Zurich, Switzerland
Anna-Katharina Praetorius
University of Cyprus, Nicosia, Cyprus
Charalambos Y. Charalambous

Authors

Anna-Katharina Praetorius
View author publications
You can also search for this author in PubMed Google Scholar
Charalambos Y. Charalambous
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna-Katharina Praetorius.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 86 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Praetorius, AK., Charalambous, C.Y. Classroom observation frameworks for studying instructional quality: looking back and looking forward. ZDM Mathematics Education 50, 535–553 (2018). https://doi.org/10.1007/s11858-018-0946-0

Download citation

Accepted: 16 May 2018
Published: 26 May 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11858-018-0946-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classroom observation frameworks for studying instructional quality: looking back and looking forward

Abstract

Access this article

Similar content being viewed by others

On Classroom Observations

Possible biases in observation systems when applied across contexts: conceptualizing, operationalizing, and sequencing instructional quality

Studying instructional quality by using a content-specific lens: the case of the Mathematical Quality of Instruction framework

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOCX 86 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classroom observation frameworks for studying instructional quality: looking back and looking forward

Abstract

Access this article

Similar content being viewed by others

On Classroom Observations

Possible biases in observation systems when applied across contexts: conceptualizing, operationalizing, and sequencing instructional quality

Studying instructional quality by using a content-specific lens: the case of the Mathematical Quality of Instruction framework

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOCX 86 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation