Skip to main content

Advertisement

Log in

Moving beyond means: revealing features of the learning environment by investigating the consensus among student ratings

  • Original Paper
  • Published:
Learning Environments Research Aims and scope Submit manuscript

Abstract

Student ratings, a critical component in policy efforts to assess and improve teaching, are often collected using questionnaires, and inferences about teachers are then based on aggregated student survey responses. While considerable attention has been paid to the reliability and validity of these aggregates, much less attention has been paid to within-classroom consensus, and what that consensus can reveal about classrooms. This study used data from the Measures of Effective Teaching Project to investigate how the consensus among student ratings in a classroom can enhance our understanding of the learning environment, and potentially could be used to understand features of instructional practice. The results suggest that consensus is related to teacher effectiveness, the questioning strategies used by teachers, and the demographic heterogeneity of students. The possibility of instructional subclimates and the implications for the use of overall averages in teacher appraisal are discussed together with directions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. For example, in Classroom 1, the mean is given by \(\frac{10\left( 3 \right)}{10} = 3\). In Classroom 2, the mean is given by \(\frac{5\left( 1 \right) + 5\left( 5 \right)}{10} = 3\). In Classroom 3, the mean is given by \(\frac{2\left( 1 \right) + 2\left( 2 \right) + 2\left( 3 \right) + 2\left( 4 \right) + 2\left( 5 \right)}{10} = 3\).

  2. Alternative specifications of this model were also used, including a two-level model ignoring the clustering of class sections within teachers, a three-level model with district fixed effects, and a four-level model with district random effects. Because these alternative specifications did not influence substantive conclusions, only one set of models is presented here.

References

  • Babad, E. (1993). Teachers’ differential behavior. Educational Psychology Review, 5(4), 347–376.

    Article  Google Scholar 

  • Balch, R. T. (2012). The validation of a student survey on teacher practice. Unpublished doctoral dissertation, Vanderbilt University.

  • Bill and Melinda Gates Foundation. (2009). Teachers’ perceptions and the MET project. Seattle, WA: Bill & Melinda Gates Foundation.

    Google Scholar 

  • Bill and Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project. Seattle, WA: Bill & Melinda Gates Foundation.

    Google Scholar 

  • Bill and Melinda Gates Foundation. (2013). Measures of effective teaching: Study information. Ann Arbor, MI: Inter-University Consortium for Political and Social Research.

    Google Scholar 

  • Bliese, P. D. (2000). Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In K. J. Klein & S. W. Kozlowski (Eds.), Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions (pp. 349–381). San Francisco, CA: Jossey-Bass.

    Google Scholar 

  • Bliese, P. D., & Halverson, R. R. (1998). Group consensus and psychological well-being: A large field study. Journal of Applied Social Psychology, 28(7), 563–580.

    Article  Google Scholar 

  • Braun, H. I. (2005). Using student progress to evaluate teachers: A primer on value-added models. Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Brophy, J. E., & Good, T. L. (1974). Teacher-student relationships: Causes and consequences. New York: Holt, Rinehart & Winston.

    Google Scholar 

  • Burniske, J., & Meibaum, D. (2012). The use of student perceptual data as a measure of teaching effectiveness. Austin, TX: SEDL.

    Google Scholar 

  • Casteel, C. A. (2000). African American students’ perceptions of their treatment by Caucasian teachers. Journal of Instructional Psychology, 27(3), 143–148.

    Google Scholar 

  • Chan, D. (1998). Functional relations among constructs in the same content domain at different levels of analysis: A typology of composition models. Journal of Applied Psychology, 83(2), 234–246.

    Article  Google Scholar 

  • Charalambous, C. Y. (2010). Mathematical knowledge for teaching and task unfolding: An exploratory study. The Elementary School Journal, 110(3), 247–278.

    Article  Google Scholar 

  • Clarke, K. A. (2005). The phantom menace: Omitted variable bias in econometric research. Conflict Management and Peace Science, 22(4), 341–352.

    Article  Google Scholar 

  • Cole, M. S., Bedeian, A. G., Hirschfeld, R. R., & Vogel, B. (2011). Dispersion-composition models in multilevel research: A data-analytic framework. Organizational Research Methods, 14(4), 718–734.

    Article  Google Scholar 

  • Crosson, A. C., Boston, M., Levison, A., Matsumura, L. C., Resnick, L. B., Wolf, M. K., et al. (2006). Beyond summative evaluation: The instructional quality assessment as a professional development tool (CSE Technical Report 691). Los Angeles, CA: National Center for Research on Evaluation, Standards, and Student Testing.

    Google Scholar 

  • Danielson, C. (2011). Enhancing professional practice: A framework for teaching. Alexandria, VA: ASCD.

    Google Scholar 

  • Downer, J. T., Stuhlman, M., Schweig, J., Martínez, J. F., & Ruzek, E. (2014). Measuring effective teacher-student interactions from a student perspective a multi-level analysis. The Journal of Early Adolescence, 35(5–6), 722–758.

    Google Scholar 

  • Doyle, W. (1983). Academic work. Review of Educational Research, 53(2), 159–199.

    Article  Google Scholar 

  • Doyle, W., & Carter, K. (1984). Academic tasks in classrooms. Curriculum Inquiry, 14(2), 129–149.

    Article  Google Scholar 

  • Doyle, W., Sandford, J., Clements, B., Schmidt-French, B., & Emmer, E. (1983). Managing academic tasks: An interim report of the junior school study (R & D Report 6186). Austin, TX: University of Texas Research and Development Center for Teacher Education.

    Google Scholar 

  • Erickson, F., & Schultz, J. (1981). When is a context? Some issues and methods in the analysis of social competence. In J. Green & C. Wallat (Eds.), Ethnography and language in educational settings (pp. 147–160). Norwood, NJ: Ablex.

    Google Scholar 

  • Fauth, B., Decristan, J., Rieser, S., Klieme, E., & Büttner, G. (2014). Student ratings of teaching quality in primary school: Dimensions and prediction of student outcomes. Learning and Instruction, 29, 1–9.

    Article  Google Scholar 

  • Feldman, K. A. (1978). Course characteristics and college students’ ratings of their teachers: What we know and what we don’t. Research in Higher Education, 9(3), 199–242.

    Article  Google Scholar 

  • Ferguson, R. (2010). Student perceptions of teaching effectiveness. Boston, MA: Harvard University.

    Google Scholar 

  • Ferguson, R. (2012). Can student surveys measure teaching quality? Phi Delta Kappan, 94(3), 24–28.

    Article  Google Scholar 

  • Follman, J. (1992). Secondary school students’ ratings of teacher effectiveness. The High School Journal, 75(3), 168–178.

    Google Scholar 

  • Fraser, B. J. (1998a). Classroom environment instruments: Development, validity and applications. Learning Environments Research, 1(1), 7–34.

    Article  Google Scholar 

  • Fraser, B. J. (1998b). Science learning environments: Assessment, effects and determinants. In B. J. Fraser & K. G. Tobin (Eds.), International handbook of science education (pp. 527–564). Dordrecht, The Netherlands: Kluwer.

    Chapter  Google Scholar 

  • Gates, S. M., Ringel, J. S., & Santibanez, L. (2003). Who is leading our schools? An overview of school administrators and their careers (No. 1679). Santa Monica, CA: RAND Corporation.

  • Godfrey, E. B., & Grayman, J. K. (2014). Teaching citizens: The role of open classroom climate in fostering critical consciousness among youth. Journal of Youth and Adolescence, 43(11), 1801–1817.

    Article  Google Scholar 

  • Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. Washington, DC: National Comprehensive Center For Teacher Quality.

    Google Scholar 

  • Griffith, J. (2000). School climate as group evaluation and group consensus: Student and parent perceptions of the elementary school environment. The Elementary School Journal, 101, 35–61.

    Article  Google Scholar 

  • Guion, R. M. (1973). A note on organizational climate. Organizational Behavior and Human Performance, 9(1), 120–125.

    Article  Google Scholar 

  • Haertel, G. D., Walberg, H. J., & Haertel, E. H. (1981). Socio-psychological environments and learning: A quantitative synthesis. British Educational Research Journal, 7(1), 27–36.

    Article  Google Scholar 

  • Hill, H. C., & Grossman, P. (2013). Learning from teacher observations: Challenges and opportunities posed by new teacher evaluation systems. Harvard Educational Review, 83(2), 371–384.

    Article  Google Scholar 

  • Ho, A. D., & Kane, T. J. (2013). The reliability of classroom observations by school personnel. Seattle, WA: Bill & Melinda Gates Foundation.

    Google Scholar 

  • Hoy, W. K., & Clover, S. I. (1986). Elementary school climate: A revision of the OCDQ. Educational Administration Quarterly, 22(1), 93–110.

    Article  Google Scholar 

  • Irvine, J. J., & Irvine, R. W. (1995). Black youth in school: Individual achievement and institutional/cultural perspectives. African-American youth: Their social and economic status in the United States, 129–142.

  • James, L. R., Demaree, R. G., & Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69(1), 85–98.

    Article  Google Scholar 

  • Jehn, K. A., Chadwick, C., & Thatcher, S. M. (1997). To agree or not to agree: The effects of value congruence, individual demographic dissimilarity, and conflict on workgroup outcomes. International Journal of Conflict Management, 8(4), 287–305.

    Article  Google Scholar 

  • Jones, M. G., & Wheatley, J. (1990). Gender differences in teacher-student interactions in science classrooms. Journal of Research in Science Teaching, 27(9), 861–874.

    Article  Google Scholar 

  • Kane, M. T., & Brennan, R. L. (1977). The generalizability of class means. Review of Educational Research, 47(2), 267–292.

    Article  Google Scholar 

  • Kane, T. J., Staiger, D. O., McCCaffrey, D., Cantrell, S., Archer, J., Buhayar, S., et al. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains. Seattle, WA: Bill & Melinda Gates Foundation.

    Google Scholar 

  • Klein, K. J., Conn, A. B., Smith, D. B., & Sorra, J. S. (2001). Is everyone in agreement? An exploration of within-group agreement in employee perceptions of the work environment. Journal of Applied Psychology, 86(1), 3–16.

    Article  Google Scholar 

  • Kozlowski, S. W., & Klein, K. J. (2000). A multilevel approach to theory and research in organizations: Contextual, temporal, and emergent processes. In K. J. Klein & S. W. Kozlowski (Eds.), Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions (pp. 3–90). San Francisco, CA: Jossey-Bass.

    Google Scholar 

  • Kratz, H. (1896). Characteristics of the best teacher as recognized by children. The Pedagogical Seminary, 3(3), 413–460.

    Article  Google Scholar 

  • Kuklinski, M., & Weinstein, R. (2000). Classroom and grade level differences in the stability of teacher expectations and perceived differential teacher treatment. Learning Environments Research, 3(1), 1–34.

    Article  Google Scholar 

  • Kyriakides, L. (2005). Drawing from teacher effectiveness research and research into teacher interpersonal behaviour to establish a teacher evaluation system: A study on the use of student ratings to evaluate teacher behaviour. Journal of Classroom Interaction, 40(2), 44–66.

    Google Scholar 

  • Lampert, M. (2001). Teaching problems and the problems of teaching. New Haven, CT: Yale University Press.

    Google Scholar 

  • LeBreton, J. M., James, L. R., & Lindell, M. K. (2005). Recent issues regarding rWG, rWG, rWG (J), and rWG (J). Organizational Research Methods, 8(1), 128–138.

    Article  Google Scholar 

  • Lindell, M. K., & Brandt, C. J. (1997). Measuring interrater agreement for ratings of a single target. Applied Psychological Measurement, 21(3), 271–278.

    Article  Google Scholar 

  • Lindell, M. K., & Brandt, C. J. (2000). Climate quality and climate consensus as mediators of the relationship between organizational antecedents and outcomes. Journal of Applied Psychology, 85(3), 331–348.

    Article  Google Scholar 

  • Lindell, M. K., Brandt, C. J., & Whitney, D. J. (1999). A revised index of interrater agreement for multi-item ratings of a single target. Applied Psychological Measurement, 23(2), 127–135.

    Article  Google Scholar 

  • Lüdtke, O., Robitzsch, A., Trautwein, U., & Kunter, M. (2009). Assessing the impact of learning environments: How to use student ratings of classroom or school characteristics in multilevel modeling. Contemporary Educational Psychology, 34(2), 120–131.

    Article  Google Scholar 

  • Lüdtke, O., Trautwein, U., Kunter, M., & Baumert, J. (2006). Reliability and agreement of student ratings of the classroom environment: A reanalysis of TIMSS data. Learning Environments Research, 9(3), 215–230.

    Article  Google Scholar 

  • Marsh, H. W. (1987). Students’ evaluations of university teaching: Research findings, methodological issues, and directions for future research. International Journal of Educational Research, 11(3), 253–388.

    Article  Google Scholar 

  • Marsh, H. W., & Roche, L. A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52(11), 1187.

  • Marsh, H. W., Lüdtke, O., Nagengast, B., Trautwein, U., Morin, A. J., Abduljabbar, A. S., et al. (2012). Classroom climate and contextual effects: Conceptual and methodological issues in the evaluation of group-level effects. Educational Psychologist, 47(2), 106–124.

    Article  Google Scholar 

  • Martínez, F., Taut, S., & Schaaf, K. (2016). Classroom observation for evaluating and improving teaching: An international perspective. Studies in Educational Evaluation, 49, 15–29.

    Article  Google Scholar 

  • Mathieu, J., Maynard, M. T., Rapp, T., & Gilson, L. (2008). Team effectiveness 1997–2007: A review of recent advancements and a glimpse into the future. Journal of Management, 34(3), 410–476.

    Article  Google Scholar 

  • McCaffrey, D. F., Lockwood, J. R., Koretz, D., Louis, T. A., & Hamilton, L. (2004). Models for value-added modeling of teacher effects. Journal of Educational and Behavioral Statistics, 29(1), 67–101.

    Article  Google Scholar 

  • McGarity, J. R., & Butts, D. P. (1984). The relationship among teacher classroom management behavior, student engagement, and student achievement of middle and high school science students of varying aptitude. Journal of Research in Science Teaching, 21(1), 55–61.

    Article  Google Scholar 

  • McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27, 415–444.

    Article  Google Scholar 

  • Meade, A. W., & Eby, L. T. (2007). Using indices of group agreement in multilevel construct validation. Organizational Research Methods, 10(1), 75–96.

    Article  Google Scholar 

  • Mihaly, K., McCaffrey, D. F., Staiger, D. O., & Lockwood, J. (2013). A composite estimator of effective teaching. Seattle, WA: Bill & Melinda Gates Foundation.

    Google Scholar 

  • Morin, A. J., Marsh, H. W., Nagengast, B., & Scalas, L. F. (2014). Doubly latent multilevel analyses of classroom climate: An illustration. The Journal of Experimental Education, 82(2), 143–167.

    Article  Google Scholar 

  • O’Brien, R. M. (1990). Estimating the reliability of aggregate-level variables based on individual-level characteristics. Sociological Methods & Research, 18(4), 473–504.

    Article  Google Scholar 

  • Partee, G. L. (2012). Using multiple evaluation measures to improve teacher effectiveness: State strategies from round 2 of No child left behind act waivers. Washington, DC: Center for American Progress.

    Google Scholar 

  • Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices. Thousand Oaks, CA: Corwin Press.

    Google Scholar 

  • Peterson, K. D., Wahlquist, C., & Bone, K. (2000). Student surveys for school teacher evaluation. Journal of Personnel Evaluation in Education, 14(2), 135–153.

    Article  Google Scholar 

  • Phillips, M., & Yamashiro, K. (2013). Reliability and validity of a student survey measuring classroom conditions and practices: Evidence form the pilot administration of the classroom and school environment survey. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.

  • Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes: Standardized observation can leverage capacity. Educational Researcher, 38(2), 109–119.

    Article  Google Scholar 

  • Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Computational tools for probing interaction effects in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational and Behavioral Statistics, 31, 437–448.

    Article  Google Scholar 

  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (Vol. 1). Thousand Oaks, CA: Sage.

    Google Scholar 

  • Raudenbush, S. W., & Jean, M. (2014). To what extent do student perceptions of classroom quality predict teacher value added? In T. Kane, K. Kerr, & R. Pianta (Eds.), Designing teacher evaluation systems: New guidance from the measures of effective teaching project (pp. 170–202). San Francisco, CA: Wiley.

    Google Scholar 

  • Raudenbush, S. W., Rowan, B., & Kang, S. J. (1991). A multilevel, multivariate model for studying school climate with estimation via the em algorithm and application to us high-school data. Journal of Educational and Behavioral Statistics, 16(4), 295–330.

    Article  Google Scholar 

  • Roth, W. M. (1999). Learning environments research, lifeworld analysis, and solidarity in practice. Learning Environments Research, 2(3), 225–247.

    Article  Google Scholar 

  • Rothstein, J., & Mathis, W. J. (2013). Review of two culminating reports from the MET project. American Educational Research Journal, 50(1), 4–36.

    Article  Google Scholar 

  • Rowan, B., & Correnti, R. (2009). Studying reading instruction with teacher logs: Lessons from the study of instructional improvement. Educational Researcher, 38(2), 120–131.

    Article  Google Scholar 

  • Rubin, B., & Fernandes, R. (2013). The teacher as leader: Effect of teaching behaviors on class community and agreement. The International Review of Research in Open and Distributed Learning, 14(5), 1–26.

    Google Scholar 

  • Sanders, W. L. (1998). Value-added assessment. School. Administrator, 55, 24–29.

    Google Scholar 

  • Schachter, S. (1951). Deviation, rejection, and communication. The Journal of Abnormal and Social Psychology, 46(2), 190–207.

    Article  Google Scholar 

  • Seidel, T. (2006). The role of student characteristics in studying micro teaching-learning environments. Learning Environments Research, 9(3), 253–271.

    Article  Google Scholar 

  • Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in the past decade: The role of theory and research design in disentangling meta-analysis results. Review of Educational Research, 77(4), 454–499.

    Article  Google Scholar 

  • Shulman, L. S. (1987). Knowledge and teaching: Foundations of the new reform. Harvard Educational Review, 57(1), 1–23.

    Article  Google Scholar 

  • Sirotnik, K. A. (1980). Psychometric implications of the unit-of-analysis problem (with examples from the measurement of organizational climate). Journal of Educational Measurement, 17(4), 245–282.

    Article  Google Scholar 

  • Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455–488.

    Article  Google Scholar 

  • Stern, G. G., Stein, M. I., & Bloom, B. S. (1956). Methods in personality assessment. Glencoe, IL: The Free Press.

    Google Scholar 

  • Stigler, J. W., Gallimore, R., & Hiebert, J. (2000). Using video surveys to compare classrooms and teaching across cultures: Examples and lessons from the TIMSS video studies. Educational Psychologist, 35(2), 87–100.

    Article  Google Scholar 

  • Taylor, E. S., & Tyler, J. H. (2012). The effect of evaluation on teacher performance. The American Economic Review, 102(7), 3628–3651.

    Article  Google Scholar 

  • Thompson, J. E. (1974). Student evaluation of teachers. NASSP Bulletin, 58(384), 25–30.

    Article  Google Scholar 

  • Veldman, D. J., & Peck, R. F. (1969). Influences on pupil evaluations of student teachers. Journal of Educational Psychology, 60(2), 103–108.

    Article  Google Scholar 

  • Wang, M. C., Haertel, G. D., & Walberg, H. J. (1993). Toward a knowledge base for school learning. Review of Educational Research, 63(3), 249–294.

    Article  Google Scholar 

  • White, M. J. (1986). Segregation and diversity measures in population distribution. Population Index, 52, 198–221.

    Article  Google Scholar 

  • Wilkerson, D. J., Manatt, R. P., Rogers, M. A., & Maughan, R. (2000). Validation of student, principal, and self-ratings in 360 feedback for teacher evaluation. Journal of Personnel Evaluation in Education, 14(2), 179–192.

    Article  Google Scholar 

  • Worrell, F. C., & Kuterbach, L. D. (2001). The use of student ratings of teacher behaviors with academically talented high school students. Prufrock Journal, 12(4), 236–247.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonathan David Schweig.

Ethics declarations

Conflict of interest

Jonathan Schweig declares that he has no conflict of interest.

Ethical standards

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008 (5).

Informed consent

Informed consent was obtained from all individuals for being included in the study.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schweig, J.D. Moving beyond means: revealing features of the learning environment by investigating the consensus among student ratings. Learning Environ Res 19, 441–462 (2016). https://doi.org/10.1007/s10984-016-9216-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10984-016-9216-7

Keywords

Navigation