Abstract
This paper examines the effects of two background variables in students' ratings of teaching effectiveness (SETs): class size and students' motivation (as surrogated by students' likelihood to respond randomly). Resampling simulation methodology has been employed to test the sensitivity of the SET scale for three hypothetical instructors (excellent, average, and poor). In an ideal scenario without confounding factors, SET statistics unmistakably distinguish the instructors. However, at different class sizes and levels of random responses, SET class averages are significantly biased. Results suggest that evaluations based on SET statistics should look at more than class averages. Resampling methodology (bootstrap simulation) is useful for SET research for scale sensitivity study, research results validation, and actual SET score analyses. Examples will be given on how bootstrap simulation can be applied to real-life SET data comparison.
Similar content being viewed by others
REFERENCES
Aleamoni, L. M. (1981). Student ratings of instruction. In J. Millman (ed.), Handbook of Teacher Evaluation (pp. 110–145). Beverly Hills, CA: Sage.
Aleamoni, L. M. (1987). Typical faculty concerns about student evaluation of teaching. Techniques for Evaluating and Improving Instruction. New Directions for Teaching and Learning 31: 25–31.
Arubayi, E. A. (1987). Improvement of instruction and teacher effectiveness: Are student ratings reliable and valid? Higher Education 16: 267–278.
Cadwell, J., and Jenkins, J. (1985). Effects of semantic similarity on student ratings of instructors. Journal of Educational Psychology 77: 383–393.
Cashin, W. E. (1988). Student ratings of teaching. A summary of research. IDEA Paper No. 20. Kansas State University.
Centra, J. A. (1993). Student evaluations of teaching: What research tells us. In Reflective Faculty Evaluation: Enhancing Teaching and Determining Faculty Effectiveness. San Francisco: Jossey-Bass.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Conover, W. J., and Iman, R. L. (1981). Rank transformation as a bridge between parametric and nonparametric statistics. American Statistician 35: 124–129.
Feldman, K. A. (1978). Course characteristics and college students' ratings of their teachers and courses: What we know and what we don't know. Research in Higher Education 9: 199–242.
Fisher, R. A. (1973). Statistical Methods for Research Workers (14th edition). New York: Halner.
Franklin, J., and Theall, M. (1989). Who Reads Ratings: Knowledge, Attitude and Practice of Users of Student Ratings of Instruction. Paper presented at the 1988 annual meeting of the American Educational Research Association, San Francisco, CA.
Howard, G. S., and Maxwell, S. E. (1980). The correlation between student satisfaction and grades: A case of mistaken causation? Journal of Education Psychology 72: 810–820.
Howard, G. S., and Maxwell, S. E. (1982). Do grades contaminate student evaluations of instruction? Research in Higher Education 16: 175–188.
Marsh, H. W. (1984). Students' evaluations of university teaching: Dimensionality, reliability validity, potential biases, and utility. Journal of Educational Psychology 76: 707–754.
Marsh, H. W. (1987). Students' evaluations of university teaching: Research findings, methodological issues, and directions for future research. International Journal of Educational Research 11: 253–388.
Marsh, H. W., and Dunkin, M. (1992). Students' evaluations of university teaching: A multidimensional perspective. In J. C. Smart (ed.), Higher Education: Handbook of Theory and Research 8: 143–233.
Murray, H. G. (1984). The impact of formative and summative evaluation of teaching, in North American University Assessment and Evaluation in Higher Education 9: 117–132.
Naftulin, D. I., Ware, J. E., and Donnelly, F. A. (1973). The Doctor Fox lecture: A paradigm of educational seduction. Journal of Medical Education 48: 630–635.
O'Hanlan, J., and Mortensen, L. (1980). Making teacher evaluation work. Journal of Higher Education 51: 664–672.
Payne, D. A., and Hobbs, A. M. (1979). The effect of college course evaluation feedback on instructor and student perceptions of instructional climate and effectiveness. Higher Education 8: 525–533.
Pratt, J. W., and Gibbons, J. D. (1981). Concepts of Nonparametric Theory. New York: Springer-Verlag.
Riggs, R. O. (1975). The prevalence and purposes of student and subordinate evaluations among AACTE member institutions. Journal of Teacher Education 26: 218–221.
Siegel, S., and Castellan, N. J. (1988). Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill Book Company.
Simon, J. L. (1992). Resampling: The New Statistics. Arlington, VA: Resampling Stats, Inc.
Smith, R. A., and Cranton, P. A. (1992) Students' perceptions of teaching skills and overall effectiveness across instructional settings. Research in Higher Education 33(6): 747–764.
Stevens, J. P. (1990). Intermediate Statistics: A Modern Approach. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Theall, M., and Franklin, J. (1990). Student ratings in the context of complex evaluation systems. Student Ratings of Instruction: Issues for Improving Practice. New Directions for Teaching and Learning 43: 17–34.
Tukey, J. W. (1962). The future of data analysis. Annals of Mathematical Statistics 33: 22.
Wagenaar, T. C. (1995). Student evaluation of teaching: Some cautions and suggestions. Teaching Sociology 23: 64–68.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Chau, CT. A Bootstrap Experiment on the Statistical Properties of Students' Ratings of Teaching Effectiveness. Research in Higher Education 38, 497–517 (1997). https://doi.org/10.1023/A:1024918711471
Issue Date:
DOI: https://doi.org/10.1023/A:1024918711471