Research in Higher Education

, Volume 53, Issue 8, pp 888–904 | Cite as

Measuring Teaching Effectiveness: Correspondence Between Students’ Evaluations of Teaching and Different Measures of Student Learning

  • Sebastian StehleEmail author
  • Birgit Spinath
  • Martina Kadmon


Relating students’ evaluations of teaching (SETs) to student learning as an approach to validate SETs has produced inconsistent results. The present study tested the hypothesis that the strength of association of SETs and student learning varies with the criteria used to indicate student learning. A multisection validity approach was employed to investigate the association of SETs and two different criteria of student learning, a multiple-choice test and a practical examination. Participants were N = 883 medical students, enrolled in k = 32 sections of the same course. As expected, results showed a strong positive association between SETs and the practical examination but no significant correlation between SETs and multiple-choice test scores. Furthermore, students’ subjective perception of learning significantly correlated with the practical examination score whereas no relation was found for subjective learning and the multiple choice test. It is discussed whether these results might be due to different measures of student learning varying in the degree to which they reflect teaching effectiveness.


Students’ evaluations of teaching SETs Teaching effectiveness Student learning 



Preparation of the manuscript was supported by a doctoral fellowship through the Landesgraduiertenförderung-LGFG (Funding program of the German Federal State of Baden-Württemberg) awarded to Sebastian Stehle. We thank Gerald Wibbecke and Dr. Monika Porsche for their assistance in the data collection and Dr. Anna Ropeter and Janine Kahman for reviewing the manuscript.


  1. Abrami, P. C., d’Appollonia, S., & Cohen, P. A. (1990). Validity of student ratings of instruction: What we know and what we do not. Journal of Educational Psychology, 82, 219–231.CrossRefGoogle Scholar
  2. Abrami, P. C., & Mizener, D. A. (1985). Student/instructor attitude similarity, student ratings, and course performance. Journal of Educational Psychology, 77, 693–702.CrossRefGoogle Scholar
  3. Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998. Journal of Personnel Evaluation in Education, 13, 153–166.CrossRefGoogle Scholar
  4. Clayson, D. E. (2009). Student evaluations of teaching: Are they related to what students learn? A meta-analysis and review of the literature. Journal of Marketing Education, 31, 16–30.CrossRefGoogle Scholar
  5. Cohen, P. A. (1981). Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies. Review of Educational Research, 51, 281–309.Google Scholar
  6. Ellis, L., Burke, D. M., Lomire, P., & McCormack, D. R. (2003). Student grades and average ratings of instructional quality: The need for adjustment. Journal of Educational Research, 97, 35–40.CrossRefGoogle Scholar
  7. Greenwald, A. G., & Gillmore, G. M. (1997). No pain, no gain? The importance of measuring course workload in student ratings of instruction. Journal of Educational Psychology, 89, 743–751.CrossRefGoogle Scholar
  8. Gross, J., Lakey, B., Edinger, K., Orehek, E., & Heffron, D. (2009). Person perception in the college classroom: Accounting for taste in students’ evaluations of teaching effectiveness. Journal of Applied Social Psychology, 39, 1609–1638.CrossRefGoogle Scholar
  9. Koon, J., & Murray, H. G. (1995). Using multiple outcomes to validate student ratings of overall teacher effectiveness. Journal of Higher Education, 66, 61–81.CrossRefGoogle Scholar
  10. Kromrey, H. (1994). Evaluation der Lehre durch Umfrageforschung? [Evaluation of teaching through survey research?]. in P.Ph. Mohler (Ed.), Universität und Lehre. Ihre Evaluation als Herausforderung an die Empirische Sozialforschung (University and Teaching. Their Evaluation as a Challenge for Empirical Social Research) (p. 91–114). Münster: Waxmann.Google Scholar
  11. Kulik, J. A. (2001). Student ratings: Validity, utility, and controversy. New Directions for Institutional Research, 109, 9–25.CrossRefGoogle Scholar
  12. Marsh, H. W., & Roche, L. A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52, 1187–1197.CrossRefGoogle Scholar
  13. Marsh, H. W., & Roche, L. A. (2000). Effects of grading leniency and low workload on students’ evaluations of teaching: Popular myth, bias, validity, or innocent bystanders? Journal of Educational Psychology, 92, 202–228.CrossRefGoogle Scholar
  14. McKeachie, W. J. (1979). Student ratings of faculty: A reprise. Academe, 65, 384–397.CrossRefGoogle Scholar
  15. McKeachie, W. J. (1997). Student ratings: The validity of use. American Psychologist, 52, 1218–1225.CrossRefGoogle Scholar
  16. Murray, H. G. (2005, June). Student evaluation of teaching: Has it made a difference? Paper presented at the Annual Meeting of the Society for Teaching and Learning in Higher Education, Charlottetown, Prince Edward Island, Canada.Google Scholar
  17. Neath, I. (1996). How to improve your teaching evaluations without improving your teaching. Psychological Reports, 78, 1363–1372.CrossRefGoogle Scholar
  18. Prosser, M., & Trigwell, K. (1991). Student evaluations of teaching and courses: Student learning approaches and outcomes as criteria of validity. Contemporary Educational Psychology, 16, 293–301.CrossRefGoogle Scholar
  19. Rindermann, H. (1996). Untersuchungen zur Brauchbarkeit studentischer Lehrevaluationen [Analyses on the Usefulness of Student Evaluations of Teaching]. Landau: Verlag Empirische Pädagogik.Google Scholar
  20. Rindermann, H., & Amelang, M. (1994). Das Heidelberger Inventar zur Lehrveranstaltungs-Evaluation (HILVE). Handanweisung [The Heidelberg Inventory for Evaluation of Teaching (HILVE). Manual]. Heidelberg: Asanger.Google Scholar
  21. Sitzmann, T., Ely, K., Brown, K. G., & Bauer, K. N. (2010). Self-assessment of knowledge: A cognitive learning or affective measure? Academy of Management Learning and Education, 9, 169–191.CrossRefGoogle Scholar
  22. Stark-Wroblewski, K., Ahlering, R. F., & Brill, F. M. (2007). Toward a more comprehensive approach to evaluating teaching effectiveness: Supplementing student evaluations of teaching with pre-post learning measures. Assessment and Evaluation in Higher Education, 32, 403–415.CrossRefGoogle Scholar
  23. Svinivki, M., & McKeachie, W. J. (2010). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers. Boston: Houghton Mifflin.Google Scholar
  24. Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment and Evaluation in Higher Education, 23, 191–211.CrossRefGoogle Scholar
  25. Wilson, R. (1998). New research casts doubt on value of student evaluations of professors. Chronicle of Higher Education, 44, A12–A14.Google Scholar
  26. Zabaleta, F. (2007). The use and misuse of student evaluations of teaching. Teaching in Higher Education, 12, 55–76.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Sebastian Stehle
    • 1
    Email author
  • Birgit Spinath
    • 2
  • Martina Kadmon
    • 3
  1. 1.Department of PsychologyGoethe University FrankfurtFrankfurt am MainGermany
  2. 2.Department of PsychologyHeidelberg UniversityHeidelbergGermany
  3. 3.Department of SurgeryHeidelberg University HospitalHeidelbergGermany

Personalised recommendations