Regular Feedback from Student Ratings of Instruction: Do College Teachers Improve their Ratings in the Long Run?

Lang, Jonas W. B.; Kersting, Martin

doi:10.1007/s11251-006-9006-1

Regular Feedback from Student Ratings of Instruction: Do College Teachers Improve their Ratings in the Long Run?

Published: 14 November 2006

Volume 35, pages 187–205, (2007)
Cite this article

Instructional Science Aims and scope Submit manuscript

Jonas W. B. Lang¹ &
Martin Kersting¹

445 Accesses
35 Citations
3 Altmetric
Explore all metrics

Abstract

The authors examined whether feedback from student ratings of instruction not augmented with consultation helps college teachers to improve their student ratings on a long-term basis. The study reported was conducted in an institution where no previous teaching-effectiveness evaluations had taken place. At the end of each of four consecutive semesters, student ratings were assessed and teachers were provided with feedback. Data from 3122 questionnaires evaluating 12 teachers were analyzed using polynomial and piecewise random coefficient models. Results revealed that student ratings increased from the no-feedback baseline semester to the second semester and then gradually decreased from the second to the fourth semester, although feedback was provided after each semester. The findings suggest that student ratings not augmented with consultation are far less effective than typically assumed when considered from a long-term perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Theories of Motivation in Education: an Integrative Framework

Article Open access 30 March 2023

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Article Open access 07 June 2017

The Impact of Peer Assessment on Academic Performance: A Meta-analysis of Control Group Studies

Article Open access 10 December 2019

References

Abrami P.C., d’Apollonia S. (1991). Multidimensional students’ evaluations of teaching effectiveness? Generalizability of “N = 1” research: Comment on Marsh (1991). Journal of Educational Psychology 83:411–415
Article Google Scholar
Abrami P.C., d’Apollonia S., Cohen P.A. (1990). Validity of student ratings of instruction: What we know and what we do not. Journal of Educational Psychology 82:219–231
Article Google Scholar
Adair J.G., Sharpe D., Huynh C.L. (1989). Hawthorne control procedures in educational experiments: A reconsideration of their use and effectiveness. Review of Educational Research 59:215–228
Article Google Scholar
Akaike H. (1973). Information theory as an extension of the maximum likelihood principle. In: Petrov B.N., Csaki F. (eds). Second international symposium on information theory. Akademiai Kiado, Budapest, Hungary, pp. 267–281
Google Scholar
Armstrong S.J. (1998). Are student ratings of instruction useful?. American Psychologist 53:1223–1224
Article Google Scholar
Basow S.A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology 87:656–665
Article Google Scholar
Biesanz J.C., Deeb-Sossa N., Papadakis A.A., Bollen K.A., Curran P.J. (2004). The role of coding time in estimating and interpreting growth curve models. Psychological Methods 9:30–52
Article Google Scholar
Bliese P.D. (2000). Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In: Klein K.J., Kozlowski S.W.J. (eds). Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions. Jossey-Bass, San Francisco, CA, pp. 349–381
Google Scholar
Bliese P.D., Jex S.M. (2002). Incorporating a multilevel perspective into occupational stress research: Theoretical, methodological, and practical implications. Journal of Occupational Health Psychology 7:265–276
Article Google Scholar
Bliese P.D., Ployhart R.E. (2002). Growth modeling using random coefficient models: Model building, testing, and illustrations. Organizational Research Methods 5:362–387
Google Scholar
Bryk A.S., Raudenbush S.W. (1987). Applications of hierarchical linear models to assessing change. Psychological Bulletin 101:147–158
Article Google Scholar
Campbell D.T., Stanley J.C. (1963). Experimental and quasi-experimental designs for research. Rand McNally, Chicago
Google Scholar
Carlson K.D., Schmidt F.L. (1999). Impact of experimental design on effect size: Findings from the research literature on training. Journal of Applied Psychology 84:851–862
Article Google Scholar
Carter R.E. (1989). Comparison of criteria for academic promotion of medical-school and university-based psychologists. Professional Psychology: Research and Practice 20:400–403
Article Google Scholar
Cashin W.E., Downey R.G. (1992). Using global student rating items for summative evaluation. Journal of Educational Psychology 84:563–572
Article Google Scholar
Cohen J., Cohen P., West S.G., Aiken L.S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed). Erlbaum, Mahwah, NJ
Google Scholar
Cohen P.A. (1980). Effectiveness of student-rating feedback for improving college instruction: A meta-analysis. Research in Higher Education 13:321–341
Article Google Scholar
Coleman J., McKeachie W.J. (1981). Effects of instructor/course evaluations on student course selection. Journal of Educational Psychology 73:224–226
Article Google Scholar
Cronbach L.J., Furby L. (1970). How we should measure “change”: Or should we? Psychological Bulletin 74:68–80
Article Google Scholar
d’Apollonia S., Abrami P.C. (1997). Navigating student ratings of instruction. American Psychologist 52:1198–1208
Article Google Scholar
DeShon R.P., Ployhart R.E., Sacco J.M. (1998). The estimation of reliability in longitudinal models. International Journal of Behavior and Development 22:493–515
Article Google Scholar
Diehl, J.M. (2002) VBVOR–VBREF. Fragebögen zur studentischen Evaluation von Hochschulveranstaltungen – Manual. [VBVOR – VBREF. Questionnaires for students’ evaluations of college courses–Manual]. Retrieved on March 17, 2005 from http://www.psychol.uni-giessen.de/dl/det/diehl/2368/
Diehl J.M. (2003). Normierung zweier Fragebögen zur studentischen Beurteilung von Vorlesungen und Seminaren [Student evaluations of lectures and seminars: Norms for two recently developed questionnaires]. Psychologie in Erziehung und Unterricht 50:27–42
Google Scholar
Diehl J.M., Kohr H.-U. (1977). Entwicklung eines Fragebogens zur Beurteilung von Hochschulveranstaltungen im Fach Psychologie [Development of a psychology course evaluation questionnaire]. Psychologie in Erziehung und Unterricht 24:61–75
Google Scholar
Firebaugh G. (1978). A rule for inferring individual-level relationships from aggregate data. American Sociological Review 43:557–572
Article Google Scholar
Franklin J., Theall M. (2002). Faculty thinking about the design and evaluation of instruction. In: Hativa N., Goodyear P. (eds). Teacher thinking, beliefs and knowledge in higher education. Kluwer, Dordrecht, The Netherlands
Google Scholar
Greenwald A.G. (1997). Validity concerns and usefulness of student ratings of instruction. American Psychologist 52:1182–1186
Article Google Scholar
Greenwald A.G., Gillmore G.M. (1997a). Grading leniency is a removable contaminant of student ratings. American Psychologist 52:1209–1217
Article Google Scholar
Greenwald A.G., Gillmore G.M. (1997b). No pain, no gain? The importance of measuring course workload in student ratings of instruction. Journal of Educational Psychology 89:743–751
Article Google Scholar
Greenwald A.G., Gillmore G.M. (1998). How useful are student ratings? Reactions to comments on the current issues section. American Psychologist 53:1228–1229
Article Google Scholar
Guzzo R.A., Jette R.D., Katzell R.A. (1985) The effects of psychologically based intervention programs on worker productivity: A meta-analysis. Personnel Psychology 38:275–291
Article Google Scholar
Hernández-Lloreda M.V., Colmenares F., Martínez-Arias R. (2004). Application of piecewise hierarchical linear growth modeling to the study of continuity in behavioral development of baboons (Papio hamadryas). Journal of Comparative Psychology 118:316–324
Article Google Scholar
Hofmann D.A., Jacobs R., Baratta J.E. (1993). Dynamic criteria and the measurement of change. Journal of Applied Psychology 78:194–204
Article Google Scholar
Howell A.J., Symbaluk D.G. (2001). Published student ratings of instruction: Revealing and reconciling the views of students and faculty. Journal of Educational Psychology 93:790–796
Article Google Scholar
James L.R., Demaree R.Q., Wolf G. (1984). Estimating withingroup interrater reliability with and without response bias. Journal of Applied Psychology 69:85–98
Article Google Scholar
Klein K.J., Dansereau F., Hall R.J. (1994). Levels issues in theory development, data collection, and analysis. Academy of Management Review 19:195–229
Article Google Scholar
Kluger A.N., DeNisi A. (1996). The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin 119:254–284
Article Google Scholar
L’Hommedieu R., Menges R.J., Brinko K.T. (1990). Methodological explanations for the modest effects of feedback from student ratings. Journal of Educational Psychology 82:232–241
Article Google Scholar
Longford N. (1993). Random coefficient models. Oxford University Press, Oxford
Google Scholar
Marsh H.W. (1991). Multidimensional students’ evaluations of teaching effectiveness: A test of alternative higher-order structures. Journal of Educational Psychology 83:285–296
Article Google Scholar
Marsh H.W. (1994). Comments to: “Review of the dimensionality of student ratings of instruction: I. Introductory remarks. II. Aggregation of factor studies. III. A meta-analysis of the factor studies”. Instructional Evaluation and Faculty Development 14:13–19
Google Scholar
Marsh H.W., Hocevar D. (1991). The multidimensionality of students’ evaluations of teaching effectiveness: The generality of factor structures across academic discipline, instructor level, and course level. Teaching and Teacher Education 7:9–18
Article Google Scholar
Marsh H.W., Roche L.A. (1993). The use of student evaluations and an individually structured intervention to enhance university teaching effectiveness. American Educational Research Journal 30:217–251
Article Google Scholar
Marsh H.W., Roche L.A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist 52:1187–1197
Article Google Scholar
Marsh H.W., Roche L.A. (2000). Effects of grading leniency and low workload on students’ evaluations of teaching: Popular myth, bias, validity, or innocent bystanders? Journal of Educational Psychology 92:202–228
Article Google Scholar
McKeachie W.J. (1997). Student ratings—the validity of use. American Psychologist 52:1218–1225
Article Google Scholar
Pinheiro J.C., Bates D.M. (2000). Mixed-effects models in S and S-PLUS. Springer, New York
Google Scholar
R Development Core Team (2005). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Google Scholar
Raftery A.E. (1995). Bayesian model selection in social research. Sociological Methodology 25:111–196
Article Google Scholar
Raudenbush S.W., Bryk A.S. (2002). Hierarchical linear models: Applications and data analysis methods, 2nd edn. Sage, Thousand Oaks, CA
Google Scholar
Rousseau D. (1985). Issues of level in organizational research: Multi-level and cross-level perspectives. In: Cummings L.L., Staw B.M. (eds). Research in organizational behavior, vol. 7. JAI Press, Greenwich, CT, pp. 1–37
Google Scholar
Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics 6:461–464
Google Scholar
Stevens J.J., Aleamoni L.M. (1985). The use of evaluative feedback for instructional improvement: A longitudinal perspective. Instructional Science 13:285–304
Article Google Scholar
Ting K. (2000). Cross-level effects of class characteristics on students’ perceptions of teaching quality. Journal of Educational Psychology 92:818–825
Article Google Scholar
Wilhelm W.B. (2004). The relative influence of published teaching evaluations and other instructor attributes on course choice. Journal of Marketing Education 26:17–30
Article Google Scholar
Wood R.E., Locke E.A. (1990). Goal setting and strategy effects on complex tasks. In: Cummings L.L., Staw B.M. (eds). Research in organizational behavior, vol. 12. JAI Press, Greenwich, CT, pp. 73–109
Google Scholar

Download references

Acknowledgements

We would like to thank Jessica Ippolito, Anette Kluge, Jan Schilling, and two anonymous reviewers for their helpful comments on an earlier version of this article, and Paul D. Bliese for answering questions on random coefficient modeling and data aggregation. Further thanks go to Susannah Goss for improving the language of this article.

Author information

Authors and Affiliations

Institute of Psychology, RWTH Aachen University, Jägerstr. 17-19, 52056, Aachen, Germany
Jonas W. B. Lang & Martin Kersting

Authors

Jonas W. B. Lang
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kersting
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonas W. B. Lang.

Appendix: Items used in the study

1.
What grade would you give the instructor?
2.
What grade would you give this class?
3.
The instructor was not particularly interested in the students’ progress. (R)
4.
The instructor’s attitude toward the students was cold and unpersonal. (R)
5.
The instructor seemed to see teaching as a duty and a routine activity. (R)
6.
The instructor was clearly only interested in getting through the material. (R)
7.
It was easy to follow the material covered in the course.
8.
Too much material was covered in the course. (R)
9.
The pace was too fast. (R)
10.
You had to put in a lot of extra work to keep up with the course. (R)
11.
The course was often confusing because it seemed to lack structure, and it was easy to get lost. (R)
12.
The instructor presented the material in a clear and understandable manner.
13.
The instructor planned and delivered the course well.
14.
The course was clearly structured.

Note: Original German versions of the items may be found in Diehl (2002). Items 1 and 2 are from the global subscale, items 3–6 from the rapport subscale, items 7–10 from the difficulty subscale, and items 10–14 from the teaching skill subscale. R = items scored reversely to form the overall index.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lang, J.W.B., Kersting, M. Regular Feedback from Student Ratings of Instruction: Do College Teachers Improve their Ratings in the Long Run?. Instr Sci 35, 187–205 (2007). https://doi.org/10.1007/s11251-006-9006-1

Download citation

Received: 29 November 2005
Accepted: 16 August 2006
Published: 14 November 2006
Issue Date: May 2007
DOI: https://doi.org/10.1007/s11251-006-9006-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regular Feedback from Student Ratings of Instruction: Do College Teachers Improve their Ratings in the Long Run?

Abstract

Access this article

Similar content being viewed by others

Theories of Motivation in Education: an Integrative Framework

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

The Impact of Peer Assessment on Academic Performance: A Meta-analysis of Control Group Studies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Items used in the study

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Regular Feedback from Student Ratings of Instruction: Do College Teachers Improve their Ratings in the Long Run?

Abstract

Access this article

Similar content being viewed by others

Theories of Motivation in Education: an Integrative Framework

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

The Impact of Peer Assessment on Academic Performance: A Meta-analysis of Control Group Studies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Items used in the study

Appendix: Items used in the study

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation