Abstract
This study pioneers a Rasch scoring approach and compares it to a conventional summative approach for measuring longitudinal gains in student learning. In this methodological note, our proposed methodology is demonstrated using an example of rating scales in a student survey as part of a higher education outcome assessment. Such assessments have become increasingly important worldwide for purposes of institutional accreditation and accountability to stakeholders. Data were collected from a longitudinal study by tracking self-reported learning outcomes of individual students in the same cohort who completed the student learning experience questionnaire (SLEQ) in their first and final years. Rasch model was employed for item calibration and latent trait estimation, together with a scaling procedure of concurrent calibration incorporating a randomly equivalent group design and a single group design to measure the gains in self-reported learning outcomes as yielded by repeated measures. The extent to which Rasch scoring compared to the conventional summative scoring method in its sensitivity to change was quantified by a statistical index namely relative performance (RP). Findings indicated greater ability to capture learning outcomes gains from Rasch scoring over the conventional summative scoring method, with RP values ranging from 3 to 17% in the cognitive, social, and value domains of the SLEQ. The Rasch scoring approach and the scaling procedure presented in the study can be readily generalised to studies using rating scales to measure change in student learning in the higher education context. The methodological innovations and contributions of this study are discussed.
Similar content being viewed by others
Notes
In each administration year, a response rate of around 61% was achieved.
The other SLEQ items which are out of the scope of the current study assess students’ perceptions of their teaching and learning environment.
References
Andrich, D., Sheridan, B., & Luo, G. (2012). Rumm 2030. Perth: Rumm Laboratories.
Baghaei, P. (2008). Local dependency and Rasch measures. Rasch Measurement Transactions, 21(3), 1105–1106.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57(1), 289–300.
Canty, A., & Ripley, B. (2015). boot: Bootstrap R (S-Plus) Functions. R package version 1.3-17.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Colvin, K. F., Champaign, J., Liu, A., Zhou, Q., Fredericks, C., & Pritchard, D. E. (2014). Learning in an introductory Physics MOOC: All cohorts learn equally, including an on-campus class. The International Review of Research in Open and Distributed Learning, 15(4), 1–11.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their applications. Cambridge: University Press.
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82(397), 171–185.
Hambleton, R. K., Swaminathan, H., & Rogers, H. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.
Harwell, M. R., & Gatti, G. G. (2001). Rescaling ordinal data to interval data in educational research. Review of Educational Research, 71, 105–131.
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking. New York, NY: Springer.
Lindman, H. R. (1974). Analysis of variance in complex experimental designs. San Francisco: Freeman.
Liu, O. L. (2011). Outcomes assessment in higher education: Challenges and future research in the context of voluntary system of accountability. Educational Measurement: Issues and Practice, 30(3), 2–9.
Luo, N., Johnson, J. A., Shaw, J. W., & Coons, S. J. (2009). Relative efficiency of the EQ-5D, HUI2, and HUI3 index scores in measuring health burden of chronic medical conditions in a population health survey in the United States. Medical Care, 47(1), 53–60.
McHorney, C. A., Ware, J. E., Jr., Rogers, W., Raczek, A. E., & Lu, J. R. (1992). The validity and relative precision of MOS short-and long-form health status scales and Dartmouth COOP charts: Results from the Medical Outcomes Study. Medical Care, 30(5 Suppl), MS253–MS265.
Norquist, J. M., Fitzpatrick, R., Dawson, J., & Jenkinson, C. (2004). Comparing alternative Rasch-based methods vs raw scores in measuring change in health. Medical Care, 42(1), I25–I36.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.
Paek, I., Baek, S. G., & Wilson, M. (2012). An IRT modeling of change over time for repeated measures item response data using a random weights linear logistic test model approach. Asia Pacific Education Review, 13(3), 487–494.
Prosser, M., Trigwell, K., Hazel, E., & Gallagher, P. (1994). Students’ experiences of learning and teaching at the topic level. Research and Development in Higher Education, 16, 305–310.
R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/.
Rasch, G. (1980). Probabilistic models for some intelligence and attainment tests. Chicago, IL: University of Chicago Press.
Sharkness, J., & DeAngelo, L. (2011). Measuring student involvement: A comparison of classical test theory and item response theory in the construction of scales from student surveys. Research in Higher Education, 52(5), 480–507.
Wang, W. C., & Wu, Chyi-In. (2004). Gain score in item response theory as an effect size measure. Educational and Psychological Measurement, 64(5), 758–780.
Waugh, R. F. (1999). Approaches to studying for students in higher education: A Rasch measurement model analysis. British Journal of Educational Psychology, 69(1), 63–79.
Wright, B. D. (1996). Comparison requires stability. Rasch Measurement Transactions, 10, 506.
Xie, Q., Zhong, X., Wang, W. C., & Lim, C. P. (2014). Development of an item-bank to assess generic competence in a higher education institute: A Rasch modelling approach. Higher Education Research and Development, 33(4), 821–835.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, Y., Huen, J.M.Y. & Chan, Y.W. Measuring Longitudinal Gains in Student Learning: A Comparison of Rasch Scoring and Summative Scoring Approaches. Res High Educ 58, 605–616 (2017). https://doi.org/10.1007/s11162-016-9441-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11162-016-9441-z