Evaluating Equating Transformations from Different Frameworks
Test equating is used to ensure that test scores from different test forms can be used interchangeably. This paper aims to compare the statistical and computational properties from three equating frameworks: item response theory observed-score equating (IRTOSE), kernel equating and kernel IRTOSE. The real data applications suggest that IRT-based frameworks tend to provide more stable and accurate results than kernel equating. Nonetheless, kernel equating can provide satisfactory results if we can find a good model for the data, while also being much faster than the IRT-based frameworks. Our general recommendation is to try all methods and examine how much the equated scores change, always ensuring that the assumptions are met and that a good model for the data can be found.
KeywordsTest equating Item response theory Kernel equating Observed-score equating
The research in this article was funded by the Swedish Research Council grant 2014-578 and by the Fondazione Cassa di Risparmio di Padova e Rovigo.
- Braun, H. I., & Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (Vol. 1, pp. 9–49). New York: Academic Press.Google Scholar
- Dorans, N. J., & Feigenbaum, M. D. (1994). Equating issues engendered by changes to the SAT and PSAT/NMSQT. Technical issues related to the introduction of the new SAT and PSAT/NMSQT (pp. 91–122).Google Scholar
- van der Linden, W. J. (2011). Local observed-score equating. In A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (pp. 201–223). New York: Springer.Google Scholar
- Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
- Meng, Y. (2012). Comparison of kernel equating and item response theory equating methods. Dissertation submitted to the graduate school of the University of Massachusetts Amherst in partial fulfillment of the requirements for the degree of doctor of education, University of Massachusetts Amherst.Google Scholar
- R Core Team. (2017). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
- Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25. http://www.jstatsoft.org/v17/i05/.