Item Response Theory Equating

González, Jorge; Wiberg, Marie

doi:10.1007/978-3-319-51824-4_5

Jorge González⁵ &
Marie Wiberg⁶

Part of the book series: Methodology of Educational Measurement and Assessment ((MEMA))

1327 Accesses

Abstract

In this chapter, different methods of Item Response Theory (IRT) linking and equating will be discussed and illustrated using the SNSequate (González, J Stat Softw 59(7):1–30, 2014) and equateIRT (Battauz, J Stat Softw 68(7):1–22, 2015) packages. Other useful packages include ltm (Rizopoulos, J Stat Softw 17(5):1–25, 2006) and mirt (Chalmers, J Stat Softw, 48(6):1–29, 2012), which allow the user to model response data using different IRT models. IRT objects obtained from the latter packages can also be read into equateIRT and kequate (Andersson et al., J Stat Softw, 55(6):1–25, 2013) to perform IRT equating and linking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The model shown in Example 1.2 corresponds to the fixed-effects version of IRT models. For more details on the difference between the fixed-effects version and the random-effects version of the model that is presented here, see San Martín et al. (2015).
2.
When multiple test forms are to be linked, the argument coef needs a list of matrices containing the item parameter estimates corresponding to each test form.
3.
In this case, an internal call to irt.link() is made.
4.
Note that the item parameter estimates shown in Table 6.10 in Kolen and Brennan (2014) are already rescaled. This is why we have set the equating coefficients as A=1 and B=0 so that comparable results with those obtained in Kolen and Brennan (2014) are obtained.
5.
Figure 6.6 in Kolen and Brennan (2014) also shows the curve for frequency estimation equating. This curve can easily be obtained and added using the equate package as illustrated in Chap. 3
6.
Because a Rasch model is used to fit the 0/1 data, item discrimination parameters are fixed to 1 and guessing parameters fixed to 0.
7.
Some columns in the output are omitted.
8.
The mirt() function implement a general four parameter model from which the 1PL, 2PL and 3PL models are particular cases. The discrimination, difficulty and guessing parameters are denoted by a1, d, and g, respectively, whereas a fourth upper asymptote parameter is denoted by u. In the case of the Rasch model, a1=u=1 and c=0.

References

Andersson, B., Bränberg, K., & Wiberg, M. (2013). Performing the kernel method of test equating with the package kequate. Journal of Statistical Software, 55(6), 1–25.
Article Google Scholar
Andersson, B., & Wiberg, M. (2016). Item response theory observed-score kernel equating. Psychometrika. doi: 10.1007/s11336--016--9528--7.
Google Scholar
Baker, F., & Kim, S. (2004). Item response theory: Parameter estimation techniques. New York: Marcel Dekker.
Google Scholar
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48.
Article Google Scholar
Battauz, M. (2015). equateIRT: An R package for IRT test equating. Journal of Statistical Software, 68(7), 1–22.
Article Google Scholar
Bechger, T., & Maris, G. (2015). A statistical test for differential item pair functioning. Psychometrika, 80(2), 317–340.
Article Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring any examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading: Adison-Wesley.
Google Scholar
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.
Google Scholar
Chen, M. (2004). Skewed link models for categorical response data. In M. Genton (Ed.), Skew-elliptical distributions and their applications: A journey beyond normality (Vol. 1, pp. 131–152). Boca Raton: Chapman & Hall/CRC.
Google Scholar
Chen, M.-H., Dey, D. K., & Shao, Q.-M. (1999). A new skewed link model for dichotomous quantal response data. Journal of the American Statistical Association, 94(448), 1172–1186.
Google Scholar
Cook, L. L., & Eignor, D. (1991). IRT equating methods. Educational Measurement: Issues and Practice, 10(3), 37–45.
Google Scholar
De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., & Partchev, I. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39(12), 1–28.
Google Scholar
De Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.
Google Scholar
DeMars, C. (2002). Incomplete data and item parameter estimates under JMLE and MML estimation. Applied Measurement in Education, 15(1), 15–31.
Article Google Scholar
Estay, G. (2012). Characteristic curves scale transformation methods using asymmetric ICCs for IRT equating. Unpublished master’s thesis, Department of Statistics, Pontificia Universidad Catolica de Chile.
Google Scholar
Fischer, G., & Molenaar, I. (1995). Rasch models: Foundations and recent developments. New York: Springer.
Book Google Scholar
González, J. (2014). SNSequate: Standard and nonstandard statistical models and methods for test equating. Journal of Statistical Software, 59(7), 1–30.
Article Google Scholar
González, J., Wiberg, M., & von Davier A. A. (2016). A note on the Poisson’s binomial distribution in item response theory. Applied Psychological Measurement, 40(4), 302–310.
Google Scholar
Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144–149.
Google Scholar
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Dordrecht: Kluwer Nijhoff Publishing.
Google Scholar
Kiefer, T., Robitzsch, A., & Wu, M. (2016). TAM: Test analysis modules. R Package Version 1.995-0.
Google Scholar
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355–381.
Article Google Scholar
Kolen, M., & Brennan, R. (2014). Test equating, scaling, and linking: Methods and practices (3rd ed.). New York: Springer.
Book Google Scholar
Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
Lord, F., & Novick, M. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Google Scholar
Lord, F., & Wingersky, M. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”. Applied Psychological Measurement, 8(4), 453–461.
Article Google Scholar
Loyd, B. H., & Hoover, H. (1980). Vertical equating using the rasch model. Journal of Educational Measurement, 17(3), 179–193.
Google Scholar
Mair, P., & Hatzinger, R. (2007). Extended Rasch modeling: The eRm package for the application of IRT models in R. Journal of Statistical Software, 20, 1–20.
Article Google Scholar
Marco, G. L. (1977). Item characteristic curve solutions to three intractable testing problems. Journal of Educational Measurement, 14(2), 139–160.
Google Scholar
Mislevy, R. J., & Bock, R. D. (1990). BILOG 3: Item analysis and test scoring with binary logistic models. Mooresville: Scientific Software International.
Google Scholar
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51(1), 1–23.
Google Scholar
Partchev, I. (2014). Irtoys: Simple interface to the estimation and plotting of IRT models. R Package Version 0.1.7.
Google Scholar
Rizopoulos, D. (2006). ltm: An R package for latent variable modeling and item response theory analyses. Journal of Statistical Software, 17(5), 1–25.
Article Google Scholar
Robitzsch, A. (2016). sirt: Supplementary item response theory models. R Package Version 1.12.2.
Google Scholar
San Martín, E., González, J., & Tuerlinckx, F. (2015). On the unidentifiability of the fixed-effects 3PL model. Psychometrika, 80(2), 450–467.
Google Scholar
Skaggs, G., & Lissitz, R. (1986). An exploration of the robustness of four test equating models. Applied Psychological Measurement, 10(3), 303.
Article Google Scholar
Stocking, M., & Lord, F. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201–210.
Article Google Scholar
Tuerlinckx, F., Rijmen, F., Molenberghs, G., Verbeke, G., Briggs, D., van den Noortgate, W., Meulders, M., & De Boeck, P. (2004). Estimation and software. In P. D. Boeck & M. Wilson (Eds.), Explanatory item response models: A generalized linear and nonlinear approach (Vol. 1, pp. 343–373). New York: Springer.
Google Scholar
van der Linden, W. J. (Ed.) (2016). Handbook of item response theory. Three volume set. Boca Raton: Chapman and Hall/CRC.
Google Scholar
van der Linden, W. J., & Barrett, M. (2016). Linking item response model parameters. Psychometrika, 81(3), 650–673.
Google Scholar
von Davier, M., & von Davier, A. (2011). A general model for irt scale linking and scale transformations. In A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (Vol. 1, pp. 225–242). New York: Springer.
Google Scholar
Weeks, J. P. (2010). plink: An R package for linking mixed-format tests using IRT-based methods. Journal of Statistical Software, 35(12), 1–33.
Google Scholar
Wiberg, M., van der Linden, W. J., & von Davier, A. A. (2014). Local observed-score kernel equating. Journal of Educational Measurement, 51, 57–74.
Google Scholar
Wingersky, M. S., & Lord, F. M. (1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Measurement, 8(3), 347–364.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics, Pontificia Universidad CatÓlica de Chile, Santiago, Chile
Jorge González
Department of Statistics, Umeå School of Business and Economics, Umeå University, Umeå, Sweden
Marie Wiberg

Authors

Jorge González
View author publications
You can also search for this author in PubMed Google Scholar
Marie Wiberg
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

González, J., Wiberg, M. (2017). Item Response Theory Equating. In: Applying Test Equating Methods. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-319-51824-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-51824-4_5
Published: 07 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51822-0
Online ISBN: 978-3-319-51824-4
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics