IRT Test Equating in Complex Linkage Plans

Abstract

Linkage plans can be rather complex, including many forms, several links, and the connection of forms through different paths. This article studies item response theory equating methods for complex linkage plans when the common-item nonequivalent group design is used. An efficient way to average equating coefficients that link the same two forms through different paths will be presented and the asymptotic standard errors of indirect and average equating coefficients are derived. The methodology is illustrated using simulations studies and a real data example.

This is a preview of subscription content, log in to check access.

Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.

References

  1. Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443–459.

    Article  Google Scholar 

  2. Braun, H.I., & Holland, P.W. (1982). Observed-score test equating: a mathematical analysis of some ETS equating procedures. In P.W. Holland & D.B. Rubin (Eds.), Test equating (pp. 9–49). New York: Academic Press.

    Google Scholar 

  3. Guo, H. (2010). Accumulative equating error after a chain of linear equatings. Psychometrika, 75, 438–453.

    Article  Google Scholar 

  4. Guo, H., Liu, J., Dorans, N., & Feigenbaum, M. (2011). Multiple linking in equating and random scale drift. Princeton: Educational Testing Service. (ETS RR-11-46).

    Google Scholar 

  5. Haberman, S.J. (2009). Linking parameter estimates derived from an item response model through separate calibrations. Princeton: Educational Testing Service. (ETS RR-09-40).

    Google Scholar 

  6. Holland, P.W., & Strawderman, W.E. (2011). How to average equating functions if you must. In A.A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (pp. 89–107). New York: Springer.

    Google Scholar 

  7. Kolen, M.J., & Brennan, R.L. (2004). Test equating, scaling, and linking: methods and practices (2nd ed.). New York: Springer.

    Google Scholar 

  8. Li, D., Jiang, Y., & von Davier, A.A. (2012). The accuracy and consistency of a series of IRT true-score equatings. Journal of Educational Measurement, 49, 167–189.

    Article  Google Scholar 

  9. Li, D., Li, S., & von Davier, A.A. (2011). Applying time-series analysis to detect scale drift. In A.A. von Davier (Ed.), Statistical models for test equating, scaling, and linking, New York: Springer.

    Google Scholar 

  10. Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51, 1–23.

    Google Scholar 

  11. Ogasawara, H. (2001a). Item response theory true score equatings and their standard errors. Journal of Educational and Behavioral Statistics, 26, 31–50.

    Article  Google Scholar 

  12. Ogasawara, H. (2001b). Standard errors of item response theory equating/linking by response function methods. Applied Psychological Measurement, 25, 53–67.

    Article  Google Scholar 

  13. Ogasawara, H. (2003). Asymptotic standard errors of IRT observed-score equating methods. Psychometrika, 68, 193–211.

    Article  Google Scholar 

  14. Ogasawara, H. (2011). Applications of asymptotic expansion in item response theory linking. In A.A. von Davier (Ed.), Statistical models for test equating, scaling, and linking, New York: Springer.

    Google Scholar 

  15. Puhan, G. (2009). Detecting and correcting scale drift in test equating: an illustration from a large scale testing program. Applied Measurement in Education, 22, 79–103.

    Article  Google Scholar 

  16. R Development Core Team (2012). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

  17. Rizopoulos, D. (2006). ltm: an R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17, 1–25.

    Article  Google Scholar 

  18. van der Linden, W.J., & Hambleton, R.K. (1997). Handbook of modern item response theory. Berlin: Springer.

    Google Scholar 

  19. von Davier, A.A. (2011). Quality control and data mining techniques applied to monitoring scaled scores. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. Stamper (Eds.), Proceedings of the 4th international conference on educational data mining, Eindhoven, The Netherlands, July 6–8, 2011. Eindhoven: University of Technology Library.

    Google Scholar 

Download references

Acknowledgements

This work was supported by grants of the Italian Ministry for Education, University, and Research (MIUR).

The author thanks the editor, the associate editor, and three anonymous reviewers for their comments that contributed to improve the quality of the paper. The author is grateful to Professor R. Bellio for his helpful suggestions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Michela Battauz.

Appendices

Appendix A. Partial Derivatives of the Equating Coefficients with Respect to the Item Parameters

A.1 Indirect Coefficients

Irrespective of the method used to obtain direct equating coefficients, the partial derivatives of indirect equating coefficients used in Equation (5) are as follows:

$$\frac{\partial A_{0, \dots, l}}{\partial a_{gj}}= A_{0,\dots,g-1} \frac{\partial A_{g-1, g}}{\partial a_{gj}} A_{g, \dots, l} + A_{0,\dots,g} \frac{\partial A_{g, g+1}}{\partial a_{gj}} A_{g+1, \dots, l}. $$

Note that \(\frac{\partial A_{{g-1, g}}}{\partial a_{gj}}\) and \(\frac{\partial A_{{g, g+1}}}{\partial a_{gj}}\) are both different from zero only if item j of form g is present in both form g−1 and in form g+1. Similarly,

$$\frac{\partial A_{0, \dots, l}}{\partial b_{gj}}= A_{0,\dots,g-1} \frac{\partial A_{g-1, g}}{\partial b_{gj}} A_{g, \dots, l} + A_{0,\dots,g} \frac{\partial A_{g, g+1}}{\partial b_{gj}} A_{g+1, \dots, l} , $$

while

$$\frac{\partial B_{0, \dots, l}}{\partial a_{gj}}= \sum_{h=1}^l \biggl( \frac{\partial B_{h-1, h}}{\partial a_{gj}} A_{h, \dots, l} + B_{h-1, h} \frac{\partial A_{h, \dots, l}}{\partial a_{gj}} \biggr) , $$

where \(\frac{\partial B_{h-1, h}}{\partial a_{gj}}\) is equal to zero if gh−1 and gh and \(\frac{\partial A_{h, \dots, l}}{\partial a_{gj}}\) is equal to zero if g<h. Finally,

$$\frac{\partial B_{0, \dots, l}}{\partial b_{gj}}= \sum_{h=1}^l \biggl( \frac{\partial B_{h-1, h}}{\partial b_{gj}} A_{h, \dots, l} + B_{h-1, h} \frac{\partial A_{h, \dots, l}}{\partial b_{gj}} \biggr). $$

A.2 Bisector Equating Coefficients

The partial derivatives of the bisector coefficients with respect to direct and indirect equating coefficients relative to one of the paths that link to forms used in Equation (7) are as follows:

Appendix B. Equating Coefficients for Common-Item Equating to a Calibrated Pool

It is assumed that a gj =1 for all g and j. Then the equating coefficient for converting the parameters of Form 2 on the scale of Form 1 is given by

$$B_{21}=\frac{1}{n_{12}}\sum_{j\in I_{12}}b_{1j}-\frac{1}{n_{12}}\sum _{j\in I_{12}}b_{2j}, $$

where I 12 is the set containing the items in common between forms 1 and 2. I 13 and I 23 are defined similarly. We denote by I p3=I 13I 23 the set of items in common between the pool and Form 3, by n p3=n 13+n 23 (assuming that I 13I 23=⊘) the cardinality of I p3 and by b pj the j-th difficulty item parameter of the pool. The equating coefficient for converting the parameters of Form 3 on the scale of the pool is given by

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Battauz, M. IRT Test Equating in Complex Linkage Plans. Psychometrika 78, 464–480 (2013). https://doi.org/10.1007/s11336-012-9316-y

Download citation

Key words

  • asymptotic standard errors
  • chain equating
  • double equating
  • equating coefficients
  • item response theory
  • multiple equating
  • weighted bisector