Skip to main content

Item Selection in Multidimensional Computerized Adaptive Testing—Gaining Information from Different Angles

Abstract

Over the past thirty years, obtaining diagnostic information from examinees’ item responses has become an increasingly important feature of educational and psychological testing. The objective can be achieved by sequentially selecting multidimensional items to fit the class of latent traits being assessed, and therefore Multidimensional Computerized Adaptive Testing (MCAT) is one reasonable approach to such task. This study conducts a rigorous investigation on the relationships among four promising item selection methods: D-optimality, KL information index, continuous entropy, and mutual information. Some theoretical connections among the methods are demonstrated to show how information about the unknown vector θ can be gained from different perspectives. Two simulation studies were carried out to compare the performance of the four methods. The simulation results showed that mutual information not only improved the overall estimation accuracy but also yielded the smallest conditional mean squared error in most region of θ. In the end, the overlap rates were calculated to empirically show the similarity and difference among the four methods.

This is a preview of subscription content, access via your institution.

References

  • Anderson, T.W. (1984). An introduction to multivariate statistical analysis (2nd edn.). New York: Wiley.

    Google Scholar 

  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord, & M.R. Novick (Eds.), Statistical theories of mental test scores (pp. 379–479). Reading: Addison-Wesley.

    Google Scholar 

  • Bolt, D.M., & Lall, V.F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27(6), 395–414.

    Article  Google Scholar 

  • Chaloner, K., & Verdinelli, I. (1995). Bayesian experimental design: A review. Statistical Science, 10, 237–304.

    Article  Google Scholar 

  • Chang, H.H., & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT Model. Psychometrika, 58(1), 37–52.

    Article  Google Scholar 

  • Chang, H.H., & Ying, Z.L. (1996). A global information approach to computerized adoptive testing. Applied Psychological Measurement, 20(3), 213–229.

    Article  Google Scholar 

  • Chang, H.H., & Ying, Z.L. (1999). a-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211–222.

    Article  Google Scholar 

  • Chang, H.H., & Ying, Z.L. (2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73(3), 441–450.

    Article  Google Scholar 

  • Chen, S.Y., Ankenmann, R.D., & Chang, H.H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24(3), 241–255.

    Article  Google Scholar 

  • Cheng, Y., & Chang, H.-H. (2009). The maximum priority index method for severely constrained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369–383.

    PubMed  Article  Google Scholar 

  • Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74(4), 619–632.

    Article  Google Scholar 

  • Cover, T., & Thomas, J. (1991). Elements of information theory. New York: Wiley.

    Book  Google Scholar 

  • Eggen, T. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249–261.

    Article  Google Scholar 

  • Finkelman, M., Nering, M.L., & Roussos, L.A. (2009). A conditional exposure control method for multidimensional adaptive testing. Journal of Educational Measurement, 46(1), 84–103.

    Article  Google Scholar 

  • Hattie, J. (1981). Decision criteria for determining unidimensionality. Unpublished doctoral dissertation, University of Toronto, Canada.

  • Hooker, G., Finkelman, M., & Schwartzman, A. (2009). Paradoxical results in multidimensional item response theory. Psychometrika, 74(3), 419–442.

    Article  Google Scholar 

  • Lee, Y.H., Ip, E.H., & Fuh, C.D. (2008). A strategy for controlling item exposure in multidimensional computerized adaptive testing. Educational and Psychological Measurement, 68(2), 215–232.

    Article  Google Scholar 

  • Lehmann, E.L., & Casella, G. (1998). Theory of point estimation (2nd ed.). New York: Springer.

    Google Scholar 

  • Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Erlbaum.

    Google Scholar 

  • Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.

    Google Scholar 

  • Luecht, R.M. (1996). Multidimensional computerized adaptive testing in a certification or licensure context. Applied Psychological Measurement, 20(4), 389–404.

    Article  Google Scholar 

  • Meyer, M.E., & Gokhale, O. (1993). Kullback–Leibler information measure for studying convergence rates of densities and distributions. IEEE Transactions on Information Theory, 39(4), 1401–1404.

    Article  Google Scholar 

  • Mulder, J., & van der Linden, W.J. (2009). Multidimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74(2), 273–296.

    PubMed  Article  Google Scholar 

  • Mulder, J., & van der Linden, W.J. (2010). Multidimensional adaptive testing with Kullback–Leibler information item selection. In W.J. van der Linden, & C.A.W. Glas (Eds.), Elements of adaptive testing (pp. 77–101). New York: Springer.

    Google Scholar 

  • Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9(4), 401–412.

    Article  Google Scholar 

  • Reckase, M.D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21(1), 25–36.

    Article  Google Scholar 

  • Reckase, M.D. (2009). Multidimensional item response theory. New York: Springer.

    Book  Google Scholar 

  • Reckase, M.D., & McKinley, R.L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15(4), 361–373.

    Article  Google Scholar 

  • Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35(4), 311–327.

    Article  Google Scholar 

  • Renyi, A. (1961). On measures of entropy and information. In Proceedings of the fourth berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 547–561).

    Google Scholar 

  • Runder, L.M. (2002). An examination of decision-theory adaptive testing procedures. Paper presented at the annual meeting of American Educational Research Association, New Orleans, LA.

  • Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331–354.

    Article  Google Scholar 

  • Segall, D.O. (2001). General ability measurement: An application of multidimensional item response theory. Psychometrika, 66(1), 79–97.

    Article  Google Scholar 

  • Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423.

    Google Scholar 

  • Sympson, J.B., & Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. In Proceedings of the 27th annual conference of the military testing association (Vol. 1, pp. 973–977).

    Google Scholar 

  • van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits. Applied Psychological Measurement, 20, 373 388.

    Article  Google Scholar 

  • van der Linden, W.J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201–216.

    Article  Google Scholar 

  • van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24(4), 398–412.

    Google Scholar 

  • Veldkamp, B.P., & van der Linden, W.J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67(4), 575–588.

    Article  Google Scholar 

  • Wang, C., Chang, H., & Boughton, K.A. (2011). Kullback–Leibler information and its applications in multi-dimensional adaptive testing. Psychometrika, 76, 13–39.

    Article  Google Scholar 

  • Wang, C., & Chang, H. (2010). Item selection in MCAT—the new application of Kullback–Leibler information. Paper presented at the 2010 international meeting of the psychometric society, Athens, Georgia.

  • Wang, W.C., & Chen, P.H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement, 28(5), 295–316.

    Article  Google Scholar 

  • Weissman, A. (2007). Mutual information item selection in adaptive classification testing. Educational and Psychological Measurement, 67, 41–58.

    Article  Google Scholar 

  • Xu, X., Chang, H., & Douglas, J. (2005). Computerized adaptive testing strategies for cognitive diagnosis. Paper presented at the annual meeting of national council on measurement in education, Montreal, Canada.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chun Wang.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wang, C., Chang, HH. Item Selection in Multidimensional Computerized Adaptive Testing—Gaining Information from Different Angles. Psychometrika 76, 363–384 (2011). https://doi.org/10.1007/s11336-011-9215-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-011-9215-7

Keywords

  • Kullback–Leibler information
  • Fisher information
  • mutual information
  • multidimensional computerized adaptive test
  • continuous entropy