Advertisement

Psychometrika

, Volume 76, Issue 3, pp 363–384 | Cite as

Item Selection in Multidimensional Computerized Adaptive Testing—Gaining Information from Different Angles

  • Chun Wang
  • Hua-Hua Chang
Article

Abstract

Over the past thirty years, obtaining diagnostic information from examinees’ item responses has become an increasingly important feature of educational and psychological testing. The objective can be achieved by sequentially selecting multidimensional items to fit the class of latent traits being assessed, and therefore Multidimensional Computerized Adaptive Testing (MCAT) is one reasonable approach to such task. This study conducts a rigorous investigation on the relationships among four promising item selection methods: D-optimality, KL information index, continuous entropy, and mutual information. Some theoretical connections among the methods are demonstrated to show how information about the unknown vector θ can be gained from different perspectives. Two simulation studies were carried out to compare the performance of the four methods. The simulation results showed that mutual information not only improved the overall estimation accuracy but also yielded the smallest conditional mean squared error in most region of θ. In the end, the overlap rates were calculated to empirically show the similarity and difference among the four methods.

Keywords

Kullback–Leibler information Fisher information mutual information multidimensional computerized adaptive test continuous entropy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, T.W. (1984). An introduction to multivariate statistical analysis (2nd edn.). New York: Wiley. Google Scholar
  2. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord, & M.R. Novick (Eds.), Statistical theories of mental test scores (pp. 379–479). Reading: Addison-Wesley. Google Scholar
  3. Bolt, D.M., & Lall, V.F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27(6), 395–414. CrossRefGoogle Scholar
  4. Chaloner, K., & Verdinelli, I. (1995). Bayesian experimental design: A review. Statistical Science, 10, 237–304. CrossRefGoogle Scholar
  5. Chang, H.H., & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT Model. Psychometrika, 58(1), 37–52. CrossRefGoogle Scholar
  6. Chang, H.H., & Ying, Z.L. (1996). A global information approach to computerized adoptive testing. Applied Psychological Measurement, 20(3), 213–229. CrossRefGoogle Scholar
  7. Chang, H.H., & Ying, Z.L. (1999). a-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211–222. CrossRefGoogle Scholar
  8. Chang, H.H., & Ying, Z.L. (2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73(3), 441–450. CrossRefGoogle Scholar
  9. Chen, S.Y., Ankenmann, R.D., & Chang, H.H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24(3), 241–255. CrossRefGoogle Scholar
  10. Cheng, Y., & Chang, H.-H. (2009). The maximum priority index method for severely constrained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369–383. PubMedCrossRefGoogle Scholar
  11. Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74(4), 619–632. CrossRefGoogle Scholar
  12. Cover, T., & Thomas, J. (1991). Elements of information theory. New York: Wiley. CrossRefGoogle Scholar
  13. Eggen, T. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249–261. CrossRefGoogle Scholar
  14. Finkelman, M., Nering, M.L., & Roussos, L.A. (2009). A conditional exposure control method for multidimensional adaptive testing. Journal of Educational Measurement, 46(1), 84–103. CrossRefGoogle Scholar
  15. Hattie, J. (1981). Decision criteria for determining unidimensionality. Unpublished doctoral dissertation, University of Toronto, Canada. Google Scholar
  16. Hooker, G., Finkelman, M., & Schwartzman, A. (2009). Paradoxical results in multidimensional item response theory. Psychometrika, 74(3), 419–442. CrossRefGoogle Scholar
  17. Lee, Y.H., Ip, E.H., & Fuh, C.D. (2008). A strategy for controlling item exposure in multidimensional computerized adaptive testing. Educational and Psychological Measurement, 68(2), 215–232. CrossRefGoogle Scholar
  18. Lehmann, E.L., & Casella, G. (1998). Theory of point estimation (2nd ed.). New York: Springer. Google Scholar
  19. Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Erlbaum. Google Scholar
  20. Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley. Google Scholar
  21. Luecht, R.M. (1996). Multidimensional computerized adaptive testing in a certification or licensure context. Applied Psychological Measurement, 20(4), 389–404. CrossRefGoogle Scholar
  22. Meyer, M.E., & Gokhale, O. (1993). Kullback–Leibler information measure for studying convergence rates of densities and distributions. IEEE Transactions on Information Theory, 39(4), 1401–1404. CrossRefGoogle Scholar
  23. Mulder, J., & van der Linden, W.J. (2009). Multidimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74(2), 273–296. PubMedCrossRefGoogle Scholar
  24. Mulder, J., & van der Linden, W.J. (2010). Multidimensional adaptive testing with Kullback–Leibler information item selection. In W.J. van der Linden, & C.A.W. Glas (Eds.), Elements of adaptive testing (pp. 77–101). New York: Springer. Google Scholar
  25. Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9(4), 401–412. CrossRefGoogle Scholar
  26. Reckase, M.D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21(1), 25–36. CrossRefGoogle Scholar
  27. Reckase, M.D. (2009). Multidimensional item response theory. New York: Springer. CrossRefGoogle Scholar
  28. Reckase, M.D., & McKinley, R.L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15(4), 361–373. CrossRefGoogle Scholar
  29. Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35(4), 311–327. CrossRefGoogle Scholar
  30. Renyi, A. (1961). On measures of entropy and information. In Proceedings of the fourth berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 547–561). Google Scholar
  31. Runder, L.M. (2002). An examination of decision-theory adaptive testing procedures. Paper presented at the annual meeting of American Educational Research Association, New Orleans, LA. Google Scholar
  32. Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331–354. CrossRefGoogle Scholar
  33. Segall, D.O. (2001). General ability measurement: An application of multidimensional item response theory. Psychometrika, 66(1), 79–97. CrossRefGoogle Scholar
  34. Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423. Google Scholar
  35. Sympson, J.B., & Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. In Proceedings of the 27th annual conference of the military testing association (Vol. 1, pp. 973–977). Google Scholar
  36. van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits. Applied Psychological Measurement, 20, 373 388. CrossRefGoogle Scholar
  37. van der Linden, W.J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201–216. CrossRefGoogle Scholar
  38. van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24(4), 398–412. Google Scholar
  39. Veldkamp, B.P., & van der Linden, W.J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67(4), 575–588. CrossRefGoogle Scholar
  40. Wang, C., Chang, H., & Boughton, K.A. (2011). Kullback–Leibler information and its applications in multi-dimensional adaptive testing. Psychometrika, 76, 13–39. CrossRefGoogle Scholar
  41. Wang, C., & Chang, H. (2010). Item selection in MCAT—the new application of Kullback–Leibler information. Paper presented at the 2010 international meeting of the psychometric society, Athens, Georgia. Google Scholar
  42. Wang, W.C., & Chen, P.H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement, 28(5), 295–316. CrossRefGoogle Scholar
  43. Weissman, A. (2007). Mutual information item selection in adaptive classification testing. Educational and Psychological Measurement, 67, 41–58. CrossRefGoogle Scholar
  44. Xu, X., Chang, H., & Douglas, J. (2005). Computerized adaptive testing strategies for cognitive diagnosis. Paper presented at the annual meeting of national council on measurement in education, Montreal, Canada. Google Scholar

Copyright information

© The Psychometric Society 2011

Authors and Affiliations

  1. 1.Department of PsychologyUniversity of Illinois at Urbana-ChampaignChampaignUSA

Personalised recommendations