Advertisement

An Overview of Computerized Adaptive Testing

  • David Magis
  • Duanli Yan
  • Alina A. von Davier
Chapter
Part of the Use R! book series (USE R)

Abstract

In this chapter, we present a brief overview of computerized adaptive testing theory, including test design, test assembly, item bank, item selection, scoring and equating, content balance, item exposure and security. We also provide a summary of the IRT-based item selection process with a list of the commonly used item selection methods, as well as a brief outline of the tree-based adaptive testing.

References

  1. Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and non-compensatory multidimensional items. Applied Psychological Measurement, 13, 113–127. https://doi.org/10.1177/014662168901300201 CrossRefGoogle Scholar
  2. Barrada, J. R., Mazuela, P., & Olea, J. (2006). Maximum information stratification method for controlling item exposure in computerized adaptive testing. Psicothema, 18, 156–159.Google Scholar
  3. Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2008). Incorporating randomness to the fisher information for improving item exposure control in cats. British Journal of Mathematical and Statistical Psychology, 61, 493–513. https://doi.org/10.1348/000711007X230937 MathSciNetCrossRefGoogle Scholar
  4. Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2009). Item selection rules in computerized adaptive testing: Accuracy and security. Methodology, 5, 7–17. https://doi.org/10.1027/1614-2241.5.1.7 CrossRefGoogle Scholar
  5. Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 34, 438–452. https://doi.org/10.1177/0146621610370152 CrossRefGoogle Scholar
  6. Bejar, I. I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003). A feasibility study of on-the-fly item generation in adaptive testing. Journal of Technology, Learning, and Assessment, 2(3), 3–29.Google Scholar
  7. Belov, D. I., & Armstrong, R. D. (2009). Direct and inverse problems of item pool design for computerized adaptive testing. Educational and Psychological Measurement, 69, 533–547. https://doi.org/10.1177/0013164409332224 MathSciNetCrossRefGoogle Scholar
  8. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
  9. Breiman, L., Friedman, L., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. New York: CRC Press.zbMATHGoogle Scholar
  10. Chang, H.-H., & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20, 213–229. https://doi.org/10.1177/014662169602000303 CrossRefGoogle Scholar
  11. Chang, S., & Ansley, T. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71–103. https://doi.org/10.1111/j.1745-3984.2003.tb01097.x CrossRefGoogle Scholar
  12. Cheng, Y., & Chang, H.-H. (2009). The maximum priority index method for severely constrained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369–383. https://doi.org/10.1348/000711008X304376 MathSciNetCrossRefGoogle Scholar
  13. Cheng, Y., Chang, H.-H., Douglas, J., & Guo, F. (2009). Constraint-weighted a-stratification for computerized adaptive testing with nonstatistical constraints: Balancing measurement efficiency and exposure control. Educational and Psychological Measurement, 69, 35–49. https://doi.org/10.1177/0013164408322030 MathSciNetCrossRefGoogle Scholar
  14. Choi, S. W. (2009). FIRESTAR: Computerized adaptive testing simulation program for polytomous item response theory models. Applied Psychological Measurement, 33, 644–645. https://doi.org/10.1177/0146621608329892 CrossRefGoogle Scholar
  15. Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 32, 419–440. https://doi.org/10.1177/0013164408322030 MathSciNetCrossRefGoogle Scholar
  16. Davey, T., & Parshall, C. G. (1999). New algorithms for item selection and exposure control with computerized adaptive testing. Paper presented at the annual meeting of the American Educational Research Association.Google Scholar
  17. Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Measurement, 19, 5–22. https://doi.org/10.1177/014662169501900103 CrossRefGoogle Scholar
  18. Eggen, T. J. H. M. (2010). Three-category adaptive classification testing. In W. V. der Linden & C. A. W. Glas (Eds.), Elements of adaptive testing (pp. 373–387). New York: Springer.Google Scholar
  19. Geerlings, H., Glas, C., & van der Linden, W. J. (2011). Modeling rule-based item generation. Psychometrika, 76, 337–359. https://doi.org/10.1007/s11336-011-9204-x MathSciNetCrossRefzbMATHGoogle Scholar
  20. Glas, C. A. W., & van der Linden, W. J. (2003). Computerized adaptive testing with item cloning. Applied Psychological Measurement, 27, 247–261. https://doi.org/10.1177/0146621603027004001 MathSciNetCrossRefGoogle Scholar
  21. Glas, C. A. W., & Vos, H. J. (2010). Adaptive mastery testing using a multidimensional IRT model. In W. V. der Linden & C. A. W. Glas (Eds.), Elements of adaptive testing (pp. 409–431). New York: Springer.Google Scholar
  22. Hetter, R. D., & Sympson, J. B. (1997). Item exposure control in CAT-ASVAB. In J. R. McBride (Ed.), Computerized adaptive testing: From inquiry to operation (pp. 141–144). Washington, D.C.: American Psychological Association.CrossRefGoogle Scholar
  23. Hsu, C.-L., Wang, W.-C., & Chen, S.-Y. (2013). Variable-length computerized adaptive testing based on cognitive diagnosis models. Applied Psychological Measurement, 37, 563–582. https://doi.org/10.1177/0146621613488642 CrossRefGoogle Scholar
  24. Huitzing, H. A., Veldkamp, B. P., & Verschoor, A. J. (2005). Infeasibility in automated test assembly models: A comparison study of different methods. Journal of Educational Measurement, 42, 223–243. https://doi.org/10.1111/j.1745-3984.2005.00012.x CrossRefGoogle Scholar
  25. Irvine, S., & Kyllonen, P. (2002). Item generation for test development. Mahwah, NJ: Lawrence ErlbaumGoogle Scholar
  26. Kaplan, M., de la Torre, J., & Barrada, J. R. (2015). New item selection methods for cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 39, 167–188. https://doi.org/10.1177/0146621614554650 CrossRefGoogle Scholar
  27. Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2, 359–375. https://doi.org/10.1207/s15324818ame0204\_6 CrossRefGoogle Scholar
  28. Kingsbury, G. G., & Zara, A. R. (1991). A comparison of procedures for content-sensitive item selection in computerized adaptive tests. Applied Measurement in Education, 4, 241–261. https://doi.org/10.1207/s15324818ame0403\_4 CrossRefGoogle Scholar
  29. Leung, C. K., Chang, H.-H., & Hau, K. T. (2003). Computerized adaptive testing: A comparison of three content balancing methods. The Journal of Technology, Learning and Assessment, 2, 1–15.Google Scholar
  30. Lewis, C., & Sheehan, K. (1990). Using Bayesian decision theory to design a computerized mastery test. Applied Psychological Measurement, 14, 367–386. https://doi.org/10.1177/014662169001400404 CrossRefGoogle Scholar
  31. Lord, F. M. (1977). A broad-range tailored test of verbal ability. Applied Psychological Measurement, 1, 95–100. https://doi.org/10.1177/014662167700100115 CrossRefGoogle Scholar
  32. Luecht, R. M. (1998). Computer-assisted test assembly using optimization heuristics. Applied Psychological Measurement, 22, 224–236. https://doi.org/10.1177/01466216980223003 CrossRefGoogle Scholar
  33. Magis, D. (2013). A note on the item information function of the four-parameter logistic model. Applied Psychological Measurement, 37, 304–315. https://doi.org/10.1177/0146621613475471 CrossRefGoogle Scholar
  34. Magis, D. (2015a). Empirical comparison of scoring rules at early stages of CAT. Paper presented at the Conference of the International Association for Computerized Adaptive Testing, Cambridge, UK.Google Scholar
  35. Magis, D. (2015b). A note on the equivalence between observed and expected information functions with polytomous IRT models. Journal of Educational and Behavioral Statistics, 40, 96–105. https://doi.org/10.3102/1076998614558122 CrossRefGoogle Scholar
  36. Magis, D., & Barrada, J. R. (2017). Computerized adaptive testing with R: Recent updates of the package catR. Journal of Statistical Software, Code Snippets, 76(1), 1–19. https://doi.org/10.18637/jss.v076.c01 Google Scholar
  37. Magis, D., & Raîche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48(8), 1–31. https://doi.org/10.18637/jss.v048.i08 CrossRefGoogle Scholar
  38. Magis, D., & Verhelst, N. (2017). On the finiteness of the weighted likelihood estimator of ability. Psychometrika. https://doi.org/10.1007/s11336-016-9518-9
  39. McBride, J. R., & Martin, J. T. (1983). Reliability and validity of adaptive ability tests in a military setting. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 224–236). New York: Academic Press.Google Scholar
  40. McClarty, K. L., Sperling, R. A., & Dodd, B. G. (2006). A variant of the progressive-restricted item exposure control procedure in computerized adaptive testing systems based on the 3PL and partial credit models. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.Google Scholar
  41. Mills, C. N., Potenza, M. T., Fremer, J. J., & Ward, W. C. (2002). Computer-based testing: Building the foundation for future assessments. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
  42. Parshall, C. G., Davey, T., & Nering, M. L. (1998). Test development exposure control for adaptive testing. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.Google Scholar
  43. Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.CrossRefzbMATHGoogle Scholar
  44. Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35, 311–327. https://doi.org/10.1111/j.1745-3984.1998.tb00541.x CrossRefGoogle Scholar
  45. Riley, B. B., Dennis, M. L., & Conrad, K. J. (2010). A comparison of content-balancing procedures for estimating multiple clinical domains in computerized adaptive testing: Relative precision, validity, and detection of persons with misfitting responses. Applied Psychological Measurement, 34, 410–423. https://doi.org/10.1177/0146621609349802 CrossRefGoogle Scholar
  46. Rulison, K., & Loken, E. (2009). I’ve fallen and I can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33, 83–101. https://doi.org/10.1177/0146621608324023 MathSciNetCrossRefGoogle Scholar
  47. Samejima, F. (1977). A use of the information function in tailored testing. Applied Psychological Measurement, 1, 233–247. https://doi.org/10.1177/014662167700100209 CrossRefGoogle Scholar
  48. Segall, D. O. (2004). A sharing item response theory model for computerized adaptive testing. Journal of Educational and Behavioral Statistics, 29, 439–460. https://doi.org/10.3102/10769986029004439 CrossRefGoogle Scholar
  49. Segall, D. O. (2010). Principles of multidimensional adaptive testing. In W. V. der Linden & C. A. W. Glas (Eds.), Elements of adaptive testing (pp. 57–75). New York: Springer.Google Scholar
  50. Smith, R., & Lewis, C. (2014). Multistage testing for categorical decisions. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 189–203). New York: CRC Press.Google Scholar
  51. Stocking, M. L., & Lewis, C. (1998). Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23, 57–75. https://doi.org/10.3102/10769986023001057 CrossRefGoogle Scholar
  52. Swanson, L., & Stocking, M. (1993). A model and heuristic for solving very large item selection problem. Applied Psychological Measurement, 17, 151–166. https://doi.org/10.1177/014662169301700205 CrossRefGoogle Scholar
  53. Urry, V. W. (1970). A Monte Carlo investigation of logistic test models. Unpublished doctoral dissertation, Purdue University, West Lafayette, IN.Google Scholar
  54. van der Linden, W. J. (1998a). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201–216. https://doi.org/10.1007/BF02294775 MathSciNetCrossRefzbMATHGoogle Scholar
  55. van der Linden, W. J. (1998b). Optimal test assembly of psychological and educational tests. Applied Psychological Measurement, 22, 195–211. https://doi.org/10.1177/01466216980223001 CrossRefGoogle Scholar
  56. van der Linden, W. J. (2000). Constrained adaptive testing with shadow tests. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 27–52). Norwell, MA: Kluwer.CrossRefGoogle Scholar
  57. van der Linden, W. J. (2005). A comparison of item-selection methods for adaptive tests with content constraints. Journal of Educational Measurement, 42, 283–302. https://doi.org/10.1111/j.1745-3984.2005.00015.x CrossRefGoogle Scholar
  58. van der Linden, W. J., Ariel, A., & Veldkamp, B. P. (2006). Assembling a computerized adaptive testing item pool as a set of linear tests. Journal of Educational and Behavioral Statistics, 31, 81–99. https://doi.org/10.3102/10769986031001081 CrossRefGoogle Scholar
  59. van der Linden, W. J., & Diao, Q. (2014). Using a universal shadow test assembler with multistage testing. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 101–118). New York: CRC Press.Google Scholar
  60. van der Linden, W. J., & Glas, C. A. W. (2010). Elements of adaptive testing. New York: Springer.CrossRefzbMATHGoogle Scholar
  61. van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365–384. https://doi.org/10.1007/s11336-007-9046-8 MathSciNetCrossRefzbMATHGoogle Scholar
  62. van der Linden, W. J., & Pashley, P. J. (2000). Item selection and ability estimation in adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 1–25). Norwell, MA: Kluwer.CrossRefGoogle Scholar
  63. van der Linden, W. J., & Pashley, P. J. (2010). Item selection and ability estimation in adaptive testing. In W. V. der Linden & C. A. W. Glas (Eds.), Elements of adaptive testing (pp. 3–30). New York: Springer.CrossRefGoogle Scholar
  64. van der Linden, W. J., Veldkamp, B. P., & Reese, L. M. (2000). An integer programming approach to item bank design. Applied Psychological Measurement, 24, 139–150. https://doi.org/10.1177/01466210022031570 CrossRefGoogle Scholar
  65. Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22, 203–226. https://doi.org/10.3102/10769986022002203 CrossRefGoogle Scholar
  66. Veldkamp, B. P. (2014). Item pool design and maintenance for multistage testing. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 39–54). New York: CRC Press.Google Scholar
  67. Veldkamp, B. P., & van der Linden, W. J. (2010). Designing item pools for adaptive testing. In W. V. der Linden & C. A. W. Glas (Eds.), Elements of adaptive testing (pp. 231–245). New York: Springer.Google Scholar
  68. Wainer, H. (2000). Computerized adaptive testing: A primer (2nd ed.). New York: Routledge/Taylor and Francis.Google Scholar
  69. Wang, C. (2013). Mutual information item selection method in cognitive diagnostic computerized adaptive testing with short test length. Educational and Psychological Measurement, 73, 1017–1035. https://doi.org/10.1177/0013164413498256 CrossRefGoogle Scholar
  70. Wang, C., Chang, H.-H., & Huebner, A. (2011). Restrictive stochastic item selection methods in cognitive diagnostic computerized adaptive testing. Journal of Educational Measurement, 48, 255–273. https://doi.org/10.1111/j.1745-3984.2011.00145.x CrossRefGoogle Scholar
  71. Weiss, D. J. (1983). New horizons in testing: Latent trait theory and computerized adaptive testing. New York: Academic Press.Google Scholar
  72. Weissman, A. (2014). IRT-based multistage testing. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 153–168). New York: CRC Press.Google Scholar
  73. Yan, D., Lewis, C., & Stocking, M. L. (2004). Adaptive testing with regression trees in the presence of multidimensionality. Journal of Educational and Behavioral Statistics, 29, 293–316. https://doi.org/10.3102/10769986029003293 CrossRefGoogle Scholar
  74. Yan, D., Lewis, C., & von Davier, A. A. (2014b). A tree-based approach for multistage testing. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 169–188). New York: CRC Press.Google Scholar
  75. Yan, D., von Davier, A. A., & Lewis, C. (2014). Computerized multistage testing: Theory and applications. New York: CRC Press.Google Scholar
  76. Zheng, Y., Wang, C., Culbertson, M., & Chang, H.-H. (2014). Overview of test assembly methods in multistage testing. In D. Yan, A. A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 87–99). New York: CRC Press.Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • David Magis
    • 1
  • Duanli Yan
    • 2
  • Alina A. von Davier
    • 3
  1. 1.Department of EducationUniversity of LiegeLiegeBelgium
  2. 2.Educational Testing ServicePrincetonUSA
  3. 3.ACTNext by ACTIowa CityUSA

Personalised recommendations