Item Selection and Ability Estimation in Adaptive Testing

van der Linden, Wim J.; Pashley, Peter J.

doi:10.1007/978-0-387-85461-8_1

Item Selection and Ability Estimation in Adaptive Testing

Wim J. van der Linden³ &
Peter J. Pashley⁴

Chapter
First Online: 31 December 2009

3367 Accesses
11 Citations

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

Abstract

The last century saw a tremendous progression in the refinement and use of standardized linear tests. The first administered College Board exam occurred in 1901 and the first Scholastic Assessment Test (SAT) was given in 1926. Since then, progressively more sophisticated standardized linear tests have been developed for a multitude of assessment purposes, such as college placement, professional licensure, higher-education admissions, and tracking educational standing or progress. Standardized linear tests are now administered around the world. For example, the Test of English as a Foreign Language (TOEFL) has been delivered in approximately 88 countries.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Andersen, E. B. (1980). Discrete statistical models with social sciences applications. Amsterdam: North-Holland.
Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.
Google Scholar
Bock, R. D. & Mislevy, R. J. (1988). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444.
Article Google Scholar
Chang, H.-H. & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT model. Psychometrika, 58, 37–52.
Article MATH MathSciNet Google Scholar
Chang, H.-H. & Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement,20, 213–229.
Article Google Scholar
Chang, H.-H. & Ying, Z. (1999). α-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211–222.
Article Google Scholar
Chang, H.-H. & Ying, Z. (2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73, 441–450.
Article Google Scholar
Chang, H.-H. & Ying, Z. (2009). Nonlinear sequential designs for logistic item response models with applications to computerized adaptive tests. The Annals of Statistics, 37, 1466–1488.
Article MATH MathSciNet Google Scholar
Chen, S., Hou, L. & Dodd, B. G. (1998). A comparison of maximum-likelihood estimation and expected a posteriori estimation in CAT using the partial credit model. Educational and Psychological Measurement, 58, 569–595.
Article Google Scholar
De Ayala, R. J. (1992). The nominal response model in computerized adaptive testing. Applied Psychological Measurement, 16, 327–343.
Article Google Scholar
De Ayala, R. J., Dodd, B. G. & Koch, W. R. (1992). A comparison of the partial credit and graded response models in computerized adaptive testing. Applied Measurement in Education, 5, 17–34.
Article Google Scholar
Eggen, T. J. H. M. & Verschoor, A. J. (2006). Optimal testing with easy and difficult items in computerized adaptive testing. Applied Psychological Measurement, 30, 379–393.
Article MathSciNet Google Scholar
Freund, P. A., Hofer, S. & Holling, H. (2008). Explaining and controlling for the psychometric properties of computer-generated figural matrix items. Applied Psychological Measurement, 32, 195–210.
Article MathSciNet Google Scholar
Geerlings, H., van der Linden, W. J. & Glas, C. A. W. (2009). Modeling rule-based item generation. Submitted for publication.
Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S. & Rubin, D. B. (1995). Bayesian data analysis. London: Chapman & Hall.
Google Scholar
Glas, C. A. W. & van der Linden, W. J. (2001). Modeling item variability in item parameters in item response models (Research Report 01-11). Enschede, the Netherlands: Department of Educational Measurement and Data Analysis, University of Twente.
Google Scholar
Glas, C. A. W. & van der Linden, W. J. (2003). Computerized adaptive testing with item clones. Applied Psychological Measurement, 27, 247–261.
Article MathSciNet Google Scholar
Gulliksen, H. (1950). Theory of mental tests. Hillsdale, NJ: Erlbaum.
Google Scholar
Holling, H., Bertling, J. P. & Zeuch, N. (in press). Probability word problems: Automatic item generation and LLTM modelling. Studies in Educational Evaluation.
Google Scholar
Klein Entink, R. H., Fox, J.-P. & van der Linden, W. J. (2009). A multivariate multilevel approach to simultaneous modeling of accuracy and speed on test items. Psychometrika, 74, 21–48.
Article MATH Google Scholar
Lehmann, E. L. & Casella, G. (1998). Theory of point estimation. New York: Springer-Verlag.
MATH Google Scholar
Lord, F. M. (1971). The self-scoring flexilevel test. Journal of Educational Measurement, 8, 147–151.
Article Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Google Scholar
Lord, F. M. (1986). Maximum likelihood and Bayesian parameter estimation in item response theory. Journal of Educational Measurement, 23, 157–162.
Article Google Scholar
Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
MATH Google Scholar
Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177–195.
Article MATH MathSciNet Google Scholar
Mislevy, R. J. & Wu, P.-K. (1988). Inferring examinee ability when some items response are missing (Research Report 88-48-ONR). Princeton, NJ: Educational Testing Service.
Google Scholar
Owen, R. J. (1969). A Bayesian approach to tailored testing (Research Report 69-92). Princeton, NJ: Educational Testing Service.
Google Scholar
Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351–356.
Article MATH MathSciNet Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Denmarks Paedogogiske Institut.
Google Scholar
Roberts, J. S., Lin, Y. & Laughlin, J. E. (2001). Computerized adaptive testing with the generalized graded unfolding model. Applied Psychological Measurement, 25, 177–192.
Article MathSciNet Google Scholar
Samejima, F. (1973). A comment on Birnbaum’s three-parameter logistic model in latent trait theory. Psychometrika, 38, 221–233.
Article MATH Google Scholar
Samejima, F. (1993). The bias function of the maximum-likelihood estimate of ability for the dichotomous response level. Psychometrika, 58, 195–210.
Article Google Scholar
Schnipke, D. L. & Green, B. F. (1995). A comparison of item selection routines in linear and adaptive testing. Journal of Educational Measurement, 32, 227–242.
Article Google Scholar
Segall, D. O. (1997). Equating the CAT-ASVAB. In W. A. Sands, B. K. Waters & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 181–198). Washington, DC: American Psychological Association.
Chapter Google Scholar
Sinharay, S., Johnson, M. S. & Williamson, D. M. (2003). Calibrating item families and summarizing the results using family expected response functions. Journal of Educational and Behavioral Statistics, 28, 295–313.
Article Google Scholar
Stocking, M. L. (1996). An alternative method for scoring adaptive tests. Journal of Educational and Behavioral Statistics, 21, 365–389.
Google Scholar
Thissen, D., Chen, W.-H. & Bock, R. D. (2002). Multilog 7: Analysis of multi-category response data [Computer program and manual]. Lincolnwood, IL: Scientific Software International.
Google Scholar
Thissen, D. & Mislevy, R. J. (1990). Testing algorithms. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 103–134). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Tsutakawa, R. K. & Johnson, C. (1990). The effect of uncertainty on item parameter estimation on ability estimates. Psychometrika, 55, 371–390.
Article Google Scholar
van der Linden, W. J. (1998). Bayesian item-selection criteria for adaptive testing. Psychometrika, 62, 201–216.
Article MathSciNet Google Scholar
van der Linden, W. J. (1999). A procedure for empirical initialization of the trait estimator in adaptive testing. Applied Psychological Measurement, 23, 21–29.
Article Google Scholar
van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308.
Article MATH MathSciNet Google Scholar
van der Linden, W. J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33, 5–20.
Article Google Scholar
van der Linden, W. J. & Glas, C. A. W. (2000). Capitalization on item calibration error in adaptive testing. Applied Measurement in Education,13, 35–53.
Article Google Scholar
van der Linden, W. J. & Glas, C. A. W. (2001). Cross-validating item parameter estimation in computerized adaptive testing. In A. Boomsma, M. A. J. van Duijn & T. A. M. Snijders (Eds.), Essays on item response theory (pp. 205–219). New York: Springer-Verlag.
Google Scholar
van der Linden, W. J. & Glas, C. A. W. (2007). Statistical aspects of adaptive testing. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 27: Psychometrics) (pp. 801–838). Amsterdam: North-Holland.
Google Scholar
van Rijn, P. W., Eggen, T. J. H. M., Hemker, B. T. & Sanders, P. F. (2002). Evaluation of selection procedures for computerized adaptive testing with polytomous items. Applied Psychological Measurement, 26, 393–411.
Article MathSciNet Google Scholar
Veerkamp, W. J. J. & Berger, M. P. F. (1997). Item-selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22, 203–226.
Google Scholar
Wainer, H., Lewis, C., Kaplan, B. & Braswell, J. (1991). Building algebra testlets: A comparison of hierarchical and linear structures. Journal of Educational Measurement, 28, 311–323.
Article Google Scholar
Wang, T., Hanson, B. A. & Lau, C.-M. A. (1999). Reducing bias in CAT trait estimation: A comparison of approaches. Applied Psychological Measurement, 23, 263–278.
Article Google Scholar
Wang, T. & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35, 109–135.
Article Google Scholar
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory with tests of finite length. Psychometrika, 54, 427–450.
Article MathSciNet Google Scholar
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 4, 473–285.
Article Google Scholar
Weiss, D. J. & McBride, J. R. (1984). Bias and information of Bayesian adaptive testing. Applied Psychological Measurement, 8, 273–285.
Article Google Scholar
Zimoski, M. F., Muraki, E., Mislevy, R. & Bock, D. R. (2006). BILOG-MG 3 for Windows [Computer program and manual]. Lincolnwood, IL: Scientific Software International.
Google Scholar

Download references

Author information

Authors and Affiliations

CTB/McGraw-Hill, 20 Ryan Ranch Road, Monterey, CA, 93940, USA
Wim J. van der Linden
Law School Admission Council, 40, Newtown, PA, 18940–0040, USA
Peter J. Pashley

Authors

Wim J. van der Linden
View author publications
You can also search for this author in PubMed Google Scholar
Peter J. Pashley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CTB/McGraw-Hill LLC, Ryan Ranch Road 20, Monterey, 93940, U.S.A.
Wim J. van der Linden
Fac. Behavioural Sciences, Twente University, Enschede, Netherlands
Cees A.W. Glas

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

van der Linden, W.J., Pashley, P.J. (2009). Item Selection and Ability Estimation in Adaptive Testing. In: van der Linden, W., Glas, C. (eds) Elements of Adaptive Testing. Statistics for Social and Behavioral Sciences. Springer, New York, NY. https://doi.org/10.1007/978-0-387-85461-8_1

Download citation

DOI: https://doi.org/10.1007/978-0-387-85461-8_1
Published: 31 December 2009
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-85459-5
Online ISBN: 978-0-387-85461-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics