Abstract
Users of content-heterogeneous assessments based on unidimensional trait models often request information about examinee strengths and weaknesses in specific subareas. This is commonly called diagnostic information, and a standard way of providing it is by computing and reporting subscores. However, in the many cases where subscores fail to provide reliable information sufficiently independent of the total score, they cannot support claims about subarea strengths and weaknesses relative to total score expectations. These kinds of claims are referred to here as diagnostic inferences. This paper introduces a method to support diagnostic inferences for assessment programs developed and maintained using item response theory (IRT). The method establishes null and alternative hypotheses for the number correct on subsets of items or subtests. Statistical significance testing is then conducted to determine the strength of the statistical evidence in favor of a diagnostic inference. If the subtest score is modeled as a Poisson binomial distribution with probabilities set to those expected by the IRT model conditional on fixed item parameters and person scores, then a determination can be made, by individual or groups, whether and which diagnostic inferences are supported. This paper presents results of power computations showing the subtest lengths generally required for supporting diagnostic inferences under different conditions and effect sizes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Approximate numbers of students and schools are reported here to preserve anonymity of the data source.
References
American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Standards for Educational and Psychological Testing (American Educational Research Association, Washington, DC, 2014)
B.E. Barrett, J.B. Gray, Efficient computation for the Poisson binomial distribution. Comput. Stat. 29(6), 1469ā1479 (2014). doi:10.1007/s00180-014-0501-6
J. GonzĆ”lez, M. Wiberg, A.A. von Davier, A note on the Poissonās binomial distribution in item response theory. Appl. Psychol. Meas. 40(4), 302ā310 (2016). doi:10.1177/0146621616629380
S.J. Haberman, When can subscores have value? J. Educ. Behav. Stat. 33(2), 204ā229 (2008)
Y. Hong, On computing the distribution function for the Poisson binomial distribution. Comput. Stat. Data Anal. 59, 41ā51 (2013a). doi:10.1016/j.csda.2012.10.006
Y. Hong, Poibin: The Poisson Binomial Distribution (Version R package version 1.2) (2013b). Retrieved from http://CRAN.R-project.org/package=poibin
W. Monaghan, The Facts About Subscores. Educational testing service (2006)
R. Nandakumar, W. Stout, Refinements of Stoutās procedure for assessing latent trait unidimensionality. J. Educ. Stat. 18(1), 41 (1993). doi:10.2307/1165182
R Core Team, R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2015.) Retrieved from https://www.R-project.org/
S. Sinharay, How often do subscores have added value? Results from operational and simulated data. J. Educ. Meas. 47(2), 150ā174 (2010)
W.F. Stout, A nonparametric approach for assessing latent trait dimensionality. Psychometrika 52, 589ā617 (1987)
W. Stout, R. Nandakumar, B. Junker, H.-H. Chang, D. Steidinger, DIMTEST: a Fortran program for assessing dimensionality of binary item responses. Appl. Psychol. Meas. 16(3), 236ā236 (1992). doi:10.1177/014662169201600303
W. Stout, A.G. Froelich, F. Gao, Using resampling methods to produce an improved DIMTEST procedure, in Essays on Item Response Theory, ed. by ed. by A. Boomsma, M. A. J. van Duijn, T. A. B. Snijders, vol. 157, (Springer, New York, NY, 2001), pp. 357ā375. Retrieved from http://link.springer.com/10.1007/978-1-4613-0169-1_19
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2017 Springer International Publishing AG
About this paper
Cite this paper
LoriƩ, W. (2017). Supporting Diagnostic Inferences Using Significance Tests for Subtest Scores. In: van der Ark, L.A., Wiberg, M., Culpepper, S.A., Douglas, J.A., Wang, WC. (eds) Quantitative Psychology. IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol 196. Springer, Cham. https://doi.org/10.1007/978-3-319-56294-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-56294-0_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56293-3
Online ISBN: 978-3-319-56294-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)