Skip to main content

Supporting Diagnostic Inferences Using Significance Tests for Subtest Scores

  • Conference paper
Quantitative Psychology (IMPS 2016)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 196))

Included in the following conference series:

  • 1297 Accesses

Abstract

Users of content-heterogeneous assessments based on unidimensional trait models often request information about examinee strengths and weaknesses in specific subareas. This is commonly called diagnostic information, and a standard way of providing it is by computing and reporting subscores. However, in the many cases where subscores fail to provide reliable information sufficiently independent of the total score, they cannot support claims about subarea strengths and weaknesses relative to total score expectations. These kinds of claims are referred to here as diagnostic inferences. This paper introduces a method to support diagnostic inferences for assessment programs developed and maintained using item response theory (IRT). The method establishes null and alternative hypotheses for the number correct on subsets of items or subtests. Statistical significance testing is then conducted to determine the strength of the statistical evidence in favor of a diagnostic inference. If the subtest score is modeled as a Poisson binomial distribution with probabilities set to those expected by the IRT model conditional on fixed item parameters and person scores, then a determination can be made, by individual or groups, whether and which diagnostic inferences are supported. This paper presents results of power computations showing the subtest lengths generally required for supporting diagnostic inferences under different conditions and effect sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Approximate numbers of students and schools are reported here to preserve anonymity of the data source.

References

  • American Educational Research Association, American Psychological Association, National Council on Measurement in Education, Standards for Educational and Psychological Testing (American Educational Research Association, Washington, DC, 2014)

    Google ScholarĀ 

  • B.E. Barrett, J.B. Gray, Efficient computation for the Poisson binomial distribution. Comput. Stat. 29(6), 1469ā€“1479 (2014). doi:10.1007/s00180-014-0501-6

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  • J. GonzĆ”lez, M. Wiberg, A.A. von Davier, A note on the Poissonā€™s binomial distribution in item response theory. Appl. Psychol. Meas. 40(4), 302ā€“310 (2016). doi:10.1177/0146621616629380

    ArticleĀ  Google ScholarĀ 

  • S.J. Haberman, When can subscores have value? J. Educ. Behav. Stat. 33(2), 204ā€“229 (2008)

    ArticleĀ  Google ScholarĀ 

  • Y. Hong, On computing the distribution function for the Poisson binomial distribution. Comput. Stat. Data Anal. 59, 41ā€“51 (2013a). doi:10.1016/j.csda.2012.10.006

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  • Y. Hong, Poibin: The Poisson Binomial Distribution (Version R package version 1.2) (2013b). Retrieved from http://CRAN.R-project.org/package=poibin

  • W. Monaghan, The Facts About Subscores. Educational testing service (2006)

    Google ScholarĀ 

  • R. Nandakumar, W. Stout, Refinements of Stoutā€™s procedure for assessing latent trait unidimensionality. J. Educ. Stat. 18(1), 41 (1993). doi:10.2307/1165182

    ArticleĀ  Google ScholarĀ 

  • R Core Team, R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2015.) Retrieved from https://www.R-project.org/

    Google ScholarĀ 

  • S. Sinharay, How often do subscores have added value? Results from operational and simulated data. J. Educ. Meas. 47(2), 150ā€“174 (2010)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  • W.F. Stout, A nonparametric approach for assessing latent trait dimensionality. Psychometrika 52, 589ā€“617 (1987)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  • W. Stout, R. Nandakumar, B. Junker, H.-H. Chang, D. Steidinger, DIMTEST: a Fortran program for assessing dimensionality of binary item responses. Appl. Psychol. Meas. 16(3), 236ā€“236 (1992). doi:10.1177/014662169201600303

    ArticleĀ  Google ScholarĀ 

  • W. Stout, A.G. Froelich, F. Gao, Using resampling methods to produce an improved DIMTEST procedure, in Essays on Item Response Theory, ed. by ed. by A. Boomsma, M. A. J. van Duijn, T. A. B. Snijders, vol. 157, (Springer, New York, NY, 2001), pp. 357ā€“375. Retrieved from http://link.springer.com/10.1007/978-1-4613-0169-1_19

    ChapterĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William LoriƩ .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2017 Springer International Publishing AG

About this paper

Cite this paper

LoriƩ, W. (2017). Supporting Diagnostic Inferences Using Significance Tests for Subtest Scores. In: van der Ark, L.A., Wiberg, M., Culpepper, S.A., Douglas, J.A., Wang, WC. (eds) Quantitative Psychology. IMPS 2016. Springer Proceedings in Mathematics & Statistics, vol 196. Springer, Cham. https://doi.org/10.1007/978-3-319-56294-0_6

Download citation

Publish with us

Policies and ethics