Skip to main content
Log in

A Procedure for Dimensionality Analyses of Response Data from Various Test Designs

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

In some popular test designs (including computerized adaptive testing and multistage testing), many item pairs are not administered to any test takers, which may result in some complications during dimensionality analyses. In this paper, a modified DETECT index is proposed in order to perform dimensionality analyses for response data from such designs. It is proven in this paper that under certain conditions, the modified DETECT can successfully find the dimensionality-based partition of items. Furthermore, the modified DETECT index is decomposed into two parts, which can serve as indices of the reliability of results from the DETECT procedure when response data are judged to be multidimensional. A simulation study shows that the modified DETECT can successfully recover the dimensional structure of response data under reasonable specifications. Finally, the modified DETECT procedure is applied to real response data from two-stage tests to demonstrate how to utilize these indices and interpret their values in dimensionality analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1.
Figure 2.

Similar content being viewed by others

References

  • Allen, N., Donoghue, J.R., & Schoeps, T.L. (2001). The NAEP 1998 technical report (NCES 2001-509). Washington, DC: Office of Educational Research and Improvement, US Department of Education.

  • Angoff, W.H. (1968). How we calibrate college board scores. College Board Review, 68, 11–14.

    Google Scholar 

  • Bock, R.D., & Zimowski, M.F. (2003). Feasibility studies of two-stage testing in large-scale educational assessment: implications for NAEP. NAEP Validity Studies (NVS). Washington, DC: Office of Educational Research and Improvement, US Department of Education.

  • Budescu, D. (1985). Efficiency of linear equating as a function of the length of the anchor test. Journal of Educational Measurement, 22(1), 13–20.

    Article  Google Scholar 

  • Camilli, G., Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the law school admission test. Journal of Educational Measurement, 32, 79–96.

    Article  Google Scholar 

  • Dorans, N.J., Kubiak, A., & Melican, G.J. (1998). Guidelines for selection of embedded common items for score equating (ETS SR-98-02). Princeton: ETS.

    Google Scholar 

  • Grigg, W.S., Daane, M.C., Jin, Y., & Campbell, J.R. (2003). The nation’s report card: reading 2002 (NCES 2003-521). Washington, DC: National Center for Educational Statistics.

  • Hattie, J. (1984). An empirical study of various indices for determining unidimensionality. Multivariate Behavioral Research, 19, 49–78.

    Article  Google Scholar 

  • Hattie, J. (1985). Methodological review: assessing unidimensionality of tests and items. Applied Psychological Measurement, 9, 139–164.

    Article  Google Scholar 

  • Hays, W.L. (1973). Statistics for the social sciences. San Francisco: Holt, Rinehart & Winston.

    Google Scholar 

  • Hetter, R., & Sympson, B. (1997). Item exposure control in CAT-ASVAB. In W. Sands, B. Waters, & J. McBride (Eds.), Computerized adaptive testing: from inquiry to operation (pp. 141–144). Washington, DC: American Psychological Association.

    Chapter  Google Scholar 

  • Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response theory: application to psychological measurement. Homewood: Dow Jones-Irwin.

    Google Scholar 

  • Kim, H.R. (1994). New techniques for the dimensionality assessment of standardized test data. Unpublished doctoral dissertation, Department of Statistics, University of Illinois at Urbana—Champaign.

  • Kolen, M.J., & Brennan, R.L. (2004). Test equating, scaling, and linking: methods and practices (2nd ed.). New York: Springer.

    Google Scholar 

  • Lord, F.M. (1971). A theoretical study of two-stage testing. Psychometrika, 36, 227–242.

    Article  Google Scholar 

  • Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • McDonald, R.P. (1981). The dimensionality of tests and items. British Journal of Mathematical and Statistical Psychology, 34, 100–117.

    Article  Google Scholar 

  • McDonald, R.P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379–396.

    Article  Google Scholar 

  • McDonald, R.P. (1994). Testing for approximate dimensionality. In D. Laveault, B. Zumbo, M. Gessaroli, & M. Boss (Eds.), Modern theories of measurement: problems and issues (pp. 63–85). Ottawa: University of Ottawa Press.

    Google Scholar 

  • McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 258–269). New York: Springer.

    Google Scholar 

  • Mislevy, R. (1986). Recent developments in the factor analysis of categorical variables. Journal of Educational Statistics, 11, 3–31.

    Article  Google Scholar 

  • Mislevy, R., & Bock, R.D. (1982). BILOG: item analysis and test scoring with binary logistic models [Computer software]. Mooresville: Scientific Software.

  • Muraki, E., & Bock, R.D. (1997). PARSCALE: IRT item analysis and test scoring for rating scale data [Computer software]. Chicago: Scientific Software.

  • National Assessment Governing Board (2005). Reading framework for the 2005 National Assessment of Educational Progress. Washington, DC: National Assessment Governing Board.

    Google Scholar 

  • Oltman, P.K., Stricker, L.J., & Barrows, T.S. (1990). Analyzing test structure by multidimensional scaling. Journal of Applied Psychology, 75, 21–27.

    Article  Google Scholar 

  • Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401–412.

    Article  Google Scholar 

  • Reckase, M.D., & McKinley, R.L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15, 361–373.

    Article  Google Scholar 

  • Roussos, L.A., & Ozbek, O. (2006). Formulation of the DETECT population parameter and evaluation of DETECT estimator bias. Journal of Educational Measurement, 43, 215–243.

    Article  Google Scholar 

  • Roussos, L.A., Stout, W.F., & Marden, J. (1998). Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1–30.

    Article  Google Scholar 

  • Sinharay, S., & Holland, P.W. (2007). Is it necessary to make anchor tests mini-versions of the tests being equated or can some restrictions be relaxed? Journal of Educational Measurement, 44, 249–275.

    Article  Google Scholar 

  • Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589–617.

    Article  Google Scholar 

  • Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55, 293–326.

    Article  Google Scholar 

  • Stout, W.F., Habing, B., Douglas, J., Kim, H.R., Roussos, L.A., & Zhang, J. (1996). Conditional covariance based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331–354.

    Article  Google Scholar 

  • Sympson, J.B., & Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. In Proceedings of the 27th annual meeting of the military testing association (pp. 973–977). San Diego: Navy Personnel Research and Development Center.

    Google Scholar 

  • UNESCO-UIS (2008). Literacy assessment and monitoring programme (LAMP): framework for the assessment of reading component skills. Montreal: UNESCO Institute for Statistics (UIS).

    Google Scholar 

  • UNESCO-UIS (2009). The next generation of literacy statistics: implementing the literacy assessment and monitoring programme (LAMP). Montreal: UNESCO Institute for Statistics (UIS).

    Google Scholar 

  • Van Abswoude, A.A.H., Van der Ark, L.A., & Sijtsma, K. (2004). A comparative study on test dimensionality assessment procedures under nonparametric IRT models. Applied Psychological Measurement, 28, 3–24.

    Article  Google Scholar 

  • Zhang, J. (2007). Conditional covariance theory and DETECT for polytomous items. Psychometrika, 72, 69–91.

    Article  Google Scholar 

  • Zhang, J., & Stout, W.F. (1999). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64, 213–249.

    Article  Google Scholar 

Download references

Acknowledgements

The author would like to thank Brenda Tay-Lim and Demin Iris Qu for providing the LAMP field response data and other information about LAMP design, and Ting Lu and Sarah Zhang for their comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinming Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J. A Procedure for Dimensionality Analyses of Response Data from Various Test Designs. Psychometrika 78, 37–58 (2013). https://doi.org/10.1007/s11336-012-9287-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11336-012-9287-z

Key words

Navigation