A Procedure for Dimensionality Analyses of Response Data from Various Test Designs

Zhang, Jinming

doi:10.1007/s11336-012-9287-z

A Procedure for Dimensionality Analyses of Response Data from Various Test Designs

Published: 18 September 2012

Volume 78, pages 37–58, (2013)
Cite this article

Psychometrika Aims and scope Submit manuscript

Jinming Zhang¹

832 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

In some popular test designs (including computerized adaptive testing and multistage testing), many item pairs are not administered to any test takers, which may result in some complications during dimensionality analyses. In this paper, a modified DETECT index is proposed in order to perform dimensionality analyses for response data from such designs. It is proven in this paper that under certain conditions, the modified DETECT can successfully find the dimensionality-based partition of items. Furthermore, the modified DETECT index is decomposed into two parts, which can serve as indices of the reliability of results from the DETECT procedure when response data are judged to be multidimensional. A simulation study shows that the modified DETECT can successfully recover the dimensional structure of response data under reasonable specifications. Finally, the modified DETECT procedure is applied to real response data from two-stage tests to demonstrate how to utilize these indices and interpret their values in dimensionality analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

References

Allen, N., Donoghue, J.R., & Schoeps, T.L. (2001). The NAEP 1998 technical report (NCES 2001-509). Washington, DC: Office of Educational Research and Improvement, US Department of Education.
Angoff, W.H. (1968). How we calibrate college board scores. College Board Review, 68, 11–14.
Google Scholar
Bock, R.D., & Zimowski, M.F. (2003). Feasibility studies of two-stage testing in large-scale educational assessment: implications for NAEP. NAEP Validity Studies (NVS). Washington, DC: Office of Educational Research and Improvement, US Department of Education.
Budescu, D. (1985). Efficiency of linear equating as a function of the length of the anchor test. Journal of Educational Measurement, 22(1), 13–20.
Article Google Scholar
Camilli, G., Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the law school admission test. Journal of Educational Measurement, 32, 79–96.
Article Google Scholar
Dorans, N.J., Kubiak, A., & Melican, G.J. (1998). Guidelines for selection of embedded common items for score equating (ETS SR-98-02). Princeton: ETS.
Google Scholar
Grigg, W.S., Daane, M.C., Jin, Y., & Campbell, J.R. (2003). The nation’s report card: reading 2002 (NCES 2003-521). Washington, DC: National Center for Educational Statistics.
Hattie, J. (1984). An empirical study of various indices for determining unidimensionality. Multivariate Behavioral Research, 19, 49–78.
Article Google Scholar
Hattie, J. (1985). Methodological review: assessing unidimensionality of tests and items. Applied Psychological Measurement, 9, 139–164.
Article Google Scholar
Hays, W.L. (1973). Statistics for the social sciences. San Francisco: Holt, Rinehart & Winston.
Google Scholar
Hetter, R., & Sympson, B. (1997). Item exposure control in CAT-ASVAB. In W. Sands, B. Waters, & J. McBride (Eds.), Computerized adaptive testing: from inquiry to operation (pp. 141–144). Washington, DC: American Psychological Association.
Chapter Google Scholar
Hulin, C.L., Drasgow, F., & Parsons, C.K. (1983). Item response theory: application to psychological measurement. Homewood: Dow Jones-Irwin.
Google Scholar
Kim, H.R. (1994). New techniques for the dimensionality assessment of standardized test data. Unpublished doctoral dissertation, Department of Statistics, University of Illinois at Urbana—Champaign.
Kolen, M.J., & Brennan, R.L. (2004). Test equating, scaling, and linking: methods and practices (2nd ed.). New York: Springer.
Google Scholar
Lord, F.M. (1971). A theoretical study of two-stage testing. Psychometrika, 36, 227–242.
Article Google Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
McDonald, R.P. (1981). The dimensionality of tests and items. British Journal of Mathematical and Statistical Psychology, 34, 100–117.
Article Google Scholar
McDonald, R.P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379–396.
Article Google Scholar
McDonald, R.P. (1994). Testing for approximate dimensionality. In D. Laveault, B. Zumbo, M. Gessaroli, & M. Boss (Eds.), Modern theories of measurement: problems and issues (pp. 63–85). Ottawa: University of Ottawa Press.
Google Scholar
McDonald, R.P. (1997). Normal-ogive multidimensional model. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory (pp. 258–269). New York: Springer.
Google Scholar
Mislevy, R. (1986). Recent developments in the factor analysis of categorical variables. Journal of Educational Statistics, 11, 3–31.
Article Google Scholar
Mislevy, R., & Bock, R.D. (1982). BILOG: item analysis and test scoring with binary logistic models [Computer software]. Mooresville: Scientific Software.
Muraki, E., & Bock, R.D. (1997). PARSCALE: IRT item analysis and test scoring for rating scale data [Computer software]. Chicago: Scientific Software.
National Assessment Governing Board (2005). Reading framework for the 2005 National Assessment of Educational Progress. Washington, DC: National Assessment Governing Board.
Google Scholar
Oltman, P.K., Stricker, L.J., & Barrows, T.S. (1990). Analyzing test structure by multidimensional scaling. Journal of Applied Psychology, 75, 21–27.
Article Google Scholar
Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401–412.
Article Google Scholar
Reckase, M.D., & McKinley, R.L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15, 361–373.
Article Google Scholar
Roussos, L.A., & Ozbek, O. (2006). Formulation of the DETECT population parameter and evaluation of DETECT estimator bias. Journal of Educational Measurement, 43, 215–243.
Article Google Scholar
Roussos, L.A., Stout, W.F., & Marden, J. (1998). Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1–30.
Article Google Scholar
Sinharay, S., & Holland, P.W. (2007). Is it necessary to make anchor tests mini-versions of the tests being equated or can some restrictions be relaxed? Journal of Educational Measurement, 44, 249–275.
Article Google Scholar
Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589–617.
Article Google Scholar
Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55, 293–326.
Article Google Scholar
Stout, W.F., Habing, B., Douglas, J., Kim, H.R., Roussos, L.A., & Zhang, J. (1996). Conditional covariance based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331–354.
Article Google Scholar
Sympson, J.B., & Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. In Proceedings of the 27th annual meeting of the military testing association (pp. 973–977). San Diego: Navy Personnel Research and Development Center.
Google Scholar
UNESCO-UIS (2008). Literacy assessment and monitoring programme (LAMP): framework for the assessment of reading component skills. Montreal: UNESCO Institute for Statistics (UIS).
Google Scholar
UNESCO-UIS (2009). The next generation of literacy statistics: implementing the literacy assessment and monitoring programme (LAMP). Montreal: UNESCO Institute for Statistics (UIS).
Google Scholar
Van Abswoude, A.A.H., Van der Ark, L.A., & Sijtsma, K. (2004). A comparative study on test dimensionality assessment procedures under nonparametric IRT models. Applied Psychological Measurement, 28, 3–24.
Article Google Scholar
Zhang, J. (2007). Conditional covariance theory and DETECT for polytomous items. Psychometrika, 72, 69–91.
Article Google Scholar
Zhang, J., & Stout, W.F. (1999). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64, 213–249.
Article Google Scholar

Download references

Acknowledgements

The author would like to thank Brenda Tay-Lim and Demin Iris Qu for providing the LAMP field response data and other information about LAMP design, and Ting Lu and Sarah Zhang for their comments and suggestions.

Author information

Authors and Affiliations

Department of Educational Psychology, University of Illinois at Urbana-Champaign, 1310 South Sixth Street, 236A Education Building, Champaign, IL, 61820, USA
Jinming Zhang

Authors

Jinming Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinming Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J. A Procedure for Dimensionality Analyses of Response Data from Various Test Designs. Psychometrika 78, 37–58 (2013). https://doi.org/10.1007/s11336-012-9287-z

Download citation

Received: 20 June 2011
Revised: 13 February 2012
Published: 18 September 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s11336-012-9287-z

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Procedure for Dimensionality Analyses of Response Data from Various Test Designs

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

Navigation

A Procedure for Dimensionality Analyses of Response Data from Various Test Designs

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

A new criterion for assessing discriminant validity in variance-based structural equation modeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation