Assessing the performance of classical test theory item discrimination estimators in Monte Carlo simulations

Bazaldua, Diego A. Luna; Lee, Young-Sun; Keller, Bryan; Fellers, Lauren

doi:10.1007/s12564-017-9507-4

Assessing the performance of classical test theory item discrimination estimators in Monte Carlo simulations

Published: 18 November 2017

Volume 18, pages 585–598, (2017)
Cite this article

Asia Pacific Education Review Aims and scope Submit manuscript

Diego A. Luna Bazaldua ORCID: orcid.org/0000-0002-8535-7775¹,
Young-Sun Lee²,
Bryan Keller³ &
…
Lauren Fellers⁴

926 Accesses
6 Citations
3 Altmetric
1 Mention
Explore all metrics

Abstract

The performance of various classical test theory (CTT) item discrimination estimators has been compared in the literature using both empirical and simulated data, resulting in mixed results regarding the preference of some discrimination estimators over others. This study analyzes the performance of various item discrimination estimators in CTT: point-biserial correlation, point-biserial correlation with item excluded from the test total score, biserial correlation, phi coefficient splitting total score using the median, and discrimination index. For this study, data were generated from unidimensional logistic item response theory (IRT) models with one and two parameters. The factors considered in the study were test length, intervals for item difficulty and item discrimination parameters, as well as the composition of one or two groups of the examinees with specific ability distribution parameters. Results indicate that the biserial coefficient was most highly correlated with the IRT discrimination parameter across different simulation conditions. The degree of comparability among estimators and estimator invariance varied across conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Article 17 May 2019

Examination of Test Characteristics’ Effect on Coefficient α and Coefficient ω

Simulation Studies of Item Bias Estimation Accuracy

References

Alemoni, L. M., & Spencer, R. E. (1969). A comparison of biserial discrimination, point-biserial discrimination, and difficulty indices in item analysis data. Educational and Psychological Measurement, 29, 353–358.
Article Google Scholar
Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques. Boca Raton, FL: CRC Press.
Google Scholar
Bechger, T. M., Maris, G., Verstralen, H. H., & Béguin, A. A. (2003). Using classical test theory in combination with item response theory. Applied Psychological Measurement, 27(5), 319–334.
Article Google Scholar
Berk, R. A., & Griesemer, H. A. (1976). ITEMAN: An item analysis program for tests, questionnaires, and scales. Educational and Psychological Measurement, 36(1), 189–191.
Article Google Scholar
Beuchert, A., & Mendoza, J. L. (1979). A Monte Carlo comparison of ten item discrimination indices. Journal of Educational Measurement, 16(2), 109–117.
Article Google Scholar
Bowers, J. (1972). A note on comparing r biserial and r point-biserial. Educational and Psychological Measurement, 32, 771–775.
Article Google Scholar
Brennan, R. L. (1972). A generalized U-L item discrimination index. Educational and Psychological Measurement, 32, 289–303.
Article Google Scholar
Brodgen, H. E. (1949). A new coefficient: Application to biserial correlation and to estimation of selective efficiency. Psychometrika, 14, 169–182.
Article Google Scholar
Crocker, L. M., & Algina, J. (1986). Introduction to classical and modern test theory. New York, NY: Holt, Rinehart and Winston.
Google Scholar
Davis, F. B. (1951). Item selection techniques. In E. F. Lindquist (Ed.), Educational Measurement (pp. 266–328). Washington, DC: American Council on Education.
Google Scholar
Dawber, T., Rogers, W. T., & Carbonaro, M. (2009). Robustness of Lord’s formulas for item difficulty and discrimination conversions between classical and item response theory models. The Alberta Journal of Educational Research, 55(4), 512–533.
Google Scholar
de Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Google Scholar
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8(4), 341–349.
Article Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Psychology Press.
Google Scholar
Engelhart, M. D. (1965). A comparison of several item discrimination indices. Journal of Educational Measurement, 2(1), 69–76.
Article Google Scholar
Fan, X. (1998). Item response theory and classical test theory: An empirical comparison of their item/person statistics. Educational and Psychological Measurement, 58(3), 357–381.
Article Google Scholar
Fletcher, T. D. (2010). Psychometric: Applied Psychometric Theory. R package version 2.2. Retrieved from http://CRAN.R-project.org/package=psychometric.
Fox, J. (2010). Polycor: Polychoric and polyserial correlations. R package version 0.7-8. Retrieved from http://CRAN.R-project.org/package=polycor.
Hambleton, R. K., & Jones, R. W. (1993). An NCME instructional module on comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.
Article Google Scholar
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage.
Google Scholar
Harwell, M., Stone, C. A., Hsu, T. C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101–125.
Article Google Scholar
Ivens, S. H. (1971). Non-parametric item evaluation index. Educational and Psychological Measurement, 31, 843–849.
Article Google Scholar
Kamata, A., & Bauer, D. J. (2008). A note on the relation between factor analytic and item response theory models. Structural Equation Modeling, 15, 136–153.
Article Google Scholar
Kim, S. H. (1997). BILOG 3 for windows: Item analysis and test scoring with binary logistic models. Applied Psychological Measurement, 21(4), 371–376.
Article Google Scholar
Kohli, N., Koran, J., & Henn, L. (2014). Relationships among classical test theory and item response theory frameworks via factor analytic models. Educational and Psychological Measurement, 75(3), 389–405.
Article Google Scholar
Liu, F. (2008). Comparison of several popular discrimination indices based on different criteria and their application in item analysis (Unpublished master thesis). Athens, GA: University of Georgia.
Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Englewood Cliffs, NJ: Erlbaum.
Google Scholar
Lord, F., & Novick, M. R. (1968). Statistical theories of mental test scores. Oxford, UK: Addison-Wesley.
Google Scholar
MacDonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of item and person statistics based on item response theory versus classical test theory. Educational and Psychological Measurement, 62(6), 921–943.
Article Google Scholar
McDonald, R. P. (1999). Test theory: A unified approach. Mahwah, NJ: Lawrence Erlbaum Associates.
Google Scholar
Oosterhof, A. C. (1976). Similarity of various item discrimination indices. Journal of Educational Measurement, 13(2), 145–150.
Article Google Scholar
R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from http://www.R-project.org/.
Raykov, T., & Marcoulides, G. A. (2015). On the relationship between classical test theory and item response theory: from one to the other and back. Educational and Psychological Measurement, 76(2), 325–338.
Article Google Scholar
Revelle, W. (2015). Psych: Procedures for personality and psychological research. R package version 1.5.1. Retrieved from http://personality-project.org/r/psych-manual.pdf.
Rizzo, M. L. (2008). Statistical computing with R. Boca Raton, FL: Chapman & Hall/CRC.
Google Scholar
Takane, Y., & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52(3), 393–408.
Article Google Scholar
Traub, R. E. (1997). Classical test theory in historical perspective. Educational Measurement: Issues and Practice, 16(4), 8–14.
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Autonomous University of Mexico, Facultad de Psicologia, Building B. Office B 220, Av. Universidad 3004, Col. Copilco Universidad, Mexico, NM, 04510, USA
Diego A. Luna Bazaldua
Teachers College, Columbia University, 456A, Grace Dodge Hall, 525 West 120th St, New York, NY, 10027, USA
Young-Sun Lee
Teachers College, Columbia University, 453E, Grace Dodge Hall, 525 West 120th St, New York, NY, 10027, USA
Bryan Keller
New York University, Kimball Hall, 246 Greene Street Floor 3, New York, NY, 10003, USA
Lauren Fellers

Authors

Diego A. Luna Bazaldua
View author publications
You can also search for this author in PubMed Google Scholar
Young-Sun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Keller
View author publications
You can also search for this author in PubMed Google Scholar
Lauren Fellers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego A. Luna Bazaldua.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bazaldua, D.A.L., Lee, YS., Keller, B. et al. Assessing the performance of classical test theory item discrimination estimators in Monte Carlo simulations. Asia Pacific Educ. Rev. 18, 585–598 (2017). https://doi.org/10.1007/s12564-017-9507-4

Download citation

Received: 04 May 2016
Revised: 04 November 2017
Accepted: 09 November 2017
Published: 18 November 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s12564-017-9507-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing the performance of classical test theory item discrimination estimators in Monte Carlo simulations

Abstract

Access this article

Similar content being viewed by others

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Examination of Test Characteristics’ Effect on Coefficient α and Coefficient ω

Simulation Studies of Item Bias Estimation Accuracy

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing the performance of classical test theory item discrimination estimators in Monte Carlo simulations

Abstract

Access this article

Similar content being viewed by others

A comparison of Monte Carlo methods for computing marginal likelihoods of item response theory models

Examination of Test Characteristics’ Effect on Coefficient α and Coefficient ω

Simulation Studies of Item Bias Estimation Accuracy

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation