Advertisement

Applying the General Diagnostic Model to Proficiency Data from a National Skills Survey

  • Xueli XuEmail author
  • Matthias von Davier
Chapter
Part of the Methodology of Educational Measurement and Assessment book series (MEMA)

Abstract

Large-scale educational surveys (including NAEP, TIMSS, PISA) utilize item-response-theory (IRT) calibration together with a latent regression model to make inferences about subgroup ability distributions, including subgroup means, percentiles, as well as standard deviations. It has long been recognized that grouping variables not included in the latent regression model can produce secondary bias in estimates of group differences (Mislevy, RJ, Psychometrika 56:177–196, 1991). To accommodate the ever-increasing number of background variables collected and required for reporting purposes, a principal component analysis based on the background variables (von Davier M, Sinharay S, Oranje A, Beaton AE, The statistical procedures used in national assessment of educational progress: recent developments and future directions. In: Rao CR, Sinharay S (eds) Handbook of statistics: vol. 26. Psychometrics. Elsevier B.V, Amsterdam, pp 1039–1055, 2007; Moran R, Dresher A, Results from NAEP marginal estimation research on multivariate scales. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, 2007; Oranje A, Li D, On the role of background variables in large scale survey assessments. Paper presented at the annual meeting of the National Council on Measurement in Education, New York, NY, 2008) is utilized to keep the number of predictors in the latent regression models within a reasonable range. However, even this approach often results in the inclusion of several hundred variables, and it is unknown whether the principal component approach or similar approaches (such as latent-class approaches) are able to generate consistent estimates for individual subgroups (e.g., Wetzel E, Xu X, von Davier M, Educ Psychol Meas 75(5):1–25, 2014). The primary goal of the current study is to provide an exemplary application of diagnostic models for large-scale-assessment data. Specifically, a latent-class structure is used for covariates while continuing to use IRT models for item responses in the analytic model. Previous applications focused on adult literacy data (von Davier M, Yamamoto K, A class of models for cognitive diagnosis. Paper presented at the 4th Spearman invitational conference, Philadelphia, PA, 2004), as well as large-scale English-language testing programs (von Davier M; A general diagnostic model applied to language testing data (Research report no. RR-05-16). Educational Testing Service, Princeton, 2005, von Davier M, The mixture general diagnostic model. In: Hancock GR, Samuelsen KM (eds) Advances in latent variable mixture models. Information Age Publishing, Charlotte, pp 255–274, 2008), while the current application uses diagnostic modeling approaches on data from NAEP.

References

  1. Dresher, A. (2006, April). Results from NAEP marginal estimation research. Presented at the annual meeting of the national council on measurement in education, San Francisco, CA.Google Scholar
  2. Haberman, S., von Davier, M., & Lee, Y.-H. (2009). Comparison of multidimensional item response models: multivariate normal ability distributions versus multivariate polytomous ability distributions (Research Report No. RR-08-45). Princeton, NJ: Educational Testing Service.Google Scholar
  3. Johnson, E. (1992). The design of the national assessment of educational progress. Journal of Educational Measurement, 29, 95–110.CrossRefGoogle Scholar
  4. Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56, 177–196.CrossRefGoogle Scholar
  5. Mislevy, R. J., Beaton, A. E., Kaplan, B., & Sheehan, K. M. (1992). Estimating population characteristic from sparse matrix samples of item responses. Journal of Educational Measurement, 29, 133–162.CrossRefGoogle Scholar
  6. Moran, R., & Dresher, A. (2007). Results from NAEP marginal estimation research on multivariate scales. Paper presented at the annual meeting of the national council on measurement in education, Chicago, IL.Google Scholar
  7. Oranje, A., & Li, D. (2008, April). On the role of background variables in large scale survey assessments. Paper presented at the annual meeting of the national council on measurement in education, New York, NY.Google Scholar
  8. Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345–354.CrossRefGoogle Scholar
  9. Thomas, N. (2002). The role of secondary covariates when estimating latent trait population distributions. Psychometrika, 67, 33–48.  https://doi.org/10.1007/BF02294708 CrossRefGoogle Scholar
  10. von Davier, M. (2005). A general diagnostic model applied to language testing data (Research Report No. RR-05-16). Princeton, NJ: Educational Testing Service.Google Scholar
  11. von Davier, M. (2008). The mixture general diagnostic model. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 255–274). Charlotte, NC: Information Age Publishing.Google Scholar
  12. von Davier, M. (2009, March). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement—Interdisciplinary Research and Perspectives, 7(1), 67–74.CrossRefGoogle Scholar
  13. von Davier, M., Gonzalez, E., & Mislevy, R. J. (2009). What are plausible values and why are they useful? IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments, 2, 9–36.Google Scholar
  14. von Davier, M., & Rost, J. (2007). Mixture distribution item response models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Vol. 26. Psychometrics (pp. 643–661). Amsterdam, the Netherlands: Elsevier B.V.Google Scholar
  15. von Davier, M., & Rost, J. (2016). Logistic mixture-distribution response models. In W. J. van der Linden (Ed.), Handbook of item response theory. Vol. 1: Models (pp. 393–406). Boca Raton, FL: Chapman and Hall/CRC.Google Scholar
  16. von Davier, M., & Sinharay, S. (2014).. Analytics in international large-scale assessments: Item response theory and population models). In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (pp. 155–174). Boca Raton, FL: CRC Press.Google Scholar
  17. von Davier, M., Sinharay, S., Oranje, A., & Beaton, A. E. (2007). The statistical procedures used in national assessment of educational progress: Recent developments and future directions. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Vol. 26. Psychometrics (pp. 1039–1055). Amsterdam, the Netherlands: Elsevier B.V.Google Scholar
  18. von Davier, M., & Yamamoto, K. (2004, October). A class of models for cognitive diagnosis. Paper presented at the 4th Spearman invitational conference, Philadelphia, PA.Google Scholar
  19. Wetzel, E., Xu, X., & von Davier, M. (2014). An alternative way to model population ability distributions in large-scale educational surveys. Educational and Psychological Measurement, 75(5), 1–25.Google Scholar
  20. Xu, X., & von Davier, M. (2008a). Fitting the structured general diagnostic model to NAEP data (Research Report No. RR-08-27). Princeton, NJ: Educational Testing Service.Google Scholar
  21. Xu, X., & von Davier, M. (2008b). Comparing multiple-group multinomial loglinear models for multidimensional skill distributions in the general diagnostic model (Research Report No. RR-08-35). Princeton, NJ: Educational Testing Service.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Educational Testing ServicePrincetonUSA
  2. 2.National Board of Medical Examiners (NBME)PhiladelphiaUSA

Personalised recommendations