Nonparametric Methods in Cognitively Diagnostic Assessment

Chiu, Chia-Yi; Köhn, Hans-Friedrich

doi:10.1007/978-3-030-05584-4_5

Chia-Yi Chiu⁵ &
Hans-Friedrich Köhn⁶

Part of the book series: Methodology of Educational Measurement and Assessment ((MEMA))

1570 Accesses
2 Citations

Abstract

Parametric estimation is the prevailing method for fitting diagnostic classification models. In the early days of cognitively diagnostic modeling, publicly available implementations of parametric estimation methods were scarce and often encountered technical difficulties in practice. In response to these difficulties, a number of researchers explored the potential of methods that do not rely on a parametric statistical model—nonparametric methods for short—as alternatives to, for example, MLE for assigning examinees to proficiency classes. Of particular interest were clustering methods because efficient implementations were readily available in the major statistical software packages. This article provides a review of nonparametric concepts and methods, as they have been developed and adopted for cognitive diagnosis: clustering methods and the Asymptotic Classification Theory of Cognitive Diagnosis (ACTCD), the Nonparametric Classification (NPC) method, and its generalization, the General NPC method. Also included in this review are two methods that employ the NPC method as a computational device: joint MLE for cognitive diagnosis and the nonparametric Q-matrix refinement and reconstruction method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
R is an open source statistical computing language available through the Comprehensive R Archive Network (CRAN) for free public use.
2.
Recall that “classification” typically refers to supervised learning—that is, the groups are known a priori—and “clustering” to unsupervised learning, where the groups are to be discovered in the analysis. Thus, strictly speaking, neither classification nor clustering seem accurate descriptions of the use of HACA with CD because (a) the number of realizable proficiency classes is known in advance and used to “cut” the HACA tree accordingly so that assigning examinees to clusters might be legitimately addressed as “classification” and (b) HACA produces unlabeled groups (i.e., not identified in terms of the underlying attribute vectors α) that require additional steps to determine the underlying α so that “clustering” might also appear as a fairly accurate characterization of the use of HACA in CD.
3.
A K-dimensional vector α ^∗≠ α is said to be nested within the vector α—written as α ≻α ^∗—if \(\alpha ^{\ast }_k \leq \alpha _k\), for all elements k, and \(\alpha ^{\ast }_k < \alpha _k\) for at least one k.
4.
Parameterization and notation refer to a general DCM as defined in Eqs. 5.1 and 5.2.
5.
Notation: As the QMR method relies on the NPC method that can be used for conjunctive as well as disjunctive models, instead of \(\eta ^{(c)}_{ij}\) and \(\eta ^{(d)}_{ij}\), in this section only η _ij is used to denotes the conjunctive as well as the disjunctive case.

References

Arabie, P., Hubert, L. J., & De Soete, G. (Eds.). (1996). Clustering and classification. River Edge, NJ: World Scientific.
Google Scholar
Ayers, E., Nugent, R., & Dean, N. (2009). A comparison of student skill knowledge estimates. In T. Barnes, M. Desmarais, C. Romero, & S. Ventura (Eds.), Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings. Cordoba, Spain (pp. 101–110).
Google Scholar
Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York: Marcel Dekker.
Book Google Scholar
Barnes, T. (2010). Novel derivation and application of skill matrices: The q-matrix method. In C. Ramero, S. Vemtora, M. Pechemizkiy, & R. S. J. de Baker (Eds.), Handbook of educational data mining (pp. 159–172). Boca Raton, FL: Chapman & Hall.
Chapter Google Scholar
Bartholomew, D. J. (1987). Latent variable models and factor analysis. New York: Oxford University Press.
Google Scholar
Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis (2nd ed.). London: Arnold.
Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Load & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.
Google Scholar
Bock, H. H. (2007). Clustering methods: A history of K-means algorithms. In P. Brito, P. Bertrand, G. Cucumel, & F. De Carvalho (Eds.), Selected contributions in data analysis and classification (pp. 161–172). Berlin, Germany: Springer.
Chapter Google Scholar
Brown, M. B., & Diaz, B. (2011). Seeking evidence of impact: Opportunities and needs. EDUCAUSE Review, 46, 41–54.
Google Scholar
Chen, J. (2017). A residual-based approach to validate Q-matrix specifications. Applied Psychological Measurement, 41, 277–293.
Article Google Scholar
Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of DINA Q matrix. Psychometrika, 83, 89–108.
Article Google Scholar
Chen, Y., Culpepper, S. A., Wang, S., & Douglas, J. (2018). A hidden Markov model for learning trajectories in cognitive diagnosis with application to spatial rotation skills. Applied Psychological Measurement, 42, 5–23.
Article Google Scholar
Chiu, C.-Y. (2008). Cluster analysis for cognitive diagnosis: Theory and applications (Doctoral dissertation). Available from ProQuest Dissertations and Theses database (UMI No. 3337778).
Google Scholar
Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis Applied Psychological Measurement, 37, 598–618.
Google Scholar
Chiu, C.-Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response profiles. Journal of Classification, 30, 225–250.
Article Google Scholar
Chiu, C.-Y., & Köhn, H.-F. (2015a). Consistency of cluster analysis for cognitive diagnosis: The DINO model and the DINA model revisited. Applied Psychological Measurement, 39, 465–479.
Article Google Scholar
Chiu, C.-Y., & Köhn, H.-F. (2015b). A general proof of consistency of heuristic classification for cognitive diagnosis models. British Journal of Mathematical and Statistical Psychology, 68, 387–409.
Article Google Scholar
Chiu, C.-Y., & Köhn, H.-F. (2016). Consistency of cluster analysis for cognitive diagnosis: The reduced reparameterized unified model and the general diagnostic model. Psychometrika, 81, 585–610.
Article Google Scholar
Chiu, C.-Y., & Ma, W. (2016). ACTCD: Asymptotic classification theory for cognitive diagnosis. R package version 1.1-0. Retrieved from the Comprehensive R Archive Network [CRAN] website http://cran.r-project.org/web/packages/ACTCD/
Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for cognitive diagnosis: Theory and applications. Psychometrika, 74, 633–665.
Article Google Scholar
Chiu, C.-Y., Sun, Y., & Bian, Y. (2018). Cognitive diagnosis for small educational programs: The general nonparametric classification method. Psychometrika, 83, 355–375.
Article Google Scholar
Chiu, C.-Y., Köhn, H.-F., Zheng, Y., & Henson, R. (2016). Joint maximum likelihood estimation for cognitive diagnostic models. Psychometrika, 81, 1069–1092.
Article Google Scholar
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353.
Article Google Scholar
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343–362.
Article Google Scholar
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115–130.
Article Google Scholar
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.
Article Google Scholar
de la Torre, J., & Chiu, C.-Y. (2016) A general method of empirical Q-matrix validation. Psychometrika, 81, 253–73.
Article Google Scholar
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36, 447–468.
Article Google Scholar
DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Volume 26. Psychometrics (pp. 979–1030). Amsterdam: Elsevier.
Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
Google Scholar
Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster analysis (4th ed.). New York: Arnold.
Google Scholar
Forgy, E. W. (1965). Cluster analyses of multivariate data: Efficiency versus interpretability of classifications. Biometrika, 61, 621–626.
Google Scholar
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.
Article Google Scholar
Fu, J., & Li, Y. (2007). An integrative review of cognitively diagnostic psychometric models. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
Google Scholar
Gordon, A. D. (1999). Classification (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.
Google Scholar
Grim, J. (2006). EM cluster analysis for categorical data. In D.-Y. Yeung, J. T. Kwok, A. L. N. Fred, F. Roll, & D. de Ridder (Eds.), Structural, syntactic, and statistical pattern recognition (pp. 640–648). Berlin, Germany: Springer.
Chapter Google Scholar
Haberman, S. J. (2004, May/2005, September). Joint and conditional maximum likelihood estimation for the Rasch model for binary responses (Research report No. RR-04-20). Princeton, NJ: Educational Testing Service.
Google Scholar
Haberman, S. J., & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Volume 26. Psychometrics (pp. 1031–1038). Amsterdam: Elsevier.
Google Scholar
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 333–352.
Article Google Scholar
Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.
Google Scholar
Hartigan, J. A. (1978). Asymptotic Distributions for Clustering Criteria. The Annals of Statistics, 6, 117–131.
Article Google Scholar
Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, 28, 100–108.
Article Google Scholar
Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Doctoral dissertation). Available from ProQuest Dissertations and Theses database (UMI No. 3044108).
Google Scholar
Hartz, S. M., & Roussos, L. A. (October 2008). The fusion model for skill diagnosis: Blending theory with practicality (Research report No. RR-08-71). Princeton, NJ: Educational Testing Service.
Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). New York: Springer.
Book Google Scholar
Heinen, T. (1996). Latent class and discrete latent trait models. Newbury Park, CA: Sage.
Google Scholar
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210.
Article Google Scholar
Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.
Article Google Scholar
Junker, B. W. (1991). Essential independence and likelihood-based ability estimation for polytomous items. Psychometrika, 56, 255–278.
Article Google Scholar
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.
Article Google Scholar
Köhn, H.-F., & Chiu, C.-Y. (2016). A proof of the duality of the DINA model and the DINO model. Journal of Classification, 33, 171–184.
Article Google Scholar
Köhn, H.-F., & Chiu, C.-Y. (2017). A procedure for assessing completeness of the Q-matrices of cognitively diagnostic tests. Psychometrika, 82, 112–132.
Article Google Scholar
Köhn, H.-F., Chiu, C.-Y., & Brusco, M. J. (2015) Heuristic cognitive diagnosis when the Q-matrix is unknown. British Journal of Mathematical and Statistical Psychology, 68, 268–291.
Article Google Scholar
Langeheine, R., & Rost, J. (Eds.). (1988). Latent trait and latent class models. New York: Plenum.
Google Scholar
Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis. Boston: Houghton Mifflin.
Google Scholar
Leighton, J., & Gierl, M. (2007) Cognitive diagnostic assessment for education: Theory and applications. Cambridge, UK: Cambridge University Press.
Book Google Scholar
Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564.
Article Google Scholar
Liu, J., Xu, G., & Ying, Z. (2013). Theory of the self-learning Q-matrix. Bernoulli, 19, 1790–1817.
Article Google Scholar
Ma, W., & de la Torre, J. (2017). GDINA: The generalized DINA model framework. R package version 1.4.2. Retrieved from the Comprehensive R Archive Network [CRAN] website https://cran.r-project.org/web/packages/GDINA/
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 281–297). Berkeley, CA: University of California Press.
Google Scholar
Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 33, 379–416.
Google Scholar
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212.
Article Google Scholar
McLachlan, G., & Basford, K. E. (1988). Mixture models: Inference and applications to clustering. New York: Marcel Dekker.
Google Scholar
McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Book Google Scholar
Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16, 1–32.
Article Google Scholar
Nichols, P. D., Chipman, S. F., & Brennan, R. L. (1995). Cognitively diagnostic assessment. Hillsdale, NJ: Lawrence Erlbaum Associates.
Google Scholar
Park, Y. S., & Lee, Y.-S. (2011). Diagnostic cluster analysis of mathematics skills. In M. von Davier & D. Hastedt (Eds.), Issues and methodologies in large-scale assessments (IERI monograph series, Vol. 4, pp. 75–107). Hamburg, Germany: IERI.
Google Scholar
Picciano, A. G. (2012). The evolution of big data and learning analytics in American higher education. Journal of Asynchronous Learning Networks, 16, 9–20.
Google Scholar
Pollard, D. (1981). Strong consistency of K-means clustering. The Annals of Statistics, 9(1), 135–140.
Article Google Scholar
Pollard, D. (1982). Quantization and the method of K-means. IEEE Transactions on Information Theory, 28, 199–205.
Article Google Scholar
Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2016). CDM: Cognitive diagnosis modeling. R package version 4.7-0. Retrieved from the Comprehensive R Archive Network [CRAN] website https://cran.r-project.org/web/packages/CDM/
Rupp, A. A., & Templin, J. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement Interdisciplinary Research and Perspectives, 6, 219–262.
Article Google Scholar
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement. Theory, methods, and applications. New York: Guilford.
Google Scholar
Steinhaus, H. (1956). Sur la division des corps matériels en parties. Bulletin de l’Académie Polonaise des Sciences, Classe III, IV(12), 801–804.
Google Scholar
Steinley, D. (2003). Local optima in K-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–304.
Article Google Scholar
Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.
Article Google Scholar
Steinley, D., & Brusco, M. J. (2007). Initializing K-means batch clustering: A critical analysis of several techniques. Journal of Classification, 24, 99–121.
Article Google Scholar
Stout, W. (2002). Psychometrics: From practice to theory and back. Psychometrika, 67, 485–518.
Article Google Scholar
Tatsuoka, K. K. (1985). A probabilistic model for diagnosing misconception in the pattern classification approach. Journal of Educational and Behavioral Statistics, 12, 55–73.
Article Google Scholar
Tatsuoka, K. K. (2009). Cognitive assessment. An introduction to the rule space method. New York: Routledge/Taylor & Francis.
Book Google Scholar
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.
Article Google Scholar
Vermunt, J. K. (1997). Loglinear models for event histories. Thousand Oaks, CA: Sage.
Google Scholar
von Davier, M. (2005, September). A general diagnostic model applied to language testing data (Research report No. RR-05-16). Princeton, NJ: Educational Testing Service.
Google Scholar
von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–301.
Article Google Scholar
von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement – Interdisciplinary Research and Perspectives, 7, 67–74.
Article Google Scholar
von Davier, M. (2014a). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS Research Report Series, 2014(2), 1–39.
Article Google Scholar
von Davier, M. (2014b). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67, 49–71.
Article Google Scholar
Wang, S., & Douglas, J. (2015). Consistency of nonparametric classification in cognitive diagnosis. Psychometrika, 80, 85–100.
Article Google Scholar
Wang, S., Yang, Y., Culpepper, S. A., & Douglas, J. (2018). Tracking skill acquisition with cognitive diagnosis models: Applications to spatial rotation skills. Journal of Educational and Behavioral Statistics, 43, 57–87.
Article Google Scholar
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.
Article Google Scholar
Willse, J., Henson, R., & Templin, J. (2007). Using sum scores or IRT in place of cognitive diagnosis models: Can existing or more familiar models do the job? Paper Presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
Google Scholar
Zheng, Y., & Chiu, C.-Y. (2016). NPCD: Nonparametric methods for cognitive diagnosis. R package version 1.0-10. Retrieved from the Comprehensive R Archive Network [CRAN] website http://CRAN.R-project.org/package=NPCD

Download references

Author information

Authors and Affiliations

Department of Educational Psychology, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
Chia-Yi Chiu
Department of Psychology, University of Illinois at Urbana-Champaign, Champaign, IL, USA
Hans-Friedrich Köhn

Authors

Chia-Yi Chiu
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Friedrich Köhn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chia-Yi Chiu .

Editor information

Editors and Affiliations

National Board of Medical Examiners (NBME), Philadelphia, PA, USA
Matthias von Davier
Teachers College, Columbia University, New York, NY, USA
Young-Sun Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chiu, CY., Köhn, HF. (2019). Nonparametric Methods in Cognitively Diagnostic Assessment. In: von Davier, M., Lee, YS. (eds) Handbook of Diagnostic Classification Models. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-030-05584-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-05584-4_5
Published: 12 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05583-7
Online ISBN: 978-3-030-05584-4
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics