Abstract
Parametric estimation is the prevailing method for fitting diagnostic classification models. In the early days of cognitively diagnostic modeling, publicly available implementations of parametric estimation methods were scarce and often encountered technical difficulties in practice. In response to these difficulties, a number of researchers explored the potential of methods that do not rely on a parametric statistical model—nonparametric methods for short—as alternatives to, for example, MLE for assigning examinees to proficiency classes. Of particular interest were clustering methods because efficient implementations were readily available in the major statistical software packages. This article provides a review of nonparametric concepts and methods, as they have been developed and adopted for cognitive diagnosis: clustering methods and the Asymptotic Classification Theory of Cognitive Diagnosis (ACTCD), the Nonparametric Classification (NPC) method, and its generalization, the General NPC method. Also included in this review are two methods that employ the NPC method as a computational device: joint MLE for cognitive diagnosis and the nonparametric Q-matrix refinement and reconstruction method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
R is an open source statistical computing language available through the Comprehensive R Archive Network (CRAN) for free public use.
- 2.
Recall that “classification” typically refers to supervised learning—that is, the groups are known a priori—and “clustering” to unsupervised learning, where the groups are to be discovered in the analysis. Thus, strictly speaking, neither classification nor clustering seem accurate descriptions of the use of HACA with CD because (a) the number of realizable proficiency classes is known in advance and used to “cut” the HACA tree accordingly so that assigning examinees to clusters might be legitimately addressed as “classification” and (b) HACA produces unlabeled groups (i.e., not identified in terms of the underlying attribute vectors α) that require additional steps to determine the underlying α so that “clustering” might also appear as a fairly accurate characterization of the use of HACA in CD.
- 3.
A K-dimensional vector α ∗≠ α is said to be nested within the vector α—written as α ≻α ∗—if \(\alpha ^{\ast }_k \leq \alpha _k\), for all elements k, and \(\alpha ^{\ast }_k < \alpha _k\) for at least one k.
- 4.
- 5.
Notation: As the QMR method relies on the NPC method that can be used for conjunctive as well as disjunctive models, instead of \(\eta ^{(c)}_{ij}\) and \(\eta ^{(d)}_{ij}\), in this section only η ij is used to denotes the conjunctive as well as the disjunctive case.
References
Arabie, P., Hubert, L. J., & De Soete, G. (Eds.). (1996). Clustering and classification. River Edge, NJ: World Scientific.
Ayers, E., Nugent, R., & Dean, N. (2009). A comparison of student skill knowledge estimates. In T. Barnes, M. Desmarais, C. Romero, & S. Ventura (Eds.), Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings. Cordoba, Spain (pp. 101–110).
Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York: Marcel Dekker.
Barnes, T. (2010). Novel derivation and application of skill matrices: The q-matrix method. In C. Ramero, S. Vemtora, M. Pechemizkiy, & R. S. J. de Baker (Eds.), Handbook of educational data mining (pp. 159–172). Boca Raton, FL: Chapman & Hall.
Bartholomew, D. J. (1987). Latent variable models and factor analysis. New York: Oxford University Press.
Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis (2nd ed.). London: Arnold.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Load & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.
Bock, H. H. (2007). Clustering methods: A history of K-means algorithms. In P. Brito, P. Bertrand, G. Cucumel, & F. De Carvalho (Eds.), Selected contributions in data analysis and classification (pp. 161–172). Berlin, Germany: Springer.
Brown, M. B., & Diaz, B. (2011). Seeking evidence of impact: Opportunities and needs. EDUCAUSE Review, 46, 41–54.
Chen, J. (2017). A residual-based approach to validate Q-matrix specifications. Applied Psychological Measurement, 41, 277–293.
Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of DINA Q matrix. Psychometrika, 83, 89–108.
Chen, Y., Culpepper, S. A., Wang, S., & Douglas, J. (2018). A hidden Markov model for learning trajectories in cognitive diagnosis with application to spatial rotation skills. Applied Psychological Measurement, 42, 5–23.
Chiu, C.-Y. (2008). Cluster analysis for cognitive diagnosis: Theory and applications (Doctoral dissertation). Available from ProQuest Dissertations and Theses database (UMI No. 3337778).
Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis Applied Psychological Measurement, 37, 598–618.
Chiu, C.-Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response profiles. Journal of Classification, 30, 225–250.
Chiu, C.-Y., & Köhn, H.-F. (2015a). Consistency of cluster analysis for cognitive diagnosis: The DINO model and the DINA model revisited. Applied Psychological Measurement, 39, 465–479.
Chiu, C.-Y., & Köhn, H.-F. (2015b). A general proof of consistency of heuristic classification for cognitive diagnosis models. British Journal of Mathematical and Statistical Psychology, 68, 387–409.
Chiu, C.-Y., & Köhn, H.-F. (2016). Consistency of cluster analysis for cognitive diagnosis: The reduced reparameterized unified model and the general diagnostic model. Psychometrika, 81, 585–610.
Chiu, C.-Y., & Ma, W. (2016). ACTCD: Asymptotic classification theory for cognitive diagnosis. R package version 1.1-0. Retrieved from the Comprehensive R Archive Network [CRAN] website http://cran.r-project.org/web/packages/ACTCD/
Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for cognitive diagnosis: Theory and applications. Psychometrika, 74, 633–665.
Chiu, C.-Y., Sun, Y., & Bian, Y. (2018). Cognitive diagnosis for small educational programs: The general nonparametric classification method. Psychometrika, 83, 355–375.
Chiu, C.-Y., Köhn, H.-F., Zheng, Y., & Henson, R. (2016). Joint maximum likelihood estimation for cognitive diagnostic models. Psychometrika, 81, 1069–1092.
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353.
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343–362.
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115–130.
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.
de la Torre, J., & Chiu, C.-Y. (2016) A general method of empirical Q-matrix validation. Psychometrika, 81, 253–73.
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36, 447–468.
DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Volume 26. Psychometrics (pp. 979–1030). Amsterdam: Elsevier.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster analysis (4th ed.). New York: Arnold.
Forgy, E. W. (1965). Cluster analyses of multivariate data: Efficiency versus interpretability of classifications. Biometrika, 61, 621–626.
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.
Fu, J., & Li, Y. (2007). An integrative review of cognitively diagnostic psychometric models. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
Gordon, A. D. (1999). Classification (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.
Grim, J. (2006). EM cluster analysis for categorical data. In D.-Y. Yeung, J. T. Kwok, A. L. N. Fred, F. Roll, & D. de Ridder (Eds.), Structural, syntactic, and statistical pattern recognition (pp. 640–648). Berlin, Germany: Springer.
Haberman, S. J. (2004, May/2005, September). Joint and conditional maximum likelihood estimation for the Rasch model for binary responses (Research report No. RR-04-20). Princeton, NJ: Educational Testing Service.
Haberman, S. J., & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Volume 26. Psychometrics (pp. 1031–1038). Amsterdam: Elsevier.
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 333–352.
Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.
Hartigan, J. A. (1978). Asymptotic Distributions for Clustering Criteria. The Annals of Statistics, 6, 117–131.
Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, 28, 100–108.
Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Doctoral dissertation). Available from ProQuest Dissertations and Theses database (UMI No. 3044108).
Hartz, S. M., & Roussos, L. A. (October 2008). The fusion model for skill diagnosis: Blending theory with practicality (Research report No. RR-08-71). Princeton, NJ: Educational Testing Service.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). New York: Springer.
Heinen, T. (1996). Latent class and discrete latent trait models. Newbury Park, CA: Sage.
Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210.
Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.
Junker, B. W. (1991). Essential independence and likelihood-based ability estimation for polytomous items. Psychometrika, 56, 255–278.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.
Köhn, H.-F., & Chiu, C.-Y. (2016). A proof of the duality of the DINA model and the DINO model. Journal of Classification, 33, 171–184.
Köhn, H.-F., & Chiu, C.-Y. (2017). A procedure for assessing completeness of the Q-matrices of cognitively diagnostic tests. Psychometrika, 82, 112–132.
Köhn, H.-F., Chiu, C.-Y., & Brusco, M. J. (2015) Heuristic cognitive diagnosis when the Q-matrix is unknown. British Journal of Mathematical and Statistical Psychology, 68, 268–291.
Langeheine, R., & Rost, J. (Eds.). (1988). Latent trait and latent class models. New York: Plenum.
Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis. Boston: Houghton Mifflin.
Leighton, J., & Gierl, M. (2007) Cognitive diagnostic assessment for education: Theory and applications. Cambridge, UK: Cambridge University Press.
Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564.
Liu, J., Xu, G., & Ying, Z. (2013). Theory of the self-learning Q-matrix. Bernoulli, 19, 1790–1817.
Ma, W., & de la Torre, J. (2017). GDINA: The generalized DINA model framework. R package version 1.4.2. Retrieved from the Comprehensive R Archive Network [CRAN] website https://cran.r-project.org/web/packages/GDINA/
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 281–297). Berkeley, CA: University of California Press.
Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 33, 379–416.
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212.
McLachlan, G., & Basford, K. E. (1988). Mixture models: Inference and applications to clustering. New York: Marcel Dekker.
McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16, 1–32.
Nichols, P. D., Chipman, S. F., & Brennan, R. L. (1995). Cognitively diagnostic assessment. Hillsdale, NJ: Lawrence Erlbaum Associates.
Park, Y. S., & Lee, Y.-S. (2011). Diagnostic cluster analysis of mathematics skills. In M. von Davier & D. Hastedt (Eds.), Issues and methodologies in large-scale assessments (IERI monograph series, Vol. 4, pp. 75–107). Hamburg, Germany: IERI.
Picciano, A. G. (2012). The evolution of big data and learning analytics in American higher education. Journal of Asynchronous Learning Networks, 16, 9–20.
Pollard, D. (1981). Strong consistency of K-means clustering. The Annals of Statistics, 9(1), 135–140.
Pollard, D. (1982). Quantization and the method of K-means. IEEE Transactions on Information Theory, 28, 199–205.
Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2016). CDM: Cognitive diagnosis modeling. R package version 4.7-0. Retrieved from the Comprehensive R Archive Network [CRAN] website https://cran.r-project.org/web/packages/CDM/
Rupp, A. A., & Templin, J. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement Interdisciplinary Research and Perspectives, 6, 219–262.
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement. Theory, methods, and applications. New York: Guilford.
Steinhaus, H. (1956). Sur la division des corps matériels en parties. Bulletin de l’Académie Polonaise des Sciences, Classe III, IV(12), 801–804.
Steinley, D. (2003). Local optima in K-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–304.
Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.
Steinley, D., & Brusco, M. J. (2007). Initializing K-means batch clustering: A critical analysis of several techniques. Journal of Classification, 24, 99–121.
Stout, W. (2002). Psychometrics: From practice to theory and back. Psychometrika, 67, 485–518.
Tatsuoka, K. K. (1985). A probabilistic model for diagnosing misconception in the pattern classification approach. Journal of Educational and Behavioral Statistics, 12, 55–73.
Tatsuoka, K. K. (2009). Cognitive assessment. An introduction to the rule space method. New York: Routledge/Taylor & Francis.
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.
Vermunt, J. K. (1997). Loglinear models for event histories. Thousand Oaks, CA: Sage.
von Davier, M. (2005, September). A general diagnostic model applied to language testing data (Research report No. RR-05-16). Princeton, NJ: Educational Testing Service.
von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–301.
von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement – Interdisciplinary Research and Perspectives, 7, 67–74.
von Davier, M. (2014a). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS Research Report Series, 2014(2), 1–39.
von Davier, M. (2014b). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67, 49–71.
Wang, S., & Douglas, J. (2015). Consistency of nonparametric classification in cognitive diagnosis. Psychometrika, 80, 85–100.
Wang, S., Yang, Y., Culpepper, S. A., & Douglas, J. (2018). Tracking skill acquisition with cognitive diagnosis models: Applications to spatial rotation skills. Journal of Educational and Behavioral Statistics, 43, 57–87.
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.
Willse, J., Henson, R., & Templin, J. (2007). Using sum scores or IRT in place of cognitive diagnosis models: Can existing or more familiar models do the job? Paper Presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.
Zheng, Y., & Chiu, C.-Y. (2016). NPCD: Nonparametric methods for cognitive diagnosis. R package version 1.0-10. Retrieved from the Comprehensive R Archive Network [CRAN] website http://CRAN.R-project.org/package=NPCD
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chiu, CY., Köhn, HF. (2019). Nonparametric Methods in Cognitively Diagnostic Assessment. In: von Davier, M., Lee, YS. (eds) Handbook of Diagnostic Classification Models. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-030-05584-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-05584-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05583-7
Online ISBN: 978-3-030-05584-4
eBook Packages: EducationEducation (R0)