Skip to main content

Nonparametric Methods in Cognitively Diagnostic Assessment

  • Chapter
  • First Online:
Handbook of Diagnostic Classification Models

Abstract

Parametric estimation is the prevailing method for fitting diagnostic classification models. In the early days of cognitively diagnostic modeling, publicly available implementations of parametric estimation methods were scarce and often encountered technical difficulties in practice. In response to these difficulties, a number of researchers explored the potential of methods that do not rely on a parametric statistical model—nonparametric methods for short—as alternatives to, for example, MLE for assigning examinees to proficiency classes. Of particular interest were clustering methods because efficient implementations were readily available in the major statistical software packages. This article provides a review of nonparametric concepts and methods, as they have been developed and adopted for cognitive diagnosis: clustering methods and the Asymptotic Classification Theory of Cognitive Diagnosis (ACTCD), the Nonparametric Classification (NPC) method, and its generalization, the General NPC method. Also included in this review are two methods that employ the NPC method as a computational device: joint MLE for cognitive diagnosis and the nonparametric Q-matrix refinement and reconstruction method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    R is an open source statistical computing language available through the Comprehensive R Archive Network (CRAN) for free public use.

  2. 2.

    Recall that “classification” typically refers to supervised learning—that is, the groups are known a priori—and “clustering” to unsupervised learning, where the groups are to be discovered in the analysis. Thus, strictly speaking, neither classification nor clustering seem accurate descriptions of the use of HACA with CD because (a) the number of realizable proficiency classes is known in advance and used to “cut” the HACA tree accordingly so that assigning examinees to clusters might be legitimately addressed as “classification” and (b) HACA produces unlabeled groups (i.e., not identified in terms of the underlying attribute vectors α) that require additional steps to determine the underlying α so that “clustering” might also appear as a fairly accurate characterization of the use of HACA in CD.

  3. 3.

    A K-dimensional vector α ≠ α is said to be nested within the vector α—written as α ≻α —if \(\alpha ^{\ast }_k \leq \alpha _k\), for all elements k, and \(\alpha ^{\ast }_k < \alpha _k\) for at least one k.

  4. 4.

    Parameterization and notation refer to a general DCM as defined in Eqs. 5.1 and 5.2.

  5. 5.

    Notation: As the QMR method relies on the NPC method that can be used for conjunctive as well as disjunctive models, instead of \(\eta ^{(c)}_{ij}\) and \(\eta ^{(d)}_{ij}\), in this section only η ij is used to denotes the conjunctive as well as the disjunctive case.

References

  • Arabie, P., Hubert, L. J., & De Soete, G. (Eds.). (1996). Clustering and classification. River Edge, NJ: World Scientific.

    Google Scholar 

  • Ayers, E., Nugent, R., & Dean, N. (2009). A comparison of student skill knowledge estimates. In T. Barnes, M. Desmarais, C. Romero, & S. Ventura (Eds.), Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings. Cordoba, Spain (pp. 101–110).

    Google Scholar 

  • Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York: Marcel Dekker.

    Book  Google Scholar 

  • Barnes, T. (2010). Novel derivation and application of skill matrices: The q-matrix method. In C. Ramero, S. Vemtora, M. Pechemizkiy, & R. S. J. de Baker (Eds.), Handbook of educational data mining (pp. 159–172). Boca Raton, FL: Chapman & Hall.

    Chapter  Google Scholar 

  • Bartholomew, D. J. (1987). Latent variable models and factor analysis. New York: Oxford University Press.

    Google Scholar 

  • Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis (2nd ed.). London: Arnold.

    Google Scholar 

  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Load & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.

    Google Scholar 

  • Bock, H. H. (2007). Clustering methods: A history of K-means algorithms. In P. Brito, P. Bertrand, G. Cucumel, & F. De Carvalho (Eds.), Selected contributions in data analysis and classification (pp. 161–172). Berlin, Germany: Springer.

    Chapter  Google Scholar 

  • Brown, M. B., & Diaz, B. (2011). Seeking evidence of impact: Opportunities and needs. EDUCAUSE Review, 46, 41–54.

    Google Scholar 

  • Chen, J. (2017). A residual-based approach to validate Q-matrix specifications. Applied Psychological Measurement, 41, 277–293.

    Article  Google Scholar 

  • Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J. (2018). Bayesian estimation of DINA Q matrix. Psychometrika, 83, 89–108.

    Article  Google Scholar 

  • Chen, Y., Culpepper, S. A., Wang, S., & Douglas, J. (2018). A hidden Markov model for learning trajectories in cognitive diagnosis with application to spatial rotation skills. Applied Psychological Measurement, 42, 5–23.

    Article  Google Scholar 

  • Chiu, C.-Y. (2008). Cluster analysis for cognitive diagnosis: Theory and applications (Doctoral dissertation). Available from ProQuest Dissertations and Theses database (UMI No. 3337778).

    Google Scholar 

  • Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis Applied Psychological Measurement, 37, 598–618.

    Google Scholar 

  • Chiu, C.-Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response profiles. Journal of Classification, 30, 225–250.

    Article  Google Scholar 

  • Chiu, C.-Y., & Köhn, H.-F. (2015a). Consistency of cluster analysis for cognitive diagnosis: The DINO model and the DINA model revisited. Applied Psychological Measurement, 39, 465–479.

    Article  Google Scholar 

  • Chiu, C.-Y., & Köhn, H.-F. (2015b). A general proof of consistency of heuristic classification for cognitive diagnosis models. British Journal of Mathematical and Statistical Psychology, 68, 387–409.

    Article  Google Scholar 

  • Chiu, C.-Y., & Köhn, H.-F. (2016). Consistency of cluster analysis for cognitive diagnosis: The reduced reparameterized unified model and the general diagnostic model. Psychometrika, 81, 585–610.

    Article  Google Scholar 

  • Chiu, C.-Y., & Ma, W. (2016). ACTCD: Asymptotic classification theory for cognitive diagnosis. R package version 1.1-0. Retrieved from the Comprehensive R Archive Network [CRAN] website http://cran.r-project.org/web/packages/ACTCD/

  • Chiu, C.-Y., Douglas, J. A., & Li, X. (2009). Cluster analysis for cognitive diagnosis: Theory and applications. Psychometrika, 74, 633–665.

    Article  Google Scholar 

  • Chiu, C.-Y., Sun, Y., & Bian, Y. (2018). Cognitive diagnosis for small educational programs: The general nonparametric classification method. Psychometrika, 83, 355–375.

    Article  Google Scholar 

  • Chiu, C.-Y., Köhn, H.-F., Zheng, Y., & Henson, R. (2016). Joint maximum likelihood estimation for cognitive diagnostic models. Psychometrika, 81, 1069–1092.

    Article  Google Scholar 

  • de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333–353.

    Article  Google Scholar 

  • de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343–362.

    Article  Google Scholar 

  • de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115–130.

    Article  Google Scholar 

  • de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.

    Article  Google Scholar 

  • de la Torre, J., & Chiu, C.-Y. (2016) A general method of empirical Q-matrix validation. Psychometrika, 81, 253–73.

    Article  Google Scholar 

  • DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36, 447–468.

    Article  Google Scholar 

  • DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Volume 26. Psychometrics (pp. 979–1030). Amsterdam: Elsevier.

    Google Scholar 

  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster analysis (4th ed.). New York: Arnold.

    Google Scholar 

  • Forgy, E. W. (1965). Cluster analyses of multivariate data: Efficiency versus interpretability of classifications. Biometrika, 61, 621–626.

    Google Scholar 

  • Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.

    Article  Google Scholar 

  • Fu, J., & Li, Y. (2007). An integrative review of cognitively diagnostic psychometric models. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.

    Google Scholar 

  • Gordon, A. D. (1999). Classification (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.

    Google Scholar 

  • Grim, J. (2006). EM cluster analysis for categorical data. In D.-Y. Yeung, J. T. Kwok, A. L. N. Fred, F. Roll, & D. de Ridder (Eds.), Structural, syntactic, and statistical pattern recognition (pp. 640–648). Berlin, Germany: Springer.

    Chapter  Google Scholar 

  • Haberman, S. J. (2004, May/2005, September). Joint and conditional maximum likelihood estimation for the Rasch model for binary responses (Research report No. RR-04-20). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Haberman, S. J., & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Volume 26. Psychometrics (pp. 1031–1038). Amsterdam: Elsevier.

    Google Scholar 

  • Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26, 333–352.

    Article  Google Scholar 

  • Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley.

    Google Scholar 

  • Hartigan, J. A. (1978). Asymptotic Distributions for Clustering Criteria. The Annals of Statistics, 6, 117–131.

    Article  Google Scholar 

  • Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, 28, 100–108.

    Article  Google Scholar 

  • Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Doctoral dissertation). Available from ProQuest Dissertations and Theses database (UMI No. 3044108).

    Google Scholar 

  • Hartz, S. M., & Roussos, L. A. (October 2008). The fusion model for skill diagnosis: Blending theory with practicality (Research report No. RR-08-71). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). New York: Springer.

    Book  Google Scholar 

  • Heinen, T. (1996). Latent class and discrete latent trait models. Newbury Park, CA: Sage.

    Google Scholar 

  • Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210.

    Article  Google Scholar 

  • Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.

    Article  Google Scholar 

  • Junker, B. W. (1991). Essential independence and likelihood-based ability estimation for polytomous items. Psychometrika, 56, 255–278.

    Article  Google Scholar 

  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.

    Article  Google Scholar 

  • Köhn, H.-F., & Chiu, C.-Y. (2016). A proof of the duality of the DINA model and the DINO model. Journal of Classification, 33, 171–184.

    Article  Google Scholar 

  • Köhn, H.-F., & Chiu, C.-Y. (2017). A procedure for assessing completeness of the Q-matrices of cognitively diagnostic tests. Psychometrika, 82, 112–132.

    Article  Google Scholar 

  • Köhn, H.-F., Chiu, C.-Y., & Brusco, M. J. (2015) Heuristic cognitive diagnosis when the Q-matrix is unknown. British Journal of Mathematical and Statistical Psychology, 68, 268–291.

    Article  Google Scholar 

  • Langeheine, R., & Rost, J. (Eds.). (1988). Latent trait and latent class models. New York: Plenum.

    Google Scholar 

  • Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis. Boston: Houghton Mifflin.

    Google Scholar 

  • Leighton, J., & Gierl, M. (2007) Cognitive diagnostic assessment for education: Theory and applications. Cambridge, UK: Cambridge University Press.

    Book  Google Scholar 

  • Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564.

    Article  Google Scholar 

  • Liu, J., Xu, G., & Ying, Z. (2013). Theory of the self-learning Q-matrix. Bernoulli, 19, 1790–1817.

    Article  Google Scholar 

  • Ma, W., & de la Torre, J. (2017). GDINA: The generalized DINA model framework. R package version 1.4.2. Retrieved from the Comprehensive R Archive Network [CRAN] website https://cran.r-project.org/web/packages/GDINA/

  • MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 281–297). Berkeley, CA: University of California Press.

    Google Scholar 

  • Macready, G. B., & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 33, 379–416.

    Google Scholar 

  • Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212.

    Article  Google Scholar 

  • McLachlan, G., & Basford, K. E. (1988). Mixture models: Inference and applications to clustering. New York: Marcel Dekker.

    Google Scholar 

  • McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.

    Book  Google Scholar 

  • Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16, 1–32.

    Article  Google Scholar 

  • Nichols, P. D., Chipman, S. F., & Brennan, R. L. (1995). Cognitively diagnostic assessment. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Google Scholar 

  • Park, Y. S., & Lee, Y.-S. (2011). Diagnostic cluster analysis of mathematics skills. In M. von Davier & D. Hastedt (Eds.), Issues and methodologies in large-scale assessments (IERI monograph series, Vol. 4, pp. 75–107). Hamburg, Germany: IERI.

    Google Scholar 

  • Picciano, A. G. (2012). The evolution of big data and learning analytics in American higher education. Journal of Asynchronous Learning Networks, 16, 9–20.

    Google Scholar 

  • Pollard, D. (1981). Strong consistency of K-means clustering. The Annals of Statistics, 9(1), 135–140.

    Article  Google Scholar 

  • Pollard, D. (1982). Quantization and the method of K-means. IEEE Transactions on Information Theory, 28, 199–205.

    Article  Google Scholar 

  • Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2016). CDM: Cognitive diagnosis modeling. R package version 4.7-0. Retrieved from the Comprehensive R Archive Network [CRAN] website https://cran.r-project.org/web/packages/CDM/

  • Rupp, A. A., & Templin, J. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement Interdisciplinary Research and Perspectives, 6, 219–262.

    Article  Google Scholar 

  • Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement. Theory, methods, and applications. New York: Guilford.

    Google Scholar 

  • Steinhaus, H. (1956). Sur la division des corps matériels en parties. Bulletin de l’Académie Polonaise des Sciences, Classe III, IV(12), 801–804.

    Google Scholar 

  • Steinley, D. (2003). Local optima in K-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–304.

    Article  Google Scholar 

  • Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.

    Article  Google Scholar 

  • Steinley, D., & Brusco, M. J. (2007). Initializing K-means batch clustering: A critical analysis of several techniques. Journal of Classification, 24, 99–121.

    Article  Google Scholar 

  • Stout, W. (2002). Psychometrics: From practice to theory and back. Psychometrika, 67, 485–518.

    Article  Google Scholar 

  • Tatsuoka, K. K. (1985). A probabilistic model for diagnosing misconception in the pattern classification approach. Journal of Educational and Behavioral Statistics, 12, 55–73.

    Article  Google Scholar 

  • Tatsuoka, K. K. (2009). Cognitive assessment. An introduction to the rule space method. New York: Routledge/Taylor & Francis.

    Book  Google Scholar 

  • Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.

    Article  Google Scholar 

  • Vermunt, J. K. (1997). Loglinear models for event histories. Thousand Oaks, CA: Sage.

    Google Scholar 

  • von Davier, M. (2005, September). A general diagnostic model applied to language testing data (Research report No. RR-05-16). Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–301.

    Article  Google Scholar 

  • von Davier, M. (2009). Some notes on the reinvention of latent structure models as diagnostic classification models. Measurement – Interdisciplinary Research and Perspectives, 7, 67–74.

    Article  Google Scholar 

  • von Davier, M. (2014a). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS Research Report Series, 2014(2), 1–39.

    Article  Google Scholar 

  • von Davier, M. (2014b). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67, 49–71.

    Article  Google Scholar 

  • Wang, S., & Douglas, J. (2015). Consistency of nonparametric classification in cognitive diagnosis. Psychometrika, 80, 85–100.

    Article  Google Scholar 

  • Wang, S., Yang, Y., Culpepper, S. A., & Douglas, J. (2018). Tracking skill acquisition with cognitive diagnosis models: Applications to spatial rotation skills. Journal of Educational and Behavioral Statistics, 43, 57–87.

    Article  Google Scholar 

  • Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.

    Article  Google Scholar 

  • Willse, J., Henson, R., & Templin, J. (2007). Using sum scores or IRT in place of cognitive diagnosis models: Can existing or more familiar models do the job? Paper Presented at the Annual Meeting of the National Council on Measurement in Education, Chicago, IL.

    Google Scholar 

  • Zheng, Y., & Chiu, C.-Y. (2016). NPCD: Nonparametric methods for cognitive diagnosis. R package version 1.0-10. Retrieved from the Comprehensive R Archive Network [CRAN] website http://CRAN.R-project.org/package=NPCD

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chia-Yi Chiu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Chiu, CY., Köhn, HF. (2019). Nonparametric Methods in Cognitively Diagnostic Assessment. In: von Davier, M., Lee, YS. (eds) Handbook of Diagnostic Classification Models. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-030-05584-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05584-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05583-7

  • Online ISBN: 978-3-030-05584-4

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics