Abstract
In contrast to unidimensional item response models that postulate a single underlying proficiency, cognitive diagnosis models (CDMs) posit multiple, discrete skills or attributes, thus allowing CDMs to provide a finer-grained assessment of examinees’ test performance. A common component of CDMs for specifying the attributes required for each item is the Q-matrix. Although construction of Q-matrix is typically performed by domain experts, it nonetheless, to a large extent, remains a subjective process, and misspecifications in the Q-matrix, if left unchecked, can have important practical implications. To address this concern, this paper proposes a discrimination index that can be used with a wide class of CDM subsumed by the generalized deterministic input, noisy “and” gate model to empirically validate the Q-matrix specifications by identifying and replacing misspecified entries in the Q-matrix. The rationale for using the index as the basis for a proposed validation method is provided in the form of mathematical proofs to several relevant lemmas and a theorem. The feasibility of the proposed method was examined using simulated data generated under various conditions. The proposed method is illustrated using fraction subtraction data.
This is a preview of subscription content, access via your institution.

References
Barnes, T. (2010). Novel derivation and application of skill matrices: The q-matrix method. In Handbook on educational data mining (pp. 159–172). Boca Raton: CRC Press.
Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one-way classification. Annals of Mathematical Statistics, 25, 290–302.
Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50, 123–140.
Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis. Applied Psychological Measurement, 37, 598–618.
Chiu, C.-Y., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classication, 30, 225–250.
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343–362.
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.
de la Torre, J., & Douglas, J. (2004). Higher order latent trait models for cognitive diagnosis. Psychometrika, 63, 333–353.
de la Torre, J., van der Ark, L. A., & Rossi, G. (in press). Analysis of clinical data from a cognitive diagnosis modeling framework. Measurement and Evaluation in Counseling and Development.
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36, 447–468.
DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, Vol. 26, psychometrics (pp. 979–1030). Amsterdam: Elsevier.
Doornik, J. A. (2007). Object-oriented matrix programming using Ox (3rd ed.). London: Timberlake Consultants Press.
Fu, J., & Li, Y. (2007). An integrative review of cognitively diagnostic psychometric models. Paper presented at the Annual Meeting of the National Council of Measurement in Education, Chicago, IL.
Haberman, S. J., & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, vol. 26, psychometrics (pp. 1031–1038). Amsterdam: Elsevier.
Haertel, E. H. (1984). An application of latent class models to assessment data. Applied Psychological Measurement, 8, 333–346.
Hartz, S. M., & Roussos, L. A. (2008). The Fusion Model for skills diagnosis: Blending theory with practice. Educational Testing Service, Research Report, RR-08-71. Princeton, NJ: Educational Testing Service.
Henson, R. A., Templin, J. L., & Willse, J. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210.
Jaeger, J., Tatsuoka, C., & Berns, S. (2003). Innovation methods for extracting valid cognitive deficit profiles from NP test data in schizophrenia. Schizophrenia Research, 60, 140–140.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.
Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564.
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212.
McCullagh, P., & Nelder, J. (1999). Generalized linear models (2nd ed.). Boca Raton, FL: Chapman and Hall.
Rupp, A. A., & Templin, J. L. (2008a). The effect of Q-matrix misspecification on parameter estimates and misclassification rates in the DINA model. Educational and Psychological Measurement, 68, 78–96.
Rupp, A. A., & Templin, J. L. (2008b). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 6, 219–262.
Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford.
Stevens, J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). Mahwah, NJ: Erlbaum.
Tatsuoka, K. K. (1983). Rule-space: An approach for dealing with misconception based on item response theory. Journal of Educational Measurement, 20, 345–354.
Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold & M. Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Hillsdale, NJ: Erlbaum.
Templin, J. L., & Henson, R. A. (2006a). A Bayesian method for incorporating uncertainty into Q-matrix estimation in skills assessment. Paper Presented at the Annual Meeting of the National Council on Measurement in Education, San Francisco, CA.
Templin, J. L., & Henson, R. A. (2006b). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.
von Davier, M. (2005). A general diagnostic model applied to language testing data. Educational Testing Service, Research Report, RR-05-16.
Acknowledgments
This research was supported in part by National Science Foundation Grant DRL-0744486.
Author information
Authors and Affiliations
Corresponding author
Appendix : Proofs of the Lemmas and the Theorem
Appendix : Proofs of the Lemmas and the Theorem
Lemma 1
\(\bar{p}(\varvec{\alpha }_{k:K''})=\bar{p}(\varvec{\alpha }_{1:K''})\) for all \(k<K''\).
Proof
According to Definition 1,
Because (12) holds for all \(k<K''\), \(\bar{p}(\varvec{\alpha }_{K':K''})=\bar{p}(\varvec{\alpha }_{1:K'})=\bar{p}(\varvec{\alpha }_{1:K^{*}})=\bar{p}(\varvec{\alpha }_{1:K''})\). \(\square \)
Lemma 2
Suppose \(K'+1\le K^{*}\).
Proof
The LHS of (11) equals
The RHS of (13) is
Subtracting (14) from (15), and simplifying,
Therefore, (11) holds. \(\square \)
Theorem 1
\(\varsigma ^2_{K':K''}\le \varsigma ^2_{1:K^{*}}\).
Proof
Case 1: The provisional q-vector is strictly overspecified, that is, \(K'=K^{*}<K''\) resulting in \(\varvec{q}=\varvec{q}_{1:K''}\) and \(\varsigma ^2=\varsigma ^2_{1:K''}\).
By Lemma 1, the theorem for this case can be proved by showing that
Based on Definition 1,
and
The RHS of (16) can be expressed as
By definition of a correct q-vector,
Thus, (17) is equal to
which is the LHS of (16).
Case 2: When the provisional q-vector is both under- and overspecified, that is, \(K'<K^{*}<K''\), \(\varvec{q}=\varvec{q}_{K':K''}\) and \(\varsigma ^2=\varsigma ^2_{K':K''}\).
Case 2 will be proved by induction. By Lemma 1 and the result for Case 1, Case 2 of the theorem can be proved by showing that
Step 1: Show that (18) is true for \(K'=1\).
The LHS of (18) is
which is the RHS of (18).
Step 2: Assume that (18) is true for \(K'=k\), that is,
Step 3: Show that (18) is true for \(K'=k+1\), as in,
According to Lemma 2, the LHS of (19) is
According to the assumption in Step 2, (20) can further be expressed as
which is the RHS of (19). \(\square \)
Rights and permissions
About this article
Cite this article
de la Torre, J., Chiu, CY. A General Method of Empirical Q-matrix Validation. Psychometrika 81, 253–273 (2016). https://doi.org/10.1007/s11336-015-9467-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-015-9467-8