Skip to main content

A General Method of Empirical Q-matrix Validation


In contrast to unidimensional item response models that postulate a single underlying proficiency, cognitive diagnosis models (CDMs) posit multiple, discrete skills or attributes, thus allowing CDMs to provide a finer-grained assessment of examinees’ test performance. A common component of CDMs for specifying the attributes required for each item is the Q-matrix. Although construction of Q-matrix is typically performed by domain experts, it nonetheless, to a large extent, remains a subjective process, and misspecifications in the Q-matrix, if left unchecked, can have important practical implications. To address this concern, this paper proposes a discrimination index that can be used with a wide class of CDM subsumed by the generalized deterministic input, noisy “and” gate model to empirically validate the Q-matrix specifications by identifying and replacing misspecified entries in the Q-matrix. The rationale for using the index as the basis for a proposed validation method is provided in the form of mathematical proofs to several relevant lemmas and a theorem. The feasibility of the proposed method was examined using simulated data generated under various conditions. The proposed method is illustrated using fraction subtraction data.

This is a preview of subscription content, access via your institution.

Fig. 1


  • Barnes, T. (2010). Novel derivation and application of skill matrices: The q-matrix method. In Handbook on educational data mining (pp. 159–172). Boca Raton: CRC Press.

  • Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems: I. Effect of inequality of variance in the one-way classification. Annals of Mathematical Statistics, 25, 290–302.

    Article  Google Scholar 

  • Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50, 123–140.

    Article  Google Scholar 

  • Chiu, C.-Y. (2013). Statistical refinement of the Q-matrix in cognitive diagnosis. Applied Psychological Measurement, 37, 598–618.

    Article  Google Scholar 

  • Chiu, C.-Y., & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classication, 30, 225–250.

    Article  Google Scholar 

  • de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343–362.

    Article  Google Scholar 

  • de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.

    Article  Google Scholar 

  • de la Torre, J., & Douglas, J. (2004). Higher order latent trait models for cognitive diagnosis. Psychometrika, 63, 333–353.

    Article  Google Scholar 

  • de la Torre, J., van der Ark, L. A., & Rossi, G. (in press). Analysis of clinical data from a cognitive diagnosis modeling framework. Measurement and Evaluation in Counseling and Development.

  • DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36, 447–468.

    Article  Google Scholar 

  • DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and summary of psychometric models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, Vol. 26, psychometrics (pp. 979–1030). Amsterdam: Elsevier.

    Google Scholar 

  • Doornik, J. A. (2007). Object-oriented matrix programming using Ox (3rd ed.). London: Timberlake Consultants Press.

    Google Scholar 

  • Fu, J., & Li, Y. (2007). An integrative review of cognitively diagnostic psychometric models. Paper presented at the Annual Meeting of the National Council of Measurement in Education, Chicago, IL.

  • Haberman, S. J., & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics, vol. 26, psychometrics (pp. 1031–1038). Amsterdam: Elsevier.

    Google Scholar 

  • Haertel, E. H. (1984). An application of latent class models to assessment data. Applied Psychological Measurement, 8, 333–346.

    Article  Google Scholar 

  • Hartz, S. M., & Roussos, L. A. (2008). The Fusion Model for skills diagnosis: Blending theory with practice. Educational Testing Service, Research Report, RR-08-71. Princeton, NJ: Educational Testing Service.

  • Henson, R. A., Templin, J. L., & Willse, J. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74, 191–210.

    Article  Google Scholar 

  • Jaeger, J., Tatsuoka, C., & Berns, S. (2003). Innovation methods for extracting valid cognitive deficit profiles from NP test data in schizophrenia. Schizophrenia Research, 60, 140–140.

    Article  Google Scholar 

  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25, 258–272.

    Article  Google Scholar 

  • Liu, J., Xu, G., & Ying, Z. (2012). Data-driven learning of Q-matrix. Applied Psychological Measurement, 36, 548–564.

    Article  PubMed  PubMed Central  Google Scholar 

  • Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187–212.

    Article  Google Scholar 

  • McCullagh, P., & Nelder, J. (1999). Generalized linear models (2nd ed.). Boca Raton, FL: Chapman and Hall.

    Google Scholar 

  • Rupp, A. A., & Templin, J. L. (2008a). The effect of Q-matrix misspecification on parameter estimates and misclassification rates in the DINA model. Educational and Psychological Measurement, 68, 78–96.

    Article  Google Scholar 

  • Rupp, A. A., & Templin, J. L. (2008b). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 6, 219–262.

    Google Scholar 

  • Rupp, A. A., Templin, J. L., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford.

    Google Scholar 

  • Stevens, J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Tatsuoka, K. K. (1983). Rule-space: An approach for dealing with misconception based on item response theory. Journal of Educational Measurement, 20, 345–354.

    Article  Google Scholar 

  • Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In N. Frederiksen, R. Glaser, A. Lesgold & M. Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Hillsdale, NJ: Erlbaum.

  • Templin, J. L., & Henson, R. A. (2006a). A Bayesian method for incorporating uncertainty into Q-matrix estimation in skills assessment. Paper Presented at the Annual Meeting of the National Council on Measurement in Education, San Francisco, CA.

  • Templin, J. L., & Henson, R. A. (2006b). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287–305.

    Article  PubMed  Google Scholar 

  • von Davier, M. (2005). A general diagnostic model applied to language testing data. Educational Testing Service, Research Report, RR-05-16.

Download references


This research was supported in part by National Science Foundation Grant DRL-0744486.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jimmy de la Torre.

Appendix : Proofs of the Lemmas and the Theorem

Appendix : Proofs of the Lemmas and the Theorem

Lemma 1

\(\bar{p}(\varvec{\alpha }_{k:K''})=\bar{p}(\varvec{\alpha }_{1:K''})\) for all \(k<K''\).


According to Definition 1,

$$\begin{aligned} \bar{p}(\varvec{\alpha }_{k:K''})= & {} \sum _{\alpha _{k}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} w(\varvec{\alpha }_{k:K''})p(\varvec{\alpha }_{k:K''})\nonumber \\= & {} \sum _{\alpha _{k}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} w(\varvec{\alpha }_{k:K''})\frac{\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{k-1}=0}^{1} w(\varvec{\alpha }_{1:K''})p(\varvec{\alpha }_{1:K''})}{w(\varvec{\alpha }_{k:K''})}\nonumber \\= & {} \sum _{\alpha _{k}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{k-1}=0}^{1}w(\varvec{\alpha }_{1:K''})p(\varvec{\alpha }_{1:K''})\nonumber \\= & {} \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} w(\varvec{\alpha }_{1:K''})p(\varvec{\alpha }_{1:K''})\nonumber \\= & {} \bar{p}(\varvec{\alpha }_{1:K''}) \end{aligned}$$

Because (12) holds for all \(k<K''\), \(\bar{p}(\varvec{\alpha }_{K':K''})=\bar{p}(\varvec{\alpha }_{1:K'})=\bar{p}(\varvec{\alpha }_{1:K^{*}})=\bar{p}(\varvec{\alpha }_{1:K''})\). \(\square \)

Lemma 2

Suppose \(K'+1\le K^{*}\).

$$\begin{aligned} \sum _{\alpha _{K'+1}=0}^1\cdots \sum _{\alpha _{K''}=0}^1 w(\varvec{\alpha }_{(K'+1):K''})p^{2}(\varvec{\alpha }_{(K'+1):K''}) \le \sum _{\alpha _{K'}=0}^1\cdots \sum _{\alpha _{K''}=0}^1 w(\varvec{\alpha }_{K':K''})p^{2}(\varvec{\alpha }_{K':K''}). \end{aligned}$$


The LHS of (11) equals

$$\begin{aligned}&\sum _{\alpha _{K'+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}\sum _{\alpha _{K'}=0}^{1}w(\varvec{\alpha }_{K':K''})\Bigg [ \frac{\sum _{\alpha _{K'}=0}^{1}w(\varvec{\alpha }_{K':K''})p(\varvec{\alpha }_{K':K''})}{\sum _{\alpha _{K'}=0}^{1}w(\varvec{\alpha }_{K':K''})}\Bigg ]^{2}\nonumber \\&\quad =\sum _{\alpha _{K'+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} \frac{\Bigg [\sum _{\alpha _{K'}=0}^{1}w(\varvec{\alpha }_{K':K''})p(\varvec{\alpha }_{K':K''})\Bigg ]^{2}}{\sum _{\alpha _{K'}=0}^{1}w(\varvec{\alpha }_{K':K''})}\nonumber \\&\quad =\sum _{\alpha _{K'+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} \left[ w(0,\varvec{\alpha }_{(K'+1):K''})+w(1,\varvec{\alpha }_{(K'+1):K''})\right] ^{-1}\nonumber \\&\qquad \qquad \quad \Big [w^{2}(0,\varvec{\alpha }_{(K'+1):K''})p^{2}(0,\varvec{\alpha }_{(K'+1):K''})+ w^{2}(1,\varvec{\alpha }_{(K'+1):K''})p^{2}(1,\varvec{\alpha }_{(K'+1):K''})\nonumber \\&\qquad \qquad \qquad 2w(0,\varvec{\alpha }_{(K'+1):K''})w(1,\varvec{\alpha }_{(K'+1):K''}) p(0,\varvec{\alpha }_{(K'+1):K''})p(1,\varvec{\alpha }_{(K'+1):K''}) \Big ]. \end{aligned}$$

The RHS of (13) is

$$\begin{aligned} \sum _{\alpha _{K'+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}\Big [ w(0,\varvec{\alpha }_{(K'+1):K''})p^{2}(0,\varvec{\alpha }_{(K'+1):K''})+ w(1,\varvec{\alpha }_{(K'+1):K''})p^{2}(1,\varvec{\alpha }_{(K'+1):K''}) \Big ]. \end{aligned}$$

Subtracting (14) from (15), and simplifying,

$$\begin{aligned}&\sum _{\alpha _{K'+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}\Bigg [ \frac{w(0,\varvec{\alpha }_{(K'+1):K''})w(1,\varvec{\alpha }_{(K'+1):K''})}{w(0,\varvec{\alpha }_{(K'+1):K''})+w(1,\varvec{\alpha }_{(K'+1):K''})}\Bigg ] \Big [p(0,\varvec{\alpha }_{(K'+1):K''})-p(1,\varvec{\alpha }_{(K'+1):K''})\Big ]^{2}\\&\quad \ge 0. \end{aligned}$$

Therefore, (11) holds. \(\square \)

Theorem 1

\(\varsigma ^2_{K':K''}\le \varsigma ^2_{1:K^{*}}\).


Case 1: The provisional q-vector is strictly overspecified, that is, \(K'=K^{*}<K''\) resulting in \(\varvec{q}=\varvec{q}_{1:K''}\) and \(\varsigma ^2=\varsigma ^2_{1:K''}\).

By Lemma 1, the theorem for this case can be proved by showing that

$$\begin{aligned} \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})p^{2}(\varvec{\alpha }_{1:K''}) =\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K^{*}}=0}^{1}w(\varvec{\alpha }_{1:{K^{*}}})p^{2}(\varvec{\alpha }_{1:{K^{*}}}). \end{aligned}$$

Based on Definition 1,

$$\begin{aligned} w(\varvec{\alpha }_{1:K^{*}})=\sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} w(\varvec{\alpha }_{1:K''}), \end{aligned}$$


$$\begin{aligned} p(\varvec{\alpha }_{1:K^{*}})=\frac{\sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} w(\varvec{\alpha }_{1:K''})p(\varvec{\alpha }_{1:K''}) }{\sum _{\alpha _{K^{*}+1}=0}^{1},\cdots ,\sum _{\alpha _{1:K''}=0}^{1}w(\varvec{\alpha }_{1:K''})}. \end{aligned}$$

The RHS of (16) can be expressed as

$$\begin{aligned} \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K^{*}}=0}^{1} w(\varvec{\alpha }_{1:K^{*}})\Bigg [\frac{\sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''}) p(\varvec{\alpha }_{1:K''})}{\sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})}\Bigg ]^{2}. \end{aligned}$$

By definition of a correct q-vector,

$$\begin{aligned} p(\varvec{\alpha }_{1:K''})=p(\varvec{\alpha }_{1:K^{*}}) \forall \alpha _{(K^{*}+1)},\ldots ,\alpha _{K''}. \end{aligned}$$

Thus, (17) is equal to

$$\begin{aligned}&\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K^{*}}=0}^{1} w(\varvec{\alpha }_{1:K^{*}})\Bigg [\frac{p(\varvec{\alpha }_{1:K^{*}})\sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})}{\sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})}\Bigg ]^{2}\\&\quad =\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K^{*}}=0}^{1} \sum _{\alpha _{K^{*}+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''}) p^{2}(\varvec{\alpha }_{1:K''})\\&\quad =\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''}) p^{2}(\varvec{\alpha }_{1:K''}) \end{aligned}$$

which is the LHS of (16).

Case 2: When the provisional q-vector is both under- and overspecified, that is, \(K'<K^{*}<K''\), \(\varvec{q}=\varvec{q}_{K':K''}\) and \(\varsigma ^2=\varsigma ^2_{K':K''}\).

Case 2 will be proved by induction. By Lemma 1 and the result for Case 1, Case 2 of the theorem can be proved by showing that

$$\begin{aligned}&\sum _{\alpha _{K'}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{K':K''})p^{2}(\varvec{\alpha }_{K':K''}) \nonumber \\&\quad \le \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K^{*}}=0}^{1}w(\varvec{\alpha }_{1:K^{*}})p^{2}(\varvec{\alpha }_{1:K^{*}})\nonumber \\&\quad =\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})p^{2}(\varvec{\alpha }_{1:K''}) \end{aligned}$$

Step 1: Show that (18) is true for \(K'=1\).

The LHS of (18) is

$$\begin{aligned} \sum _{\alpha _{K'}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{K':K''})p(\varvec{\alpha }_{K':K''}) =\sum _{\alpha _1=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})p(\varvec{\alpha }_{1:K''}), \end{aligned}$$

which is the RHS of (18).

Step 2: Assume that (18) is true for \(K'=k\), that is,

$$\begin{aligned} \sum _{\alpha _{k}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{k:K''}) p^{2}(\varvec{\alpha }_{k:K''}) \le \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})p^{2}(\varvec{\alpha }_{1:K''}). \end{aligned}$$

Step 3: Show that (18) is true for \(K'=k+1\), as in,

$$\begin{aligned}&\sum _{\alpha _{k+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{k+1:K''}) p^{2}(\varvec{\alpha }_{k+1:K''})\nonumber \\&\quad \le \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K^{*}}=0}^{1}w(\varvec{\alpha }_{1:K^{*}})p^{2}(\varvec{\alpha }_{1:K^{*}})\nonumber \\&\quad =\sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''}) p^{2}(\varvec{\alpha }_{1:K''}) \end{aligned}$$

According to Lemma 2, the LHS of (19) is

$$\begin{aligned} \sum _{\alpha _{k+1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1} w(\varvec{\alpha }_{k+1:K''})p^{2}(\varvec{\alpha }_{k+1:K''})\le \sum _{\alpha _{k}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{k:K''}) p^{2}(\varvec{\alpha }_{k:K''}). \end{aligned}$$

According to the assumption in Step 2, (20) can further be expressed as

$$\begin{aligned} \sum _{\alpha _{k}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{k:K''}) p^{2}(\varvec{\alpha }_{k:K''})\le \sum _{\alpha _{1}=0}^{1}\cdots \sum _{\alpha _{K''}=0}^{1}w(\varvec{\alpha }_{1:K''})p^{2}(\varvec{\alpha }_{1:K''}), \end{aligned}$$

which is the RHS of (19). \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de la Torre, J., Chiu, CY. A General Method of Empirical Q-matrix Validation. Psychometrika 81, 253–273 (2016).

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: