Commentary on “Extending the Basic Local Independence Model to Polytomous Data” by Stefanutti, de Chiusole, Anselmi, and Spoto

The Polytomous Local Independence Model (PoLIM) by Stefanutti, de Chiusole, Anselmi, and Spoto, is an extension of the Basic Local Independence Model (BLIM) to accommodate polytomous items. BLIM, a model for analyzing responses to binary items, is based on Knowledge Space Theory, a framework developed by cognitive scientists and mathematical psychologists for modeling human knowledge acquisition and representation. The purpose of this commentary is to show that PoLIM is simply a paraphrase of a DINA model in cognitive diagnosis for polytomous items. Specifically, BLIM is shown to be equivalent to the DINA model when the BLIM-items are conceived as binary single-attribute items, each with a distinct attribute; thus, PoLIM is equivalent to the DINA for polytomous single-attribute items, each with a distinct attribute.

where q refers to an individual item in Q. The principal purpose of fitting item responses by the BLIM is estimation of the knowledge state K . Whereas BLIM describes ability in a domain as the set of items K an examinee is capable to solve, CD, a paradigm of educational measurement based on psychometric theory (e.g., Haberman & von Davier, 2007;von Davier & Lee, 2019), describes ability as a composite of A latent binary skills α a , a = 1, 2, . . . , A-called "attributes"-each of which an examinee may or may not have mastered. Attribute mastery is recorded as an A-dimensional binary vector α. Different proficiency classes are identified by distinct α. The individual items of a cognitively diagnostic test are also characterized by A-dimensional binary attribute vectors q that determine for each item j, j = 1, 2, . . . , J , which attributes must be mastered for a correct response (q ja = 1 if the a th attribute is required, 0 otherwise). The conjunction parameter ξ i j ∈ {0, 1} indicates whether examinee i has mastered all the attributes needed to answer item j correctly: Thus, the J -dimensional vector ξ is the CD-equivalent to K in KST. The Deterministic Input Noisy Output "AND" Gate (DINA) model (e.g., Haertel, 1989;Junker & Sijtsma, 2001;Macready & Dayton, 1977;von Davier, 2014) is presumably among the most frequently used diagnostic classification models (DCM). The conditional probability of a correct response Y i j to item j, given an examinee's attribute profile α i , is modeled as a function of an examinee's mastery of the attributes required for item j subject to slipping and guessing. The latter are implemented in the item response function (IRF) as the item-related parameters s j = P(Y i j = 0 | ξ i j = 1) and g j = P(Y i j = 1 | ξ i j = 0), with 0 ≤ g j < 1 − s j ≤ 1: The responses Y j are assumed to be locally independent. Hence, given an examinee's attribute profile α i , the conditional probability of her response vector Y i is The principal goal of CD is to classify examinees based on Y i -that is, to estimate α i . Thus, the major difference between BLIM and DINA is that the former describes ability in terms of items an examinee is able to solve, whereas DINA further dissects ability as a function of item attribute requirements and attributes generally mastered by an examinee.

BLIM as a DINA Model
Beyond the equivalence of ξ and K , synchronizing the notation of CD and KST reveals the following identities: Since Y i j and ξ i j are binary, Eq. (4) simplifies to Comparing Eq. (5) with Eq. (1) (repeated here for convenience) shows that item q ∈ Q corresponds to item j in a CD context; in fact, Q is mapped to {0, 1} J , where J =| Q |. Also, slipping, β q , and guessing, η q , in BLIM correspond to s j and g j in the DINA model. Several comments are warranted. First, the set R ⊆ Q of correctly answered items is mapped to the responses in CD by Y j = I [q ∈ R]. Second, the transition from the knowledge state K to its CD-analogue, the vector of ideal responses ξ , is achieved by the mapping of K to the j th element of ξ through the indicator function ξ j = I [q ∈ K ]. Third, the goal in KST is to estimate K ; however, in CD, the goal is to estimate α. So, how can K be linked to α? Recall that K is equivalent to ξ ; for single-attribute items in CD, however, it is true that ξ = α: an examinee's vector of ideal responses is equal to her attribute profile (Chiu, et al., 2009;Köhn & Chiu, 2019). Hence, estimating BLIM using DINA requires that all BLIM-items are singleattribute items because only then α = ξ is true. Said differently, only under this condition is estimating α equivalent to estimating ξ is equivalent to estimating K .

The BLIM as a DINA Model: An Empirical Demonstration
A synthetic data set comprising responses to five items by 1000 examinees (Doignon & Falmagne, 1999;Ch. 7) is used to demonstrate the equivalence of BLIM and DINA. The knowledge domain is Q = {a, b, c, d, e}; the knowledge structure is defined as an ordinal knowledge space, K = {∅,{a},{b},{a,b},{a,b,c},{a,b,d},{a,b,c,d},{a,b,c,e},Q} (see Eq. [1] in Doignon & Falmagne, 1999, p. 142). Notice that the prerequisite relations among items impose a hierarchical structure on the knowledge states. Consequently, K contains only nine out of 32 theoretically possible knowledge states. As a courtesy to the reader, the data are presented below. BLIM was fitted using the default settings of the blim function in the R package pks. The data were also fitted with the DINA model using the GDINA function in the R package GDINA, with the Q-matrix specified as a 5 × 5 identity matrix representing the | Q |= J BLIM-items as singleattribute items, each with a different attribute. The GDINA function uses marginal maximum likelihood estimation relying on the EM algorithm; hence, the blim function was used with the option "ML" to secure compatibility of the estimation. For the GDINA function, the initial values Note. Data are retrieved from Table 7.1 in Doignon and Falmagne, 1999, Ch. 7, p. 147. of the item parameters s and g were set to 0.1, which is the default setting for β and η in the blim function. The KST knowledge structure K consisted of nine different knowledge states K ; thus, in matching this setting, the attribute profiles α were limited to nine (recall α = ξ ≡ K ). The convergence criterion was set to 10 −7 . The results are presented below: the estimates of the item parameters and the probabilities of P(K | R) and P(ξ = α | y) are identical (except for a few discrepancies at the third or fourth decimal position); notice that y denotes the vector of observed item responses in CD.  Stefanutti et al. (2020) define L as the set of response levels of a polytomous item q ∈ Q and l, l ∈ L; K (q) = l and R(q) = l denote (possibly distinct) levels of responses to item q in K and in R, respectively. Describing disagreements between K and R as "slips" and "guesses" might be an inadequate simplification in a polytomous setting. Instead, Stefanutti et al. (2020) define the function q K (q), R(q) = q (l, l ) to account for agreement and disagreement between K (q) and R(q). The constraint l ∈L q (l, l ) = 1, for given q and l, identifies q as a probability. The conditional probability of observing R, given the latent knowledge state K , is defined for the PoLIM as  Davier, 2008). But as BLIM corresponds to the DINA model when the BLIM-items are conceived as binary single-attribute items, each with a distinct attribute, and PoLIM extends BLIM to polytomous items, the focus here is on the polytomous DINA model when only single-attribute items are used. In case of the polytomous DINA, like with the PoLIM, observed and ideal item responses are conceived as polytomous. However, different from PoLIM, DINA involves latent skills/attributes, which calls for some adjustments of the DINA model for binary items.

The Case of a Polytomous Single-Attribute Item
Let α a have levels 0, 1, 2 to H a ; there are then H a + 1 distinct levels of α a . The concept of polytomous attributes applies to examinee attribute profiles, α, as well as to item attribute vectors, q, which, in turn, affects the definition of the ideal response ξ that uses q and α as key ingredients. The extension of the binary ξ (defined in Eq. (2) and repeated here for convenience) to a polytomous ideal item response needs to account for the fact that the argument of the indicator function, α ia ≥ q ja , now compares two quantities that are no longer binary, but take on values ranging from 0 to H a (i.e., the levels of attribute α a ). The entries of the item attribute vector q j are all zero except for the entry q ja ∈ {1, 2, . . . , H a } that corresponds to the single polytomous attribute α a required for a correct response to item j. The q-vector of the polytomous singleattribute item j is denoted as q (a) j = (0, . . . , 0, q ja , 0, . . . , 0). Let L j denote the highest level of the polytomous ideal response ξ i j , with different levels l ∈ {0, 1, . . . , L j }. Of course, for a single-attribute item having q-vector q (a) j , L j = q ja is true, and so the ideal response is As an example, consider a single-attribute item j with q max{0, 1, 2, 3} = 3. Finally, notice that the observable random variable Y i j has L j + 1 response levels like its latent counterpart ξ i j .

Perturbations (Formerly Known as "Slips" and "Guesses")
As mentioned earlier in connection with PoLIM, describing the item parameters as "slips" and "guesses" does not adequately account for the increased complexity of potential disagreements between ideal and observed responses in a polytomous setting. Instead, the more general term "perturbation" should be preferred, which calls also for adjusting the notation. Recall that l denotes the category of the ideal response, ξ i j , of examinee i to item j; let l denote the category of her observed response Y i j to this item. In following Stefanutti et al. (2020), the corresponding probabilities are denoted by : As an example, consider again item j having q ja = L j = 3. The table below summarizes for all combinations of the levels l and l the corresponding item parameters j : As an aside, notice that the form of the main diagonal entries, 1− m =l jlm , is slightly reminiscent of the terms 1 − s j and 1 − g j in the binary DINA model.

The Item Response Function of the Polytomous DINA
The IRF of the polytomous DINA model is In using the previous example of the polytomous single attribute item j having q j = (0, 0, 3), the IRF of the polytomous DINA model returns the following probabilities for the response categories of Y i j , given α i = (1, 2, 3)-notice that ξ i j = 3:

PoLIM as a Polytomous DINA Model
PoLIM is equivalent to the polytomous DINA model if the J =| Q | items are conceived as single-attribute items, with all attributes distinct; recall that the concept of attributes is foreign to BLIM as well as PoLIM. Consider again Eq. (6) (repeated here for convenience): which can be rewritten as where q (l, l) refers to the case K (q) = R(q) that can be reexpressed as 1 − l =l q (l, l ) due to the constraint l ∈L q (l, l ) = 1 for all l and each q.
For the polytomous DINA model, the conditional probability of an examinee's polytomous response vector Y i is Eq. (9) confirms that the two expressions for the conditional probabilities are equivalent.
Suppose polytomous responses to J items have been collected. Equivalent to using PoLIM, the data can be fitted by the polytomous DINA model in constructing a J × J Q-matrix with each item j expressed as a single-attribute item having q-vector q (a) j where entry q ja = q j j = L j for all j. Notice that like with BLIM-as-DINA, there are J = A attributes that must all be distinct. Therefore, the Q-matrix is a J × J diagonal matrix, with the main diagonal containing the L j . As is true for BLIM-as-DINA, the conjunctive ideal item response vector ξ i of examinee i is identical to her attribute profile α i . Thus, estimating α is equivalent to estimating ξ -and the latter is the analogue to K that is to be estimated by PoLIM. Here is a small-scale example involving only two items. Suppose the levels of the responses are 3 and 4, respectively. Thus, when the data are fitted with the polytomous DINA model, the Q-matrix is (3, 1). Using Eq. (10), ξ i1 and ξ i2 can be obtained by computing (1), 1(1), 2(0), 3(0), 4(0)} = 1 Hence, the vector of ideal item responses is ξ i = (3, 1) = α i . The associated item response probabilities are

The PoLIM as a Polytomous DINA Model: An Empirical Demonstration
Although software for fitting PoLIM is not publicly available, PoLIM can still be estimated using the polytomous DINA model based on the principles developed in the previous paragraphs. The validity of this approach is demonstrated with a small-scale computational experiment that emulates the non-available PoLIM algorithm in analyzing the synthetic BLIM-data from Doignon and Falmagne (1999; Ch. 7) that were already used earlier. The key idea is to predict from the BLIM results those of a PoLIM analysis if the item response format were changed from binary to polytomous. These results are then compared with those obtained from the polytomous DINA model. For maximal precision, the response format of only a single item was changed from binary to trinary. Also, an item was chosen that is included in a minimal number of knowledge states; Item e met this requirement: it was only an element of K = {a, b, c, e} and K = {a, b, c, d, e} (recall the knowledge domain is Q = {a, b, c, d, e}). The correct responses to Item e were split into two parts: either into equal halfs (50:50) or according to 40:60. The first part of correct responses to Item e were kept as "1"; the second part was changed to "2," which induces a split of each of the knowledge states, K = {a, b, c, e} and K = {a, b, c, d, e}, into two, thereby increasing the number of knowledge states from nine to 11. Specifically, in switching from set to vector notation, K = {a, b, c, e} can be written as (11101), which is then split into (11101) and (11102) ; and K = {a, b, c, d, e} is written as (11111), split into (11111) and (11112). Fitting the modified data with the polytomous DINA model was predicted to provide estimates of the proportions of the knowledge states (11101) and (11102) that would add up to the proportion of K = {a, b, c, e} ≡ (11101) as it was obtained by BLIM earlier as 0.1417. The same reasoning applies to the estimated proportions of (11111) and (11112) that would add up to the BLIM estimate of the proportion of K = {a, b, c, d, e} ≡ (11111) equal to 0.1568. An additional prediction was that the proportions estimated by BLIM for the remaining K not involving Item e should remain the same when fitting the data with the polytomous DINA model. Notice that splitting the responses for the patterns involving Item e amounted to producing fractional subjects (e.g., splitting the frequency of R = {a, b, d, e} that is 19 into two would have resulted in 9.5 subjects). Hence, the frequencies of the observed response patterns were multiplied by two (i.e., N = 2, 000), which preserves the proportions. The data were fitted with the polytomous DINA model, which was coded in R by the authors in implementing marginal maximum likelihood estimation relying on the EM algorithm. The Q-matrix was specified as a 5 × 5 matrix: The KST knowledge structure K consisted now of 11 different knowledge states K ; thus, in matching this setting, the attribute profiles α were limited to 11. The results are presented below: the probabilities of P(K | R) and P(ξ = α | y) are identical (except for a few discrepancies at the third or fourth decimal position). Specifically, notice that the probabilities of the split knowledge states, K = {a, b, c, e} ≡ (11101) and K = {a, b, c, d, e} ≡ (11111), add up-as predictedto the proportion estimates obtained from BLIM. For completeness, the estimates of the item parameters are also reported.

Item parameter estimates BLIM
Polytomous DINA model

Conclusion and Outlook
BLIM is for modeling the responses to dichotomous items within a KST framework. PoLIM is the extension of the BLIM to model the responses to polytomous items. BLIM was shown to be equivalent to the DINA model if all items in Q are conceived as single-attribute items; and so is PoLIM equivalent to the polytomous DINA model if all items require only a single-attribute. In conclusion, a few aspects of single-attribute items should be addressed including how they connect BLIM and PoLIM in KST with DINA and its polytomous companion in CD.
The first aspect concerns the identifiability of the model parameters. BLIM is known to be not identified for most knowledge structures (Heller, 2017;Spoto et al., 2012;2013;Stefanutti et al., 2018). In light of the equivalence of BLIM and the single-attribute DINA model, the nonidentifiability of the BLIM can be rephrased within a CD point of view. Gu and Xu (2019) showed that the DINA model is only identifiable if each attribute is used by at least three items (Condition 1 (ii); p. 471). By definition, this condition cannot be fulfilled by the single-attribute DINA model, as its Q-matrix is a J × J identity matrix-hence, due to the equivalence of the two models, identifiability cannot hold for BLIM either. Within this context, an interesting question awaiting future research is whether PoLIM is identifiable.
Second, for single-attribute items, the choice of the particular DCM is immaterial, as they all result in the same parameter estimates and examinee classification provided the attributes do not have a hierarchical structure. The items in the example from Doignon and Falmagne (1999), however, are hierarchically organized-and so are the attributes underlying these items (Tatsuoka, 2009). So, is it still true that any DCM using the K × K Q-matrix with only single-attribute items allows for reproducing the BLIM results as was possible with the DINA model? In fact, this is not true. The data from Doignon and Falmagne (1999) were also fitted with the G-DINA model using the same 5 × 5 identity matrix as Q that was used for the DINA model (with the same initial values), but different from the DINA model, the BLIM estimates could not be reproduced when using the G-DINA. Said differently, which models are equivalent when attributes have a hierarchy, and which Q-matrices lead to identical results is currently uncharted territory. Only for the DINA model can we claim with certainty that explicit and implicit Q-matrices (Akbay & de la Torre, 2020)-the latter meaning the identity Q-matrix-lead to the same result when attributes are hierarchically organized. In fact, the 5 × 5 identity matrix used in the commentary as Q for the DINA model to fit the data from Doignon and Falmagne (1999) is an implicit Q-matrix in the Akbay and de la Torre (2020) sense.
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: The Polytomous DINA Model
The section "The Polytomous DINA Model" only describes the special case of single-attribute items. For multi-attribute items, the polytomous DINA is more complex. This appendix provides a description of technical details of the polytomous DINA when used for modeling responses to polytomous items. Recall that the q-vector of a polytomous single-attribute item j is written as q (a) j = (0, . . . , 0, q ja , 0, . . . , 0). Let L j denote the highest level of the polytomous ideal response ξ i j , with different levels l ∈ {0, 1, . . . , L j }. For a single-attribute item having q-vector q (a) j , L j = q ja is true, and so the ideal response is So, for a single-attribute polytomous item, the relation between levels l and the item attribute vector q is relatively straightforward; this is not the case for polytomous multi-attribute items.
The relation is far more complex and requires adjustment of the notation.

The "Star" Notation
First, Consider the entries q ja of the item attribute vector q j = (q j1 , q j2 , . . . , q j A ) that document which attributes an examinee must have mastered to answer item j correctly. Suppose item j requires A j ≤ A attributes. Also, assume that the entries of the item attribute vector have been rearranged such that the attributes required for item j have been shifted to the first A j positions of q j ; the remaining entries beyond A j are all zero. The shuffled item attribute vector is denoted as q * j = (q * j1 , q * j2 , . . . , q * j A j ) . Notice that the zero entries in q j have been deleted in q * j . Thus, distinct from the q ja , the entries of q * j , q * ja , take on values ranging only from 1 to H a denoting the different levels of attributes required for item j.

Polytomous Ideal Response Categories
Switching from q j to q * j allows for expressing the number of categories of η i j -that is, of the polytomous ideal response to item j-as the product of the entries of q * j : define L j = q * j1 q * j2 · · · q * j A j ; then, the number of categories is L j + 1, where the additional category refers to the zero response. The ideal response categories of η i j are indexed by l = 0, 1, . . . , L j . (As η i j is the latent counterpart of the manifest response, L j + 1 also denotes the number of response categories of the observable random variable Y i j .) Notice the terminology: "response categories"-and not "response levels." The latter would have implied ordered response categories. However, here, the general case is concerned that also includes non-ordered response categories; hence, the indices l = 0, 1, . . . , L j should be merely interpreted as category labels.

The Polytomous Ideal Item Response
With these preliminaries in place, the polytomous ideal item response can then be defined as