Incremental Model Fit Assessment in the Case of Categorical Data: Tucker–Lewis Index for Item Response Theory Modeling

The Tucker–Lewis index (TLI; Tucker & Lewis, 1973), also known as the non-normed fit index (NNFI; Bentler & Bonett, 1980), is one of the numerous incremental fit indices widely used in linear mean and covariance structure modeling, particularly in exploratory factor analysis, tools popular in prevention research. It augments information provided by other indices such as the root-mean-square error of approximation (RMSEA). In this paper, we develop and examine an analogous index for categorical item level data modeled with item response theory (IRT). The proposed Tucker–Lewis index for IRT (TLIRT) is based on Maydeu-Olivares and Joe's (2005) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$M_2$$\end{document}M2 family of limited-information overall model fit statistics. The limited-information fit statistics have significantly better Chi-square approximation and power than traditional full-information Pearson or likelihood ratio statistics under realistic situations. Building on the incremental fit assessment principle, the TLIRT compares the fit of model under consideration along a spectrum of worst to best possible model fit scenarios. We examine the performance of the new index using simulated and empirical data. Results from a simulation study suggest that the new index behaves as theoretically expected, and it can offer additional insights about model fit not available from other sources. In addition, a more stringent cutoff value is perhaps needed than Hu and Bentler's (1999) traditional cutoff criterion with continuous variables. In the empirical data analysis, we use a data set from a measurement development project in support of cigarette smoking cessation research to illustrate the usefulness of the TLIRT. We noticed that had we only utilized the RMSEA index, we could have arrived at qualitatively different conclusions about model fit, depending on the choice of test statistics, an issue to which the TLIRT is relatively more immune. Supplementary Information The online version of this article (10.1007/s11121-021-01253-4) contains supplementary material, which is available to authorized users.

In M 2 , the marginal residuals up to order 2 are used. For n dichotomous items, there are n first-order marginal residuals, and n(n − 1)/2 second-order marginal residuals.
They are obtained via operator matrices. LetL be an n × C fixed matrix consisting of zeros and ones. The (i, c)th element inL is one if and only if item i is endorsed in the cth response pattern. Similarly, letL be an n(n − 1)/2 × C fixed matrix of zeros and ones. Each row of this matrix corresponds to an item pair, and for each row, column c is equal to one if and only if the pair of items is endorsed in the cth response pattern.
Pre-multiplying e byL leads toė =Le, the n × 1 vector of first-order marginal residuals.
Similarly, pre-multiplying e byL leads toë =Le, the n(n − 1)/2 × 1 vector of secondorder marginal residuals. Stackingė andë, the vector of marginal residuals up to order 2 is e 2 = (ė,ë). Because e 2 is a linear function of e, the asymptotic distribution of e 2 is also normal where L 2 = (L,L), stackingL andL. Let us simplify the notation and let Ω 2 = L 2 ΩL 2 = where∆ 2 andΞ 2 denote the evaluation of ∆ 2 and Ξ 2 at the maximum likelihood estimatê θ. It follows from Proposition 4 in Browne's (1984) that M 2 is asymptotically chi-square distributed with n(n + 1)/2 − dim(θ) degrees of freedom under the null hypothesis that the model fits exactly in the population.
For n polytomous items with K categories, the total number of possible response patterns becomes C = K n . Consequently, the vectors of multinomial cell probabilities π(θ) andπ is C × 1 in size. In limited information goodness-of-fit testing, expanded operator matricesL andL collapse cell probabilities into univariate and bivariate marginal probabilities or residual vectors, as detailed in Cai and Hansen (2013), among others.
The asymptotic distribution theory of residuals under polytomous data can derived using along the same lines as Equation (2), enabling the formation of chi-square statistics similar to M 2 in Equation (3).
Again, with no loss of generality, we illustrate the computations related to the fullindependence model under dichotomous data. Recall that the 2PL model is where α is the intercept term and β a potentially vector-valued item slope parameter conformable with the dimensions of η. For the complete-independence null model, no latent variable is present Note that P(U i = 0) = 1 − P(U i = 1). Now let us turn to the computation of the M 2 statistic for the independence model.
Requirement components are the first-and second-order marginal probabilities, the Jacobian, and the multinomial covariance matrix. As a practical matter, the fit statistics for independence model are implemented in flexMIRT® (Cai, 2015). We note that M 2 for independence model leads us to compute other relative fit indices such as the comparative fit index (CFI; Bentler, 1990), the normed-fit index (NFI; Bentler & Bonett, 1980), and the incremental fit index (IFI; Bollen, 1989) based on M 2 .

Marginal probabilities
Let us begin with the computation of the first-and second-order marginal probabilities. Again, without loss of generality, we only consider dichotomous items. Let there be n items scored U i = 0 or 1. Letπ i denote the first order marginal probability for item i andπ ij the second order marginal probability for item pair (i, j). Accordingly,π i means the model-implied probability for item i when U i = 1. Likewise,π ij means the modelimplied joint probability for items i and j when U i = 1 and U j = 1. We can computeπ i andπ ij directly as follows:π In our independence model, the first-order marginal probability when U i = 1 be- It follows thatπ i =ṗ i , whereṗ i is the observed counterpart. This is due to the fact that intercepts perfectly fit the observed univariate proportions. Once this is established, the second-order marginal probabilities can be obtained simply as the products of two observed univariate proportions, i.e.,π ij =ṗ iṗj .

The Jacobian
Next, the Jacobian, ∆ 2 , for the independence model can be expressed as Note that we only have to take the first-order derivatives of the model-implied marginal probabilities with respect to intercept. A typical elements in the upper block is and those of the lower block are

The multinomial covariance matrix
The multinominal covariance matrix Ξ 2 may be rewritten as (see Cai & Hansen, 2013): where Σ = L 2 diag(π)L 2 , and can be partitioned into Σ 11 , Σ 21 , and Σ 22 as follows: In short, calculating the elements of Σ involves calculation of the first, second, third, and fourth order marginal probabilities. Because of full-independence, the model-implied probabilities for the third and fourth order margins can be obtained as products of univariate proportions.