Abstract
We propose a nonparametric item response theory model for dichotomouslyscored items in a Bayesian framework. The model is based on a latent class (LC) formulation, and it is multidimensional, with dimensions corresponding to a partition of the items in homogenous groups that are specified on the basis of inequality constraints among the conditional success probabilities given the latent class. Moreover, an innovative system of prior distributions is proposed following the encompassing approach, in which the largest model is the unconstrained LC model. A reversiblejump type algorithm is described for sampling from the joint posterior distribution of the model parameters of the encompassing model. By suitably postprocessing its output, we then make inference on the number of dimensions (i.e., number of groups of items measuring the same latent trait) and we cluster items according to the dimensions when unidimensionality is violated. The approach is illustrated by two examples on simulated data and two applications based on educational and qualityoflife data.
This is a preview of subscription content, log in to check access.
References
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Second international symposium of information theory (pp. 267–281). Budapest: Akademiai Kiado.
Bacci, S., & Bartolucci, F. (2016). Twotier latent class IRT models in R. The R Journal, 8, 139–166.
Bacci, S., Bartolucci, F., & Gnaldi, M. (2014). A class of multidimensional latent class IRT models for ordinal polytomous item responses. Communication in Statistics  Theory and Methods, 43, 787–800.
Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141–157.
Bartolucci, F., Bacci, S., & Gnaldi, M. (2015). Statistical analysis of questionnaires: A unified approach based on Stata and R. Boca Raton, FL: Chapman and Hall/CRC Press.
Bartolucci, F., Bacci, S., & Gnaldi, M. (2016). MultiLCIRT: multidimensional latent class item response theory models. In R package version, version 2.10. https://cran.rproject.org/web/packages/MultiLCIRT/index.html.
Bartolucci, F., & Forcina, A. (2005). Likelihood inference on the underlying structure of IRT models. Psychometrika, 70, 31–43.
Bartolucci, F., Scaccia, L., & Farcomeni, A. (2012). Bayesian inference through encompassing priors and importance sampling for a class of marginal models for categorical data. Computational Statistics and Data Analysis, 56, 4067–4080.
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some modelfit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: AddisonWesley.
Bock, R., Gibbons, R. D., & Muraki, E. (1988). Fullinformation item factor analysis. Applied Psychological Measurement, 12, 261–280.
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Chalmers, P., Pritikin, J., Robitzsch, A., Zoltak, M., Kim, K., Falk, C. F., & Meade, A. (2017). MIRT: Multidimensional item response theory. In R package version, version 1.23. https://cran.rproject.org/web/packages/mirt/index.html.
Christensen, K. B., Bjorner, J. B., Kreiner, S., & Petersen, J. H. (2002). Testing unidimensionality in polytomous Rasch models. Psychometrika, 67, 563–574.
Costantini, M., Musso, M., Viterbori, P., Bonci, F., Del Mastro, L., Garrone, O., et al. (1999). Detecting psychological distress in cancer patients: Validity of the Italian version of the hospital anxiety and depression scale. Support Care Cancer, 7, 121–127.
Diebolt, J., & Robert, C. (1994). Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society, Series B, 56, 363–375.
Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231.
Graham, R. L., Knuth, D. E., & Patashnik, O. (1988). Concrete mathematics: A foundation for computer science. Reading, MA: AddisonWesley.
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711–732.
Green, P. J., & Richardson, S. (2001). Hidden Markov models and disease mapping. Journal of the American Statistical Association, 97, 1055–1070.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Nijhoff.
Hojtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62, 171–189.
Hurvich, C. M., & Tsai, C.L. (1989). Regression and time series model selection in small samples. Biometrika, 76, 297–307.
Junker, B. W., & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211–220.
Karabatsos, G. (2001). The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory. Journal of Applied Measurement, 2, 389–423.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.
Klugkist, I., Kato, B., & Hoijtink, H. (2005). Bayesian model selection using encompassing priors. Statistica Nederlandica, 59, 57–69.
Kuo, T., & Sheng, Y. (2015). Bayesian estimation of a multiunidimensional graded response IRT model. Behaviormetrika, 42, 79–94.
Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis. Boston: Houghton Mifflin.
Lindley, D. V. (1957). A statistical paradox. Biometrika, 44, 187–192.
Lindsay, B., Clogg, C., & Grego, J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86, 96–107.
MartinLöf, P. (1973). Statistiska modeller. Stockholm: Institütet för Försäkringsmatemetik och Matematisk Statistisk vid Stockholms Universitet.
Pan, J. C., & Huang, G. H. (2014). Bayesian inferences of latent class models with an unknown number of classes. Psychometrika, 79, 621–646.
Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, 4, 321–333.
Reckase, M. D. (2009). Multidimensional itemresponse theory. New York: Springer.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Tierney, L. (1994). Markov chains for exploring posterior distributions. Annals of Statistics, 22, 1701–1762.
Tuyl, F., Gerlach, R., & Mengersen, K. (2009). Posterior predictive arguments in favor of the Bayes–Laplace prior as the consensus prior for binomial and multinomial parameters. Bayesian Analysis, 4, 151–158.
Van Onna, M. J. H. (2002). Bayesian estimation and model selection in ordered latent class models for polytomous items. Psychometrika, 67, 519–538.
Verhelst, N. D. (2001). Testing the unidimensionality assumption of the Rasch model. Methods of Psychological Research Online, 6, 231–271.
Vermunt, J. K. (2001). The use of restricted latent class models for defining and testing nonparametric and parametric item response theory models. Applied Psychological Measurement, 25, 283–294.
von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61, 287–307.
Zigmond, A. S., & Snaith, R. P. (1983). The hospital anxiety and depression scale. Acta Psychiatrika Scandinavica, 67, 361–370.
Author information
Affiliations
Corresponding author
Appendix
Appendix
Prior Probabilities of \(\mathcal{P}\) and s
In this section, we show how to calculate the prior probability of any partition \(\mathcal{P}\) and the prior probability of the number s of groups, conditionally on k, under independent uniform priors on the success probabilities \(\lambda _{jc}\). Under more general priors of type (6), analytical derivation of the priors for \(\mathcal{P}\) and s becomes prohibitive and resorting to simulations is unavoidable.
We start by calculating the conditional prior probability of a single partition \(\mathcal{P}\). For this aim, it is convenient to associate to each item j the ranking vector \({\varvec{q}}_j=(q_{j1},\ldots ,q_{jk})\), in which \(q_{jc}\) represents the rank of \(\lambda _{jc}\) after ordering \(\lambda _{j1},\ldots ,\lambda _{jk}\) in increasing order. Let \({\varvec{Q}}=({\varvec{q}}{'}_{1},\ldots ,{\varvec{q}}{'}_{r})\) be the \(k\times r\) matrix of the vectors \({\varvec{q}}_j\), arranged by columns. Each vector \({\varvec{q}}_{j_1}\) can have one out of k! possible configurations, independently from any other \({\varvec{q}}_{j_2}\), for \(j_1\ne j_2\). Thus, there are \((k!)^r\) possible configurations of the matrix \({\varvec{Q}}\), all having the same probability under a uniform prior on the success probabilities \(\lambda _{jc}\). Notice that \({\varvec{Q}}\) is a function of \(\varvec{\Lambda }\).
The number of matrices \({\varvec{Q}}\) determining the same partition \(\mathcal{P}\) is
where \(s=\mathcal{P}\). This is easily proved considering that the ranking vector of the first item in the first group can have any of the k! possible configurations, which consequently determines the configuration for the ranking vectors of the other items in the same group; the ranking vector of the first item in the second group can have any possible configuration but that of the items in the first group, that is, \(k!1\) possible configurations. By iterating this process, we conclude that the ranking vector of the first item in the sth group can have any possible configuration but those of the items in the previous groups, that is, \(k!s+1\) possible configurations. Therefore, the probability of any partition \(\mathcal{P}\), conditionally on k is given by
where the denominator corresponds to the number of possible configurations of \({\varvec{Q}}\).
We now consider the prior conditional probability of the number s of groups, that is, p(sk). Since any partition into s groups has the same probability, given in (15), to obtain p(sk) we simply have to count how many partitions into s groups can be obtained and multiply this number by the probability of the single partition. The number of ways in which a set of r items can be partitioned into s nonempty groups is known as the Stirling number of the second kind (Graham et al., 1988), which is equal to
with
Thus, the conditional probability of s can be obtained as
where \(p(\mathcal{P}k)\) is the probability of a partition \(\mathcal{P}\) such that \(\mathcal{P}=s\), which is given in (15).
The RJMCMC Algorithm
In this section we provide a detailed description of the RJMCMC algorithm implemented in this paper. We first illustrate the fixeddimension moves and then present the changing dimension moves for updating the number of latent classes.
FixedDimension Moves
Model parameters are updated according to the following steps:

Update class weights. The full conditional distribution of \(\varvec{\pi }\) is a Dirichlet distribution with parameters \((1+n_{1},\ldots ,1+ n_{k})\), where \(n_c = \#\{i:z_{ic}=1\}\). The vector \(\varvec{\pi }\) is then updated by means of Gibbs sampling, drawing a new vector from such a distribution.

Update latent class variables. The latent class allocation variables \(z_{ic}\), \(i=1,\ldots ,n\), \(c=1,\ldots ,k\), can be updated by means of Gibbs sampling, drawing them independently from their full conditional
$$\begin{aligned} p(z_{ic}=1\cdots ) = \frac{\pi _c \prod _{j=1}^r (1\lambda _{jc})^{1y_{ij}}\lambda _{jc}^{y_{ij}}}{\sum _{c=1}^k \pi _c \prod _{j=1}^r (1\lambda _{jc})^{1y_{ij}}\lambda _{jc}^{y_{ij}}}, \end{aligned}$$where “\(\cdots \)” denotes “all other parameters and data.”

Update the success probabilities. In order to update the success probabilities, we build a proposal based on independent zerocentered normal increments of \(\mathrm{logit}\,(\lambda _{jc})\), separately for each \(j=1,\ldots ,r\) and \(c=1,\ldots ,k\). The candidate \(\lambda _{jc}^{\star }\) is accepted with probability equal to \(\min (1,p_{\lambda _{jc}^\star })\), where
$$\begin{aligned} p_{\lambda _{jc}^\star }=\prod _{i=1}^n&\{ (\lambda _{jc}^\star /\lambda _{jc})^{y_{ij}}[(1\lambda _{jc}^\star )/(1\lambda _{jc})]^{(1y_{ij})}\}^{z_{ic}}\nonumber \\&\times (\lambda _{jc}^\star /\lambda _{jc})^{(\alpha _c1)}[(1\lambda _{jc}^\star )/(1\lambda _{jc})]^{(\beta _c1)}\nonumber \\&\times (\lambda _{jc}^\star )(1\lambda _{jc}^\star )/[(\lambda _{jc})(1\lambda _{jc})]. \end{aligned}$$(16)The first line on the right side is the likelihood ratio, while the second line corresponds to the ratio between the prior densities. The ratio between the proposal densities cancels out, apart from the Jacobian of the logit transformation, given in the third line of (16).
Changing Dimension Moves for Updating k

Split/merge move. For the combine proposal, we pick a class at random among \(2,\ldots ,k\), with probability \(1/(k1)\), and denote it with \(c_2\). Then, we draw another class at random among \(1,\ldots ,c_21\), with probability \(1/(c_21)\), and denote it with \(c_1\). Classes \(c_1\) and \(c_2\) are then merged into a new class \(c^\prime \), decreasing k by 1, with the merged class \(c^\prime \) occupying the place \(c_21\), once the place \(c_1\) has been deleted. We then create new values for \(\pi _{c^\prime }\) and \(\lambda _{jc^\prime }\), for \(j=1,\ldots ,r\), and we reallocate all those observations for which \(z_{ic_1}=1\) or \(z_{ic_2}=1\) to the merged class \(c^\prime \). A new vector of weights is created by letting
$$\begin{aligned} \pi _{c^\prime }=\pi _{c_1}+\pi _{c_2}. \end{aligned}$$The new parameters \(\lambda _{jc^\prime }\) are created in such a way that \(\lambda _{jc^\prime }=\lambda _{jc_2}\), while the \(\lambda _{jc_1}\) are simply deleted for all j. The split proposal begins by choosing a class at random among \(1,\ldots ,k\), say \(c^\prime \), with probability 1 / k and splitting it into two new classes, labeled \(c_1\) and \(c_2\), augmenting k by 1. Let \(c_2\) take the place \(c^\prime +1\) and \(c_1\) be inserted in a place preceding \(c_2\) and chosen at random with probability \(1/c^\prime \). We then need to create new values for \(\pi _{c_1},\pi _{c_2}\) and for \(\lambda _{jc_1},\lambda _{jc_2}\), \(j=1,\ldots ,r\), and reallocate all those observations for which \(z_{ic^\prime }=1\), while the hyperparameters \(\alpha _c\) and \(\beta _c\) are simply recalculated as \(\alpha _c^\star = vc+1\) and \( \beta _c^\star =v(k^\star +1c)+1\), for \(c=1,\ldots ,k^\star \), with \(k^\star =k+1\). Let us start by splitting the weight \(\pi _{c^\prime }\) into \(\pi _{c_1}\) and \(\pi _{c_2}\) in such a way that \(\pi _{c_1}+\pi _{c_2}=\pi _{c^\prime }\). We accomplish this by generating a random value \(u\sim \hbox {Beta}\left( u_1,u_2\right) \), where \(u_1\) and \(u_2\) are the parameters of the Beta density, and setting
$$\begin{aligned} \pi _{c_1}=\pi _{c^\prime }u \quad \hbox {and}\quad \pi _{c_2}=\pi _{c^\prime }(1u). \end{aligned}$$The new vector of weights is denoted by \(\varvec{\pi }^\star \). We then split the parameters \(\lambda _{jc^\prime }\) into \(\lambda _{jc_1}\) and \(\lambda _{jc_2}\) for all j. We accomplish this by sampling a vector \({\varvec{w}}=(w_1,\ldots ,w_r)\) from the prior distribution of \(\lambda _{jc_1}\), that is, \(w_j\sim \hbox {Beta}(\alpha ^\star _{c_1},\beta _{c_1}^\star )\), for \(j=1,\ldots ,r\), and thus setting
$$\begin{aligned} \lambda _{jc_1}=w_j \quad \hbox {and}\quad \lambda _{jc_2}=\lambda _{jc^\prime }, \quad j=1,\ldots ,r. \end{aligned}$$The new matrix of success probabilities is denoted by \(\varvec{\Lambda }^\star \). Finally, we reallocate all those observations for which \(z_{ic^\prime }=1\) between \(c_1\) and \(c_2\) in a way analogous to the standard Gibbs allocation move and we let \({\varvec{Z}}^\star \) denote the new allocation matrix. According to the RJ framework, the acceptance probability for the split move is min\((1,p_k)\), where
$$\begin{aligned} p_k= & {} \prod _{i=1}^n\prod _{j=1}^r \frac{[ (\lambda _{jc_1}^\star )^{y_{ij}} (1\lambda _{jc_1}^\star )^{(1y_{ij})}]^{z_{ic_1}^\star } [ (\lambda _{jc_2}^\star )^{y_{ij}} (1\lambda _{jc_2}^\star )^{(1y_{ij})}]^{z_{ic_2}^\star }}{[ (\lambda _{jc^\prime })^{y_{ij}} (1\lambda _{jc^\prime })^{(1y_{ij})}]^{z_{ic^\prime }}}\nonumber \\&\times \frac{p(k^\star )}{p(k)}\times \frac{\mathcal {D}(\pi _1^\star ,\ldots ,\pi _{k^\star }^\star )}{\mathcal {D}(\pi _1,\ldots ,\pi _{k})}\times \frac{(\pi _{c_1}^\star )^{\sum _{i=1}^n z_{ic_1}^\star }(\pi _{c_2}^\star )^{\sum _{i=1}^n z_{ic_2}^\star }}{(\pi _{c^\prime })^{\sum _{i=1}^n z_{ic^\prime }}} \nonumber \\&\times \frac{\prod _{j=1}^r \prod _{c=1}^{k^\star }\mathcal {B}_{\alpha _c^\star ,\beta _c^\star }(\lambda _{jc}^\star )}{\prod _{j=1}^r \prod _{c=1}^{k} \mathcal {B}_{\alpha _c,\beta _c}(\lambda _{jc})}\times \frac{d_{k^\star }}{b_k p_{\hbox {alloc}}\mathcal {B}_{u_1,u_2}(u)\prod _{j=1}^r \mathcal {B}_{\alpha _{c_1}^\star ,\beta _{c_1}^\star }(w_j)}\times \pi _{c^\prime }.\qquad \end{aligned}$$(17)The first two lines in (17) represent, respectively, the likelihood and the priors ratio, with \(\mathcal {D}\) being the Dirichlet density with parameters all equal to 1, and \(\mathcal {B}\) being the Beta density with parameters specified in the subscript. In the third line, the first term represents the proposal ratio, with \(p_{\hbox {alloc}}\) being the probability of the particular allocation made in the split move, and the second term is the Jacobian of the transformation from \((\pi _{c^\prime }, \lambda _{1 c^\prime },\ldots , \lambda _{r c^\prime }, u,w_{1}, \ldots , w_{r})\) to \((\pi _{c_1}^\star , \lambda _{1 c_2}^\star ,\ldots , \lambda _{r c_2}^\star , \pi _{c_2}^\star , \lambda _{1 c_1}^\star ,\ldots , \lambda _{r c_1}^\star )\). The combine move is accepted with probability \(\min (1,p_k^{1})\), with some obvious substitutions in the expression for \(p_k\).

Birth/death move. We first make a random choice between birth and death, using the same probabilities \(b_k\) and \(d_k\) as above. For a birth, we pick a position at random among \(1,\ldots ,k^\star \), with probability \(1/k^\star \) for the place to be occupied by the new class \(c^\prime \). Then, a weight and a vector of success probabilities for the new class are drawn using
$$\begin{aligned} \pi _{c^\prime }\sim \hbox {Beta}(1,k), \quad \lambda _{jc^\prime }\sim \hbox {Beta}(\alpha ^\star _{c^\prime },\beta _{c^\prime }^\star ), \quad j=1,\ldots ,r, \end{aligned}$$where the hyperparameters \(\varvec{\alpha }\) and \(\varvec{\beta }\) are recalculated as in the split move, leading to \(\varvec{\alpha }^\star \) and \(\varvec{\beta }^\star \), respectively. To “make space” for the new class, the existing weights are rescaled, so that all weights sum to 1, using \(\pi _c^\star = \pi _c(1  \pi _{c^\prime })\). The weights and the success probabilities proposed under the birth move are denoted by \(\varvec{\pi }^\star \) and \(\varvec{\Lambda }^\star \), respectively. The allocation variables remain unchanged, and no subjects are allocated to the new component, that is, \(z_{ic^\prime }=0\), \(i=1,\ldots ,n\). The increased allocation matrix is indicated as \({\varvec{Z}}^\star \). For a death, a random choice is made between any existing empty components; the chosen component is deleted and the remaining weights are rescaled to sum to 1. No other changes are proposed to the variables and, in particular, the allocations are unaltered. The acceptance probabilities for birth and death are \(\min (1, p_k)\) and \(\min (1, p_k^{1})\), respectively, where
$$\begin{aligned} p_k= & {} \frac{p(k^\star )}{p(k)}\times \frac{\mathcal {D}(\pi _1^\star ,\ldots ,\pi _{k^\star }^\star )}{\mathcal {D}(\pi _1,\ldots ,\pi _{k})}\times \frac{\prod _{c=1}^{k^\star }(\pi _c^\star )^{\sum _{i=1}^n z_{ic}^\star }}{\prod _{c=1}^k \pi _c^{\sum _{i=1}^n z_{ic}}} \times \frac{\prod _{j=1}^r\prod _{c=1}^{k^\star }\mathcal {B}_{\alpha _c^\star ,\beta _c^\star }(\lambda _{jc}^\star )}{\prod _{j=1}^r \prod _{c=1}^{k} \mathcal {B}_{\alpha _c,\beta _c}(\lambda _{jc})}\nonumber \\&\times \frac{d_{k^\star }k^\star }{b_k (k_0+1) \mathcal {B}_{1,k}(\pi _{c^\prime }^\star )\prod _{j=1}^r \mathcal {B}_{\alpha _{c_1}^\star ,\beta _{c_1}^\star }(\lambda _{jc^\prime }^\star )}\times (1\pi _{c^\prime }^\star ), \end{aligned}$$(18)with \(k_0\) being the number of empty classes, before the birth. In equation (18), the first line is the prior ratio, and the second line contains the proposal ratio and the Jacobian; the likelihood ratio is equal to 1.
Rights and permissions
About this article
Cite this article
Bartolucci, F., Farcomeni, A. & Scaccia, L. A Nonparametric Multidimensional Latent Class IRT Model in a Bayesian Framework. Psychometrika 82, 952–978 (2017). https://doi.org/10.1007/s1133601795767
Received:
Revised:
Published:
Issue Date:
Keywords
 cluster analysis
 encompassing priors
 item response theory
 unidimensionality
 Markov chain Monte Carlo
 reversiblejump algorithm
 stochastic partitions