Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler

White, Arthur; Wyse, Jason; Murphy, Thomas Brendan

doi:10.1007/s11222-014-9542-5

Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler

Published: 09 December 2014

Volume 26, pages 511–527, (2016)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Arthur White¹,
Jason Wyse¹ &
Thomas Brendan Murphy²

1028 Accesses
20 Citations
2 Altmetric
2 Mentions
Explore all metrics

Abstract

Latent class analysis is used to perform model based clustering for multivariate categorical responses. Selection of the variables most relevant for clustering is an important task which can affect the quality of clustering considerably. This work considers a Bayesian approach for selecting the number of clusters and the best clustering variables. The main idea is to reformulate the problem of group and variable selection as a probabilistically driven search over a large discrete space using Markov chain Monte Carlo (MCMC) methods. Both selection tasks are carried out simultaneously using an MCMC approach based on a collapsed Gibbs sampling method, whereby several model parameters are integrated from the model, substantially improving computational performance. Post-hoc procedures for parameter and uncertainty estimation are outlined. The approach is tested on simulated and real data .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation and selection for the latent block model on categorical data

Article 05 June 2014

Bayesian shrinkage in mixture-of-experts models: identifying robust determinants of class membership

Article Open access 21 February 2019

Adaptive random neighbourhood informed Markov chain Monte Carlo for high-dimensional Bayesian variable selection

Article Open access 30 September 2022

Notes

http://ghuang.stat.nctu.edu.tw/software/download.htm.

References

Aitkin, M., Anderson, D., Hinde, J.: Statistical modelling of data on teaching styles. J. R. Stat. Soc. Ser. A 144, 419–461 (1981)
Article Google Scholar
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Akadémiai Kiadó. (ed.) Second International Symposium on Information Theory, pp. 267–281. Springer, New York (1973)
Bartholomew, D.J., Knott, M.: Latent Variable Models and Factor Analysis, 2nd edn. Kendall’s Library of Statistics, Hodder Arnold (1999)
Bennet, N.: Teaching Styles and Pupil Progress. Open Books, London (1976)
Google Scholar
Bensmail, H., Celeux, G., Raftery, A., Robert, C.: Inference in model-based cluster analysis. Stati. Comput. 7, 1–10 (1997)
Article Google Scholar
Cappé, O., Robert, C.P., Rydén, T.: Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 65(3), 679–700 (2003)
Article MATH MathSciNet Google Scholar
Carpaneto, G., Toth, P.: Algorithm 548: solution of the assignment problem [H]. ACM Trans. Math. Softw. 6, 104–111 (1980)
Article Google Scholar
Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000)
Article MATH MathSciNet Google Scholar
Celeux, G., Forbes, F., Robert, C.P., Titterington, D.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–673 (2006)
Article MathSciNet Google Scholar
Chopin, N., Robert, C.P.: Properties of nested sampling. Biometrika 97(3), 741–755 (2010)
Article MATH MathSciNet Google Scholar
Dean, N., Raftery, A.E.: Latent class analysis variable selection. Ann. Inst. Stat. Math. 62, 11–35 (2010)
Article MathSciNet Google Scholar
Dellaportas, P., Papageorgiou, I.: Multivariate mixtures of normals with unknown number of components. Stat. Comput. 16, 57–68 (2006)
Article MathSciNet Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from incomplete data via the EM Algorithm. J. R. Stat. Soc. B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Fraley, C., Raftery, A.: Model-based methods of classification: using the software in chemometrics. J. Stat. Softw. 18, 1–13 (2007)
Article Google Scholar
Frühwirth-Schnatter, S.: Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques. Econom. J. 7(1), 143–167 (2004)
Article MATH MathSciNet Google Scholar
Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models: Modeling and Applications to Random Processes. Springer, Berlin (2006)
Google Scholar
Garrett, E.S., Zeger, S.L.: Latent class model diagnosis. Biometrics 56, 1055–1067 (2000)
Article MATH MathSciNet Google Scholar
Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions and the bayesian restoration of images. IEEE Trans.Pattern Anal. Mach. Intell. 6, 721–741 (1984)
Article MATH Google Scholar
Geweke, J.: Bayesian inference in econometric models using Monte Carlo integration. Econometrica 57(6), 1317–1339 (1989)
Article MATH MathSciNet Google Scholar
Gollini, I., Murphy ,T.: Mixture of latent trait analyzers for model-based clustering of categorical data. Statistics and Computing (to appear) (2013)
Goodman, L.A.: Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61, 215–231 (1974)
Article MATH MathSciNet Google Scholar
Green, P.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)
Article MATH MathSciNet Google Scholar
Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995)
Article MATH Google Scholar
Ley, E., Steel, M.F.J.: On the effect of prior assumptions in Bayesian model averaging with applications to growth regression. J. Appl. Econom. 24, 651–674 (2009)
Article MathSciNet Google Scholar
Marin, J.M., Mengersen, K., Robert, C.P.: Bayesian modelling and inference on mixtures of distributions. In: Dey, D., Rao, C. (eds) Bayesian Thinking: Modeling and Computation, vol 25, 1st edn, chap 16, pp 459–507. Handbook of Statistics, North Holland, Amsterdam (2005)
McDaid, A.F., Murphy, T.B., Friel, N., Hurley, N.: Improved Bayesian inference for the stochastic block model with application to large networks. Comput. Stat. & Data Anal. 60, 12–31 (2013)
Article MathSciNet Google Scholar
McLachlan, G., Peel, D.: Finite Mixture Models. John Wiley & Sons, New York (2002)
Google Scholar
Meng, X.L., Wong, W.H.: Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Stat. Sin. 6, 831–860 (1996)
MATH MathSciNet Google Scholar
Moran, M., Walsh, C., Lynch, A., Coen, R.F., Coakley, D., Lawlor, B.A.: Syndromes of behavioural and psychological symptoms in mild alzheimer’s disease. Int J Geriatr Psychiatry 19, 359–364 (2004)
Article Google Scholar
Newton, M.A., Raftery, A.E.: Approximate bayesian inference with the weighted likelihood bootstrap. J. R. Stat. Soc. Ser. B (Methodol.) 56(1), 3–48 (1994)
MATH MathSciNet Google Scholar
Nobile, A.: Bayesian finite mixtures: a note on prior specification and posterior computation. Tech. Rep. 05–3, University of Glasgow, Glasgow, UK (2005)
Nobile, A., Fearnside, A.: Bayesian finite mixtures with an unknown number of components: the allocation sampler. Stat. Comput. 17, 147–162 (2007)
Article MathSciNet Google Scholar
Pan, J.C., Huang, G.H.: Bayesian inferences of latent class models with an unknown number of classes. Psychometrika. pp 1–26 (2013)
Pandolfi, S., Bartolucci, F., Friel, N.: A generalized multiple-try version of the reversible jump algorithm. Comput. Stat. & Data Anal. 72, 298–314 (2014)
Article MathSciNet Google Scholar
Plummer, M., Best, N., Cowles, K., Vines, K.: CODA: convergence diagnosis and output analysis for MCMC. R News 6, 7–11 (2006)
Google Scholar
R Core Team.: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project.org/
Raftery, A.E., Dean, N.: Variable selection for model-based clustering. J. Am. Stat. Assoc. 101, 168–178 (2006)
Article MATH MathSciNet Google Scholar
Raftery, A.E., Newton, M.A., Satagopan, J.M., Krivitsky, P.N.: Estimating the integrated likelihood via posterior simulation using the harmonic mean identity (with discussion). In: Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., West, M. (eds.) Bayesian Statistics, vol. 8, pp. 1–45. Oxford University Press, Oxford (2007)
Google Scholar
Richardson, S., Green, P.J.: On bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. Ser. B (Stat. Methodol.) 59, 731–792 (1997)
Article MATH MathSciNet Google Scholar
Rousseau, J., Mengersen, K.: Asymptotic behaviour of the posterior distribution in overfitted mixture models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73, 689–710 (2011)
Article MATH MathSciNet Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Article MATH Google Scholar
Smart, K.M., Blake, C., Staines, A., Doody, C.: The Discriminative Validity of “Nociceptive”, “Peripheral Neuropathic”, and “Central Sensitization” as mechanisms-based classifications of musculoskeletal pain. Clin. J. pain 27, 655–663 (2011)
Article Google Scholar
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A.: Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B 64, 583–639 (2002)
Article MATH Google Scholar
Stephens, M.: Bayesian analysis of mixture models with an unknown number of components an alternative to reversible jump methods. Ann. Stat. 28(1), 40–74 (2000a)
Article MATH MathSciNet Google Scholar
Stephens, M.: Dealing with label switching in mixture models. J. R. Stat. Soc. Ser. B 62, 795–809 (2000b)
Article MATH MathSciNet Google Scholar
Tadesse, M.G., Sha, N., Vannucci, M.: Bayesian variable selection in clustering high-dimensional data. J. Am. Stat. Assoc. 100, 602–617 (2005)
Article MATH MathSciNet Google Scholar
Walsh, C.: Latent class analysis identification of syndromes in alzheimer’s disease: a bayesian approach. Metodol Zvezki Adv. Methodol. Stat. 3, 147–162 (2006)
Google Scholar
White, A., Murphy, B.: BayesLCA: Bayesian Latent Class Analysis (2013). http://CRAN.R-project.org/package=BayesLCA, R package version 1.3
Wyse, J., Friel, N.: Block clustering with collapsed latent block models. Stat. Comput. 22, 415–428 (2012)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work of Arthur White and Thomas Brendan Murphy was partly supported by Science Foundation Ireland under the Clique Strategic Research Cluster (08/SRC/I1407) while the work of Jason Wyse was partly done while working at the Insight Centre for Data Analytics which is supported by Science Foundation Ireland [SFI/12/RC/2289].

Author information

Authors and Affiliations

School of Computer Science and Statistics, Trinity College, O’Reilly Institute, Dublin 2, Ireland
Arthur White & Jason Wyse
School of Mathematical Sciences, Complex & Adaptive Systems Laboratory and Insight Research Centre, University College Dublin, Dublin 4, Ireland
Thomas Brendan Murphy

Authors

Arthur White
View author publications
You can also search for this author in PubMed Google Scholar
Jason Wyse
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Brendan Murphy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arthur White.

Appendix: Comparison with reversible jump MCMC

In this section we investigate how the performance of a collapsed Gibbs sampler compares with an RJMCMC based on using all model parameters. We divide this investigation into two tasks: selecting (a) the number of classes, and (b) which variables to include. We implement the approach of Pan and Huang (2013) using already available software to investigate the efficacy of RJMCMC for the former task, and outline our own approach to perform the latter for the case where the observed data is binary only. We find that the approach performs reasonably well when selecting the number of classes, although its performance is somewhat slower than that of the collapsed sampler. The implemented approach performs poorly when performing variable selection.

1.1 Number of classes

To identify the number of groups in a dataset using RJMCMC methods, we apply software^{Footnote 1} implementing the approach of Pan and Huang (2013). We applied the software to the binary and non-binary Dean and Raftery datasets described in Sect. 2.2, running the sampler for 100,000 iterations in both cases. All prior settings were set by default. In both cases, the non-informative variables were removed, since the sole task was to identify the correct number of classes.

As the software for this approach was implemented as a C++ programme, it can be thought of as broadly comparable to our own collapsed sampler, which is implemented in C; for the binary and non-binary datasets, the software took roughly 25 and 90 mins to run respectively, based on the same hardware specifications described previously. In both cases, this was markedly longer than the collapsed sampler took, despite the fact that the model was exploring the group space only, and the dimension of the data had been reduced.

The results from the samplers are shown in Table 14. In the case of the binary data, the correct number of groups is chosen as the most likely candidate, although, with a lower posterior probability in comparison to the collapsed sampler. In the case of the non-binary data, $G = 2$ is incorrectly is chosen as the most likely candidate, with some level of uncertainty surrounding which model is the most suitable.

Table 14 Posterior probability for number of groups in the binary and non-binary Dean and Raftery datasets using the reversible jump approach of (Pan and Huang 2013)

Full size table

1.2 Variable selection

Recall that for the variable inclusion/exclusion step, a variable ${m^*}$ is selected at random from $1, \ldots , M.$ An inclusion or exclusion move is then proposed, based on the current status of the variable. In what follows, we assume that the state space has $G$ groups, and that the data is binary, so that $X_{nm} \in \{0, 1\}$, for all $n = 1, \ldots , N$ and $m = 1, \ldots , M.$

1.2.1 Inclusion step

Suppose we select a variable ${m^*},$ which is currently excluded from the model. For the inclusion step, dropping the variable index, we propose the following move:

1.
Generate $u_1, \ldots , u_{G-1} \sim \hbox {Uniform}(-\epsilon , \epsilon ), $ and set $u_{G} = - \sum ^{G-1}_{i=1} u_i.$
2.
Set
$$\begin{aligned} \log \left( \frac{\theta _{1}}{1 - \theta _{1}}\right) = \log \left( \frac{\beta }{1 - \beta }\right) + u_1, \end{aligned}$$
which is equivalent to setting
$$\begin{aligned} \theta _1 = \frac{\beta e^{u_1}}{1 + \beta (e^{u_1} -1 )}. \end{aligned}$$
Similarly, for $g = 2, \ldots , G,$ set:
$$\begin{aligned} \theta _g = \frac{\beta e^{u_g}}{1 + \beta (e^{u_g} -1 )}. \end{aligned}$$

Then the proposed move is accepted with probability $\alpha $, where

$$\begin{aligned} \alpha&= \min \left( 1, \frac{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|\tilde{{\varvec{\theta }}},\tilde{{\varvec{\nu }}},{\varvec{\tau }},G) }{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|{\varvec{\theta }},{\varvec{\nu }},{\varvec{\tau }},G) } \frac{p_{\mathrm {n}}(\mathbf {X}|\tilde{{\varvec{\rho }}},\tilde{{\varvec{\nu }}})}{p_{\mathrm {n}}(\mathbf {X}|{{\varvec{\rho }}},{{\varvec{\nu }}})} \right. \\&\, \times \frac{ \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {n}}} p(\tilde{{\varvec{\rho }}}_m | \beta ) }{ \prod _{m \in {\varvec{\nu }}_{\mathrm {n}}} p({\varvec{\rho }}_m | \beta ) } \frac{ \prod _{g=1}^G \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {cl}}} p(\tilde{{\varvec{\theta }}}_{gm} | \beta ) }{ \prod _{g=1}^G \prod _{m \in {\varvec{\nu }}_{\mathrm {cl}}} p({\varvec{\theta }}_{gm} | \beta ) } \\&\, \times \left. \frac{ \prod _{g=1}^G \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {cl}}} p(\tilde{{\varvec{\theta }}}_{gm} | \beta ) }{ \prod _{g=1}^G \prod _{m \in {\varvec{\nu }}_{\mathrm {cl}}} p({\varvec{\theta }}_{gm} | \beta ) } |{\mathcal J}| \times p(\xi \rightarrow \xi ^*) \right) , \end{aligned}$$

where

$$\begin{aligned}&\frac{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|\tilde{{\varvec{\theta }}},\tilde{{\varvec{\nu }}},{\varvec{\tau }},G) }{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|{\varvec{\theta }},{\varvec{\nu }},{\varvec{\tau }},G) } = \prod ^G_{g=1} \theta _{gm}^{S_{g{m^*}}}(1 - \theta _{g{m^*}})^{S^C_{g{m^*}}}, \\&\frac{p_{\mathrm {n}}(\mathbf {X}|\tilde{{\varvec{\rho }}},\tilde{{\varvec{\nu }}})}{p_{\mathrm {n}}(\mathbf {X}|{{\varvec{\rho }}},{{\varvec{\nu }}})} = \frac{1}{\rho _{m^*}^{N_{m^*}}(1 - \rho _{m^*})^{N - N_{m^*}}},\\&\frac{ \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {n}}} p(\tilde{{\varvec{\rho }}}_m | \beta ) }{ \prod _{m \in {\varvec{\nu }}_{\mathrm {n}}} p({\varvec{\rho }}_m | \beta ) } = \frac{\Gamma (\beta )^2}{\Gamma ({2\beta })} \times \frac{1}{ \rho _{m^*}^{\beta -1}(1 - \rho _{m^*}^{\beta -1})},\\&\!\frac{ \prod _{g\!=\!1}^G \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {cl}}} p(\tilde{{\varvec{\theta }}}_{gm} | \beta ) }{ \prod _{g=1}^G \prod _{m \in {\varvec{\nu }}_{\mathrm {cl}}} p({\varvec{\theta }}_{gm} | \beta ) } \!=\! G \frac{\Gamma (2 \beta )}{\Gamma (\beta )^2} \prod ^G_{g=1} \theta _{gm^*}^{\beta \!-\!1} (1 \!-\! \theta _{gm^*})^{\beta \!-\!1},\\ \end{aligned}$$

and we define $S_{g{m^*}} = \sum ^N_{n=1} X_{nm^*}Z_{ng}, $ $S^C_{g{m^*}} = \sum ^N_{n=1} (1 - X_{nm^*})Z_{ng}, $ and $N_{m^*} = \sum ^N_{n=1} X_{nm^*}.$ Here we use $p(\xi \rightarrow \xi ^*) = 1/M$ to denote the probability of the proposed move. Finally, the Jacobian ${\mathcal J}$ is defined as ${\mathcal J}_{1g} = \frac{\partial \theta _{gm^*}}{\partial \rho _{m^*}},$ and ${\mathcal J}_{kg} = \frac{\partial \theta _{gm^*}}{\partial u_{k-1}},$ for $g = 1, \dots , G \hbox { and } k = 2, \dots , G.$

1.2.2 Exclusion step

If the variable ${m^*},$ is currently included in the model, we propose the exclusion step,

$$\begin{aligned} \rho = \frac{\left( \prod ^G_{g=1} \theta _g \right) ^{1/G} }{ \left( \prod ^G_{g=1} \theta _g \right) ^{1/G} + \left( \prod ^G_{g=1} (1 - \theta _g) \right) ^{1/G} }, \end{aligned}$$

where again we have dropped the variable index. Using this expression, we then obtain

$$\begin{aligned} u_g = \left( 1 - \frac{1}{G} \right) \log \left( \frac{ \theta _g }{ 1 - \theta _g } \right) - \frac{1}{G} \sum _{j \ne g} \log \left( \frac{ \theta _j }{ 1 - \theta _j } \right) , \end{aligned}$$

for $g = 1, \ldots , G-1$, demonstrating the required bijection between ${\varvec{\theta }}_{m^*}$ and $(\rho _{m^*}, \mathbf {u})$. The proposed move is again accepted with probability $\alpha $, where

$$\begin{aligned} \alpha&= \min \left( 1, \frac{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|\tilde{{\varvec{\theta }}},\tilde{{\varvec{\nu }}},{\varvec{\tau }},G) }{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|{\varvec{\theta }},{\varvec{\nu }},{\varvec{\tau }},G) } \frac{p_{\mathrm {n}}(\mathbf {X}|\tilde{{\varvec{\rho }}},\tilde{{\varvec{\nu }}})}{p_{\mathrm {n}}(\mathbf {X}|{{\varvec{\rho }}},{{\varvec{\nu }}})} \right. \\&\quad \times \frac{ \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {n}}} p(\tilde{{\varvec{\rho }}}_m | \beta ) }{ \prod _{m \in {\varvec{\nu }}_{\mathrm {n}}} p({\varvec{\rho }}_m | \beta ) } \frac{ \prod _{g=1}^G \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {cl}}} p(\tilde{{\varvec{\theta }}}_{gm} | \beta ) }{ \prod _{g=1}^G \prod _{m \in {\varvec{\nu }}_{\mathrm {cl}}} p({\varvec{\theta }}_{gm} | \beta ) } \\&\quad \times \left. \frac{ \Gamma (\beta )^2 }{ G\Gamma (2 \beta ) } \times \frac{1 }{ \prod ^G_{g=1} \theta _{gm^*}^{\beta -1} (1 - \theta _{gm^*})^{\beta -1} } |{\mathcal J}| \times p(\xi \rightarrow \xi ^*) \right) \end{aligned}$$

where the calculations are inverted, so that

$$\begin{aligned}&\frac{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|\tilde{{\varvec{\theta }}},\tilde{{\varvec{\nu }}},{\varvec{\tau }},G) }{ p_{\mathrm {cl}}(\mathbf {X}, \mathbf {Z}|{\varvec{\theta }},{\varvec{\nu }},{\varvec{\tau }},G) } = \frac{1}{ \prod ^G_{g=1} \theta _{gm}^{S_{g{m^*}}}(1 - \theta _{g{m^*}})^{S^C_{g{m^*}}} }, \\&\frac{p_{\mathrm {n}}(\mathbf {X}|\tilde{{\varvec{\rho }}},\tilde{{\varvec{\nu }}})}{p_{\mathrm {n}}(\mathbf {X}|{{\varvec{\rho }}},{{\varvec{\nu }}})} = \rho _{m^*}^{N_{m^*}} (1 - \rho _{m^*})^{N - N_{m^*} },\\&\frac{ \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {n}}} p(\tilde{{\varvec{\rho }}}_m | \beta ) }{ \prod _{m \in {\varvec{\nu }}_{\mathrm {n}}} p({\varvec{\rho }}_m | \beta ) } = \frac{ \Gamma ({2\beta }) }{ \Gamma (\beta )^2 } { \rho _{m^*}^{\beta -1}(1 - \rho _{m^*}^{\beta -1})},\\&\frac{ \prod _{g=1}^G \prod _{m \in \tilde{{\varvec{\nu }}}_{\mathrm {cl}}} p(\tilde{{\varvec{\theta }}}_{gm} | \beta ) }{\prod _{g=1}^G \prod _{m \in {\varvec{\nu }}_{\mathrm {cl}}} p({\varvec{\theta }}_{gm} | \beta )} = \frac{ \Gamma (\beta )^2 }{ G\Gamma (2 \beta )} \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \times \frac{1 }{ \prod ^G_{g=1} \theta _{gm^*}^{\beta -1} (1 - \theta _{gm^*})^{\beta -1} }.\\ \end{aligned}$$

The probability of the proposed move remains, $p(\xi \rightarrow \xi ^*) = 1/M$.

1.2.3 Dean and Raftery data application

We apply this approach to the binary Dean and Raftery dataset described previously in Sect. 2.2. Here, we fix the number of groups to the true value $G = 2$, so that the model search is based on variable selection only. The sampler was run for 50,000 iterations, with $\epsilon = 1$, which resulted in an acceptance probability for the inclusion/exclusion move of $\alpha \approx 0.12.$

The posterior probability for variable inclusion from the sampler are shown in Table 15. None of the informative variables are selected as frequently as for the collapsed sampler, with the model only finding weak evidence for variable 1, and failing to distinguish between the other variables.

Table 15 Posterior probability for variable inclusion in the binary Dean and Raftery data using RJMCMC

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

White, A., Wyse, J. & Murphy, T.B. Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler. Stat Comput 26, 511–527 (2016). https://doi.org/10.1007/s11222-014-9542-5

Download citation

Received: 10 February 2014
Accepted: 30 October 2014
Published: 09 December 2014
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11222-014-9542-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler

Abstract

Access this article

Similar content being viewed by others

Estimation and selection for the latent block model on categorical data

Bayesian shrinkage in mixture-of-experts models: identifying robust determinants of class membership

Adaptive random neighbourhood informed Markov chain Monte Carlo for high-dimensional Bayesian variable selection

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Comparison with reversible jump MCMC

1.1 Number of classes

1.2 Variable selection

1.2.1 Inclusion step

1.2.2 Exclusion step

1.2.3 Dean and Raftery data application

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler

Abstract

Access this article

Similar content being viewed by others

Estimation and selection for the latent block model on categorical data

Bayesian shrinkage in mixture-of-experts models: identifying robust determinants of class membership

Adaptive random neighbourhood informed Markov chain Monte Carlo for high-dimensional Bayesian variable selection

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Comparison with reversible jump MCMC

Appendix: Comparison with reversible jump MCMC

1.1 Number of classes

1.2 Variable selection

1.2.1 Inclusion step

1.2.2 Exclusion step

1.2.3 Dean and Raftery data application

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation