Bayesian approach for mixture models with grouped data

Gau, Shiow-Lan; de Dieu Tapsoba, Jean; Lee, Shen-Ming

doi:10.1007/s00180-013-0478-6

Bayesian approach for mixture models with grouped data

Original Paper
Published: 16 January 2014

Volume 29, pages 1025–1043, (2014)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Shiow-Lan Gau¹,
Jean de Dieu Tapsoba² &
Shen-Ming Lee¹

437 Accesses
2 Citations
Explore all metrics

Abstract

Finite mixture modeling approach is widely used for the analysis of bimodal or multimodal data that are individually observed in many situations. However, in some applications, the analysis becomes substantially challenging as the available data are grouped into categories. In this work, we assume that the observed data are grouped into distinct non-overlapping intervals and follow a finite mixture of normal distributions. For the inference of the model parameters, we propose a parametric approach that accounts for the categorical features of the data. The main idea of our method is to impute the missing information of the original data through the Bayesian framework using the Gibbs sampling techniques. The proposed method was compared with the maximum likelihood approach, which uses the Expectation-Maximization algorithm for the estimation of the model parameters. It was also illustrated with an application to the Old Faithful geyser data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bayesian Analysis of Two-Piece Distributions Based on the Scale Mixtures of Normal Family

Article 12 April 2018

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

Clustering bivariate mixed-type data via the cluster-weighted model

Article 04 July 2015

References

Albert JH, Chib S (1993) Bayes inference via Gibbs sampling of autoregressive time series subject to markov mean and variance shifts. J Bus Econ Stat 11:1–15
Google Scholar
Boldea O, Magnus JR (2009) Maximum likelihood estimation of the multivariate normal mixture model. J Am Stat Assoc 104:1539–1549
Article MathSciNet MATH Google Scholar
Cadez IV, Smyth P, McLachlan GJ, McLaren CE (2002) Maximum likelihood estimation of mixture densities for binned and truncated multivariate data. Mach Learn 47:7–34
Article MATH Google Scholar
Chen CWS, Chan JSK, So MKP, Lee K (2011) Classification in segmented regression problems. Comput Stat Data Anal 55:2276–2287
Google Scholar
Chibs S (1996) Calculating posterior distributions and modal estimates in Markov mixture models. J Econom 75:79–97
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation incomplete data via the EM algorithm (with discussion). J R Stat Soc B 39:1–38
MathSciNet MATH Google Scholar
Diebolt J, Robert CP (1994) Estimation of finite mixture distributions through Bayesian sampling. J R Stat Soc B 56:363–375
MathSciNet MATH Google Scholar
Frühwirth-Schnatter S (2001) Markov Chain Monte Carlo estimation of classical and dynamic switching and mixture models. J Am Stat Assoc 96:196–209
Article Google Scholar
Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New York
MATH Google Scholar
Hardle W (1991) Smoothing techniques with implementation in S. Springer, New York
Book Google Scholar
Gelfand AE, Hills SE, Racine-Poon A, Smith AFM (1990) Illustration of Bayesian inference in normal data models using Gibbs sampling. J Am Stat Assoc 85:972–985
Article Google Scholar
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Article MATH Google Scholar
Hunter DR, Wang S, Hettmansperger TP (2007) Inference for mixture of symmetric distributions. Ann Stat 35:224–251
Article MathSciNet MATH Google Scholar
Hogg RV, Klugman SA (1984) Loss distributions. Wiley-Interscience, New York
Book Google Scholar
McLachlan GJ, Jones PN (1988) Fitting mixture models to grouped and truncated data via the EM algorithm. Biometrics 44:571–578
Article MATH Google Scholar
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York
MATH Google Scholar
McLachlan GJ, Peel D (1998) Robust cluster analysis via mixtures of multivariate t-distributions. In: Amin A, Dori D (eds) Lecture notes in computer science, vol 1451
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
Melnykov V, Melnykov I (2012) Initializing the EM algorithm in Gaussian mixture models with an unknown number of components. Comput Stat Data Anal 56:1381–1395
Article MathSciNet MATH Google Scholar
Pearson K (1894) Contribution to the theory of mathematical evolution. Philos Trans R Soc Lond A 186: 343–414
Google Scholar
Qu P, Qu Y (2000) A Bayesian approach to finite mixture models in bioassay via data augmentation and Gibbs sampling and its application to insecticide resistance. Biometrics 56:1249–1255
Article MathSciNet MATH Google Scholar
Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B 59:731–792
Article MathSciNet MATH Google Scholar
Scallan AJ (1999) Fitting a mixture distribution to complex censored survival data using generalized linear models. J Appl Stat 26:747–753
Article MathSciNet MATH Google Scholar
Stephens M (2000) Dealing with label switching in mixture models. J R Stat Soc Ser B 62:795–809
Article MathSciNet MATH Google Scholar
Tanner MA, Wong WH (1987) The calculation of posterior distribution by data augmentation (with discussion). J Am Stat Assoc 82:528–550
Article MathSciNet MATH Google Scholar
Tanner MA (1994) Tools for statistical inference. Springer, New York
Google Scholar

Download references

Acknowledgments

We thank the Editor and two anonymous reviewers for their insightful comments that helped improve the content of the paper. Also, we acknowledge the support by the National Science Council (NSC) of Taiwan Grant NSC101-2118-M-035-004-MY2 (Lee).

Author information

Authors and Affiliations

Department of Statistics, Feng Chia University, Taichung, 40724, Taiwan, ROC
Shiow-Lan Gau & Shen-Ming Lee
Division of Public Health, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
Jean de Dieu Tapsoba

Authors

Shiow-Lan Gau
View author publications
You can also search for this author in PubMed Google Scholar
Jean de Dieu Tapsoba
View author publications
You can also search for this author in PubMed Google Scholar
Shen-Ming Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shiow-Lan Gau.

Appendices

Appendix 1: Conditional distribution of ${\mathbf {Z}}_{i}$ given $\varvec{\Psi }$ and $D_{ik}=1$

The conditional distribution of ${\mathbf {D}}_i$ given $(\varvec{\Psi },Z_{ri}=1)$ is the multinomial distribution $Multinomial\left( 1,P_{1r}(\varvec{\Theta }_r), \ldots ,P_{Kr}(\varvec{\Theta }_r)\right) $. Hence conditionally on $\varvec{\Psi }$ and $Z_{ri} = 1$ the probability density function (pdf) of $\left( D_{i1},\ldots ,D_{iK}\right) $ can be expressed as $\{P_{kr}(\varvec{\Theta }_r)\}^{D_{ik}}$. It follows that the general expression of the pdf of $\left( D_{i1},\ldots ,D_{iK}\right) |(Z_{1i},\ldots ,Z_{Ri})$ is $\prod ^R_{r=1} [\prod ^K_{k=1}\{P_{kr}(\varvec{\Theta }_r)\}^{D_{ik}}]^{Z_{ri}}$. As a result, the joint pdf of $\left( D_{i1},\ldots ,D_{iK}\right) $ and $(Z_{1i},\ldots ,Z_{Ri})$ is given as follows:

$$\begin{aligned} f({\mathbf {D}}_{i},{\mathbf {Z}}_{i})&= \prod ^R_{r=1}\prod ^K_{k=1}\Bigl \{P_{kr}\bigl (\varvec{\Theta }_r \bigr )\Bigr \}^{D_{ik}Z_{ri}}\pi _r^{Z_{ri}}. \end{aligned}$$

Moreover, the marginal pdf of $(D_{i1} = d_{i1},\ldots ,D_{iK} = d_{iK})$ is given as

$$\begin{aligned} P(D_{i1} = d_{i1},\ldots ,D_{iK} = d_{iK})&= \sum \limits _{Z_{1i}+\ldots +Z_{Ri}=1}\prod \limits ^R_{r=1}\prod \limits ^K_{k=1} \Bigl \{P_{kr}(\varvec{\Theta }_r)\Bigr \}^{D_{ik}Z_{ri}}\pi _r^{Z_{ri}}\\&= \prod \limits ^K_{k=1}\left\{ \sum \limits ^R_{r=1}\pi _r P_{kr}(\varvec{\Theta }_r)\right\} ^{d_{ik}}. \end{aligned}$$

Therefore,

$$\begin{aligned} P(Z_{1i}&= z_{1i},\ldots ,Z_{Ri}=z_{Ri}|D_{i1}=d_{i1},\ldots ,D_{iK}= d_{iK})\\&= \prod \limits ^K_{k=1}\left[ \sum \limits ^R_{r=1}\left\{ \frac{\pi _r P_{kr}(\varvec{\Theta }_r)}{\sum ^R_{l=1}\pi _lP_{kl}(\varvec{\Theta }_l)}\right\} ^{z_{ri}} \right] ^{d_{ik}}, \end{aligned}$$

and

$$\begin{aligned} {\mathbf {Z}}_i|\varvec{\Psi },D_{ik}=1\sim \textit{Multinomial}\left( 1,\frac{\pi _1 P_{k1}(\varvec{\Theta }_1)}{\sum ^R_{r=1}\pi _r P_{kr}(\varvec{\Theta }_r)}, \ldots ,\frac{\pi _R P_{kR}(\varvec{\Theta }_R)}{\sum ^R_{r=1}\pi _r P_{kr}(\varvec{\Theta }_r)}\right) . \end{aligned}$$

Appendix 2: Joint probability density function of $(x_i,{\mathbf {D}}_{i},{\mathbf {Z}}_{i},\varvec{\Psi })$

The conditional pdf of $x_i$ given $D_{ik}=1$ and $Z_{ri}=1$ is specified as in (3). It follows that the general expression of the pdf of $x_i$ given $\varvec{\Psi }, D_{i}$ and $Z_{i}$ can be expressed as

$$\begin{aligned} f(x_i|\varvec{\Psi },{\mathbf {D}}_i,{\mathbf {Z}}_i)=\prod ^R_{r=1}\left[ \prod ^K_{k=1}\left\{ \frac{f_r(x_i |\varvec{\Theta }_r)}{P_{kr}(\varvec{\Theta }_r)}\right\} ^{D_{ik}}\right] ^{Z_{ri}}, \end{aligned}$$

and the joint pdf of $(x_i, {\mathbf {D}}_i, {\mathbf {Z}}_i, \varvec{\Psi })$ is

$$\begin{aligned} f(x_i,{\mathbf {Z}}_i,{\mathbf {D}}_i,\varvec{\Psi })&= \prod \limits ^K_{k=1}\prod \limits ^R_{r=1} \left\{ \frac{f_r(x_i|\varvec{\Theta }_r)}{P_{kr}(\varvec{\Theta }_r)}\right\} ^{Z_{ri} D_{ik}}\left\{ P_{kr}(\varvec{\Theta }_r)\right\} ^{Z_{ri}D_{ik}}\pi _r^{Z_{ri}}\\&= \prod \limits ^R_{r=1}\left\{ \prod \limits ^K_{k=1}f_r(x_i|\varvec{\Theta }_r) \right\} ^{D_{ik}Z_{ri}}\pi _r^{Z_{ri}}. \end{aligned}$$

Under the assumption that the components of the mixture distribution are normal we have

$$\begin{aligned} f_r(x|\Theta _r)=\frac{1}{\sqrt{2\pi }\sigma _r}\exp \left\{ -\frac{(x- \mu _r)^2}{2\sigma ^2_r}\right\} , \end{aligned}$$

for $r=1,\ldots ,R$. Moreover, noting that $\sum _{k=1}^{K}D_{ik}=1, i=1,\ldots ,n$, it is straightforward to see that

$$\begin{aligned}&f(x_i,{\mathbf {D}}_i,{\mathbf {Z}}_i,\varvec{\Psi }) = \prod ^n_{i=1}\prod ^R_{r=1}\left[ \prod ^K_{k=1} \left\{ f_{r}(x_{i}|\varvec{\Theta }_r)\right\} ^{D_{ik}Z_{ri}}\right] \pi _r^{Z_{ri}}\\&\quad = \prod ^n_{i=1}\prod ^R_{r=1}\left[ \prod ^K_{k=1}\Bigl (\frac{1}{\sqrt{2\pi }} \Bigr )^{D_{ik}Z_{ri}}\Bigl (\frac{1}{\sigma ^2_r}\Bigr )^{\frac{1}{2}D_{ik}Z_{ri}} \exp \left\{ -\frac{1}{2}\frac{D_{ik}Z_{ri}(x_i-\mu _r)^{2}}{\sigma ^2_r}\right\} \right] \pi _r^{Z_{ri}}\\&\quad \!=\!\Bigl (\frac{1}{\sqrt{2\pi }}\Bigr )^n \prod ^R_{r=1}\left( \frac{1}{\sigma ^2_r} \right) ^{\frac{1}{2}\sum \limits ^n_{i=1}\sum \limits ^K_{k=1}D_{ik}Z_{ri}} \exp \left\{ \!-\!\frac{1}{2}\sum \limits ^n_{i=1}\sum \limits ^K_{k=1}\frac{D_{ik} Z_{ri}(x_i-\mu _r)^2}{\sigma ^2_r}\right\} \pi _r^{\sum \limits ^n_{i=1}Z_{ri}}\\&\quad =\Bigl (\frac{1}{\sqrt{2\pi }}\Bigr )^n \prod ^R_{r=1}\left( \frac{1}{\sigma ^2_r} \right) ^{\frac{1}{2}\sum \limits ^n_{i=1}Z_{ri}}\exp \left\{ -\frac{1}{2} \sum \limits ^n_{i=1}\sum \limits ^K_{k=1}\frac{D_{ik}Z_{ri}(x_i-\mu _r)^2}{\sigma ^2_r}\right\} \pi _r^{\sum \limits ^n_{i=1}Z_{ri}}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gau, SL., de Dieu Tapsoba, J. & Lee, SM. Bayesian approach for mixture models with grouped data. Comput Stat 29, 1025–1043 (2014). https://doi.org/10.1007/s00180-013-0478-6

Download citation

Received: 19 November 2012
Accepted: 27 December 2013
Published: 16 January 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s00180-013-0478-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian approach for mixture models with grouped data

Abstract

Access this article

Similar content being viewed by others

A Bayesian Analysis of Two-Piece Distributions Based on the Scale Mixtures of Normal Family

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

Clustering bivariate mixed-type data via the cluster-weighted model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Conditional distribution of \({\mathbf {Z}}_{i}\) given \(\varvec{\Psi }\) and \(D_{ik}=1\)

Appendix 2: Joint probability density function of \((x_i,{\mathbf {D}}_{i},{\mathbf {Z}}_{i},\varvec{\Psi })\)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian approach for mixture models with grouped data

Abstract

Access this article

Similar content being viewed by others

A Bayesian Analysis of Two-Piece Distributions Based on the Scale Mixtures of Normal Family

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

Clustering bivariate mixed-type data via the cluster-weighted model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Conditional distribution of \({\mathbf {Z}}_{i}\) given \(\varvec{\Psi }\) and \(D_{ik}=1\)

Appendix 2: Joint probability density function of \((x_i,{\mathbf {D}}_{i},{\mathbf {Z}}_{i},\varvec{\Psi })\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation