Skip to main content
Log in

Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Mixture models with (multivariate) Gaussian components are a popular tool in model-based clustering. Such models are often fitted by a procedure that maximizes the likelihood, such as the EM algorithm. At convergence, the maximum likelihood parameter estimates are typically reported, but in most cases little emphasis is placed on the variability associated with these estimates. In part this may be due to the fact that standard errors are not directly calculated in the model-fitting algorithm, either because they are not required to fit the model, or because they are difficult to compute. The examination of standard errors in model-based clustering is therefore typically neglected. Sampling based methods, such as the jackknife (JK), bootstrap (BS) and parametric bootstrap (PB), are intuitive, generalizable approaches to assessing parameter uncertainty in model-based clustering using a Gaussian mixture model. This paper provides a review and empirical comparison of the jackknife, bootstrap and parametric bootstrap methods for producing standard errors and confidence intervals for mixture parameters. The performance of such sampling methods in the presence of small and/or overlapping clusters requires consideration however; here the weighted likelihood bootstrap (WLBS) approach is demonstrated to be effective in addressing this concern in a model-based clustering framework. The JK, BS, PB and WLBS methods are illustrated and contrasted through simulation studies and through the traditional Old Faithful data set and also the Thyroid data set. The MclustBootstrap function, available in the most recent release of the popular R package mclust, facilitates the implementation of the JK, BS, PB and WLBS approaches to estimating parameter uncertainty in the context of model-based clustering. The JK, WLBS and PB approaches to variance estimation are shown to be robust and provide good coverage across a range of real and simulated data sets when performing model-based clustering; but care is advised when using the BS in such settings. In the case of poor model fit (for example for data with small and/or overlapping clusters), JK and BS are found to suffer from not being able to fit the specified model in many of the sub-samples formed. The PB also suffers when model fit is poor since it is reliant on data sets simulated from the model upon which to base the variance estimation calculations. However the WLBS will generally provide a robust solution, driven by the fact that all observations are represented with some weight in each of the sub-samples formed under this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Andrews DW, Buchinsky M (2000) A three-step method for choosing the number of bootstrap repetitions. Econometrica 68(1):23–51

    MathSciNet  MATH  Google Scholar 

  • Andrews DW, Guggenberger P (2009) Incorrect asymptotic size of subsampling procedures based on post-consistent model selection estimators. J Econom 152(1):19–27

    MathSciNet  MATH  Google Scholar 

  • Azzalini A, Bowman A (1990) A look at some data on the old faithful geyser. Appl Stat 39(3):357–365

    MATH  Google Scholar 

  • Basford K, Greenway D, McLachlan G, Peel D (1997) Standard errors of fitted means under normal mixture models. Comput Stat 12:1–17

    MATH  Google Scholar 

  • Boldea O, Magnus J (2009) Maximum likelihood estimation of the multivariate normal mixture model. J Am Stat Assoc 104:1539–1549

    MathSciNet  MATH  Google Scholar 

  • Bühlmann P (1997) Sieve bootstrap for time series. Bernoulli 3(2):123–148

    MathSciNet  MATH  Google Scholar 

  • Coomans D, Broeckaert I, Jonckheer M, Massart D (1983) Comparison of multivariate discrimination techniques for clinical data-application to the thyroid functional state. Methods Inf Med 22(02):93–101

    Google Scholar 

  • Davison AC, Hinkley DV (1997) Bootstrap methods and their application, vol 1. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Diebolt J, Ip E (1996) Stochastic EM: method and application. In: Gilks WR, Richardson R, Spiegelhalter D (eds) Markov Chain Monte Carlo in practice. Chapman & Hall, London, pp 259–273

    Google Scholar 

  • Efron B (1981) Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68(3):589–599

    MathSciNet  MATH  Google Scholar 

  • Efron B (1982) The jackknife, the bootstrap, and other resampling plans, vol 38. SIAM, Philadelphia

    MATH  Google Scholar 

  • Efron B (1994) Missing data, imputation and the bootstrap (with discussion). J Am Stat Assoc 89(426):463–479

    MATH  Google Scholar 

  • Efron B, Stein C (1981) The jackknife estimate of variance. Ann Stat 9(3):586–596

    MathSciNet  MATH  Google Scholar 

  • Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall/CRC, New York

    MATH  Google Scholar 

  • Everitt BS, Hothorn T (2009) A handbook of statistical analyses using R, 2nd edn. Chapman & Hall, London

    MATH  Google Scholar 

  • Ford I, Silvey S (1980) A sequentially constructed design for estimating a nonlinear parametric function. Biometrika 67(2):381–388

    MathSciNet  MATH  Google Scholar 

  • Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41:578–588

    MATH  Google Scholar 

  • Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–612

    MathSciNet  MATH  Google Scholar 

  • Fraley C, Raftery AE, Murphy TB, Scrucca L (2012) mclust Version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation. Tech. Rep. No. 597, Department of Statistics, University of Washington, USA

  • Grün B, Leisch F (2007) Fitting finite mixtures of generalized linear regressions in R. Comput Stat Data Anal 51(11):5247–5252

    MathSciNet  MATH  Google Scholar 

  • Hong H, Mahajan A, Nekipelov D (2015) Extremum estimation and numerical derivatives. J Econom 188(1):250–263

    MathSciNet  MATH  Google Scholar 

  • Lee SX, McLachlan GJ (2013a) EMMIX-uskew: an R package for fitting mixtures of multivariate skew t-distributions via the EM algorithm. J Stat Softw 55(12):1–22

    Google Scholar 

  • Lee SX, McLachlan GJ (2013b) Model-based clustering and classification with non-normal mixture distributions. Stat Methods Appl 22(4):427–454

    MathSciNet  MATH  Google Scholar 

  • Leeb H, Pötscher BM (2005) Model selection and inference: facts and fiction. Econom Theory 21(1):21–59

    MathSciNet  MATH  Google Scholar 

  • McLachlan G (1987) On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. J R Stat Soc Ser C 36:318–324

    Google Scholar 

  • McLachlan G, Peel D, Basford K, Adams P (1999) Fitting mixtures of normal and \(t\)-components. J Stat Softw 4(2)

  • McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York

    MATH  Google Scholar 

  • McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    MATH  Google Scholar 

  • Meilijson I (1989) A fast improvement to the EM algorithm on its own terms. J R Stat Soc Ser B 51(1):127–138

    MathSciNet  MATH  Google Scholar 

  • Meng X, Rubin D (1991) Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm. J Am Stat Assoc 86(416):899–909

    Google Scholar 

  • Meng XL, Rubin D (1989) Obtaining asymptotic variance-covariance matrices for missing-data problems using EM. In: Proceedings of the American statistical association (statistical computing section), American Statistical Association, Alexandria, Virginia, pp 140–144

  • Mita N, Jiao J, Kani K, Tabuchi A, Hara H (2012) The parametric and non-parametric bootstrap resamplings for the visual acuity measurement. Kawasaki J Med Welf 18:19–28

    Google Scholar 

  • Moulton LH, Zeger SL (1991) Bootstrapping generalized linear models. Comput Stat Data Anal 11(1):53–63

    MathSciNet  MATH  Google Scholar 

  • Newton MA, Raftery AE (1994) Approximate Bayesian inference with the weighted likelihood bootstrap. J R Stat Soc Ser B 56(1):3–26

    MathSciNet  MATH  Google Scholar 

  • Nyamundanda G, Brennan L, Gormley I (2010) Probabilistic principal component analysis for metabolomic data. BMC Bioinform 11(1):571

    Google Scholar 

  • Pawitan Y (2000) Computing empirical likelihood from the bootstrap. Stat Probab Lett 47(4):337–345

    MATH  Google Scholar 

  • Peel D (1998) Mixture model clustering and related topics. Ph.D thesis, University of Queensland, Brisbane

  • Quenouille M (1956) Notes on bias in estimation. Biometrika 43(2):343–348

    MathSciNet  MATH  Google Scholar 

  • R Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    MathSciNet  MATH  Google Scholar 

  • Shi X (1988) A note on the delete-d jackknife variance estimators. Stat Probab Lett 6(5):341–347

    MathSciNet  MATH  Google Scholar 

  • Stoica P, Söderström T (1982) On non-singular information matrices and local identifiability. Int J Control 36(2):323–329

    MATH  Google Scholar 

  • Tanner MA (2012) Tools for statistical inference. Springer, New York

    Google Scholar 

  • Titterington DM (1984) Recursive parameter estimation using incomplete data. J R Stat Soc Ser B 46(2):257–267

    MathSciNet  MATH  Google Scholar 

  • Tukey J (1958) Bias and confidence in not-quite large samples (abstract). Ann Math Stat 29(2):614

    Google Scholar 

  • Turner TR (2000) Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. J R Stat Soc: Ser C 49(3):371–384

    MathSciNet  MATH  Google Scholar 

  • Wu CFJ (1986) Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat 14(4):1261–1295

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrian O’Hagan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the Insight Research Centre (SFI/12/RC/2289) and Science Foundation Ireland under the Research Frontiers Programme (2007/RFP/MATH281).

Appendices

Appendix A: Pairs plots of a simulated data set from Simulation Setting Three

Simulation Setting Three explores the performance and computational features of the JK, BS, PB and WLBS approaches to parameter variance estimation in a higher dimensional setting featuring overlapping and small clusters. Figures 78 and 9 provide pairs plots from a single simulated data set under this setting for which \(n = 500\), \(p = 25\) and \(G = 5\). Each of the different colours/symbols in the plots denotes one of the 5 distinct clusters of observations simulated.

Fig. 7
figure 7

Pairs plots of the first 10 variables for a single simulated data set from Simulation Setting Three (\(n = 500, p = 25, G = 5\))

Fig. 8
figure 8

Pairs plots of the second 10 variables for a single simulated data set from Simulation Setting Three (\(n = 500, p = 25, G = 5\))

Fig. 9
figure 9

Pairs plots of the final 5 variables for a single simulated data set from Simulation Setting Three (\(n = 500, p = 25, G = 5\))

Appendix B: Covariance parameter estimates and standard errors for the Thyroid data

Cluster covariance estimated values are presented below using jackknife (JK), bootstrap (BS), parametric bootstrap (PB) and weighted likelihood bootstrap (WLBS) methods (with associated standard errors) for the optimal mixture of Gaussians model for the Thyroid data, group 1, where \(G = 3\) and \(p = 5\) and the optimal model has unequal diagonal covariance structure across clusters.

$$\begin{aligned} \varSigma _{MCLUST, \,\,Group \,1}= & {} \left( \begin{array}{ccccc} 66.39 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 4.82 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0.23 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0.22 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 3.19 \\ \end{array}\right) \\ \varSigma _{JK, \,\,Group \,1}= & {} \left( \begin{array}{ccccc} 67.50 \,(7.82) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 4.80 \,(0.63) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.24 \,(0.03) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.33 \,(0.04) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 3.25 \,(0.36)\\ \end{array}\right) \\ \varSigma _{BS, \,\,Group \,1}= & {} \left( \begin{array}{ccccc} 66.00 \,(8.25) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 4.80 \,(0.64) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.23 \,(0.03) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.22 \,(0.05) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 3.16 \,(0.34)\\ \end{array}\right) \\ \varSigma _{PB, \,\,Group \,1}= & {} \left( \begin{array}{ccccc} 65.85 \,(7.80) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 4.80 \,(0.54) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.23 \,(0.03) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.22 \,(0.03) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 3.17 \,(0.37)\\ \end{array}\right) \\ \varSigma _{WLBS, \,\,Group \,1}= & {} \left( \begin{array}{ccccc} 65.85 \,(7.99) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 4.78 \,(0.62) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.23 \,(0.03) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.22 \,(0.05) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 3.17 \,(0.42)\\ \end{array}\right) \\ \end{aligned}$$

Cluster covariance estimated values are presented below using jackknife (JK), bootstrap (BS), parametric bootstrap (PB) and weighted likelihood bootstrap (WLBS) methods (with associated standard errors) for the optimal mixture of Gaussians model for the Thyroid data, group 2, where \(G = 3\) and \(p = 5\) and the optimal model has unequal diagonal covariance structure across clusters.

$$\begin{aligned} \varSigma _{MCLUST, \,\,Group \,2}= & {} \left( \begin{array}{ccccc} 344.46 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 17.44 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 4.92 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0.15 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0.07 \\ \end{array}\right) \\ \varSigma _{JK, \,\,Group \,2}= & {} \left( \begin{array}{ccccc} 384.31 \,(101.72) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 14.84 \,(3.00) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 5.19 \,(1.37) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.15 \,(0.03) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.08 \,(0.02)\\ \end{array}\right) \\ \varSigma _{BS, \,\,Group \,2}= & {} \left( \begin{array}{ccccc} 336.73 \,(98.03) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 16.85 \,(2.88) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 4.77 \,(1.31) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.15 \,(0.03) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.07 \,(0.02)\\ \end{array}\right) \\ \varSigma _{PB, \,\,Group \,2}= & {} \left( \begin{array}{ccccc} 334.34 \,(83.28) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 17.04 \,(4.45) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 4.74 \,(1.15) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.15 \,(0.04) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.07 \,(0.02)\\ \end{array}\right) \\ \varSigma _{WLBS, \,\,Group \,2}= & {} \left( \begin{array}{ccccc} 332.50 \,(92.04) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 16.71 \,(2.71) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 4.81 \,(1.28) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.15 \,(0.03) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0.07 \,(0.02)\\ \end{array}\right) \\ \end{aligned}$$

Cluster covariance estimated values are presented below using jackknife (JK), bootstrap (BS), parametric bootstrap (PB) and weighted likelihood bootstrap (WLBS) methods (with associated standard errors) for the optimal mixture of Gaussians model for the Thyroid data, group 3, where \(G = 3\) and \(p = 5\) and the optimal model has unequal diagonal covariance structure across clusters.

$$\begin{aligned}&\varSigma _{MCLUST, \,\,Group \,3} = \left( \begin{array}{ccccc} 95.23 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 4.26 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0.28 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 147.06 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 231.22 \\ \end{array}\right) \\&\varSigma _{JK, \,\,Group \,3} = \left( \begin{array}{ccccc} 95.47 \,(29.87) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 2.91 \,(1.10) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.24 \,(0.06) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 157.52 \,(71.60) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 234.45 \,(71.18)\\ \end{array}\right) \\&\varSigma _{BS, \,\,Group \,3} = \left( \begin{array}{ccccc} 90.83 \,(27.53) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 3.93 \,(0.94) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.26 \,(0.06) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 143.33 \,(65.03) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 222.37 \,(65.83)\\ \end{array}\right) \\&\varSigma _{PB, \,\,Group \,3} = \left( \begin{array}{ccccc} 91.11 \,(25.03) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 4.17 \,(1.16) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.27 \,(0.08) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 141.92 \,(38.84) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 222.99 \,(61.33)\\ \end{array}\right) \\&\varSigma _{WLBS, \,\,Group \,3} = \left( \begin{array}{ccccc} 92.72 \,(25.66) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 3.91 \,(0.85) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0.26 \,(0.05) &{} 0 \,(0) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 139.92 \,(61.20) &{} 0 \,(0)\\ 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 0 \,(0) &{} 219.38 \,(62.58)\\ \end{array}\right) \end{aligned}$$

The following code produces all variance estimation results for the Thyroid data set, using the MclustBootstrap function in mclust.

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

O’Hagan, A., Murphy, T.B., Scrucca, L. et al. Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap. Comput Stat 34, 1779–1813 (2019). https://doi.org/10.1007/s00180-019-00897-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-019-00897-9

Keywords

Navigation