Abstract
Multivariate mixture modeling approach using the skew-t distribution has emerged as a powerful and flexible tool for robust model-based clustering. The occurrence of missing data is a ubiquitous problem in almost every scientific field. In this paper, we offer a computationally flexible EM-type procedure for learning multivariate skew-t mixture models to deal with missing data under missing at random mechanisms. Further, we present an information-based approach to approximating the asymptotic covariance matrix of the maximum likelihood estimators using the outer product of the scores. To assist the development and ease the implementation of our algorithm, two auxiliary permutation matrices are utilized for fast determination of the observed and missing parts of each observation. The practical usefulness of the proposed methodology is illustrated through simulations with varying proportions of artificial missing values and a real data example with genuine missing values.
Similar content being viewed by others
References
Aitken AC (1926) On Bernoulli’s numerical solution of algebraic equations. Proc R Soc Edinb 46:289–305
Andrews JL, McNicholas PD (2012) Model-based clustering, classification, and discriminant analysis via mixtures of multivariate \(t\)-distributions. Stat Comput 22:1021–1029
Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178
Azzalini A (2014) The Skew-Normal and Related Families. IMS Monographs series. Cambridge University Press, Cambridge, UK
Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew normal distribution. J Roy Stat Soc Ser B 61:579–602
Azzalini A, Capitanio A (2003) Distributions generated by perturbation of symmetry with emphasis on a multivariate skew \(t\)-distribution. J R Stat Soc Ser B 65:367–389
Azzalini A, Dalla Valle A (1996) The multivariate skew-normal distribution. Biometrika 83:715–726
Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803–821
Basford KE, Greenway DR, McLachlan GJ, Peel D (1997) Standard errors of fitted means under normal mixture. Comput Stat 12:1–17
Boldea O, Magnus JR (2009) Maximum likelihood estimation of the multivariate normal mixture model. J Am Statist Assoc 104:1539–1549
Bolfarine H, Montenegro LC, Lachos VH (2007) Influence diagnostics for skew-normal linear mixed models. Sankhya 69:648–670
Cabral CR, Lachos VH, Prates M (2012) Robust multivariate mixture modelling using scale mixtures of skew-normal distributions. Comput Stat Data Anal 56:226–246
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recognit 28:781–793
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 39:1–38
Efron B, Hinkley DV (1978) Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher Information (with discussion). Biometrika 65:457–487
Efron B, Tibshirani R (1986) Bootstrap method for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1:54–77
Fraley C, Raftery AE (1998) How many clusters? which clustering method? answers via model-based cluster analysis. Comput J 41:578–588
Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611–612
Frühwirth-Schnatter S, Pyne S (2010) Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-\(t\) distributions. Biostatistics 11(2):317–336
García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2008) A general trimming approach to robust cluster analysis. Ann Stat 36:1324–1345
García-Escudero LA, Gordaliza A, Mayo-Iscar A (2014) A constrained robust proposal for mixture modeling avoiding spurious solutions. Adv Data Anal Classif 8:27–43
Genton MG (2004) Skew-Elliptical Distributions and Their Applications. Chapman & Hall, New York
Ghahramani Z, Jordan MI (1994) Supervised learning from incomplete data via an EM approach. In: Cowan JD, Tesarro G, Alspector J (eds) Adv Neural Inform Process Syst, vol 6. Morgan Kaufmann Publishers, San Francisco, pp 120–127
Hartigan JA, Wong MA (1979) A \(k\)-means clustering algorithm. Appl Stat 28:100–108
Hennig C (2004) Breakdown points for maximum likelihood estimators of location-scale mixtures. Ann Stat 32:1313–1340
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Jones MC, Faddy MJ (2003) A skew extension of the \(t\)-distribution, with applications. J Roy Stat Soc Ser B 65:159–174
Karlis D, Santourian A (2009) Model-based clustering with non-elliptically contoured distributions. Stat Comp 19:73–83
Keribin C (2000) Consistent estimation of the order of mixture models. Sankhyā Indian J Stat Ser A 62:49–66
Lee S, McLachlan GJ (2013a) On mixtures of skew normal and skew \(t\)-distributions. Adv Data Anal Classif 7:241–266
Lee S, McLachlan GJ (2013b) Model-based clustering and classification with non-normal mixture distributions (with discussion). Stat Methods Appl 22:427–479
Lee S, McLachlan GJ (2014) Finite mixtures of multivariate skew \(t\)-distributions: some recent and new results. Stat Comp 24:181–202
Lee S, McLachlan GJ (2015) Finite mixtures of canonical fundamental skew \(t\)-distributions: the unification of the restricted and unrestricted skew \(t\)-mixture models. Stat Comp. doi:10.1007/s11222-015-9545-x
Lin TI (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivar Anal 100:257–265
Lin TI (2010) Robust mixture modeling using multivariate skew \(t\) distributions. Stat Comp 20:343–356
Lin TI (2014) Learning from incomplete data via parameterized \(t\) mixture models through eigenvalue decomposition. Comput Stat Data Anal 71:183–195
Lin TI, Ho HJ, Lee CR (2014) Flexible mixture modelling using the multivariate skew-\(t\)-normal distribution. Stat Comput 24:531–546
Lin TI, Ho HJ, Shen PS (2009) Computationally efficient learning of multivariate \(t\) mixture models with missing information. Comp Stat 24:375–392
Lin TI, Lee JC, Ho HJ (2006) On fast supervised learning for normal mixture models with missing information. Pattern Recognit 39:1177–1187
Lin TI, Lee JC, Hsieh WJ (2007a) Robust mixture modeling using the skew \(t\) distribution. Stat Comp 17:81–92
Lin TI, Lee JC, Yen SY (2007b) Finite mixture modelling using the skew normal distribution. Stat Sin 17:909–927
Lin TI, McLachlan GJ, Lee SX (2015a) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. J Multivar Anal. doi:10.1016/j.jmva.2015.09.025
Lin TI, Wu PH, McLachlan GJ, Lee SX (2015b) A robust factor analysis model using the restricted skew-\(t\) distribution. TEST 24:510–531
Little RJA, Rubin DB (2002) Statistical Analysis with Missing Data, 2nd edn. Wiley, New York
Liu J, Wu YN (1999) Parameter expansion for data augmentation. J Am Stat Assoc 94:1264–1274
McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley, New York
Meilijson I (1989) A fast improvement to the EM algorithm to its own terms. J R Stat Soc Ser B 51:127–138
Melnykov V, Maitra R (2010) Finite mixture models and model-based clustering. Stat Surv 4:80–116
Meng XL, Van Dyk D (1997) The EM algorithm-an old folk song sung to a fast new tune (with discussion). J R Stat Soc Ser B 59:511–567
Meng XL, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80:267–278
Murray PM, Browne RP, McNicholas PD (2014a) Mixtures of skew-\(t\) factor analyzers. Comput Stat Data Anal 77:326–335
Murray PM, McNicholas PD, Browne RP (2014b) Mixtures of common skew-\(t\) factor analyzers. Stat 3:68–82
Neykov N, Filzmoser P, Dimova R, Neytchev P (2007) Robust fitting of mixtures using the trimmed likelihood estimator. Comput Stat Data Anal 52:299–308
Peel D, McLachlan GJ (2000) Robust mixture modeling using the \(t\) distribution. Stat Comput 10:339–348
Prates MO, Cabral MO, Lachos VH (2013) Fitting finite mixture of scale mixture of skew-normal distributions. J Stat Softw 54:1–20
Pyne S, Hu X, Wang K, Rossin E, Lin TI, Maier LM, Baecher-Allan C, McLachlan GJ, Tamayo P, Hafler DA, De Jager PL, Mesirov JP (2009) Automated high-dimensional flow cytometric data analysis. Proc Natl Acad Sci USA 106:8519–8524
Rubin DB (1974) Characterizing the estimation of parameters in incomplete-data problems. J Am Stat Assoc 69:474–476
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Sahu SK, Dey DK, Branco MD (2003) A new class of multivariate skew distributions with application to Bayesian regression models. Can J Stat 31:129–150
Schafer JL (1997) Analysis of incomplete multivariate data. Chapman and Hall, London
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Smith JW, Everhart JE, Dickson WC, Knowler WC, Johannes RS (1988) Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care IEEE Computer Society Press, pp 261–265
Vrbik I, McNicholas PD (2014) Parsimonious skew mixture models for model-based clustering and classification. Comput Stat Data Anal 71:196–210
Wang HX, Hu Z (2009) On EM estimation for mixture of multivariate \(t\)-distributions. Neural Process Lett 30:243–256
Wang HX, Zhang QB, Luo B, Wei S (2004) Robust mixture modelling using multivariate \(t\) distribution with missing information. Pattern Recogn Lett 25:701–710
White HS (1994) Estimation, inference, and specification analysis. Cambridge University Press, Cambridge
Yao W, Wei Y, Yu C (2014) Robust mixture regression using the \(t\)-distribution. Comput Stat Data Anal 71:116–127
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, WL., Lin, TI. Robust model-based clustering via mixtures of skew-t distributions with missing information. Adv Data Anal Classif 9, 423–445 (2015). https://doi.org/10.1007/s11634-015-0221-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-015-0221-y