Abstract
The notion of defining a cluster as a component in a mixture model was put forth by Tiedeman in 1955; since then, the use of mixture models for clustering has grown into an important subfield of classification. Considering the volume of work within this field over the past decade, which seems equal to all of that which went before, a review of work to date is timely. First, the definition of a cluster is discussed and some historical context for model-based clustering is provided. Then, starting with Gaussian mixtures, the evolution of model-based clustering is traced, from the famous paper by Wolfe in 1965 to work that is currently available only in preprint form. This review ends with a look ahead to the next decade or so.
Article PDF
Similar content being viewed by others
References
AITKEN, A.C. (1926), âA Series Formula for the Roots of Algebraic and Transcendental Equationsâ, Proceedings of the Royal Society of Edinburgh, 45, 14â22.
AITKIN, M., and WILSON, G.T. (1980), âMixture Models, Outliers, and the EM Algorithmâ, Technometrics, 22(3), 325â331.
ANDERLUCCI, L., and VIROLI, C. (2015), âCovariance Pattern Mixture Models for Multivariate Longitudinal Dataâ, The Annals of Applied Statistics, 9(2), 777â800.
ANDREWS, J.L., and MCNICHOLAS, P.D. (2011a), âExtending Mixtures of Multivariate t-Factor Analyzersâ, Statistics and Computing, 21(3), 361â373.
ANDREWS, J.L., and MCNICHOLAS, P.D. (2011b), âMixtures of Modified t-Factor Analyzers for Model-Based Clustering, Classification, and Discriminant Analysisâ, Journal of Statistical Planning and Inference, 141(4), 1479â1486.
ANDREWS, J.L., and MCNICHOLAS, P.D. (2012), âModel-Based Clustering, Classification, and Discriminant Analysis Via Mixtures of Multivariate t-Distributions: The tEIGEN Familyâ, Statistics and Computing, 22(5), 1021â1029.
ANDREWS, J.L., and MCNICHOLAS, P.D. (2013), vscc: Variable Selection for Clustering and Classification, R Package Version 0.2.
ANDREWS, J.L., and MCNICHOLAS, P.D. (2014), âVariable Selection for Clustering and Classificationâ, Journal of Classification, 31(2), 136â153.
ANDREWS, J.L., MCNICHOLAS, P.D., and SUBEDI, S. (2011), âModel-Based Classification Via Mixtures of Multivariate t-Distributionsâ, Computational Statistics and Data Analysis, 55(1), 520â529.
ANDREWS, J.L.,WICKINS, J.R., BOERS, N.M., and MCNICHOLAS, P.D. (2015), teigen: Model-Based Clustering and Classification with the Multivariate t Distribution, R Package Version 2.1.0.
ATTIAS, H. (2000), âA Variational Bayesian Framework for Graphical Modelsâ, in Advances in Neural Information Processing Systems, Volume 12, MIT Press, pp. 209â215.
AZZALINI, A., BROWNE, R.P., GENTON, M.G., and MCNICHOLAS, P.D. (2016), âOn Nomenclature for, and the Relative Merits of, Two Formulations of Skew Distributionsâ, Statistics and Probability Letters, 110, 201â206.
AZZALINI, A., and CAPITANIO, A. (1999), âStatistical Applications of the Multivariate Skew Normal Distributionâ, Journal of the Royal Statistical Society: Series B, 61(3), 579â602.
AZZALINI, A., and CAPITANIO, A. (2003), âDistributions Generated by Perturbation of Symmetry with Emphasis on a Multivariate Skew t Distributionâ, Journal of the Royal Statistical Society: Series B, 65(2), 367â389.
AZZALINI, A. (2014), The Skew-Normal and Related Families, with the collaboration of A. Capitanio, IMS monographs, Cambridge: Cambridge University Press.
AZZALINI, A., and VALLE, A.D. (1996), âThe Multivariate Skew-Normal Distributionâ, Biometrika / 83, 715â726.
BAEK, J., and MCLACHLAN, G.J. (2008), âMixtures of Factor Analyzers with Common Factor Loadings for the Clustering and Visualisation of High-Dimensional Dataâ, Technical Report NI08018-SCH, Preprint Series of the Isaac Newton Institute for Mathematical Sciences, Cambridge.
BAEK, J., and MCLACHLAN, G.J. (2011), âMixtures of Common t-Factor Analyzers for Clustering High-Dimensional Microarray Dataâ, Bioinformatics, 27, 1269â1276.
BAEK, J., MCLACHLAN, G.J., and FLACK, L.K. (2010), âMixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Dataâ, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1298â1309.
BANFIELD, J.D., and RAFTERY, A.E. (1993), âModel-Based Gaussian and Non-Gaussian Clusteringâ, Biometrics, 49(3), 803â821.
BARNDORFF-NIELSEN,O.E. (1997), âNormal Inverse Gaussian Distributions and Stochastic Volatility Modellingâ, Scandinavian Journal of Statistics, 24(1), 1â13.
BARTLETT,M.S. (1953), âFactor Analysis in Psychology as a Statistician Sees Itâ, in Uppsala Symposium on Psychological Factor Analysis, Number 3 in Nordisk Psykologiâs Monograph Series, Copenhagen: Ejnar Mundsgaards, pp. 23â34.
BAUDRY, J.-P. (2015), âEstimation and Model Selection for Model-Based Clustering with the Conditional Classification Likelihoodâ, Electronic Journal of Statistics, 9, 1041â1077.
BAUM, L.E., PETRIE, T., SOULES, G., and WEISS, N. (1970), âA Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chainsâ, Annals of Mathematical Statistics, 41, 164â171.
BDIRI, T., BOUGUILA, N., and ZIOU, D. (2016), âVariational Bayesian Inference for Infinite Generalized Inverted Dirichlet Mixtures with Feature Selection and Its Application to Clusteringâ, Applied Intelligence, 44(3), 507â525.
BENSMAIL, H., CELEUX, G., RAFTERY, A.E., and ROBERT, C.P. (1997), âInference in Model-Based Cluster Analysisâ, Statistics and Computing, 7(1), 1â10.
BHATTACHARYA, S., and MCNICHOLAS, P.D. (2014), âA LASSO-Penalized BIC for Mixture Model Selectionâ, Advances in Data Analysis and Classification, 8(1), 45â61.
BIERNACKI, C., CELEUX, G., and GOVAERT, G. (2000), âAssessing a Mixture Model for Clustering with the Integrated Completed Likelihoodâ, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(7), 719â725.
BIERNACKI, C., CELEUX, G., and GOVAERT, G. (2003), âChoosing Starting Values for the EM Algorithm for Getting the Highest Likelihood in Multivariate Gaussian Mixture Modelsâ, Computational Statistics and Data Analysis, 41, 561â575.
BIERNACKI, C., CELEUX, G., and GOVAERT, G. (2010), âExact and Monte Carlo Calculations of Integrated Likelihoods for the Latent Class Modelâ, Journal of Statistical Planning and Inference, 140(11), 2991â3002.
BIERNACKI, C., CELEUX, G., GOVAERT, G., and LANGROGNET, F. (2006), âModel-Based Cluster and Discriminant Analysis with the MIXMOD Softwareâ, Computational Statistics and Data Analysis, 51(2), 587â600.
BOUVEYRON, C., and BRUNET-SAUMARD, C. (2014), âModel-Based Clustering of High-Dimensional Data: A Reviewâ, Computational Statistics and Data Analysis, 71, 52â78.
BOUVEYRON, C., CELEUX, G., and Girard, S. (2011), âIntrinsic Dimension Estimation by Maximum Likelihood in Isotropic Probabilistic PCAâ, Pattern Recognition Letters, 32(14), 1706â1713.
BOUVEYRON, C., GIRARD, S., and SCHMID, C. (2007a), âHigh-Dimensional Data Clusteringâ, Computational Statistics and Data Analysis, 52(1), 502â519.
BOUVEYRON, C., GIRARD, S., and SCHMID, C. (2007b), âHigh Dimensional Discriminant Analysisâ, Communications in Statistics â Theory and Methods, 36(14), 2607â2623.
BRANCO, M.D., and DEY, D.K. (2001), âA General Class of Multivariate Skew-Elliptical Distributionsâ, Journal of Multivariate Analysis, 79, 99â113.
BROWNE, R.P., and MCNICHOLAS, P.D. (2012), âModel-Based Clustering and Classification of Data with Mixed Typeâ, Journal of Statistical Planning and Inference, 142(11), 2976â2984.
BROWNE, R.P., and MCNICHOLAS, P.D. (2014a), âEstimating Common Principal Components in High Dimensionsâ, Advances in Data Analysis and Classification, 8(2), 217â226.
BROWNE, R.P., and MCNICHOLAS, P.D. (2014b), mixture: Mixture Models for Clustering and Classification, R Package Version 1.1.
BROWNE, R.P., and P. D. MCNICHOLAS, P.D. (2014c), âOrthogonal Stiefel Manifold Optimization for Eigen-Decomposed Covariance Parameter Estimation in Mixture Modelsâ, Statistics and Computing, 24(2), 203â210.
BROWNE, R.P., and MCNICHOLAS, P.D. (2015), âA Mixture of Generalized Hyperbolic Distributionsâ, Canadian Journal of Statistics, 43(2), 176â198.
BROWNE, R.P., MCNICHOLAS, P.D., and SPARLING, M.D. (2012), âModel-Based Learning Using a Mixture of Mixtures of Gaussian and Uniform Distributionsâ, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 814â817.
CAGNONE, S., and VIROLI, C. (2012), âA Factor Mixture AnalysisModel for Multivariate Binary Dataâ, Statistical Modelling, 12(3), 257â277.
CAMPBELL, N.A. (1984), âMixture Models and Atypical Valuesâ, Mathematical Geology, 16(5), 465â477.
CARVALHO, C., CHANG, J., LUCAS, J., NEVINS, J., WANG, Q., and WEST, M. (2008), âHigh-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomicsâ, Journal of the American Statistical Association, 103(484), 1438â1456.
CATELL, R.B. (1949), ââRâ and Other Coefficients of Pattern Similarityâ, Psychometrika, 14, 279â298.
CELEUX, G., and GOVAERT, G. (1991), âClustering Criteria for Discrete Data and Latent Class Modelsâ, Journal of Classification, 8(2), 157â176.
CELEUX, G., and GOVAERT, G. (1995), âGaussian Parsimonious Clustering Modelsâ, Pattern Recognition, 28(5), 781â793.
CORDUNEANU, A., and BISHOP, C.M. (2001), âVariational Bayesian Model Selection for Mixture Distributionsâ, in Artificial Intelligence and Statistics, Los Altos, CA: Morgan Kaufmann, pp. 27â34.
CORETTO, P., and HENNIG, C. (2015), âRobust Improper Maximum Likelihood: Tuning, Computation, and a Comparison with Other Methods for Robust Gaussian Clusteringâ, arXiv preprint arXiv:1405.1299v3.
CORMACK, R.M. (1971), âA Review of Classification (With Discussion)â, Journal of the Royal Statistical Society: Series A, 34, 321â367.
DANG, U.J., BROWNE, R.P., and MCNICHOLAS, P.D. (2015), âMixtures of Multivariate Power Exponential Distributionsâ, Biometrics, 71(4), 1081â1089.
DASGUPTA, A., and RAFTERY, A.E. (1998), âDetecting Features in Spatial Point Processes with Clutter ViaModel-Based Clusteringâ, Journal of the American Statistical Association, 93, 294â302.
DAY, N.E. (1969), âEstimating the Components of a Mixture of Normal Distributionsâ, Biometrika, 56, 463â474.
DE LA CRUZ-MESĂA, R., QUINTANA, R.A., and MARSHALL, G. (2008), âModel-Based Clustering for Longitudinal dataâ, Computational Statistics and Data Analysis, 52(3), 1441â1457.
DE VEAUX, R.D., and KRIEGER, A.M. (1990), âRobust Estimation of a Normal Mixtureâ, Statistics and Probability Letters, 10(1), 1â7.
DEAN, N., RAFTERY, A.E., and SCRUCCA, L. (2012), clustvarsel: Variable Selection for Model-Based Clustering, R package version 2.0.
DEMPSTER, A.P., LAIRD, N.M., and RUBIN, D.B. (1977), âMaximum Likelihood from Incomplete Data Via the EM Algorithmâ, Journal of the Royal Statistical Society: Series B, 39(1), 1â38.
DI LASCIO, F.M.L., and GIANNERINI, S. (2012), âA Copula-Based Algorithm for Discovering Patterns of Dependent Observationsâ, Journal of Classification, 29(1), 50â75.
EDWARDS, A.W.F., and CAVALLI-SFORZA, L.L. (1965), âA Method for Cluster Analysisâ, Biometrics, 21, 362â375.
EVERITT, B.S., and HAND, D.J. (1981), Finite Mixture Distributions, Monographs on Applied Probability and Statistics, London: Chapman and Hall.
EVERITT, B.S., LANDAU, S., LEESE, M., and STAHL, D. (2011), Cluster Analysis (5th ed.), Chichester: John Wiley & Sons.
FABRIGAR, L.R., WEGENER, D.T., MACCALLUM, R.C., and STRAHAN, E.J. (1999), âEvaluating the Use of Exploratory Factor Analysis in Psychological Researchâ, Psychological Methods, 4(3), 272â299.
FLURY, B. (1988), Common Principal Components and Related Multivariate Models, New York: Wiley.
FRALEY, C., and RAFTERY, A.E. (1998), âHow Many Clusters? Which Clustering Methods? Answers Via Model-Based Cluster Analysisâ, The Computer Journal, 41(8), 578â588.
FRALEY, C., and RAFTERY, A.E. (1999), âMCLUST: Software for Model-Based Cluster Analysisâ, Journal of Classification, 16, 297â306.
FRALEY, C., and RAFTERY, A.E. (2002a), âMCLUST: Software for Model-Based Clustering, Density Estimation, and Discriminant Analysisâ, Technical Report 415, University of Washington, Department of Statistics.
FRALEY, C., and RAFTERY, A.E. (2002b), âModel-Based Clustering, Discriminant Analysis, and Density Estimationâ, Journal of the American Statistical Association, 97(458), 611â631.
FRANCZAK, B.C., BROWNE, R.P., and MCNICHOLAS, P.D. (2014), âMixtures of Shifted Asymmetric Laplace Distributionsâ, IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1149â1157.
FRIEDMAN, H.P., and RUBIN, J. (1967), âOn Some Invariant Criteria for Grouping Dataâ, Journal of the American Statistical Association, 62, 1159â1178.
FRITZ, H., GARCĂA-ESCUDERO, L.A., and MAYO-ISCAR, A. (2012), âtclust: An R Package for a Trimming Approach to Cluster Analysisâ, Journal of Statistical Software, 47(12), 1â26.
FRĂHWIRTH-SCHNATTER, S. (2006), Finite Mixture and Markov Switching Models, New York: Springer-Verlag.
GALIMBERTI, G., MONTANARI, A., and VIROLI, C. (2009), âPenalized Factor Mixture Analysis for Variable Selection in Clustered Dataâ, Computational Statistics and Data Analysis, 53, 4301â4310.
GARCĂA-ESCUDERO, L.A.,GORDALIZA,A., MATRN, C., andMAYO-ISCAR,A. (2008), âA General Trimming Approach to Robust Cluster Analysisâ, The Annals of Statistics, 36(3), 1324â1345.
GERSHENFELD, N. (1997), âNonlinear Inference and Cluster-Weighted Modelingâ, Annals of the New York Academy of Sciences, 808(1), 18â24.
GHAHRAMANI, Z., and HINTON, G.E. (1997), âThe EM Algorithm for Factor Analyzersâ, Technical Report CRG-TR-96-1, University of Toronto, Toronto, Canada.
GOLLINI, I., and MURPHY, T.B. (2014), âMixture of Latent Trait Analyzers for Model-Based Clustering of Categorical Dataâ, Statistics and Computing, 24(4), 569â588.
GĂMEZ, E., GĂMEZ-VIILEGAS, M.A., and MARIN, J.M. (1998), âA Multivariate Generalization of the Power Exponential Family of Distributionsâ, Communications in Statistics â Theory and Methods, 27(3), 589â600.
GĂMEZ-SĂ NCHEZ-MANZANO, E., GĂMEZ-VILLEGAS, M.A., and MarĂn, J.M. (2008), âMultivariate Exponential Power Distributions as Mixtures of Normal Distributions with Bayesian Applicationsâ, Communications in Statistics â Theory and Methods, 37(6), 972â985.
GOODMAN, L. (1974), âExploratory Latent Structure Analysis Using Both Identifiable and Unidentifiable Modelsâ, Biometrika, 61(2), 215â231.
GORDON, A.D. (1981), Classification, London: Chapman and Hall.
GRESELIN, F., and INGRASSIA, S. (2010), âConstrained Monotone EM Algorithms for Mixtures of Multivariate t-Distributionsâ, Statistics and Computing, 20(1), 9â22.
HATHAWAY, R.J. (1985), âA Constrained Formulation of Maximum Likelihood Estimation for Normal Mixture Distributionsâ, The Annals of Statistics, 13(2), 795â800.
HEISER, W.J. (1995), âRecent Advances in Descriptive Multivariate Analysisâ, in Convergent Computation by Iterative Majorization: Theory and Applications in Multidimensional Data Analysis, ed. W.J. Krzanowski, Oxford: Oxford University Press, pp. 157â189.
HENNIG, C. (2000), âIdentifiablity of Models for Clusterwise Linear Regressionâ, Journal of Classification, 17(2), 273â296.
HENNIG, C. (2004), âBreakdown Points for Maximum Likelihood Estimators of Location-Scale Mixturesâ, The Annals of Statistics, 32(4), 1313â1340.
HENNIG, C. (2015), âWhat are the True Clusters?â, Pattern Recognition Letters, 64, 53â62.
HORN, J.L. (1965), âA Rationale and Technique for Estimating the Number of Factors in Factor Analysisâ, Psychometrika, 30, 179â185.
HU, W. (2005), Calibration of Multivariate Generalized Hyperbolic Distributions Using the EM Algorithm, with Applications in Risk Management, Portfolio Optimization and Portfolio Credit Risk, Ph. D. thesis, The Florida State University, Tallahassee.
HUBER, P.J. (1964), âRobust Estimation of a Location Parameterâ, The Annals of Mathematical Statistics, 35, 73â101.
HUBER, P.J. (1981), Robust Statistics, New York: Wiley.
HUMBERT, S., SUBEDI, S., COHN, J., ZENG, B., BI, Y.-M., CHEN, X., ZHU, T., MCNICHOLAS, P.D., and ROTHSTEIN, S.J. (2013), âGenome-Wide Expression Profiling of Maize in Response to Individual and Combined Water and Nitrogen Stressesâ, BMC Genetics, 14(3).
HUMPHREYS, L.G., and ILGEN, D.R. (1969), âNote on a Criterion for the Number of Common Factorsâ, Educational and Psychological Measurements, 29, 571â578.
HUMPHREYS, L.G., and MONTANELLI, R.G. JR. (1975), âAn Investigation of the Parallel Analysis Criterion for Determining the Number of Common Factorsâ, Multivariate Behavioral Research, 10, 193â205.
INGRASSIA, S., MINOTTI, S.C., and PUNZO, A. (2014), âModel-Based Clustering Via Linear Cluster-Weighted Modelsâ, Computational Statistics and Data Analysis, 71, 159â182.
INGRASSIA, S., MINOTTI, S.C., PUNZO, A., and VITTADINI, G. (2015), âThe Generalized Linear Mixed Cluster-Weighted Modelâ, Journal of Classification, 32(1), 85â113.
INGRASSIA, S., MINOTTI, S.C., and VITTADINI, G. (2012), âLocal Statistical Modeling Via the Cluster-Weighted Approach with Elliptical Distributionsâ, Journal of Classification, 29(3), 363â401.
INGRASSIA, S., and PUNZO, A. (2015), âDecision Boundaries for Mixtures of Regressionsâ, Journal of the Korean Statistical Society, 44(2), 295â306.
JAAKKOLA, T.S., and JORDAN, M.I. (2000), âBayesian Parameter Estimation Via Variational Methodsâ, Statistics and Computing, 10(1), 25â37.
JAIN, S., and NEAL, R.M. (2004), âA Split-Merge Markov Chain Monte Carlo Procedure for the Dirichlet Process Mixture Modelâ, Journal of Computational and Graphical Statistics, 13(1), 158â182.
JAJUGA, K., and PAPLA, D. (2006), âCopula Functions in Model Based Clusteringâ, in From Data and Information Analysis to Knowledge Engineering, Studies in Classification, Data Analysis, and Knowledge Organization, eds. M. Spiliopoulou, R. Kruse, C. Borgelt, A.N¨urnberger, and W. Gaul, Berlin, Heidelberg: Springer, pp. 603â613.
JORDAN, M.I., ZGHAHRAMANI, Z., JAAKKOLA, T.S., and SAUL, L.K. (1999), âAn Introduction to Variational Methods for Graphical Modelsâ, Machine Learning, 37, 183â233.
JĂRESKOG, K.G. (1990), âNew Developments in LISREL: Analysis of Ordinal Variables Using Polychoric Correlations and Weighted Least Squaresâ, Quality and Quantity, 24(4), 387â404.
KARLIS, D., and SANTOURIAN, A. (2009), âModel-Based Clustering with Non-Elliptically Contoured Distributionsâ, Statistics and Computing, 19(1), 73â83.
KASS, R.E., and RAFTERY, A.E. (1995), âBayes Factorsâ, Journal of the American Statistical Association, 90(430), 773â795.
KERIBIN, C. (2000), âConsistent Estimation of the Order of Mixture Modelsâ, SankhyÄ. The Indian Journal of Statistics. Series A, 62(1), 49â66.
KHARIN, Y. (1996), Robustness in Statistical Pattern Recognition, Dordrecht: Kluwer.
KOSMIDIS, I., and KARLIS, D. (2015), âModel-Based Clustering Using Copulas with Applicationsâ, arXiv preprint arXiv:1404.4077v5.
KOTZ, S., KOZUBOWSKI, T.J., and PODGORSKI, K. (2001), The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance (1st ed.), Boston: Burkhäuser.
LAWLEY, D.N., and MAXWELL, A.E. (1962), âFactor Analysis as a Statistical Methodâ, Journal of the Royal Statistical Society: Series D, 12(3), 209â229.
LEE, S., and MCLACHLAN, G.J. (2011), âOn the Fitting of Mixtures of Multivariate Skew t-distributions Via the EM Algorithmâ, arXiv:1109.4706.
LEE, S., and MCLACHLAN, G.J.(2014), âFinite Mixtures of Multivariate Skew t-Distributions: Some Recent and New Resultsâ, Statistics and Computing, 24, 181â202.
LEE, S.X., and MCLACHLAN, G.J. (2013a), âModel-Based Clustering and Classification with Non-Normal Mixture Distributionsâ, Statistical Methods and Applications, 22(4), 427â454.
LEE, S.X., and MCLACHLAN, G.J. (2013b), âOn Mixtures of Skew Normal and Skew t-Distributionsâ, Advances in Data Analysis and Classification, 7(3), 241â266.
LEISCH, F. (2004), âFlexmix: A General Framework For Finite Mixture Models And Latent Class Regression in Râ, Journal of Statistical Software, 11(8), 1â18.
LEROUX, B.G. (1992), âConsistent Estimation of a Mixing Distributionâ, The Annals of Statistics, 20(3), 1350â1360.
LI, J. (2005), âClustering Based on a Multi-Layer Mixture Modelâ, Journal of Computational and Graphical Statistics, 14(3), 547â568.
LI, K.C. (1991), âSliced Inverse Regression for Dimension Reduction (With Discussion)â, Journal of the American Statistical Association, 86, 316â342.
LI, K.C. (2000), âHigh Dimensional Data Analysis Via the SIR/PHD Approachâ, Unpublished manuscript.
LIN, T.-I. (2009), âMaximum Likelihood Estimation for Multivariate Skew Normal Mixture Modelsâ, Journal of Multivariate Analysis, 100, 257â265.
LIN, T.-I. (2010), âRobust Mixture Modeling Using Multivariate Skew t Distributionsâ, Statistics and Computing, 20(3), 343â356.
LIN, T.-I., MCLACHLAN, G.J., and LEE, S.X. (2016), âExtending Mixtures of Factor Models Using the Restricted Multivariate Skew-Normal Distributionâ, Journal of Multivariate Analysis, 143, 398â413.
LIN, T.-I., MCNicholas, P.D., and HSIU, J.H. (2014), âCapturing Patterns Via Parsimonious t Mixture Modelsâ, Statistics and Probability Letters, 88, 80â87.
LOPES, H.F., and WEST, M. (2004), âBayesian Model Assessment in Factor Analysisâ, Statistica Sinica, 14, 41â67.
MARBAC, M., BIERNACKI, C., and VANDEWALLE, V. (2014), âFinite Mixture Model of Conditional Dependencies Modes to Cluster Categorical Dataâ, arXiv preprint arXiv:1402.5103.
MARBAC, M., BIERNACKI, C., and VANDEWALLE, V. (2015), âModel-Based Clustering of Gaussian Copulas for Mixed Dataâ, arXiv preprint arXiv:1405.1299v3.
MARKATOU, M. (2000), âMixture Models, Robustness, and the Weighted Likelihood Methodologyâ, Biometrics, 56(2), 483â486.
MAUGIS, C. (2009), âThe Selvarclust Softwareâ, www.math.univ-toulouse.fr/~maugis/SelvarClustHomepage.html.
MAUGIS, C., CELEUX, G., and MARTIN-MAGNIETTE, M.-L. (2009a), âVariable Selection for Clustering with Gaussian Mixture Modelsâ, Biometrics, 65(3), 701â709.
MAUGIS, C., CELEUX, G., and MARTIN-MAGNIETTE, M.-L. (2009b), âVariable Selection in Model-Based Clustering: A General Variable Role Modelingâ, Computational Statistics and Data Analysis, 53(11), 3872â3882.
MCGRORY, C., and TITTERINGTON, D. (2007), âVariational Approximations in Bayesian Model Selection for Finite Mixture Distributionsâ, Computational Statistics and Data Analysis, 51(11), 5352â5367.
MCLACHLAN, G.J., and BASFORD, K.E. (1988), Mixture Models: Inference and Applications to Clustering, New York: Marcel Dekker Inc.
MCLACHLAN, G.J., BEAN, R.W., and JONES, L.B.-T. (2007), âExtension of the Mixture of Factor Analyzers Model to Incorporate the Multivariate t-Distributionâ, Computational Statistics and Data Analysis, 51(11), 5327â5338.
MCLACHLAN, G.J., and KRISHNAN, T. (2008), The EM Algorithm and Extensions (2nd ed.), New York: Wiley.
MCLACHLAN, G.J., and PEEL, D. (1998), âRobust Cluster Analysis Via Mixtures of Multivariate t-Distributionsâ, in Lecture Notes in Computer Science, Volume 1451, Berlin: Springer-Verlag, pp. 658â666.
MCLACHLAN, G.J., and PEEL, D. (2000a), Finite Mixture Models, New York: John Wiley & Sons.
MCLACHLAN, G.J., and PEEL, D. (2000b), âMixtures of Factor Analyzersâ, in Proceedings of the Seventh International Conference on Machine Learning, San Francisco, Morgan Kaufmann, pp. 599â606.
MCNEIL, A.J., FREY, R., and EMBRECHTS, P. (2005), Quantitative Risk Management: Concepts, Techniques and Tools., Princeton: Princeton University Press.
MCNICHOLAS, P.D. (2013), âModel-Based Clustering and Classification Via Mixtures of Multivariate t-Distributionsâ, in Statistical Models for Data Analysis, Studies in Classification, Data Analysis, and Knowledge Organization, eds. P. Giudici, S. Ingrassia, and M. Vichi, Switzerland: Springer International Publishing.
MCNICHOLAS, P.D. (2016), Mixture Model-Based Classification, Boca Raton FL: Chapman & Hall/CRC Press.
MCNICHOLAS, P.D., and BROWNE, R.P. (2013), âDiscussion of âHow to Find an Appropriate Clustering for Mixed-Type Variables with Application to Socio-Economic Stratificationâ by Hennig and Liaoâ, Journal of the Royal Statistical Society: Series C, 62(3), 352â353.
MCNICHOLAS, P.D., ELSHERBINY, A., MCDAID, A.F., and MURPHY, T.B. (2015), pgmm: Parsimonious Gaussian Mixture Models, R Package Version 1.2.
MCNICHOLAS, P.D., JAMPANI, K.R., and SUBEDI, S. (2015), longclust: Model-Based Clustering and Classification for Longitudinal Data, R Package Version 1.2.
MCNICHOLAS, P.D., and MURPHY, T.B. (2005), âParsimonious Gaussian Mixture Modelsâ, Technical Report 05/11, Department of Statistics, Trinity College Dublin, Dublin, Ireland.
MCNICHOLAS, P.D., and MURPHY, T.B. (2008), âParsimonious Gaussian Mixture Modelsâ, Statistics and Computing, 18(3), 285â296.
MCNICHOLAS, P.D., and MURPHY, T.B. (2010a), âModel-Based Clustering of Longitudinal Dataâ, Canadian Journal of Statistics, 38(1), 153â168.
MCNICHOLAS, P.D., and MURPHY, T.B. (2010b), âModel-Based Clustering of Microarray Expression Data Via Latent Gaussian Mixture Modelsâ, Bioinformatics, 26(21), 2705â2712.
MCNICHOLAS, P.D., and SUBEDI, S. (2012), âClustering Gene Expression Time Course Data Using Mixtures of Multivariate t-Distributionsâ, Journal of Statistical Planning and Inference, 142(5), 1114â1127.
MCNICHOLAS, S.M., MCNICHOLAS, P.D., and BROWNE, R.P. (2014), âMixtures of Variance-Gamma Distributionsâ, arxiv preprint arXiv:1309.2695v2.
MCPARLAND, D., GORMLEY, I.C., MCCORMICK, T.H., CLARK, S.J., KABUDULA, C.W., and COLLINSON, M.A. (2014), âClustering South African Households Based on Their Asset Status Using Latent Variable Modelsâ, The Annals of Applied Statistics, 8(2), 747â776.
MCQUITTY, L.L. (1956), âAgreement Analysis: A Method of Classifying Subjects According to Their Patterns of Responsesâ, British Journal of Statistical Psychology, 9, 5â16.
MELNYKOV, V. (2016), âModel-Based Biclustering of Clickstream Dataâ, Computational Statistics and Data Analysis, 93, 31â45.
MENG, X.-L., and RUBIN, D.B. (1993), âMaximum Likelihood Estimation Via the ECM Algorithm: A General Frameworkâ, Biometrika, 80, 267â278.
MENG, X.-L., and VAN DYK, D. (1997), âThe EM AlgorithmâAn Old Folk Song Sung to a Fast New Tune (With Discussion)â, Journal of the Royal Statistical Society: Series B, 59(3), 511â567.
MONTANARI, A., and VIROLI, C. (2010a), âHeteroscedastic Factor Mixture Analysisâ, Statistical Modelling, 10(4), 441â460.
MONTANARI, A., and VIROLI, C. (2010b), âA Skew-Normal Factor Model for the Analysis of Student Satisfaction Towards University Coursesâ, Journal of Applied Statistics, 43, 473â487.
MONTANARI, A., and VIROLI, C. (2011), âMaximum Likelihood Estimation of Mixture of Factor Analyzersâ, Computational Statistics and Data Analysis, 55, 2712â2723.
MONTANELLI, R.G., JR., and HUMPHREYS, L.G. (1976), âLatent Roots of Random Data Correlation Matrices with Squared Multiple Correlations on the Diagonal: A Monte Carlo Studyâ, Psychometrika, 41, 341â348.
MORRIS, K., and MCNICHOLAS, P.D. (2013), âDimension Reduction for Model-Based Clustering ViaMixtures of Shifted Asymmetric Laplace Distributionsâ, Statistics and Probability Letters, 83(9), 2088â2093, Erratum 2014, 85,168.
MORRIS, K., and MCNICHOLAS, P.D. (2016), âClustering, Classification, Discriminant Analysis, and Dimension Reduction Via Generalized Hyperbolic Mixturesâ, Computational Statistics and Data Analysis, 97, 133â150.
MORRIS, K., MCNICHOLAS, P.D., and SCRUCCA, L. (2013), âDimension Reduction for Model-Based Clustering Via Mixtures of Multivariate t-Distributionsâ, Advances in Data Analysis and Classification, 7(3), 321â338.
MURRAY, P.M., BROWNE, R.B., and MCNICHOLAS, P.D. (2014a), âMixtures of Skew-t Factor Analyzersâ, Computational Statistics and Data Analysis, 77, 326â335.
MURRAY, P.M., MCNICHOLAS, P.D., and BROWNE, R.B. (2014b), âA Mixture of Common Skew-t Factor Analyzersâ, Stat, 3(1), 68â82.
MUTHEN, B., and ASPAROUHOV, T. (2006), âItem Response Mixture Modeling: Application to Tobacco Dependence Criteriaâ, Addictive Behaviors, 31, 1050â1066.
OâHAGAN, A., MURPHY, T.B., GORMLEY, I.C., MCNICHOLAS, P.D., and KARLIS, D. (2016), âClustering with the Multivariate Normal Inverse Gaussian Distributionâ, Computational Statistics and Data Analysis, 93, 18â30.
ORCHARD, T., and WOODBURY, M.A. (1972), âA Missing Information Principle: Theory and Applicationsâ, in Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistics, eds. L.M. Le Cam, J. Neyman, and E.L. Scott, Berkeley: University of California Press, pp. 697â715.
PAN, J., and MACKENZIE, G. (2003), âOn Modelling Mean-Covariance Structures in Longitudinal Studiesâ, Biometrika, 90(1), 239â244.
PEARSON, K. (1894), âContributions to the Mathematical Theory of Evolutionâ, Philosophical Transactions of the Royal Society, Part A, 185, 71â110.
PEEL, D., and MCLACHLAN, G.J. (2000), âRobust Mixture Modelling Using the t Distributionâ, Statistics and Computing, 10(4), 339â348.
POURAHMADI, M. (1999), âJoint Mean-Covariance Models with Applications to Longitudinal Data: Unconstrained Parameterisationâ, Biometrika, 86(3), 677â690.
POURAHMADI, M. (2000), âMaximum Likelihood Estimation of Generalised Linear Models for Multivariate Normal Covariance Matrixâ, Biometrika, 87(2), 425â435.
POURAHMADI, M., DANIELS, M., and PARK, T. (2007), âSimultaneous Modelling of the Cholesky Decomposition of Several Covariance Matricesâ, Journal of Multivariate Analysis, 98, 568â587.
PUNZO, A. (2014), âFlexible Mixture Modeling with the Polynomial Gaussian Cluster-Weighted Modelâ, Statistical Modelling, 14(3), 257â291.
PUNZO, A., and INGRASSIA, S. (2015a), âClustering Bivariate Mixed-Type Data Via the Cluster-Weighted Modelâ, Computational Statistics. To appear.
PUNZO, A., and INGRASSIA, S. (2015b), âParsimonious Generalized Linear Gaussian Cluster-Weighted Modelsâ, in, Advances in Statistical Models for Data Analysis, Studies in Classification, Data Analysis and Knowledge Organization, Switzerland, eds. I. Morlini, T. Minerva, and M. Vichi, Springer International Publishing, pp. 201â209.
PUNZO, A., and MCNICHOLAS, P.D. (2014a), âRobust Clustering in Regression Analysis Via the Contaminated Gaussian Cluster-Weighted Modelâ, arXiv preprint arXiv:1409.6019v1.
PUNZO, A., and MCNICHOLAS, P.D. (2014b), âRobust High-Dimensional Modeling with the Contaminated Gaussian Distributionâ, arXiv preprint arXiv:1408.2128v1.
PUNZO, A., and MCNICHOLAS, P.D. (2016), âParsimonious Mixtures of Multivariate Contaminated Normal Distributionsâ, Biometrical Journal. To appear.
R CORE TEAM (2015), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing.
RAFTERY, A.E. (1995), âBayesian Model Selection in Social Research (With Discussion)â, Sociological Methodology, 25, 111â193.
RAFTERY, A.E., and DEAN, N. (2006), âVariable Selection for Model-Based Clusteringâ, Journal of the American Statistical Association, 101(473), 168â178.
RANALLI, M., and ROCCI, R. (2016),âMixture Methods for Ordinal Data: A Pairwise Likelihood Approachâ, Statistics and Computing, 26(1), 529â547.
RAO, C.R. (1952), Advanced Statistical Methods in Biometric Research, New York: John Wiley and Sons, Inc.
RAU, A., MAUGIS-RABUSSEAU, C., MARTIN-MAGNIETTE, M.-L, and CELEUX, G. (2015), âCo-expression Analysis of High-Throughput Transcriptome Sequencing Data with Poisson Mixture Modelsâ, Bioinformatics, 31(9), 1420â1427.
SAHU, K., DEY, D.K., and BRANCO, M.D. (2003), âA New Class of Multivariate Skew Distributions with Applications to Bayesian Regression Modelsâ, Canadian Journal of Statistics, 31(2), 129â150. Corrigendum: Vol. 37 (2009), 301â302.
SCHĂNER, B. (2000), Probabilistic Characterization and Synthesis of Complex Data Driven Systems, Ph. D. thesis, Cambridge MA: MIT.
SCHROETER, P., VESIN, J., LANGENBERGER, T., and MEULI, R. (1998), âRobust Parameter Estimation of Intensity Distributions for BrainMagnetic Resonance Imagesâ, IEEE Transactions on Medical Imaging, 17(2), 172â186.
SCHWARZ, G. (1978), âEstimating the Dimension of a Modelâ, The Annals of Statistics, 6(2), 461â464.
SCOTT, A.J., and SYMONS, M.J. (1971), âClustering Methods Based on Likelihood Ratio Criteriaâ, Biometrics, 27, 387â397.
SCRUCCA, L. (2010), âDimension Reduction for Model-Based Clusteringâ, Statistics and Computing, 20(4), 471â484.
SCRUCCA, L. (2014), âGraphical Tools for Model-Based Mixture Discriminant Analysisâ, Advances in Data Analysis and Classification, 8(2), 147â165.
SHIREMAN, E., STEINLEY, D., and BRUSCO, M.J. (2015), âExamining the Effect of Initialization Strategies on the Performance of Gaussian Mixture Modelingâ, Behavior Research Methods.
SPEARMAN, C. (1904), âThe Proof and Measurement of Association Between Two Thingsâ, American Journal of Psychology, 15, 72â101.
SPEARMAN, C. (1927), The Abilities of Man: Their Nature and Measurement, London: MacMillan and Co., Limited.
STEANE, M.A., MCNICHOLAS, P.D., and YADA, R. (2012), âModel-Based Classification Via Mixtures of Multivariate t-Factor Analyzersâ, Communications in Statistics â Simulation and Computation, 41(4), 510â523.
STEELE, R.J., and RAFTERY, A.E. (2010), âPerformance of Bayesian Model Selection Criteria for Gaussian Mixture Modelsâ, in Frontiers of Statistical Decision Making and Bayesian Analysis, Vol, 2, New York: Springer, pp. 113â130.
STEPHENSEN, W. (1953), The Study of Behavior, Chicago: University of Chicago Press.
SUBEDI, S., and MCNICHOLAS, P.D. (2014), âVariational Bayes Approximations for Clustering Via Mixtures of Normal Inverse Gaussian Distributionsâ, Advances in Data Analysis and Classification, 8(2), 167â193.
SUBEDI, S., and MCNICHOLAS, P.D. (2016), âA Variational Approximations-DIC Rubric for Parameter Estimation and Mixture Model Selection Within a Family Settingâ, arXiv preprint arXiv:1306.5368v2.
SUBEDI, S., PUNZO, A., INGRASSIA, S., and MCNICHOLAS, P.D. (2013), âClustering and Classification Via Cluster-Weighted Factor Analyzersâ, Advances in Data Analysis and Classification, 7(1), 5â40.
SUBEDI, S., PUNZO, A., INGRASSIA, S., and MCNICHOLAS, P.D. (2015), âCluster-Weighted t-Factor Analyzers for Robust Model-Based Clustering and Dimension Reductionâ, Statistical Methods and Applications, 24(4), 623â649.
SUNDBERG, R. (1974), âMaximum Likelihood Theory for Incomplete Data from an Exponential Familyâ, Scandinavian Journal of Statistics, 1(2), 49â58.
TANG, Y., BROWNE, R.P., and MCNICHOLAS, P.D. (2015), âModel-Based Clustering of High-Dimensional Binary Dataâ, Computational Statistics and Data Analysis, 87, 84â101.
TESCHENDORFF, A., WANG, Y., BARBOSA-MORAIS, J., BRENTON, N., and CALDAS, C. (2005), âA Variational Bayesian Mixture Modelling Framework for Cluster Analysis of Gene-Expression Dataâ, Bioinformatics, 21(13), 3025â3033.
TIEDEMAN, D.V. (1955), âOn the Study of Typesâ, in Symposium on Pattern Analysis, ed. S.B. Sells, Randolph Field, Texas: Air University, U.S.A.F. School of Aviation Medicine, pp. 1â14.
TIPPING, M.E. (1999), âProbabilistic Visualization of High-Dimensional Binary Dataâ, Advances in Neural Information Processing Systems (11), 592â598.
TIPPING, M.E., and BISHOP, C.M. (1997), âMixtures of Probabilistic Principal Component Analysersâ, Technical Report NCRG/97/003, Aston University (Neural Computing Research Group), Birmingham, UK.
TIPPING, M.E., and BISHOP, C.M. (1999), âMixtures of Probabilistic Principal Component Analysersâ, Neural Computation, 11(2), 443â482.
TITTERINGTON, D.M., SMITH, A.F.M, and MAKOV, U.E. (1985), Statistical Analysis of Finite Mixture Distributions, Chichester: John Wiley & Sons.
TORTORA, C., MCNICHOLAS, P.D., and BROWNE, R.P. (2015), âA Mixture of Generalized Hyperbolic Factor Analyzersâ, Advances in Data Analysis and Classification. To appear.
TRYON, R.C. (1939), Cluster Analysis, Ann Arbor: Edwards Brothers.
TRYON, R.C. (1955), âIdentification of Social Areas by Cluster Analysisâ, in University of California Publications in Psychology, Volume 8, Berkeley: University of California Press.
VERMUNT, J.K. (2003), âMultilevel Latent Class Modelsâ, Sociological Methodology, 33(1), 213â239.
VERMUNT, J.K. (2007), âMultilevel Mixture Item Response Theory Models: An Application in Education Testingâ, in Proceedings of the 56th Session of the International Statistical Institute, Lisbon, Portugal, pp. 22â28.
VIROLI, C. (2010), âDimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzersâ, Journal of Classification, 27(3), 363â388.
VRAC, M., BILLARD, L., DIDAY, E., and CHEDIN, A. (2012), âCopula Analysis of Mixture Modelsâ, Computational Statistics, 27(3), 427â457.
VRBIK, I., and MCNICHOLAS, P.D. (2012), âAnalytic Calculations for the EM Algorithm for Multivariate Skew-t Mixture Modelsâ, Statistics and Probability Letters, 82(6), 1169â1174.
VRBIK, I., and MCNICHOLAS, P.D. (2014), âParsimonious Skew Mixture Models for Model-Based Clustering and Classificationâ, Computational Statistics and Data Analysis, 71, 196â210.
VRBIK, I., and MCNICHOLAS, P.D. (2015), âFractionally-Supervised Classificationâ, Journal of Classification, 32(3), 359â381.
WANG, Q., CARVALHO, C., LUCAS, J., and WEST, M. (2007), âBFRM: Bayesian Factor Regression Modellingâ, Bulletin of the International Society for Bayesian Analysis, 14(2), 4â5.
WATERHOUSE, S., MACKAY, D., and ROBINSON, T. (1996), âBayesian Methods for Mixture of Expertsâ, in Advances in Neural Information Processing Systems, Vol. 8. Cambridge, MA: MIT Press.
WEI, Y., and MCNICHOLAS, P.D. (2015), âMixture Model Averaging for Clusteringâ, Advances in Data Analysis and Classification, 9(2), 197â217.
WEST, M. (2003), âBayesian Factor Regression Models in the âLarge p, Small nâ Paradigmâ, in Bayesian Statistics, Volume 7, eds. J.M. Bernardo, M. Bayarri, J. Berger, A. Dawid, D. Heckerman, A. Smith, and M. West, Oxford: Oxford University Press, pp. 723â732.
WOLFE, J.H. (1963), âObject Cluster Analysis of Social Areasâ, Masterâs thesis, University of California, Berkeley.
WOLFE, J.H. (1965), âA Computer Program for the Maximum Likelihood Analysis of Typesâ, Technical Bulletin 65â15, U.S. Naval Personnel Research Activity.
WOLFE, J.H. (1970), âPattern Clustering by Multivariate Mixture Analysisâ, Multivariate Behavioral Research, 5, 329â350.
YOSHIDA, R., HIGUCHI, T., and IMOTO, S. (2004), âA Mixed Factors Model for Dimension Reduction and Extraction of a Group Structure in Gene Expression Dataâ, in Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference, pp. 161â172.
YOSHIDA, R., HIGUCHI, T., IMOTO, S., and MIYANO, S. (2006), âArrayCluster: An Analytic Tool for Clustering, Data Visualization and Module Finder on Gene Expression Profilesâ, Bioinformatics, 22, 1538â1539.
ZHOU, H., and LANGE, K.L. (2010), âOn the Bumpy Road to the Dominant Modeâ, Scandinavian Journal of Statistics, 37(4), 612â631.
Author information
Authors and Affiliations
Corresponding author
Additional information
Model-based clustering. The author is grateful to Chapman & Hall/CRC Press for allowing some text and figures from his monograph (McNicholas 2016) to be used in this review paper. The author is thankful for the helpful comments of an anonymous reviewer and the Editor. The work is partly supported by the Canada Research Chairs program
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
McNicholas, P.D. Model-Based Clustering. J Classif 33, 331â373 (2016). https://doi.org/10.1007/s00357-016-9211-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-016-9211-9