Abstract
Finite mixtures of regression (FMR) models offer a flexible framework for investigating heterogeneity in data with functional dependencies. These models can be conveniently used for unsupervised learning on data with clear regression relationships. We extend such models by imposing an eigen-decomposition on the multivariate error covariance matrix. By constraining parts of this decomposition, we obtain families of parsimonious mixtures of regressions and mixtures of regressions with concomitant variables. These families of models account for correlations between multiple responses. An expectation-maximization algorithm is presented for parameter estimation and performance is illustrated on simulated and real data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3), 803–821 (1993)
Böhning, D., Dietz, E., Schaub, R., Schlattmann, P., Lindsay, B.G.: The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann. Inst. Stat. Math. 46(2), 373–388 (1994)
Browne, R.P., McNicholas, P.D.: ‘mixture’: Mixture Models for Clustering and Classification. R package version 1.0. (2013)
Campbell, N.A., Mahon, R.J.: A multivariate study of variation in two species of rock crab of the genus leptograpsus. Aust. J. Zool. 22(3), 417–425 (1974)
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recogn. 28(5), 781–793 (1995)
Dang, U.J., Punzo, A., McNicholas, P.D., Ingrassia, S., Browne, R.P.: Multivariate response and parsimony for Gaussian cluster-weighted models [arXiv preprint arXiv:1411.0560] (2014)
Dasgupta, A., Raftery, A.E.: Detecting features in spatial point processes with clutter via model-based clustering. J. Am. Stat. Assoc. 93(441), 294–302 (1998)
DeSarbo, W.S., Cron, W.L.: A maximum likelihood methodology for clusterwise linear regression. J. Classif. 5(2), 249–282 (1988)
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)
Galimberti, G., Soffritti, G.: A multivariate linear regression analysis using finite mixtures of t distributions. Comput. Stat. Data Anal. 71, 138–150 (2014)
Gershenfeld, N.: Nonlinear inference and cluster-weighted modeling. Ann. N. Y. Acad. Sci. 808(1), 18–24 (1997)
Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. J. R. Stat. Soc. C App. 28(1), 100–108 (1979)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Keribin, C.: Consistent estimation of the order of mixture models. Sankhya Ser. A 62, 49–66 (2000)
Leisch, F.: FlexMix: a general framework for finite mixture models and latent class regression in R. J. Stat. Softw. 11(8), 1–18 (2004)
Lindsay, B.G.: Mixture models: theory, geometry and applications. In: NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5 (1995)
McNicholas, P.D., Murphy, T.B., McDaid, A.F., Frost, D.: Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models. Comput. Stat. Data Anal. 54(3), 711–723 (2010)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria (2014)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Soffritti, G., Galimberti, G.: Multivariate linear regression with non-normal errors: a solution based on mixture models. Stat. Comput. 21(4), 523–536 (2011)
Titterington, D.M., Smith, A.F.M., Makov, U,E.: Statistical Analysis of Finite Mixture Distributions, vol. 7. Wiley, New York (1985)
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002)
Wedel, M.: Concomitant variables in finite mixture models. Statistica Neerlandica 56(3), 362–375 (2002)
Acknowledgements
This work is supported by a Alexander Graham Bell Canada Graduate Scholarship (CGS-D; Dang) as well as a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (McNicholas).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dang, U.J., McNicholas, P.D. (2015). Families of Parsimonious Finite Mixtures of Regression Models. In: Morlini, I., Minerva, T., Vichi, M. (eds) Advances in Statistical Models for Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-17377-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-17377-1_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17376-4
Online ISBN: 978-3-319-17377-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)