Abstract
Variable selection for mixture of regression models has been the focus of much research in recent years. These models combine the ideas of mixture models, regression models, and variable selection to uncover group structures and key relationships between data sets. The objective is to identify homogeneous groups of objects and determine the cluster-specific subsets of covariates modulating the outcomes. In this chapter we review frequentist and Bayesian methods we have proposed to address in a unified manner the problems of cluster identification and cluster-specific variable selection in the context of mixture of regression models. These methods have a wide range of applications, in particular in the context of high-dimensional data analysis. We illustrate their performance in two diverse areas: one in ecology for modeling species-rich ecosystems and the other in genomics for integrating data from different genomic sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22, 719–725 (2000)
George, E., McCulloch, R.: Approaches for Bayesian variable selection. Stat. Sin. 7, 339–373 (1997)
Geyer, C.: Markov chain Monte Carlo maximum likelihood. In: Keramigas, E. (ed.) Computing Science and Statistics, pp. 156–163. Interface Foundation, Fairfax (1991)
Gupta, M., Ibrahim, J.G.: Variable selection in mixture modeling for the discovery of gene regulatory networks. J. Am. Stat. Assoc. 102, 867–880 (2007)
Khalili, A., Chen, J.: Variable selection in finite mixture of regression models. J. Am. Stat. Assoc. 102, 1025–1038 (2007)
Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E.: Equations of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1091 (1953)
Monni, S., Tadesse, M.G.: A stochastic partitioning method to associate high-dimensional responses and covariates (with discussion). Bayesian Anal. 4, 413–464 (2009)
Morley, M., Molony, C.M., Weber, T.M., Devlin, J.L., Ewens, K.G., Spielman, R.S., Cheung, V.G.: Genetic analysis of genome-wide variation in human gene expression. Nature 430, 743–747 (2004)
Mortier, F., Ouédraogo, D.-Y., Claeys, F., Tadesse, M.G., Cornu, G., Baya, F., Benedet, F., Freycon, V., Gourlet-Fleury, S., Picard, N.: Mixture of inhomogeneous matrix models for species-rich ecosystems. Environmetrics 26, 39–51 (2015)
Städler, N., Bühlmann, P., van de Geer, S.: ℓ1-penalization for mixture regression models. Test 19, 209–256 (2010)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Tadesse, M.G., Mortier, F., Monni, S. (2016). Uncovering Cluster Structure and Group-Specific Associations: Variable Selection in Multivariate Mixture Regression Models. In: Toni, B. (eds) Mathematical Sciences with Multidisciplinary Applications. Springer Proceedings in Mathematics & Statistics, vol 157. Springer, Cham. https://doi.org/10.1007/978-3-319-31323-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-31323-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31321-4
Online ISBN: 978-3-319-31323-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)