Finite mixture models have now been used for more than hundred years (Newcomb (1886), Pearson (1894)). They are a very popular statistical modeling technique given that they constitute a flexible and-easily extensible model class for (1) approximating general distribution functions in a semi-parametric way and (2) accounting for unobserved heterogeneity. The number of applications has tremendously increased in the last decades as model estimation in a frequentist as well as a Bayesian framework has become feasible with the nowadays easily available computing power.
The simplest finite mixture models are finite mixtures of distributions which are used for model-based clustering. In this case the model is given by a convex combination of a finite number of different distributions where each of the distributions is referred to as component. More complicated mixtures have been developed by inserting different kinds of models for each component. An obvious extension is to estimate a generalized linear model (McCullagh and Nelder (1989)) for each component. Finite mixtures of GLMs allow to relax the assumption that the regression coefficients and dispersion parameters are the same for all observations. In contrast to mixed effects models, where it is assumed that the distribution of the parameters over the observations is known, finite mixture models do not require to specify this distribution a-priori but allow to approximate it in a data-driven way.
In a regression setting unobserved heterogeneity for example occurs if important covariates have been omitted in the data collection and hence their influence is not accounted for in the data analysis. In addition in some areas of application the modeling aim is to find groups of observations with similar regression coefficients. In market segmentation (Wedel and Kamakura (2001)) one kind of application among others of finite mixtures of GLMs aims for example at determining groups of consumers with similar price elasticities in order to develop an optimal pricing policy for a market segment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aitkin M (1996) A general maximum likelihood analysis of overdisper-sion in generalized linear models. Statistics and Computing 6:251-262
Aitkin M (1999) Meta-analysis by random effect modelling in generalized linear models. Statistics in Medicine 18(17-18):2343-2351
Böhning D, Dietz E, Schlattmann P, Mendonça L, Kirchner U (1999) The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society A 162(2):195-209
Boiteau G, Singh M, Singh RP, Tai GCC, Turner TR (1998) Rate of spread of pvy-n by alate myzus persicae (sulzer) from infected to healthy plants under laboratory conditions. Potato Research 41 (4):335-344
Celeux G, Diebolt J (1988) A random imputation principle: The stochastic EM algorithm. Rapports de Recherche 901, INRIA
Dasgupta A, Raftery AE (1998) Detecting features in spatial point processes with clutter via model-based clustering. Journal of the American Statistical Association 93(441):294-302
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM-algorithm. Journal of the Royal Statistical Society B 39:1-38
Follmann DA, Lambert D (1989) Generalizing logistic regression by non-parametric mixing. Journal of the American Statistical Association 84(405):295-300
Frühwirth-Schnatter S (2006) Finite Mixture and Markov Switching Models. Springer Series in Statistics, Springer, New York
Grün B (2006) Identification and estimation of finite mixture models. PhD thesis, Institut für Statistik und Wahrscheinlichkeitstheorie, Technische Universität Wien, Friedrich Leisch, advisor
Grün B, Leisch F (2004) Bootstrapping finite mixture models. In: Antoch J (ed) Compstat 2004 — Proceedings in Computational Statistics, Physica Verlag, Heidelberg, pp 1115-1122
Grün B, Leisch F (2006) Fitting finite mixtures of linear regression models with varying & fixed effects in R. In: Rizzi A, Vichi M (eds) Compstat 2006—Proceedings in Computational Statistics, Physica Verlag, Heidelberg, Germany, pp 853-860
Grün B, Leisch F (2007) Flexmix 2.0: Finite mixtures with concomitant variables and varying and fixed effects. Submitted for publication
Grün B, Leisch F (2007) Identifiability of finite mixtures of multinomial logit models with varying and fixed effects, unpublished manuscript
Grün B, Leisch F (2007) Testing for genuine multimodality in finite mixture models: Application to linear regression models. In: Decker R, Lenz HJ (eds) Advances in Data Analysis, Proceedings of the 30th Annual Conference of the Gesellschaft für Klassifikation, SpringerVerlag, Studies in Classification, Data Analysis, and Knowledge Organization, vol 33, pp 209-216
Hennig C (2000) Identifiability of models for clusterwise linear regression. Journal of Classification 17(2):273-296
Jedidi K, Krider RE, Weinberg CB (1998) Clustering at the movies. Marketing Letters 9(4):393-405
Krider RE, Li T, Liu Y, Weinberg CB (2005) The lead-lag puzzle of demand and distribution: A graphical method applied to movies. Marketing Science 24(4):635-645
Leisch F (2004a) Exploring the structure of mixture model components. In: Antoch J (ed) Compstat 2004 — Proceedings in Computational Statistics, Physica Verlag, Heidelberg, pp 1405-1412
Leisch F (2004b) FlexMix: A general framework for finite mixture mod-els and latent class regression in R. Journal of Statistical Software 11 (8), URL http://www.jstatsoft.org/v11/i08/
Lindsay BG (1989) Moment matrices: Applications in mixtures. The Annals of Statistics 17(2):722-740
McCullagh P, Nelder JA (1989) Generalized Linear Models (2nd edition). Chapman and Hall
McLachlan GJ, Krishnan T (1997) The EM Algorithm and Extensions, 1st edn. John Wiley and Sons
Naik PA, Shi P, Tsai CL (2007) Extending the Akaike information criterion to mixture regression models. Journal of the American Statistical Association 102(477):244-254
Newcomb S (1886) A generalized theory of the combination of observations so as to obtain the best result. American Journal of Mathematics 8:343-366
Pearson K (1894) Contributions to the mathematical theory of evolu-tion. Philosophical Transactions of the Royal Society A 185:71-110
R Development Core Team (2007) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org
Redner RA, Walker HF (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Review 26(2):195-239
Titterington DM, Smith AFM, Makov UE (1985) Statistical Analysis of Finite Mixture Distributions. Wiley
Wang P, Puterman ML (1998) Mixed logistic regression models. Journal of Agricultural, Biological, and Environmental Statistics 3 (2):175-200
Wang P, Puterman ML, Cockburn IM, Le ND (1996) Mixed Poisson regression models with covariate dependent rates. Biometrics 52:381-400
Wedel M, Kamakura WA (2001) Market Segmentation — Conceptual and Methodological Foundations (2nd edition). Kluwer Academic Publishers
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2008 Physica-Verlag Heidelberg
About this chapter
Cite this chapter
Grün, B., Leisch, F. (2008). Finite Mixtures of Generalized Linear Regression Models. In: Recent Advances in Linear Models and Related Areas. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-2064-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-7908-2064-5_11
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-2063-8
Online ISBN: 978-3-7908-2064-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)