Journal of Classification

, Volume 29, Issue 3, pp 363–401 | Cite as

Local Statistical Modeling via a Cluster-Weighted Approach with Elliptical Distributions

  • Salvatore Ingrassia
  • Simona C. Minotti
  • Giorgio Vittadini


Cluster-weighted modeling (CWM) is a mixture approach to modeling the joint probability of data coming from a heterogeneous population. Under Gaussian assumptions, we investigate statistical properties of CWM from both theoretical and numerical point of view; in particular, we show that Gaussian CWM includes mixtures of distributions and mixtures of regressions as special cases. Further, we introduce CWM based on Student-t distributions, which provides a more robust fit for groups of observations with longer than normal tails or noise data. Theoretical results are illustrated using some empirical studies, considering both simulated and real data. Some generalizations of such models are also outlined.


Cluster-weighted modeling Mixture models Model-based clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. ANDERSON, J.A. (1972), “Separate Sample Logistic Discrimination”, Biometrika, 59, 19–35.MathSciNetMATHCrossRefGoogle Scholar
  2. ANDREWS, R.L., ANSARI, A., and CURRIM, I.S. (2002), “Hierarchical Bayes Versus Finite Mixture Conjoint Analysis Models: A Comparison of Fit, Prediction, and Partworth Recovery”, Journal of Marketing, 39, 87–98.CrossRefGoogle Scholar
  3. ANDREWS, J.L., and McNICHOLAS, P.D. (2011), “Extending Mixtures of Multivariate T-Factor Analyzers”, Statistics and Computing 21(3), 361–373.MathSciNetCrossRefGoogle Scholar
  4. BAEK, J., and McLACHLAN, G.J. (2011), “Mixtures of Common T-Factor Analyzers for Clustering High-Dimensional Microarray Data”, Bioinformatics, 27, 1269–1276.CrossRefGoogle Scholar
  5. BERNARDO, J.M., and GIRÓN, F.J. (1992), “Robust Sequential Prediction from Non- Random Samples: The Election Night Forecasting Case” in Bayesian Statistics 5, eds. J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, Oxford: Oxford University Press, pp. 61–77.Google Scholar
  6. CAMPBELL, N.A., and MAHON, R.J. (1974), “A Multivariate Study of Variation in Two Species of Rock Crab of Genus Letpograspus”, Australian Journal of Zoology, 22, 417–455.CrossRefGoogle Scholar
  7. CERIOLI, A. (2010), “Multivariate Outlier Detection with High-Breakdown Estimators”, Journal of the American Statistical Society, 105(489), 147–156.MathSciNetGoogle Scholar
  8. CUESTA-ALBERTOS, J.A., MATRÁN, C., and MAYO-ISCAR, A. (2008), “Trimming and Likelihood: Robust Location and Dispersion Estimation in the Elliptical Model”, The Annals of Statistics, 36(5), 2284–2318.MathSciNetMATHCrossRefGoogle Scholar
  9. DAYTON, C.M., and MACREADY, G.B. (1988), “Concomitant-Variable Latent-Class Models”, Journal of the American Statistical Association, 83, 173–178.MathSciNetCrossRefGoogle Scholar
  10. DESARBO, W.S., and CRON, W.L. (1988), “A Maximum Likelihood Methodology for Cluster wise Linear Regression”, Journal of Classification, 5(2), 249–282.MathSciNetMATHCrossRefGoogle Scholar
  11. DICKEY, J.T. (1967), “Matricvariate Generalizations of the Multivariate t Distribution and the Inverted Multivariate t Distribution”, The Annals of Mathematical Statistics, 38, 511–518.MathSciNetMATHCrossRefGoogle Scholar
  12. EVERITT, B.S., and HAND, D.J. (1981), Finite Mixture Distributions, London: Chapman & Hall.MATHCrossRefGoogle Scholar
  13. FARIA, S., and SOROMENHO, G. (2010), “Fitting Mixtures of Linear Regressions, Journal of Statistical Computation and Simulation, 80, 201–225.MathSciNetMATHCrossRefGoogle Scholar
  14. FONSECA, J.R.S. (2008), “Mixture Modeling and Information Criteria for Discovering Patterns in Continuous Data”, Eighth International Conference on Hybrid Intelligent Systems, IEEE Computer Society.Google Scholar
  15. FRÜWIRTH-SCHNATTER, S. (2005), Finite Mixture and Markov Switching Models, Heidelberg: Springer.Google Scholar
  16. GALLEGOS, M.T., and RITTER, G. (2009a), “Trimming- Algorithms for Clustering Contaminated Grouped Data and Their Robustness”, Advances in Data Analysis and Classification, 3, 135-167.MathSciNetCrossRefGoogle Scholar
  17. GALLEGOS, M.T., and RITTER, G. (2009b), “Trimmed ML Estimation of Contaminated Mixtures”, Sankhya, 71-A, Part 2, 164–220.MathSciNetGoogle Scholar
  18. GERSHENFELD, N. (1997), “Non Linear Inference and Cluster-Weighted Modeling”, Annals of the New York Academy of Sciences, 808, 18-24.CrossRefGoogle Scholar
  19. GERSHENFELD, N. (1999), The Nature of Mathematical Modelling, Cambridge: Cambridge University Press, pp. 101–130.Google Scholar
  20. GERSHENFELD, N., SCHÖNER, B., and METOIS, E. (1999), “Cluster-Weighted Modelling for Time-Series Analysis”, Nature, 397, 329-332.CrossRefGoogle Scholar
  21. GRESELIN, F., and INGRASSIA, S. (2010), “Constrained Monotone EM Algorithms of Multivariate t distributions”, Statistics & Computing, 20, 9–22.MathSciNetCrossRefGoogle Scholar
  22. INGRASSIA, S. (2004), “A Likelihood-Based Constrained Algorithm for Multivariate Normal Mixture Models”, Statistical Methods & Applications, 13, 151–166.MathSciNetCrossRefGoogle Scholar
  23. INGRASSIA, S., and ROCCI, R. (2007), “Constrained Monotone EM Algorithms for Finite Mixture of Multivariate Gaussians”, Computational Statistics & Data Analysis, 51, 5339–5351.MathSciNetMATHCrossRefGoogle Scholar
  24. JANSEN, R.C. (1993), “Maximum Likelihood in a Generalized Linear Finite Mixture Model by Using the EM Algorithm”, Biometrics, 49, 227–231.CrossRefGoogle Scholar
  25. JORDAN, M.I. (1995), “Why the Logistic Function? A Tutorial Discussion on Probabilities and Neural Networks”, MIT Computational Cognitive Science Report 9503.Google Scholar
  26. JORDAN, M.I., and JACOBS, R.A. (1994), “Hierarchical Mixtures of Experts and the EM Algorithm”, Neural Computation, 6, 181–224.CrossRefGoogle Scholar
  27. KAN, R., and ZHOU, G. (2006), “Modelling Non-Normality Using Multivariate t: Implications for Asset Pricing”, Working paper, Washington University, St. Louis.Google Scholar
  28. LANGE, K.L., LITTLE, R.J.A., and TAYLOR, J.M.G. (1989), “Robust StatisticalModeling Using the t Distribution”, Journal of the American Statistical Society, 84(408), 881–896.MathSciNetGoogle Scholar
  29. LEISCH, F. (2004), “Flexmix: A General Framework for Finite Mixture Models and Latent Class Regression in R”, Journal of Statistical Software, 11(8), 1–18.Google Scholar
  30. LIU, C., and RUBIN, D.M. (1995), “ML Estimation of the t Distribution using EM and its Extensions, ECM and ECME”, Statistica Sinica, 5, 19–39.MathSciNetMATHGoogle Scholar
  31. MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979), Multivariate Analysis, London: Academic Press.MATHGoogle Scholar
  32. McLACHLAN, G.J., and BASFORD, K.E. (1988), Mixture Models: Inference and Applications to Clustering, New York: Marcel Dekker.MATHGoogle Scholar
  33. McLACHLAN, G.J., and PEEL, D. (1998), “Robust Cluster Analysis via Mixtures of Multivariate t-distributions”’ in Lecture Notes in Computer Science, Vol. 1451, eds. A. Amin, D. Dori, P. Pudil, and H. Freeman, Berlin: Springer-Verlag, pp. 658–666.Google Scholar
  34. McLACHLAN, G.J., and PEEL, D. (2000), Finite Mixture Models, New York: Wiley.MATHCrossRefGoogle Scholar
  35. NADARAJAH, S., and KOTZ, S. (2005), “Mathematical Properties of the Multivariate t Distributions”, Acta Applicandae Mathematicae, 89, 53–84.MathSciNetMATHCrossRefGoogle Scholar
  36. NEWCOMB, S. (1886), “A Generalized Theory of the Combination of Observations so as to Obtain the Best Result”, American Journal of Mathematics, 8, 343–366.MathSciNetCrossRefGoogle Scholar
  37. NG, S.K., and McLACHLAN, G.J. (2007), “Extension of Mixture-of-Experts Networks for Binary Classification of Hierarchical Data”, Artificial Intelligence in Medicine, 41, 57–67.CrossRefGoogle Scholar
  38. NG, S.K., and McLACHLAN, G.J. (2008), “Expert Networks with Mixed Continuous and Categorical Feature Variables: A Location Modeling Approach, in: Machine Learning Research Progress, eds. H. Peters and M. Vogel, New York: Hauppauge, pp. 355–368.Google Scholar
  39. NIERENBERG, D.W., STUKEL, T.A., BARON, J., DAIN, B.J., and GREENBERG, R. (1989), “Determinants of Plasma Levels of Beta-carotene and Retinol”, American Journal of Epidemiology, 130(3), 511–521.Google Scholar
  40. PEARSON, K. (1894), “Contributions to the Mathematical Theory of Evolution”, Philosophical Transactions of the Royal Society of London A, 185, 71–110.MATHCrossRefGoogle Scholar
  41. PEEL, D., and McLACHLAN, G.J. (2000), “Robust Mixture Modelling Using the t Distribution”, Statistics & Computing, 10, 339–348.CrossRefGoogle Scholar
  42. PENG, F., JACOBS, R.A., and TANNER, M.A. (1996), “Bayesian Inference in Mixtures of- Experts and Hierarchical Mixtures-of-Experts Models with an Application to Speech Recognition”, Journal of the American Statistical Association, 91, 953–960.MATHCrossRefGoogle Scholar
  43. PINHEIRO, J.C., LIU, C., and WU, Y.N. (2001), “Efficient Algorithms for Robust Estimation in Linear Mixed-Effects Models Using the Multivariate t Distribution”, Journal of Computational and Graphical Statistics, 10, 249–276.MathSciNetCrossRefGoogle Scholar
  44. QUANDT, R.E. (1972), “A New Approach to Estimating Switching Regressions”, Journal of the American Statistical Society, 67, 306–310.MATHGoogle Scholar
  45. RIANI,M., CERIOLI, A., ATKINSON, A.C., PERROTTA, D., and TORTI, F. (2008), “Fitting Mixtures of Regression Lines with the Forward Search”, in Mining Massive Data Sets for Security, eds. F. Fogelman-Soulié, D. Perrotta, J. Piskorki and R. Steinberg, Amsterdam: IOS Press, pp. 271–286.Google Scholar
  46. RIANI, M., ATKINSON, A.C., and CERIOLI, A. (2009), “Finding an Unknown Number of Multivariate Outliers”, Journal of the Royal Statistical Society B, 71(2), 447–466.MathSciNetCrossRefGoogle Scholar
  47. SCHLATTMANN, P. (2009), Medical Applications of Finite Mixture Models, Berlin-Heidelberg: Springer-Verlag.MATHGoogle Scholar
  48. SCHÖNER, B. (2000), Probabilistic Characterization and Synthesis of Complex Data Driven Systems, Ph.D. Thesis, MIT.Google Scholar
  49. SCHÖNER, B., and GERSHENFELD, N. (2001), “Cluster Weighted Modeling: Probabilistic Time Series Prediction, Characterization, and Synthesis” in Nonlinear Dynamics and Statistics, ed. A.I. Mees, Boston: Birkhauser, pp. 365–385.CrossRefGoogle Scholar
  50. TITTERINGTON, D.M., SMITH, A.F.M., and MAKOV, U.E. (1985), Statistical Analysis of Finite Mixture Distributions, New York: Wiley.MATHGoogle Scholar
  51. WANG, P., PUTERMAN, M.L., COCKBURN, I., and LE, N. (1996), “Mixed Poisson Regression Models with Covariate Dependent Rates”, Biometrics, 52, 381–400.MATHCrossRefGoogle Scholar
  52. WEDEL, M. (2002), “Concomitant Variables in Finite Mixture Models”, Statistica Nederlandica, 56(3), 362–375.MathSciNetMATHCrossRefGoogle Scholar
  53. WEDEL, M., and DESARBO, W. (1995), “A Mixture Likelihood Approach for Generalized Linear Models”, Journal of Classification, 12, 21–55.MATHCrossRefGoogle Scholar
  54. WEDEL, M., and DESARBO, W. (2002), “Market Segment Derivation and Profiling via a Finite Mixture Model Framework”, Marketing Letters, 13, 17–25.CrossRefGoogle Scholar
  55. WEDEL, M., and KAMAMURA, W.A. (2000), Market Segmentation. Conceptual and Methodological Foundations, Boston: Kluwer Academic Publishers.CrossRefGoogle Scholar
  56. ZELLNER, A. (1976), “Bayesian and Non-Bayesian Analysis of the RegressionModel with Multivariate Student-t Error Terms”, Journal of the American Statistical Society, 71, 400–405.MathSciNetMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Salvatore Ingrassia
    • 1
  • Simona C. Minotti
    • 2
  • Giorgio Vittadini
    • 3
  1. 1.Dipartimento di Economia e ImpresaUniversità di CataniaCataniaItaly
  2. 2.Dipartimento di StatisticaUniversità di Milano-BicoccaMilanoItaly
  3. 3.Dipartimento di Metodi Quantitativi per l’Economia e le Scienze AziendaliUniversità di Milano-BicoccaMilanoItaly

Personalised recommendations