Skip to main content
Log in

A mixture likelihood approach for generalized linear models

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

A mixture model approach is developed that simultaneously estimates the posterior membership probabilities of observations to a number of unobservable groups or latent classes, and the parameters of a generalized linear model which relates the observations, distributed according to some member of the exponential family, to a set of specified covariates within each Class. We demonstrate how this approach handles many of the existing latent class regression procedures as special cases, as well as a host of other parametric specifications in the exponential family heretofore not mentioned in the latent class literature. As such we generalize the McCullagh and Nelder approach to a latent class framework. The parameters are estimated using maximum likelihood, and an EM algorithm for estimation is provided. A Monte Carlo study of the performance of the algorithm for several distributions is provided, and the model is illustrated in two empirical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • AITKIN, M., and RUBIN, D. B. (1985), “Estimation and Hypothesis Testing in Finite Mixture Distributions,”Journal of the Royal Statistical Society, Series B, 47, 67–75.

    Google Scholar 

  • AITKIN, M., ANDERSON, D., and HINDE, J. (1981), “Statistical Modelling of Data on Teaching Styles (with discussion),”Journal of the Royal Statistical Society, A 144, 419–461.

    Google Scholar 

  • AKAIKE, H. (1974), “A New Look at Statistical Model Identification,”IEEE Transactions on Automatic Control, AC-19, 716–723.

    Google Scholar 

  • ATKINSON, K. E. (1989),An Introduction to Numerical Analysis, New York: Wiley.

    Google Scholar 

  • BANFIELD, C. F., and BASSIL, L. C. (1977), “A Transfer Algorithm for Non-Hierarchical Classification,”Applied Statistics, 26, 206–210.

    Google Scholar 

  • BASFORD, K. E., and MCLACHLAN, G. J. (1985), “The Mixture Method of Clustering Applied to Three-Way Data,”Journal of Classification, 2, 109–125.

    Google Scholar 

  • BAWA, K., and SHOEMAKER, R. W. (1987), “The Coupon-Prone Consumer: Some Findings Based on Purchase Behavior Across Product Classes,”Journal of Marketing 51, 99–110.

    Google Scholar 

  • BAWA, K., and SHOEMAKER, R. W. (1989), “Analyzing Incremental Sales from a Direct Mail Coupon Promotion,”Journal of Marketing, 53, 66–78.

    Google Scholar 

  • BERK, R. H. (1972), “Consistency and Asymptotic Normality of MLS's for Exponential Models,”Annals of Mathematical Statistics, 43, 193–204.

    Google Scholar 

  • BHATTACHARYA, C. G. (1967), “A Simple Method for Resolution of a Distribution into its Gaussian Components,”Biometrics, 23, 115–135.

    Google Scholar 

  • BLATTBERG, R. C., BUESING, T., PEACOCK, P. and SEN, S. K. (1978) “Identifying the Deal Prone Segment,”Journal of Marketing Research, 15, 369–377.

    Google Scholar 

  • BOYLES, R. A. (1983), “On Convergence of the EM Algorithm,”Journal of the Royal Statistical Society, Series B, 45, 47–50.

    Google Scholar 

  • BOZDOGAN, H. (1987), “Model Selection and Akaike's Information Criterion (AIC): The General Theory and its Analytical Extensions,”Psychometrika, 52, 345–370.

    Google Scholar 

  • BOZDOGAN, H., and SCLOVE, S. L. (1984), “Multi-sample Cluster Analysis Using Akaike's Information Criterion,”Annals of the Institute of Statistics and Mathematics, 36, 163–180.

    Google Scholar 

  • CASSIE, R. M. (1954), “Some Uses of Probability Paper for the Graphical Analysis of Polymodel Frequency Distributions,”Australian Journal of Marine and Freshwater Research, 5, 513–522.

    Google Scholar 

  • CHARLIER, C. V. L., and WICKSELL, S. D. (1924), “On the Dissection of Frequency Functions,”Arkiv für Mathematik, Astronomi och Fysik, BD 18, 6.

    Google Scholar 

  • COHEN, J. (1988),Statistical Power Analysis for the Behavioral Sciences, Hillsdale: Lawrence Erlbaum.

    Google Scholar 

  • COTTON, B. C., and BABB, E. M. (1978), “Consumer Response to Promotional Deals,”Journal of Marketing, 42, 109–113.

    Google Scholar 

  • SRAMÉR, H. (1946),Mathematical Methods of Statistics, Princeton: Princeton University Press.

    Google Scholar 

  • DAVIES, R. B. (1977), “Hypothesis Testing When a Nuissance Parameter is Present Only Under the Alternative,”Biometrika, 64, 247–254.

    Google Scholar 

  • DAY, N. E. (1969), “Estimating the Components of a Mixture of two Normal Distributions,”Biometrika, 56, 463–474.

    Google Scholar 

  • DEMPSTER, A. P. (1971), “An Overview of Multivariate Data Analysis,”Journal of Multivariate Analysis, 1, 316–346.

    Google Scholar 

  • DEMPSTER A. P., LAIRD, N. M., and RUBIN, R. B. (1977), “Maximum Likelihood from Incomplete Data via the EM-Algorithm,”Journal of the Royal Statistical Society, Series B, 39, 1–38.

    Google Scholar 

  • DESARBO, W. S., OLIVER, R. L., and DE SOETE, G. (1986), “A Probabilistic Multi-Dimensional Scaling Vector Model,”Applied Psychological Measurement, 10, 79–98.

    Google Scholar 

  • DESARBO, W. S., and CRON W. L. (1988), “A Maximum Likelihood Methodology for Clusterwise Linear Regression”Journal of Classification, 5, 249–282.

    Google Scholar 

  • DESARBO, W. S., OLIVER, R. L., and RANGASWAMY, A. (1989), “A Simulated Annealing Methodology for Clusterwise Regression,”Psychometrika, 54, 707–736.

    Google Scholar 

  • DESARBO, W. S., WEDEL M., VRIENS, M. and RAMASWAMY, V. (1992) “Latent Class Metric Conjoint Analysis,”Marketing Letters, 3, 273–288.

    Google Scholar 

  • DESARBO, W. S., RAMASWAMY V., REIBSTEIN, D. J., and ROBINSON W. T. (1993), “A Latent Pooling Methodology for Regression Analysis with Limited Time Series of Cross Sections: a PIMS Data Application,”Marketing Science, 12, 103–124.

    Google Scholar 

  • DE SOETE G., and DESARBO W. S. (1991), “A Latent Class Probit Model for Analyzing Pick Any/N Data,”Journal of Classification, 8, 45–63.

    Google Scholar 

  • DODSON, J. A., TYBOUT, A. M., and STERNTHAL, B. (1978), “Impact of Deals and Deal Retraction on Brand Switching,”Journal of Marketing Research, 15, 72–81.

    Google Scholar 

  • EVERITT, B. S. (1984), “Maximum Likelihood Estimation of the Parameters in a Mixture of two Univariate Normal Distributions: A Comparison of Different Algorithms,”Statistican, 33, 205–215.

    Google Scholar 

  • EVERITT, B. S., and HAND D. J. (1981),Finite Mixture Distributions, London: Chapman and Hall.

    Google Scholar 

  • FISHER, R. A. (1935), “The Case of Zero Survivors,” (Appendix to Bliss, C.I. (1935)),Annals of Applied Biology, 22, 164–165.

    Google Scholar 

  • FOWLKES, E. B. (1979), “Some Methods for Studying Mixtures of two Normal (Lognormal) Distributions,”Journal of the American Statistical Association, 74, 561–575.

    Google Scholar 

  • FRYER I. G., and ROBERTSON C. A. (1972), “A Comparison of Some Methods for Estimating Mixed Normal Distributions,”Biometrika, 59, 639–648.

    Google Scholar 

  • GHOSH, J. M., and SEN, P. K. (1985), “On the Asymptotic Performance of the Loglikelihood Ratio Statistic for the Mixture Model and Related Results,”Proceedings of the Berkeley Conference, Neyman, and Kiefer, II, Monterey: Wadsworth, 789–806.

    Google Scholar 

  • GOODMAN, L. A. (1974), “Ex0loratory Latent Structure Analysis Using Both Identifiable and Unidentifiable Models,”Biometrika, 61, 215–231.

    Google Scholar 

  • GREEN, P. E., and RAO V. R. (1971), “Conjoint Measurement for Quantifying Judgmental Data,”Journal of Marketing Research, 8, 355–363.

    Google Scholar 

  • GREEN P. J. (1984), “Iteratively Reweighted Least Squares for Maximum Likelihood Estimation, and Some Robust and Resistant Alternatives,”Journal of the Royal Statistical Society, Series B 46, 149–192.

    Google Scholar 

  • HABERMAN, S. J. (1977), “Maximum Likelihood Estimates in Exponential Response Models,”Annals of Statistics, 5, 815–841.

    Google Scholar 

  • HARDING, I. P. (1948), “The Use of Probability Paper for the Graphical Analysis of Polymodel Frequency Distributions,”Journal of the Marine Biological, Association (UK), 28, 141–153.

    Google Scholar 

  • HASSELBLAD, V. (1966), “Estimation of Parameters for a Mixture of Normal Distributions,”Technometrics, 8, 431–444.

    Google Scholar 

  • HASSELBLAD, V. (1969) “Estimation of Finite Mixtures of Distributions from the Exponential Family,”Journal of the American Statistical Association, 64, 1459–1471.

    Google Scholar 

  • HOPE, A. C. A. (1968), “A Simplified Monte Carlo Significance Test Procedure,”Journal of the Royal Statistical Society, Series B, 30, 582–598.

    Google Scholar 

  • HOSMER, D. W. (1974), “Maximum Likelihood Estimates of the Parameters of a Mixture of two Regression Lines,”Communications in Statistics, 3, 995–1006.

    Google Scholar 

  • JONES, P. N., and MCLACHLAN, G. J. (1992), “Fitting Finite Mixture Models in a Regression Context,”Australian Journal of Statistics, 43, 233–240.

    Google Scholar 

  • JONES, P. N., and MCLACHLAN, G. J. (1992), “Improving the Convergence Rate of the EM Algorithm for a Mixture Model Fitted to Grouped and Truncated Data,”Journal of Statistical Computation and Simulation, 43, 31–44.

    Google Scholar 

  • JORGENSEN, B. (1984), “The Delta Algorithm and GLIM,”International Statistical Review, 52, 283–300.

    Google Scholar 

  • KAMAKURA, W. A., and RUSSELL G. J. (1989), “A Probabilistic Choice Model for Market Segmentation and Elasticity Structure,”Journal of Marketing Research, 26, 379–390.

    Google Scholar 

  • KAMAKURA, W. A. (1991), “Estimating Flexible Distributions of Ideal-points with External Analysis of Preference,”Psychometrika, 56, 419–448.

    Google Scholar 

  • LANGEHEINE, R., and ROST, J. (1988),Latent Trait and Latent Class Models, New York: Plenum.

    Google Scholar 

  • LAZARSFELD P. F., and HENRY N. W. (1968),Latent Structure Analysis, Boston: Houghton-Mifflin.

    Google Scholar 

  • LI, L. A., and SEDRANSK N. (1988), “Mixtures of Distributions: A Topological Approach,”Annals of Statistics, 16, 1623–1634.

    Google Scholar 

  • LICHTENSTEIN, D. R., NETEMEYER, R. G., and BURTON, S. (1990), “Distinguishing Coupon Proneness from Value Consciousness: An Acquisition-Transaction Utility Theory Perspective,”Journal of Marketing, 54, 54–67.

    Google Scholar 

  • LOUIS, T. A. (1982), “Finding the Observed Information Matrix When Using the EM Algorithm,”Journal of the Royal Statistical Society, Series B, 44, 226–233.

    Google Scholar 

  • LWIN T., and MARTIN P. J. (1989), “Probits of Mixtures,”Biometrics, 45, 721–732.

    Google Scholar 

  • MCCULLAGH, P., and NELDER, J. A. (1989),Generalized Linear Models, New York: Chapman and Hall.

    Google Scholar 

  • MCHUGH, R. B. (1956), “Efficient Estimation and Local, Identification in Latent Class Analysis,”Psychometrika, 21, 331–347.

    Google Scholar 

  • MCHUGH, R. B. (1958) “Note on Efficient Estimation and Local, Identification in Latent Class Analysis,”Psychometrika, 23, 273–274.

    Google Scholar 

  • MCLACHLAN, G. J. (1982), “The Classification and Mixture Maximum Likelihood Approaches to Cluster Analysis,” inHandbook of Statistics (vol 2) Eds., P. R. Krishnaiah and L. N. Kanal, Amsterdam: North-Holland, 199–208.

    Google Scholar 

  • MCLACHLAN, G. J. (1987), “On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture,”Applied Statistics, 36, 318–324.

    Google Scholar 

  • MCLACHLAN, G. J., and BASFORD, K. E. (1988),Mixture Models: Inference and Application to Clustering, New York: Marcel Dekker.

    Google Scholar 

  • MEILIJSON, J. (1989), “A Fast Improvement of the RM Algorithm on Its Own Terms,”Journal of the Royal Statistical Society, B 51, 127–138.

    Google Scholar 

  • MOOIJAART, A., and VAN DER HEIJDEN, P. G. M. (1992), “The EM Algorithm for Latent Class Analysis with Constraints,”Psychometrika, 57, 261–271.

    Google Scholar 

  • NARASIMHAN, C. (1984), “A Price Discrimination Theory of Coupons,”Marketing Science, 3, 125–145.

    Google Scholar 

  • NELDER, J. A., and WEDDERBURN, R. W. M. (1972), “Generalized Linear Models,”Journal of the Royal Statistical Society, Series A, 135, 370–384.

    Google Scholar 

  • NEWCOMB, S. (1886), “A Generalized Theory of the Combination of Observations So As To Obtain the Best Result,”American Journal of Mathematics, 8, 343–366.

    Google Scholar 

  • NIELSEN, A. C. (1965), “The Impact of Retail Coupons,”Journal of Marketing (October), 11–15.

  • OLIVER, R. L. (1980), “A Cognitive Model of the Antecedents and Consequences of Satisfaction Decisions,”Journal of Marketing Research, 17, 460–469.

    Google Scholar 

  • OLIVER, R. L., and DESARBO, W. S. (1988), “Response Determinants in Satisfaction Judgments,”Journal of Consumer Research, 14, 495–507.

    Google Scholar 

  • PEARSON, K. (1894), “Contributions to the Mathematical Theory of Evolution,”Philosophical Transactions, A, 185, 71–110.

    Google Scholar 

  • PETERS, B. C., and WALKER H. F. (1978), “An Iterative Procedure for Obtaining Maximum Likelihood Estimates of the Parameters of a Mixture of Normal Distributions,”Journal of Applied Mathematics, 35, 362–378.

    Google Scholar 

  • QUANDT, R. E., and RAMSEY, J. B. (1978), “Estimating Mixtures of Normal Distributions and Switching Regressions,”Journal of the American Statistical Association, 73, 730–738.

    Google Scholar 

  • QUANDT, R. E. (1972), “A New Approach to Estimating Switching Regressions,”Journal of the American Statistical Association, 67, 306–310.

    Google Scholar 

  • REDNER, R. A., and WALKER, H. F. (1984), “Mixture Densities, Maximum Likelihood and the EM Algorithm,”SIAM Review, 26, 195–239.

    Google Scholar 

  • RUBINSTEIN, R. Y. (1981),Simulation and the Monte Carlo Method, New York: Wiley.

    Google Scholar 

  • SCHWARTZ, G. (1978), “Estimating the Dimensions of a Model,”Annals of Statistics, 6, 461–464.

    Google Scholar 

  • SCLOVE, S. L. (1987), “Applications of Model-Selection Criteria to some Problems in Multivariate Analysis,”Psychometrika, 52, 333–343.

    Google Scholar 

  • SHIMP, T. A., and KAVAS, A. (1984), “The Theory of Reasoned Action Applied to Coupon Usage,”Journal of Consumer Research, 11, 795–809.

    Google Scholar 

  • STIGLER, S. M. (1986),The History of Statistics, Cambridge, Mass: Harvard University Press.

    Google Scholar 

  • SYMONS, M. J. (1981), “Clustering Criteria and Multivariate Normal Mixtures,”Biometrics, 37, 35–43.

    Google Scholar 

  • TEEL, J. E., WILLIAMS R. H., and BEARDEN W. O. (1980), “Correlates of Consumer Susceptibility to Coupons in New Grocery Product Introductions”Journal of Advertising, 3, 31–35.

    Google Scholar 

  • TEICHER, H. (1961), “Identifiability of Mixtures,”Annals of Mathematical Statistics, 31, 55–73.

    Google Scholar 

  • TITTERINGTON, D. M. (1990), “Some Recent Research in the Analysis of Mixture Distributions,”Statistics, 4, 619–641.

    Google Scholar 

  • TITTERINGTON, D. M., SMITH, A. F. M., and MAKOV, U. E. (1985),Statistical Analysis of Finite Mixture Distributions, New York: Wiley.

    Google Scholar 

  • THOMAS, E. A. C. (1966), “Mathematical Models for the Clustered Firing of Single Cortical Neurons,”British Journal of Mathematical and Statistical Psychology, 19, 151–162.

    Google Scholar 

  • VILCASSIM, N. J., and WITTINK, D. R. (1987), “Supporting a Higher Shelf Price Through Coupon Distributions,”Journal of Consumer Marketing, 4, 29–39.

    Google Scholar 

  • WARD, R. W., and DAVIS, J. E. (1978), “A Pooled Cross-Section Time Series Model of Coupon Promotions,”American Journal of Agricultural Economics, (August), 193–401.

  • WEDEL, M., and DESARBO, W. S. (1994), “A Review of Recent Developments in Latent Class Regression Models,” inAdvanced Methods of Marketing Research, Ed., R. Bagozzi, 352–388.

  • WEDEL, M., DESARBO, W. S., BULT J. R., and RAMASWAMY, V. (1993), “A Latent Class Poisson Regression Model for Heterogeneous Count Data With an Application to Direct Mail,”Journal of Applied Econometrics, 8, 397–411.

    Google Scholar 

  • WEDEL, M., and DESARBO, W. S. (1993), “A Latent Class Binomial Logit Methodology for the Analysis of Paired Comparison Data: An Application Reinvestigating the Determinants of Perceived Risk,”Decision Sciences, 24, 1157–1170.

    Google Scholar 

  • WOLFE, J. H. (1970), “Pattern Clustering by Multivariate Mixture Analysis,”Multivariate Behavioral Research, 5, 329–350.

    Google Scholar 

  • WU, C. F. J. (1983), “On the Convergence Properties of the EM Algorithm,”Annals of Statistics, 11, 95–103.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Wedel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wedel, M., DeSarbo, W.S. A mixture likelihood approach for generalized linear models. Journal of Classification 12, 21–55 (1995). https://doi.org/10.1007/BF01202266

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01202266

Key Words

Navigation