Skip to main content
Log in

How collinearity affects mixture regression results

  • Published:
Marketing Letters Aims and scope Submit manuscript

Abstract

Mixture regression models are an important method for uncovering unobserved heterogeneity. A fundamental challenge in their application relates to the identification of the appropriate number of segments to retain from the data. Prior research has provided several simulation studies that compare the performance of different segment retention criteria. Although collinearity between the predictor variables is a common phenomenon in regression models, its effect on the performance of these criteria has not been analyzed thus far. We address this gap in research by examining the performance of segment retention criteria in mixture regression models characterized by systematically increased collinearity levels. The results have fundamental implications and provide guidance for using mixture regression models in empirical (marketing) studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. For the mathematical specification of the criteria, see Table A1 in the Online Supplement of this paper.

  2. These factor levels combine Andrews and Currim’s (2003b) two factors “number of individuals” (100 or 300) and “number of observations per individual” (5 or 10).

  3. Note that prior studies used unstandardized mean separations with a random distribution of coefficients, which makes a full replication impossible as detailed information on the specified variances is missing.

  4. The balanced factor level involves equally sized segments, while the unbalanced factor levels characterize the existence of one segment that is considerably larger than the other segments. Specifically, the unbalanced segments exhibit the following relative sizes: 65 %/35 % (unbalanced) and 80 %/20 % (very unbalanced) in a situation with two segments, 50 %/25 %/25 % (unbalanced) and 66.66 %/16.66 %/16.66 % (very unbalanced) in the case of three segments, and 40 %/20 %/20 %/20 % (unbalanced) and 55 %/15 %/15 %/15 % (very unbalanced) in the case of four segments.

  5. For the correlation matrices, see Table A2 in the Online Supplement of this paper.

  6. For an illustration of the difference between consistent and inconsistent correlation matrices between segments, see Table A3 in the Online Supplement of this paper.

  7. Note that the numbers do not always add to 100 % because of rounding inaccuracies. The more precise numbers of 82.38, 11.21, and 6.41 % add to 100 %.

  8. For detailed results, see Table A4 in the Online Supplement of this paper.

  9. We thank an anonymous reviewer for this suggestion.

  10. For the complete table with all criteria’s results, see Table A5 in the Online Supplement of this paper.

  11. For the ANCOVA results, see Table A6 in the Online Supplement of this paper.

  12. For example, Kim et al. (2013) extend the new Bayesian latent structure regression model by Kim et al. (2012) by implementing model constrains and illustrating these in comparative analyses that contrast the performance of the proposed methodology with standard latent class finite mixture regression, as well as with traditional Bayesian finite mixture regression. The authors show that the new Bayesian regression model is more robust against collinearity problems than both the finite mixture regression models and traditional Bayesian finite mixture models in terms of the RMSE and ARI. In addition, the new Bayesian regression model can also be used to simultaneously select the number of segments and select the variables to retain per segment.

  13. We thank an anonymous reviewer for these comments.

References

  • Andrews, R. L., & Currim, I. S. (2003a). A comparison of segment retention criteria for finite mixture logit models. Journal of Marketing Research, 40(20), 235–243.

    Article  Google Scholar 

  • Andrews, R. L., & Currim, I. S. (2003b). Retention of latent segments in regression-based marketing models. International Journal of Research in Marketing, 20(4), 315–321.

    Article  Google Scholar 

  • Andrews, R. L., Ainsle, A., & Currim, I. S. (2002a). An empirical comparison of logit choice models with discrete versus continuous representations of heterogeneity. Journal of Marketing Research, 39(4), 479–487.

    Article  Google Scholar 

  • Andrews, R. L., Ansari, A., & Currim, I. S. (2002b). Hierarchical Bayes versus finite mixture conjoint analysis models: a comparison of fit, prediction and partworth recovery. Journal of Marketing Research, 39(1), 87.

    Article  Google Scholar 

  • Andrews, R. L., Currim, I. S., Leeflang, P., & Lim, J. (2007). Estimating the SCAN*PRO model of store sales: HB, FM or just OLS? International Journal of Research in Marketing, 25(1), 22–33.

    Article  Google Scholar 

  • Andrews, R. L., Brusco, M. J., Currim, I. S., & Davis, B. (2010). An empirical comparison of methods for clustering problems: are there benefits from having a statistical model? Review of Marketing Science, 8(1), 1–32.

    Article  Google Scholar 

  • Boone, D. S., & Roehm, M. (2002). Evaluating the appropriateness of market segmentation solutions using artificial neural networks and the membership clustering criterion. Marketing Letters, 13(4), 317–333.

    Article  Google Scholar 

  • Bozdogan, H. (1994). Mixture-model cluster analysis using model selection criteria in a new information measure of complexity. Paper presented at the Proceedings of the First US/Japan Conference on Frontiers of Statistical Modelling: An Information Approach.

  • Claeskens, G., & Hart, J. D. (2009). Goodness-of-fit tests in mixed models. Test, 18(2), 213–239.

    Article  MATH  MathSciNet  Google Scholar 

  • Core Team, R. (2014). R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

    Google Scholar 

  • Cortiñas, M., Chocarro, R., & Villanueva, M. L. (2010). Understanding multi-channel banking customers. Journal of Business Research, 63(11), 1215–1221.

    Article  Google Scholar 

  • DeSarbo, W. S., & Cron, W. L. (1988). A maximum likelihood methodology for clusterwise linear regression. Journal of Classification, 5(2), 249–282.

    Article  MATH  MathSciNet  Google Scholar 

  • DeSarbo, W. S., Kamakura, W., & Wedel, M. (2004). Applications of multivariate latent variable models in marketing. In Y. Wind & P. E. Green (Eds.), Market Research and Modeling: Progress and Prospects. A Tribute to Paul E. Green (pp. 43–68). Boston: Kluwer Academic Publishers. et al.

  • DeSarbo, W. S., Benedetto, C. A., & Song, M. (2007). A heterogeneous resource based view for exploring relationships between firm performance and capabilities. Journal of Modelling in Management, 2(2), 103–130.

    Article  Google Scholar 

  • Dubois, B., Czellar, S., & Laurent, G. (2005). Consumer segmentation based on attitudes toward luxury: empirical evidence from twenty countries. Marketing Letters, 16(2), 115–128.

    Article  Google Scholar 

  • Grewal, R., Cote, J. A., & Baumgartner, H. (2004). Multicollinearity and measurement error in structural equation models: implications for theory testing. Marketing Science, 23(4), 519–529.

    Article  Google Scholar 

  • Grewal, R., Chakravarty, A., Ding, M., & Liechty, J. (2008). Counting chickens before the eggs hatch: associating new product development portfolios with shareholder expectations in the pharmaceutical sector. International Journal of Research in Marketing, 25(3), 261–272.

    Article  Google Scholar 

  • Grewal, R., Chandrashekaran, M., & Citrin, A. V. (2010). Customer satisfaction heterogeneity and shareholder value. Journal of Marketing Research, 47(4), 612–626.

    Article  Google Scholar 

  • Grewal, R., Chandrashekaran, M., Johnson, J. L., & Mallapragada, G. (2013). Environments, unobserved heterogeneity, and the effect of market orientation on outcomes for high-tech firms. Journal of the Academy of Marketing Science, 41(2), 206–233.

    Article  Google Scholar 

  • Grün, B., & Leisch, F. (2008). Flexmix 2: finite mixtures with concomitant variables and varying constant parameters. Journal of Statistical Software, 28(4), 1–35.

    Article  Google Scholar 

  • Hahn, C., Johnson, M. D., Herrmann, A., & Huber, F. (2002). Capturing customer heterogeneity using a finite mixture PLS approach. Schmalenbach Business Review, 54(3), 243–269.

    Google Scholar 

  • Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2010). Multivariate data analysis (7th ed.). Englewood Cliffs: Prentice Hall.

    Google Scholar 

  • Hawkins, D. S., Allen, D. M., & Stromberg, A. J. (2001). Determining the number of components in mixtures of linear models. Computational Statistics & Data Analysis, 38(1), 15–48.

    Article  MATH  MathSciNet  Google Scholar 

  • Hennig, C. (2000). Identifiability of models for clusterwise linear regression. Journal of Classification, 17(2), 273–296.

    Article  MATH  MathSciNet  Google Scholar 

  • Hubert, L., & Arabi, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • Hutchinson, J. W., Kamakura, W. A., & Lynch, J. G. (2000). Unobserved heterogeneity as an alternative explanation for “reversal” effects in behavioral research. Journal of Consumer Research, 27(3), 324–344.

    Article  Google Scholar 

  • Jagpal, S., Jedidi, K., & Jamil, M. (2007). A multibrand concept-testing methodology for new product strategy. Journal of Product Innovation Management, 24(1), 34–51.

    Article  Google Scholar 

  • Jedidi, K., Jagpal, H. S., & DeSarbo, W. S. (1997). Finite-mixture structural equation models for response-based segmentation and unobserved heterogeneity. Marketing Science, 16(1), 39–59.

    Article  Google Scholar 

  • Kim, B.-D., Fong, D. K. H., & DeSarbo, W. S. (2012). Model-based segmentation featuring simultaneous segment-level variable selection. Journal of Marketing Research, 49(5), 725–736.

    Article  Google Scholar 

  • Kim, S., Blanchard, S. J., Desarbo, W. S., & Fong, D. K. H. (2013). Implementing managerial constraints in model-based segmentation: extensions of Kim, Fong, and DeSarbo (2012) with an application to heterogeneous perceptions of service quality. Journal of Marketing Research, 50(5), 664–673.

    Article  Google Scholar 

  • Kotler, P., & Keller, K. L. (2012). Marketing management (14th ed.). Pearson: Prentice-Hall.

    Google Scholar 

  • Mantrala, M. K., Naik, P. A., Sridhar, S., & Thorson, E. (2007). Uphill or downhill? Locating the firm on a profit function. Journal of Marketing, 71(2), 26–44.

    Article  Google Scholar 

  • Marcoulides, G. A., Chin, W. W., & Saunders, C. (2012). When imprecise statistical statements become problematic: a response to Goodhue, Lewis, and Thompson. MIS Quarterly, 36(3), 717–728.

    Google Scholar 

  • Mason, C. H., & Perreault, W. D. (1991). Collinearity, power, and interpretation of multiple regression analysis. Journal of Marketing Research, 28(3), 268–280.

    Article  Google Scholar 

  • McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York, NY: Wiley.

    Book  MATH  Google Scholar 

  • Ofir, C., & Khuri, A. (1986). Multicollinearity in marketing models: diagnostics and remedial measures. International Journal of Research in Marketing, 3(3), 181–205.

    Article  Google Scholar 

  • Sarstedt, M. (2008). Market segmentation with mixture regression models: understanding measures that guide model selection. Journal of Targeting, Measurement and Analysis for Marketing, 16(3), 228–246.

  • Sarstedt, M., & Ringle, C. M. (2010). Treating unobserved heterogeneity in PLS path modelling: a comparison of FIMIX-PLS with different data analysis strategies. Journal of Applied Statistics, 37(8), 1299–1318.

    Article  MathSciNet  Google Scholar 

  • Wedel, M., & Kamakura, W. A. (2000). Market segmentation: conceptual and methodological foundations (2nd ed.). Boston: Kluwer.

    Book  Google Scholar 

  • Wedel, M., Kamakura, W., Arora, N., Bemmaor, A., Chiang, J., Elrod, T., et al. (1999). Discrete and continuous representations of unobserved heterogeneity in choice modeling. Marketing Letters, 10(3), 219–232.

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Jörg Henseler (Radboud University Nijmegen) and Edward E. Rigdon (Georgia State University) for their comments on earlier versions of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Franziska Völckner.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOCX 90.7 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Becker, JM., Ringle, C.M., Sarstedt, M. et al. How collinearity affects mixture regression results. Mark Lett 26, 643–659 (2015). https://doi.org/10.1007/s11002-014-9299-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11002-014-9299-9

Keywords

Navigation