Skip to main content
Log in

Finite Mixture Models with Student t Distributions: an Applied Example

  • Published:
Prevention Science Aims and scope Submit manuscript

Abstract

The use of finite mixture modeling (FMM) to identify unobservable or latent groupings of individuals within a population has increased rapidly in applied prevention research. However, many prevention scientists are still unaware of the statistical assumptions underlying FMM. In particular, finite mixture models (FMMs) typically assume that the observed indicator variables are normally distributed within each latent subgroup (i.e., within-class normality). These assumptions are rarely met in applied psychological and prevention research, and violating these assumptions when fitting a FMM can lead to the identification of spurious subgroups and/or biased parameter estimates. Although new methods have been developed that relax the within-class normality assumption when fitting a FMM, prevention scientists continue to rely on FMM methods that assume within-class normality. The purpose of the current article is to introduce prevention researchers to a FMM method for heavy-tailed data: FMM with Student t distributions. We begin by reviewing the distributional assumptions that underlie FMM and the limitations of FMM with normal distributions. Next, we introduce FMM with Student t distributions, and show, step by step, the analytic and substantive results of fitting a FMM with normal and Student t distributions to data from a smoking-cessation trial. Finally, we extend the results of the applied example to draw conclusions about the use of FMM with Student t distributions in applied settings and to provide guidelines for researchers who wish to use these methods in their own research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. These data are part of a larger simulation study conducted to highlight the potential dangers of fitting a FMM-n to non-normally distributed datasets. The results of this study are available in the online supplemental material.

  2. Soft-randomization scheme: uses initialization values for the ECM algorithm between 0 and 1. Hard-randomization scheme: uses initialization values of either 0 or 1 for the ECM algorithm.

  3. The k-means initialization procedure derives initialization values from a k-means clustering procedure and uses the parameter estimates derived from the k-means analysis as starting values in the ECM algorithm.

  4. A model’s entropy is an aggregate measure of a model’s classification uncertainty and is derived from each individual’s posterior probability of membership in a particular subgroup. Entropy scores range from 0.00–1.00 with higher values (> .80) indicating that there is adequate separation between the identified subgroups (Asparouhov and Muthén 2018). R code to derive entropy scores from a fitted FMM is available in the online supplemental material.

  5. To assign each individual to a unique subgroup, we took a “classify-analyze” approach where participants were assigned to the subgroup corresponding to their highest posterior probabilities. This approach was deemed appropriate because the majority of the identified models had an entropy ≥ 0.90 (Clark and Muthén 2009).

References

  • Andrews, J. L., & McNicholas, P. D. (2012). Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Statistics and Computing, 22, 1021–1029. https://doi.org/10.1007/s11222-011-9272-x.

  • Andrews, J. L., Wickins, J. R., Boers, N. M., & McNicholas, P. D. (2018). teigen: An R package for model-based clustering and classification via the multivariate t distribution. Journal of Statistical Software, 83, 1–32. https://doi.org/10.18637/jss.v083.i07.

  • Andrews, J. L., McNicholas, P. D., & Subedi, S. (2011). Model-based classification via mixtures of multivariate t-distributions. Computational Statistics & Data Analysis, 55, 520–529.

  • Asparouhov, T., & Muthén, B. (2016). Structural equation models and mixture models with continuous nonnormal skewed distributions. Structural Equation Modeling: A Multidisciplinary Journal, 23, 1–19.

  • Asparouhov, T., & Muthén, B. (2018). Variable-specific entropy contribution. Retrieved from http://www.statmodel.com/download/UnivariateEntropy.pdf.

  • Bauer, D. J. (2007). Observations on the use of growth mixture models in psychological research. Multivariate Behavioral Research, 42, 757–786.

  • Bauer, D. J., & Curran, P. J. (2003). Distributional assumptions of growth mixture models: Implications for overextraction of latent trajectory classes. Psychological Methods, 8, 338–363. https://doi.org/10.1037/1082-989X.8.3.338.

  • Bauer, D. J., & Curran, P. J. (2004). The integration of continuous and discrete latent variable models: potential problems and promising opportunities. Psychological Methods, 9, 3–29. https://doi.org/10.1037/1082-989X.9.1.3.

  • Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57, 289–300.

  • Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and kurtosis in real data samples. Methodology, 9, 78–84.

  • Bonanno, G. A., & Mancini, A. D. (2012). Beyond resilience and PTSD: Mapping the heterogeneity of responses to potential trauma. Psychological Trauma: Theory, Research, Practice, and Policy, 4, 74–83. https://doi.org/10.1037/a0017829.

  • Bonanno, G. A., Ho, S. M. Y., Chan, J. C. K., Kwong, R. S. Y., Cheung, C. K. Y., Wong, C. P. Y., & Wong, V. C. W. (2008). Psychological resilience and dysfunction among hospitalized survivors of the SARS epidemic in Hong Kong: A latent class approach. Health Psychology, 27, 659–667. https://doi.org/10.1037/0278-6133.27.5.659.

  • Burgess-Hull, A. J., Roberts, L. J., Piper, M. E., & Baker, T. B. (2018). The social networks of smokers attempting to quit: An empirically derived and validated classification. Psychology of Addictive Behaviors, 32, 64–75. https://doi.org/10.1037/adb0000336.

  • Clark, S. L., & Muthén, B. (2009). Relating latent class analysis results to variables not included in the analysis. Retrieved from: https://www.statmodel.com/download/relatinglca.pdf

  • Cudeck, R., & Henly, S. J. (2003). A realistic perspective on pattern representation in growth data: Comment on Bauer and Curran (2003). Psychological Methods, 8, 378–383.

  • Forster, M. R. (2000). Key concepts in model selection: Performance and generalizability. Journal of Mathematical Psychology, 44, 205–231.

  • Forster, M. (2004). Simplicity and unification in model selection. Retrieved from http://philosophy.wisc.edu/forster/520/Chapter 3.pdf.

  • Fraley, C., & Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. Computer Journal, 41, 586–588.

  • Gerogiannis, D., Nikou, C., & Likas, A. (2009). The mixtures of Student’s t-distributions as a robust framework for rigid registration. Image and Vision Computing, 27, 1285–1294.

  • Gibson, W. A. (1959). Three multivariate models: Factor analysis, latent structure analysis and latent profile analysis. Psychometrika, 24, 229–252. https://doi.org/10.1007/BF02289845.

  • Hennig, C. (2015). What are the true clusters? Pattern Recognition Letters, 64, 53–62.

  • Jackson, K. M., Sher, K. J., & Wood, P. K. (2000). Trajectories of concurrent substance use disorders: A developmental, typological approach to comorbidity. Alcoholism: Clinical and Experimental Research, 24, 902–913.

  • Krueger, R. F., Markon, K. E., Patrick, C. J., & Iacono, W. G. (2005). Externalizing psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for DSM-V. Journal of Abnormal Psychology, 114, 537.

  • Lange, K. L., Little, R. J., & Taylor, J. M. (1989). Robust statistical modeling using the t distribution. Journal of the American Statistical Association, 84, 881–896.

  • Lanza, S. T., & Rhoades, B. L. (2013). Latent class analysis: An alternative perspective on subgroup analysis in prevention and treatment. Prevention Science, 14, 157–168.

  • Lee, S. X., & Mclachlan, G. J. (2013). On mixtures of skew normal and skew t-distributions. Advances in Data Analysis and Classification, 7, 241–266.

  • Lei, H., Nahum-Shani, I., Lynch, K., Oslin, D., & Murphy, S. a. (2012). A “SMART” design for building individualized treatment sequences. Annual Review of Clinical Psychology, 8, 21–48. https://doi.org/10.1146/annurev-clinpsy-032511-143152.

  • Lo, Y., Mendell, N. R., & Rubin, D. B. (2001). Testing the number of components in a normal mixture. Biometrika, 88, 767–778. https://doi.org/10.1093/biomet/88.3.767.

  • Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics, 18, 50–60.

  • McLachlan, G. J., & Peel, D. (2000). Finite mixture models. Wiley.

  • McLachlan, G. J., & Peel, D. (1998). Robust cluster analysis via mixtures of multivariate t-distributions. In A. Amin, D. Dori, P. Pudil, & H. Freeman (Eds.), Advances in pattern recognition. SSPR /SPR 1998 (pp. 658–666). Berlin, Heidelberg: Springer.

  • McNicholas, P. D., & Subedi, S. (2012). Clustering gene expression time course data using mixtures of multivariate t-distributions. Journal of Statistical Planning and Inference, 142, 1114–1127.

  • Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166. https://doi.org/10.1037/0033-2909.105.1.156.

  • Muthén, B. (2003). Statistical and substantive checking in growth mixture modeling: Comment on Bauer and Curran (2003). Psychological Methods, 8, 369–377.

  • Muthén, L. K., & Muthén, B. O. (1998-2017). MPlus User’s Guide (Eighth ed.). Los Angeles, CA: Muthén & Muthén.

  • Nagin, D. S., & Tremblay, R. E. (2005). Developmental trajectory groups: Fact or a useful statistical fiction? Criminology, 43, 873–904.

  • Nylund, K. L., Asparouhov, T., & Muthén, B. O. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling: A Multidisciplinary Journal, 14, 535–569.

  • Peel, D., & McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Statistics and Computing, 10, 339–348. https://doi.org/10.1023/A:1008981510081.

  • Piper, M. E., Smith, S. S., Schlam, T. R., Fiore, M. C., Jorenby, D. E., Fraser, D., & Baker, T. B. (2009). A randomized placebo-controlled clinical trial of 5 smoking cessation pharmacotherapies. Archives of General Psychiatry, 66, 1253–1262.

  • Piper, M. E., Cook, J. W., Schlam, T. R., Jorenby, D. E., Smith, S. S., Bolt, D. M., & Loh, W. Y. (2010). Gender, race, and education differences in abstinence rates among participants in two randomized smoking cessation trials. Nicotine & Tobacco Research, 12, 647–657.

  • Posada, D., & Buckley, T. R. (2004). Model selection and model averaging in phylogenetics: Advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology, 53, 793–808.

  • R Core Team. (2019). R: A language and environment for statistical computing. In R Foundation for statistical computing. Vienna: Austria. URL https://www.R-project.org/.

    Google Scholar 

  • Rocke, D. M., & Woodruff, D. L. (1997). Robust estimation of multivariate location and shape. Journal of Statistical Planning and Inference, 57, 245–255.

  • Sampson, R. J., & Laub, J. H. (2005). Seductions of method: rejoinder to nagin and tremblay's “Developmental trajectory groups: Fact or fiction?”. Criminology, 43, 905–913.

  • Tofighi, D., & Enders, C. K. (2008). Identifying the correct number of classes in growth mixture models. In Advances in Latent Variable Mixture Models (pp. 317–341). Information age publishing.

  • Van Horn, M. L., Smith, J., Fagan, A. A., Jaki, T., Feaster, D. J., Masyn, K., et al. (2012). Not quite normal: Consequences of violating the assumption of normality in regression mixture models. Structural Equation Modeling: A Multidisciplinary Journal, 19, 227–249.

  • Vermunt, J., & Magidson, J. (2002). Latent class cluster analysis. In J. Hagenaars & a. McCutcheon (Eds.), Applied latent class analysis (pp. 89–106).

  • Vrbik, I., & Mcnicholas, P. D. (2014). Parsimonious skew mixture models for model-based clustering and classification. Computational Statistics & Data Analysis, 71, 196–210.

  • Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society, 307–333

Download references

Acknowledgments

The author would like to thank Daniel Bolt, Kristin Litzelman, Robert Nix, and David Epstein for helpful comments and feedback on earlier drafts of this article.

Funding

This research was supported by a University of Wisconsin’s School of Human Ecology Summertime Academic Research (STAR) award and a training grant in connection with grant P50 DA019706 from the National Institute on Drug Abuse awarded to Michael Fiore and Timothy Baker.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Albert J. Burgess-Hull.

Ethics declarations

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Conflict of Interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent or assent was obtained from all individual participants included in the study.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Preliminary results contained in the current manuscript were previously disseminated at the 29th annual convention for the Association for Psychological Science.

Electronic supplementary material

ESM 1

(DOCX 106 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Burgess-Hull, A.J. Finite Mixture Models with Student t Distributions: an Applied Example. Prev Sci 21, 872–883 (2020). https://doi.org/10.1007/s11121-020-01109-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11121-020-01109-3

Keywords

Navigation