Growth Mixture Modeling with Measurement Selection

Flynt, Abby; Dean, Nema

doi:10.1007/s00357-018-9275-9

Growth Mixture Modeling with Measurement Selection

Published: 29 October 2018

Volume 36, pages 3–25, (2019)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Abby Flynt¹ &
Nema Dean²

225 Accesses
3 Citations
Explore all metrics

Abstract

Growth mixture models are an important tool for detecting group structure in repeated measures data. Unlike traditional clustering methods, they explicitly model the repeated measurements on observations, and the statistical framework they are based on allows for model selection methods to be used to select the number of clusters. However, the basic growth mixture model makes the assumption that all of the measurements in the data have grouping information that separate the clusters. In other clustering contexts, it has been shown that including non-clustering variables in clustering procedures can lead to poor estimation of the group structure both in terms of the number of clusters and cluster membership/parameters. In this paper, we present an extension of the growth mixture model that allows for incorporation of stepwise variable selection based on the work done by Maugis, Celeux, and Martin-Magniette (2009) and Raftery and Dean (2006). Results presented on a simulation study suggest that the method performs well in correctly selecting the clustering variables and improves on recovery of the cluster structure compared with the basic growth mixture model. The paper also presents an application of the model to a clinical study dataset and concludes with a discussion and suggestions for directions of future work in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

BAUDRY, J.P., RAFTERY, A.E., CELEUX, G., LO, K., and GOTTARDO, R. (2010), “Combining Mixture Components for Clustering”, Journal of Computational and Graphical Statistics, 19, 332–353.
Article MathSciNet Google Scholar
BIERNACKI, C., and GOVAERT, G. (1997), “Using the Classification Likelihood to Choose the Number of Clusters”, Computing Science and Statistics, 29, 451–457.
Google Scholar
BIERNACKI, A.C., and GOVAERT, G. (1999), “Choosing Models in Model-Based Clustering and Discriminant Analysis”, Journal of Statistical Computation and Simulation, 64, 49–71.
Article MATH Google Scholar
DEAN, N., and RAFTERY, A.E. (2010), “Latent Class Analysis Variable Selection”, Annals of the Institute of Statistical Mathematics, 62(1), 11–35.
Article MathSciNet MATH Google Scholar
DEMPSTER, A.P., LAIRD, N.M., and RUBIN, D.B. (1977), “Maximum Likelihood from Incomplete Data Via the EM Algorithm”, Journal of the Royal Statistical Society, Series B (Methodological), 1–38.
EVERITT, B., LANDAU, S., LEESE, M., and STAHL, D. (2011), Cluster Analysis, Wiley Series in Probability and Statistics, Chichester, UK: Wiley.
FRALEY, C., and RAFTERY, A.E. (1998), “How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis”, The Computer Journal, 41(8), 578–588.
Article MATH Google Scholar
FRALEY, C., and RAFTERY, A.E. (2002), “Model-Based Clustering, Discriminant Analysis, and Density Estimation”, Journal of the American Statistical Association, 97, 611–631.
Article MathSciNet MATH Google Scholar
FRALEY, C., RAFTERY, A.E., MURPHY, T.B., and SCRUCCA, L. (2012), “Mclust Version 4 for R: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation”, Report No. 597, Department of Statistics, University of Washington.
GRÜN, B., and LEISCH, F. (2008), “Finite Mixtures of Generalized Linear Regression Models”, in Recent Advances in Linear Models and Related Areas: Essays in Honour of Helge Toutenburg, Physica-Verlag.
GUPTA, M.R., and YIHUA CHEN, Y. (2011), “Theory and Use of the EM Algorithm”, Foundations and Trends® in Signal Processing, 4(3), 223–296.
Article MATH Google Scholar
HARTIGAN, J.A. (1975), Clustering Algorithms, Wiley.
HARTIGAN, J.A. (1981), “Consistency of Single Linkage for High-Density Clusters”, Journal of the American Statistical Association, 76, 388–394.
Article MathSciNet MATH Google Scholar
HENNIG, C. (2010), “Methods for Merging Gaussian Mixture Components”, Advances in Data Analysis and Classification, 4, 3–34.
Article MathSciNet MATH Google Scholar
HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2(1), 193–218.
Article MATH Google Scholar
JAMES, G.M., and SUGAR, C.A. (2003), “Clustering for Sparsely Sampled Functional Data”, Journal of the American Statistical Association, 98, 565–576.
Article MathSciNet MATH Google Scholar
KERIBIN, C. (2000), “Consistent Estimation of the Order of Mixture Models”, Sankhya, 62, 49–66.
MathSciNet MATH Google Scholar
LAZARSFELD, P.F., and HENRY, N.W. (1968), Latent Structure Analysis, Houghton Mifflin.
MACQUEEN, J.B. (1967), “Some Methods for Classification and Analysis of Multivariate Observations”, in Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press.
MAUGIS, C., CELEUX, G., and MARTIN-MAGNIETTE, M-L. (2009), “Variable Selection for Clustering with Gaussian Mixture Models”, Biometrics, 65(3), 701–709.
Article MathSciNet MATH Google Scholar
MCLACHLAN, G.J., and KRISHNAN, T. (2008), The EM Algorithm and Extensions, Wiley.
MCNICHOLAS, P.D., and SUBEDI, S. (2012), “The EM Algorithm and Extensions”, Journal of Statistical Planning and Inference, 5, 1114–1127.
Article Google Scholar
MELNYKOV, V. (2016), “Merging Mixture Components for Clustering Through Pairwise Overlap”, Journal of Computational and Graphical Statistics, 24(1), 66–90.
Article MathSciNet Google Scholar
MURPHY, T.B., DEAN, N., and RAFTERY, A.E. (2010), “Variable Selection and Updating in Model-Based Discriminant Analysis for High Dimensional Data with Food Authenticity Applications”, Annals of Applied Statistics, 4, 396–421.
Article MathSciNet MATH Google Scholar
MUTHÉN, B., and SHEDDEN, K. (1999), “Finite Mixture Modeling with Mixture Outcomes Using the EM Algorithm”, Biometrics, 55(2), 463–469.
Article MATH Google Scholar
PEARSON, K. (1894), “Contribution to the Mathematical Theory of Evolution”, Philosophical Transactions of the Royal Society of London, Series A, 71.
R CORE TEAM (2015), “R: A Language and Environment for Statistical Computing”, R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/.
RAFTERY, A.E. (1995), “Bayesian Model Selection in Social Research (With Discussion)”, Sociological Methodology, 111–196.
RAFTERY, A.E, and DEAN, N. (2006), “Variable Selection for Model-Based Clustering”, Journal of the American Statistical Association, 101(473), 168–178.
Article MathSciNet MATH Google Scholar
RAM, N., and GRIMM, K.J. (2009), “Methods and Measures: Growth Mixture Modeling: A Method for Identifying Differences in Longitudinal Change Among Unobserved Groups”, International Journal of Behavioral Development, 33(6), 565–576.
Article Google Scholar
RUSAKOV, D., and GEIGER, D. (2005), “Asymptotic Model Selection for Naive Bayesian Networks”, Journal of Machine Learning Research, 6, 1–35.
MathSciNet MATH Google Scholar
SCHWARZ, G.E. (1978), “Estimating the Dimension of a Model”, Annals of Statistics, 6(2), 461–464.
Article MathSciNet MATH Google Scholar
SCRUCCA, L. (2016), “Identifying Connected Components in Gaussian Finite Mixture Models for Clustering”, Computational Statistics and Data Analysis, 93, 5–17.
Article MathSciNet MATH Google Scholar
STEEL, R.G.D., and TORRIE, J.H. (1960), Principles and Procedures of Statistics with Special Reference to the Biological Sciences, McGraw Hill.
THASE, M.E, GREENHOUSE, J.B., FRANK, E., REYNOLDS, C.F, PILKONIS, P.A., HURLEY, K., GROCHOCINSKI, V., and KUPFER, D.J. (1997), “Treatment of Major Depression with Psychotherapy or Psychotherapy-Pharmacotherapy Combinations”, Archives of General Psychiatry, 54(11), 1009–1015.
Article Google Scholar
TITTERINGTON, D.M., SMITH, A.F.M., and MAKOV, U.E. (1985), Statistical Analysis of Finite Mixture Distributions (Vol 7), New York: Wiley New York.
MATH Google Scholar
WARD, J.H. (1963), “Hierarchical Grouping to Optimize an Objective Function”, Journal of the American Statistical Association, 58(301), 236–244.
Article MathSciNet Google Scholar
WISHART, D. (1969), “Mode Analysis: A Generalization of Nearest Neighbor Which Reduces Chaining Effects”, in Numerical Taxonomy, ed. A.J. Cole, Academic Press, pp. 282-311.

Download references

Author information

Authors and Affiliations

Department of Mathematics, Bucknell University, Lewisburg, PA, 17837, USA
Abby Flynt
University of Glasgow, Glasgow, Scotland
Nema Dean

Authors

Abby Flynt
View author publications
You can also search for this author in PubMed Google Scholar
Nema Dean
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abby Flynt.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Flynt, A., Dean, N. Growth Mixture Modeling with Measurement Selection. J Classif 36, 3–25 (2019). https://doi.org/10.1007/s00357-018-9275-9

Download citation

Published: 29 October 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s00357-018-9275-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Growth Mixture Modeling with Measurement Selection

Abstract

Access this article

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Fixed and random effects models: making an informed choice

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Growth Mixture Modeling with Measurement Selection

Abstract

Access this article

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Fixed and random effects models: making an informed choice

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation