Skip to main content
Log in

MatTransMix: an R Package for Matrix Model-Based Clustering and Parsimonious Mixture Modeling

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Finite mixture modeling, expanded to matrix-valued data, faces several challenges. One of the major concerns is overparameterization resulting from the high number of parameters involved in a matrix mixture. In addition, an appropriate power transformation is very useful if the data are skewed. The R package MatTransMix is a new piece of software devoted to parsimonious models, based on spectral decomposition of covariance matrices, developed for fitting heterogeneous matrix-valued data providing model-based clustering results. The package implements a variety of parsimonious models obtained from various combinations of spectral decomposition and skewness parameters. The paper discusses some methodological foundations of the proposed models and elaborates the functions available in this package on carefully chosen examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

References

  • Banfield, J.D., & Raftery, A.E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics, 49, 803–821.

    Article  MathSciNet  Google Scholar 

  • Biernacki, C., Celeux, G., & Govaert, G. (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics and Data Analysis, 413, 561–575.

    Article  MathSciNet  Google Scholar 

  • Celeux, G., & Govaert (1995). Gaussian parsimonious clustering models. Computational Statistics and Data Analysis, 28, 781–93.

    Google Scholar 

  • Dawid, A.P. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika, 68, 265–274.

    Article  MathSciNet  Google Scholar 

  • Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood for incomplete data via the EM algorithm (with discussion). Jounal of the Royal Statistical Society, Series B, 39, 1–38.

    MATH  Google Scholar 

  • Dutilleul, P. (1999). The mle algorithm for the matrix normal distribution. Journal of Statistical Computation and Simulation, 64, 105–123.

    Article  Google Scholar 

  • Fisher, R.A. (1936). The use of multiple measurements in taxonomic poblems. Annals of Eugenics, 7, 179–188.

    Article  Google Scholar 

  • Forgy, E. (1965). Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics, 21, 768–780.

    Google Scholar 

  • Fraley, C., & Raftery, A.E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.

    Article  MathSciNet  Google Scholar 

  • Gallaugher, M., & McNicholas, P.D. (2020). Parsimonious mixtures of matrix variate bilinear factor analyzers. In Advanced Studies in Behaviormetrics and Data Science (pp. 177–196). Springer.

  • Kaufman, L., & Rousseuw, P.J. (1990). Finding groups in data. New York: Wiley.

    Book  Google Scholar 

  • Kolda, T.G., & Bader, B.W. (2009). Tensor decompositions and applications. SIAM Review, 51, 455–500.

    Article  MathSciNet  Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium, 1, 281–297.

    MathSciNet  MATH  Google Scholar 

  • Manly, B.F.J. (1976). Exponential data transformations. Biometrics Unit, 25, 37–42.

    Google Scholar 

  • McLachlan, G., & Peel, D. (2000). Finite mixture models. New York: Wiley.

    Book  Google Scholar 

  • Melnykov, V. (2013). Challenges in model-based clustering. WIREs: Computational Statistics, 5, 135–148.

    Google Scholar 

  • Melnykov, V., & Zhu, X. (2018). On model-based clustering of skewed matrix data. Journal of Multivariate Analysis, 167, 181–194.

    Article  MathSciNet  Google Scholar 

  • Melnykov, V., & Zhu, X. (2019). Studying crime trends in the USA over the years 2000–2012. Advances in Data Analysis and Classification, 13, 325–341.

    Article  MathSciNet  Google Scholar 

  • Sarkar, S., Melnykov, V., & Zhu, X. (2021). Tensor-variate finite mixture modeling for the analysis of university professor remuneration. The Annals of Applied Statistics, 15(2), 1017–1036.

    Article  MathSciNet  Google Scholar 

  • Sarkar, S., Zhu, X., Melnykov, V., & Ingrassia, S. (2020). On parsimonious models for modeling matrix data. Computational Statistics and Data Analysis, 142, 106822.

    Article  MathSciNet  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimensions of a model. Annals of Statistics, 6, 461–464.

    Article  MathSciNet  Google Scholar 

  • Sneath, P. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226.

    Article  Google Scholar 

  • Sorensen, T. (1948). A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons, (Vol. 5.

  • Srivastava, M.S., Rosen, T., & Rosen, D. (2008). Models with a Kronecker product covariance structure: estimation and testing. Mathematical Methods of Statistics, 17, 357–370.

    Article  MathSciNet  Google Scholar 

  • Viroli, C. (2011). Finite mixtures of matrix normal distributions for classifying three-way data. Statistics and Computing, 21, 511–522.

    Article  MathSciNet  Google Scholar 

  • Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.

    Article  MathSciNet  Google Scholar 

  • Yeo, I.-K., & Johnson, R.A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954–959.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuwen Zhu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, X., Sarkar, S. & Melnykov, V. MatTransMix: an R Package for Matrix Model-Based Clustering and Parsimonious Mixture Modeling. J Classif 39, 147–170 (2022). https://doi.org/10.1007/s00357-021-09401-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-021-09401-9

Keywords

Navigation