A dual subspace parsimonious mixture of matrix normal distributions

Sharp, Alex; Chalatov, Glen; Browne, Ryan P.

doi:10.1007/s11634-022-00526-2

A dual subspace parsimonious mixture of matrix normal distributions

Regular Article
Published: 16 November 2022

Volume 17, pages 801–822, (2023)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

251 Accesses
Explore all metrics

Abstract

We present a parsimonious dual-subspace clustering approach for a mixture of matrix-normal distributions. By assuming certain principal components of the row and column covariance matrices are equally important, we express the model in fewer parameters without sacrificing discriminatory information. We derive update rules for an ECM algorithm and set forth necessary conditions to ensure identifiability. We use simulation to demonstrate parameter recovery, and we illustrate the parsimony and competitive performance of the model through two data analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finite mixtures, projection pursuit and tensor rank: a triangulation

Article 06 September 2018

Subspace clustering for the finite mixture of generalized hyperbolic distributions

Article 22 August 2018

Statistical Methods

References

Aitkin M, Rubin DB (1985) Estimation and hypothesis testing in finite mixture models. J R Stat Soc Ser B (Methodol) 47(1):67–75
MATH Google Scholar
Banfield J, Raftery A (1993) Model-based gaussian and non-gaussian clustering. Biometrics 49:803–821
Article MathSciNet MATH Google Scholar
Basford KE, McLachlan GJ (1985) The mixture method of clustering applied to three-way data. J Classifi 12:558. https://doi.org/10.1007/BF01908066
Article Google Scholar
Bellman R (1954) The theory of dynamic programming. Bull Am Math Soc 60(6):503–515
Article MathSciNet MATH Google Scholar
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the em algorithm for getting the highest likelihood in multivariate gaussian mixture models. Comput Stat Data Anal 41(3):561–575
Article MathSciNet MATH Google Scholar
Bouveyron C, Brunet-Saumard C (2014) Model-based clustering of high-dimensional data: a review. Comput Stat Data Anal 71:52–78
Article MathSciNet MATH Google Scholar
Bouveyron C, Girard S, Schmid C (2007) High-dimensional data clustering. Comput Stat Data Anal 52(1):502–519
Article MathSciNet MATH Google Scholar
Bouveyron C, Celeux G, Murphy TB, Raftery AE (2019) Model-based clustering and classification for data science: with applications in R. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press. ISBN 9781108494205. https://books.google.ca/books?id=ldGoDwAAQBAJ
Browne RP, Mcnicholas PD (2014) Estimating common principal components in high dimensions. Adv Data Anal Classifi 8(2):217–226
Article MathSciNet MATH Google Scholar
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recogn 28(5):781–793
Article Google Scholar
Dawid AP (1981) Some matrix-variate distribution theory: notational considerations and a Bayesian application. Biometrika 68(1):265–274. https://doi.org/10.1093/biomet/68.1.265
Article MathSciNet MATH Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–22. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
Article MathSciNet MATH Google Scholar
Dogru FZ, Bulut YM, Arslan O (2016) Finite mixtures of matrix variate t distributions. Gazi Univ J Sci 29:335–341
Google Scholar
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, London
MATH Google Scholar
Fraley C, Raftery AE (1998) How many clusters? which clustering method? answers via model-based cluster analysis. The Comput J 41(8):578–588. https://doi.org/10.1093/comjnl/41.8.578
Article MATH Google Scholar
Fraley C, Raftery A (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611–631
Article MathSciNet MATH Google Scholar
Fraley C, Raftery A (2003) Enhanced model-based clustering, density estimation, and discriminant analysis software: Mclust. J Classif 20:263–286
Article MathSciNet MATH Google Scholar
Fraley C, Raftery A (2003) Enhanced model-based clustering, density estimation, and discriminant analysis software: Mclust. J Class 20:263–286
Article MathSciNet MATH Google Scholar
Gallaugher Michael PB, McNicholas P (2018) Finite mixtures of skewed matrix variate distributions. Pattern Recognit 80:83–93
Article Google Scholar
Gallaugher M, McNicholas P (2019) Mixtures of skewed matrix variate bilinear factor analyzers. Adv Data Anal Class 14:11. https://doi.org/10.1007/s11634-019-00377-4
Article MathSciNet MATH Google Scholar
Ghahramani Z, Hinton GE (1996) The em algorithm for mixtures of factor analyzers
Glanz H, Carvalho L (2013) An expectation-maximization algorithm for the matrix normal distribution. J Multivariate Anal 167:09. https://doi.org/10.1016/j.jmva.2018.03.010
Article Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classifi 2:193–218
Article MATH Google Scholar
Keribin C (2000) Consistent estimation of the order of mixture models. The Indian J Stat Ser A 62(1):49–66
MathSciNet MATH Google Scholar
McLachlan G, Peel D (2000) Finite mixture models. Wiley, London
Book MATH Google Scholar
McLachlan GJ, Peel D, Bean RW (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41(3):379–388
Article MathSciNet MATH Google Scholar
McNicholas PD (2016) Model-based clustering. J Classifi 33:331–373. https://doi.org/10.1007/s00357-016-9211-9
Article MathSciNet MATH Google Scholar
McNicholas P, Murphy T (2008) Parsimonious gaussian mixture models. Stat Comput 18:285–296. https://doi.org/10.1007/s11222-008-9056-0
Article MathSciNet Google Scholar
Melnykov V, Zhu X (2018) On model-based clustering of skewed matrix data. J Multivariate Anal 167:04. https://doi.org/10.1016/j.jmva.2018.04.007
Article MathSciNet MATH Google Scholar
Melnykov V, Zhu X (2018) Studying crime trends in the USA over the years 2000–2012. Adv Data Anal Class 13:06. https://doi.org/10.1007/s11634-018-0326-1
Article MathSciNet MATH Google Scholar
Meng X-L, Rubin DB (1993) Maximum likelihood estimation via the ecm algorithm: a general framework. Biometrika 80(2):267–278
Article MathSciNet MATH Google Scholar
R Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Roeder K, Wasserman L (1997) Practical bayesian density estimation using mixtures of normals. J Am Stat Assoc 92(439):894–902
Article MathSciNet MATH Google Scholar
Sarkar S, Zhu X, Melnykov V, Ingrassia S (2019) On parsimonious models for modeling matrix data. Comput Stat Data Anal 142:106822. https://doi.org/10.1016/j.csda.2019.106822
Article MathSciNet MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MathSciNet MATH Google Scholar
Scott D, Thompson J (1983) Probability density estimation in higher dimension. In: Computer science and statistics: proceedings of the fifteenth symposium on the interface, pp 01
Srivastava M, von Rosen T, von Rosen D (2008) Models with a kronecker product covariance structure: estimation and testing. Math Methods Stat 17:357–370. https://doi.org/10.3103/S1066530708040066
Article MathSciNet MATH Google Scholar
Tomarchio S, Punzo A, Bagnato L (2020) Two new matrix-variate distributions with application in model-based clustering. Comput Stat Data Anal 152:107050. https://doi.org/10.1016/j.csda.2020.107050
Article MathSciNet MATH Google Scholar
Tomarchio S, McNicholas P, Punzo A (2021) Matrix normal cluster-weighted models. J Classifi. https://doi.org/10.1007/s00357-021-09389-2
Article MathSciNet MATH Google Scholar
Viroli C (2011) Finite mixtures of matrix normal distributions for classifying three-way data. Stat Comput 21:511–522. https://doi.org/10.1007/s11222-010-9188-x
Article MathSciNet MATH Google Scholar
Viroli C (2011) Model based clustering for three-way data structures. Bayesian Anal 6(4):573–602. https://doi.org/10.1214/11-BA622
Article MathSciNet MATH Google Scholar
Wolfe JH (1964) A computer program for the maximum likelihood analysis of types. In: Technical Bulletin 65-15, U.S Naval Personnel Research Activity
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms

Download references

Funding

Funding was provided by Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada (Grant No. 04444).

Author information

Authors and Affiliations

Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
Alex Sharp, Glen Chalatov & Ryan P. Browne

Authors

Alex Sharp
View author publications
You can also search for this author in PubMed Google Scholar
Glen Chalatov
View author publications
You can also search for this author in PubMed Google Scholar
Ryan P. Browne
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Sharp.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A parameters used in model selection simulation

The parameters used to generate observations are as follows. The mean parameters are,

$$\begin{aligned} \textbf{M}_1= & {} \begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 &{} 1 \\ 0 &{} 1 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 1 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 &{} 1 \\ \end{bmatrix} \,\, \textbf{M}_2 = \begin{bmatrix} 0 &{} 1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 1 \\ 0 &{} 1 &{} 1 &{} 1 &{} 0 \\ 1 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 &{} 0 \\ \end{bmatrix} \\ \textbf{M}_3= & {} \begin{bmatrix} 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 1 &{} 1 &{} 1 &{} 1 &{} 1 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ \end{bmatrix} \,\, \textbf{M}_4 = \begin{bmatrix} 0 &{} 0 &{} 0 &{} 1 &{} 0 \\ 1 &{} 1 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 &{} 1 &{} 1 \\ 0 &{} 1 &{} 0 &{} 0 &{} 0 \\ \end{bmatrix} \\ \end{aligned}$$

The covariance parameters are the same across groups and across dimensions and are specifed as,

$$\begin{aligned} \varvec{\varPhi }_{ig}&= \begin{bmatrix} 1.5157166 &{} 0 \\ 0 &{} 1.5157166\\ \end{bmatrix}, \quad \text { and } \quad \eta _{ig} = 0.7578583, \\ \end{aligned}$$

where $i = 1,2$.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sharp, A., Chalatov, G. & Browne, R.P. A dual subspace parsimonious mixture of matrix normal distributions. Adv Data Anal Classif 17, 801–822 (2023). https://doi.org/10.1007/s11634-022-00526-2

Download citation

Received: 04 May 2021
Revised: 02 September 2022
Accepted: 29 October 2022
Published: 16 November 2022
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11634-022-00526-2

Keywords

Mathematics Subject Classification

62H30

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A dual subspace parsimonious mixture of matrix normal distributions

Abstract

Access this article

Similar content being viewed by others

Finite mixtures, projection pursuit and tensor rank: a triangulation

Subspace clustering for the finite mixture of generalized hyperbolic distributions

Statistical Methods

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A parameters used in model selection simulation

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A dual subspace parsimonious mixture of matrix normal distributions

Abstract

Access this article

Similar content being viewed by others

Finite mixtures, projection pursuit and tensor rank: a triangulation

Subspace clustering for the finite mixture of generalized hyperbolic distributions

Statistical Methods

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A parameters used in model selection simulation

A parameters used in model selection simulation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation