Advances in Data Analysis and Classification

, Volume 10, Issue 4, pp 423–440 | Cite as

A mixture of generalized hyperbolic factor analyzers

  • Cristina TortoraEmail author
  • Paul D. McNicholas
  • Ryan P. Browne
Regular Article


The mixture of factor analyzers model, which has been used successfully for the model-based clustering of high-dimensional data, is extended to generalized hyperbolic mixtures. The development of a mixture of generalized hyperbolic factor analyzers is outlined, drawing upon the relationship with the generalized inverse Gaussian distribution. An alternating expectation-conditional maximization algorithm is used for parameter estimation, and the Bayesian information criterion is used to select the number of factors as well as the number of components. The performance of our generalized hyperbolic factor analyzers model is illustrated on real and simulated data, where it performs favourably compared to its Gaussian analogue and other approaches.


Clustering Generalized hyperbolic distribution Mixture of factor analyzers AECM algorithm 

Mathematics Subject Classification

62H30 62F99 



The authors are grateful to an associate editor and anonymous reviewers for their very helpful comments and suggestions, the cumulative effect of which has been a stronger manuscript.


  1. Aitken A (1926) On Bernoulli’s numerical solution of algebraic equations. Proc R Soc Edim 46:289–305CrossRefzbMATHGoogle Scholar
  2. Andrews JL, McNicholas PD (2011a) Extending mixtures of multivariate t-factor analyzers. Stat Comput 21(3):361–373MathSciNetCrossRefzbMATHGoogle Scholar
  3. Andrews JL, McNicholas PD (2011b) Mixtures of modified t-factor analyzers for model-based clustering, classification, and discriminant analysis. J Stat Plan Inference 141(4):1479–1486MathSciNetCrossRefzbMATHGoogle Scholar
  4. Andrews JL, McNicholas P (2012) Model-based clustering, classification, and discriminant analysis via mixtures of multivariate \(t\)-distributions. Stat Comput 22(5):1021–1029MathSciNetCrossRefzbMATHGoogle Scholar
  5. Baek J, McLachlan GJM, Flack L (2010) Mixtures of factor analyzers with common factor loadings: Applications to the clustering and visualization of high-dimensional data. IEEE Trans Pattern Anal Mach Intell 32(7):1298–1309CrossRefGoogle Scholar
  6. Barndorff-Nielsen O, Halgreen C (1977) Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions. Z. Wahrscheinlichkeitstheor Verw. Geb 38:309–311MathSciNetCrossRefzbMATHGoogle Scholar
  7. Bergé L, Bouveyron C, Girard S (2013) Hdclassif: high dimensional supervised classification and clustering. R Package Version 1(2):2Google Scholar
  8. Bhattacharya S, McNicholas PD (2014) A LASSO-penalized BIC for mixture model selection. Adv Data Anal Classif 8(1):45–61MathSciNetCrossRefGoogle Scholar
  9. Blæsild P (1978) The shape of the generalized inverse Gaussian and hyperbolic distributions. In: Research Report 37, Department of Theoretical Statistics. Aarhus University, DenmarkGoogle Scholar
  10. Böhning D, Diez E, Scheub R, Schlattmann P, Lindsay B (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46:373–388CrossRefzbMATHGoogle Scholar
  11. Bouveyron C, Girard S, Schmid C (2007) High-dimensional data clustering. Comput Stat Data Anal 52(1):502–519MathSciNetCrossRefzbMATHGoogle Scholar
  12. Bouveyron C, Brunet-Saumard C (2014) Model-based clustering of high-dimensional data: a review. Comput Stat Data Anal 71:52–78MathSciNetCrossRefzbMATHGoogle Scholar
  13. Browne RP, McNicholas PD (2015) A mixture of generalized hyperbolic distributions. Can J Stat. doi: 10.1002/cjs.11246
  14. Browne RP, McNicholas PD, Sparling MD (2012) Model-based learning using a mixture of mixtures of Gaussian and uniform distributions. IEEE Trans Pattern Anal Mach Intell 34(4):814–817CrossRefGoogle Scholar
  15. Browne RP, McNicholas PD (2014) Estimating common principal components in high dimensions. Adv Data Anal Classif 8(2):217–226MathSciNetCrossRefGoogle Scholar
  16. Campbell JG, Fraley F, Murtagh F, Raftery AE (1997) Linear flaw detection in woven textiles using model-based clustering. Pattern Recogn Lett 18:1539–1548CrossRefGoogle Scholar
  17. Chen X, Cheung ST, So S, Fan ST, Barry C, Higgins J, Lai K-M, Ji J, Dudoit S, Ng IO, van de Rijn M, Botstein D, Brown PO (2002) Gene expression patterns in human liver cancers. Mol Biol Cell 13(6):1929–1939CrossRefGoogle Scholar
  18. Dasgupta A, Raftery AE (1998) Detecting features in spatial point processed with clutter via model-based clustering. J Am Stat Assoc 93:294–302CrossRefzbMATHGoogle Scholar
  19. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–38MathSciNetzbMATHGoogle Scholar
  20. Forina M, Armanino C (1982) Eigenvector projection and simplified non linear mapping of fatty acid content of Italian olive oils. Ann Chim 72:127–141Google Scholar
  21. Forina M, Tiscornia E (1982) Pattern recognition methods in the prediction of Italian olive oil origin by their fatty acid content. Ann Chim 72:143–155Google Scholar
  22. Forina M, Armanino C, Castino M, Ubigli M (1986) Multivariate data analysis as a discriminating method of the origin of wines. Vitis 25:189–201Google Scholar
  23. Franczak BC, McNicholas PD, Browne RP, Murray PM (2013) Parsimonious shifted asymmetric Laplace mixtures. ArXiv preprint arXiv:1311.0317
  24. Franczak BC, Browne RP, McNicholas PD (2014) Mixtures of shifted asymmetric Laplace distributions. IEEE Trans Pattern Anal Mach Intell 36(6):1149–1157CrossRefGoogle Scholar
  25. Ghahramani Z, Hinton GE (1997) The EM algorithm for factor analyzers. In: Technical Report CRG-TR-96-1. University of Toronto, TorontoGoogle Scholar
  26. Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40:237–260MathSciNetCrossRefzbMATHGoogle Scholar
  27. Gorman RP, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1(1):75–89CrossRefGoogle Scholar
  28. Halgreen C (1979) Self-decomposibility of the generalized inverse Gaussian and hyperbolic distributions. Z. Wahrscheinlichkeitstheor Verw. Geb 47:13–18MathSciNetCrossRefzbMATHGoogle Scholar
  29. Hennig C (2010) Methods for merging Gaussian mixture components. Adv Data Anal Classif 4:3–34MathSciNetCrossRefzbMATHGoogle Scholar
  30. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218CrossRefzbMATHGoogle Scholar
  31. Jørgensen B (1982) Statistical properties of the generalized inverse Gaussian distribution. Springer, New YorkCrossRefzbMATHGoogle Scholar
  32. Karlis D, Santourian A (2009) Model-based clustering with non-elliptically contoured distributions. Stat Comput 19(1):73–83MathSciNetCrossRefGoogle Scholar
  33. Lee SX, McLachlan GJ (2013b) On mixtures of skew normal and skew t-distributions. Adv Data Anal Classif 7(3):241–266MathSciNetCrossRefzbMATHGoogle Scholar
  34. Lee S, McLachlan G (2013a). EMMIXuskew: fitting unrestricted multivariate skew t mixture models. R package version 0.11-5Google Scholar
  35. Lin T-I, McLachlan GJ, Lee SX (2013) Extending mixtures of factor models using the restricted multivariate skew-normal distribution. ArXiv preprint arXiv:1307.1748
  36. Lin T-I (2009) Maximum likelihood estimation for multivariate skew normal mixture models. J Multivar Anal 100:257–265MathSciNetCrossRefzbMATHGoogle Scholar
  37. Lin T-I (2010) Robust mixture modeling using multivariate skew t distributions. Stat Comput 20(3):343–356MathSciNetCrossRefGoogle Scholar
  38. Lin T-I, McNicholas PD, Hsiu JH (2014) Capturing patterns via parsimonious t mixture models. Stat Probab Lett 88:80–87MathSciNetCrossRefzbMATHGoogle Scholar
  39. Lindsay B (1995). Mixture models: theory, geometry and applications. In: NSF-CBMS regional conference series in probability and statistics, vol 5. Institute of Mathematical Statistics, Hayward, CaliforniaGoogle Scholar
  40. Lopes HF, West M (2004) Bayesian model assessment in factor analysis. Stat Sin 14:41–67MathSciNetzbMATHGoogle Scholar
  41. Markos A, Iodice D’Enza A, Van de Velden M (2013) clustrd: methods for joint dimension reduction and clustering. R package version 0.1.2Google Scholar
  42. Maugis C, Celeux G, Martin-Magniette M (2009) Variable selection in model-based clustering: a general variable role modeling. Comput Stat Data Anal 53(11):3872–3882MathSciNetCrossRefzbMATHGoogle Scholar
  43. McLachlan GJ, Peel D (2000) Mixtures of factor analyzers. In: Proceedings of the seventh international conference on machine learning. San Francisco, Morgan Kaufmann, pp 599–606Google Scholar
  44. McLachlan GJ, Peel D, Bean RW (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388MathSciNetCrossRefzbMATHGoogle Scholar
  45. McLachlan GJ, Bean RW, Jones LB-T (2007) Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution. Comput Stat Data Anal 51(11):5327–5338MathSciNetCrossRefzbMATHGoogle Scholar
  46. McNicholas SM, McNicholas PD, Browne RP (2013) Mixtures of variance-gamma distributions. Arxiv preprint arXiv:1309.2695
  47. McNicholas PD, Murphy TB (2008) Parsimonious Gaussian mixture models. Stat Comput 18(3):285–296MathSciNetCrossRefGoogle Scholar
  48. McNicholas PD (2010) Model-based classification using latent Gaussian mixture models. J Stat Plan Inference 140(5):1175–1181MathSciNetCrossRefzbMATHGoogle Scholar
  49. McNicholas PD, Murphy TB (2010) Model-based clustering of microarray expression data via latent Gaussian mixture models. Bioinformatics 26(21):2705–2712CrossRefGoogle Scholar
  50. McNicholas PD, Jampani KR, McDaid AF, Murphy TB, Banks L (2014) Pgmm: parsimonious Gaussian mixture models. R Package Version 1:1Google Scholar
  51. Meng X, Van Dyk D (1997) The EM algorithm-an old folk song sung to a fast new tune. J R Stat Soc Ser B (Stat Methodol) 59(3):511–567MathSciNetCrossRefzbMATHGoogle Scholar
  52. Montanari A, Viroli C (2011) Maximum likelihood estimation of mixtures of factor analyzers. Comput Stat Data Anal 55:2712–2723MathSciNetCrossRefGoogle Scholar
  53. Morris K, McNicholas PD, Scrucca L (2013) Dimension reduction for model-based clustering via mixtures of multivariate t-distributions. Adv Data Anal Classif 7(3):321–338MathSciNetCrossRefzbMATHGoogle Scholar
  54. Morris K, McNicholas PD (2013) Dimension reduction for model-based clustering via mixtures of shifted asymmetric Laplace distributions. Stat Probab Lett 83(9):2088–2093MathSciNetCrossRefzbMATHGoogle Scholar
  55. Murray PM, Browne RB, McNicholas PD (2013) Mixtures of ‘unrestricted’ skew-t factor analyzers. Arxiv preprint arXiv:1310.6224
  56. Murray PM, Browne RB, McNicholas PD (2014a) Mixtures of skew-t factor analyzers. Comput Stat Data Anal 77:326–335MathSciNetCrossRefGoogle Scholar
  57. Murray PM, McNicholas PD, Browne RB (2014b) A mixture of common skew-\(t\) factor analyzers. Stat 3(1):68–82MathSciNetCrossRefGoogle Scholar
  58. O’Hagan A, Murphy TB, Gormley IC, McNicholas PD, Karlis D (2014) Clustering with the multivariate normal inverse Gaussian distribution. Comput Stat Data Anal. doi: 10.1016/j.csda.2014.09.006
  59. R Core Team (2014) R: a language and environment for statistical computing. In: R foundation for statistical computing. Vienna, AustriaGoogle Scholar
  60. Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850CrossRefGoogle Scholar
  61. Ritter G (2014) Robust cluster analysis and variable selection. Chapman & Hall, Boca RatonzbMATHGoogle Scholar
  62. Rocci R, Gattone SA, Vichi M (2011) A new dimension reduction method: factor discriminant k-means. J Classif 28(2):210–226MathSciNetCrossRefzbMATHGoogle Scholar
  63. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464MathSciNetCrossRefzbMATHGoogle Scholar
  64. Steane MA, McNicholas PD, Yada R (2012) Model-based classification via mixtures of multivariate t-factor analyzers. Commun Stat-Simul Comput 41(4):510–523MathSciNetCrossRefzbMATHGoogle Scholar
  65. Subedi S, McNicholas PD (2014) Variational Bayes approximations for clustering via mixtures of normal inverse Gaussian distributions. Adv Data Anal Classif 8(2):167–193MathSciNetCrossRefGoogle Scholar
  66. Tan PJ, Dowe DL (2005) MML inference of oblique decision trees. In: AI 2004: advances in artificial intelligence. Springer, Berlin, Heidelberg, pp 1082–1088Google Scholar
  67. Timmerman ME, Ceulemans E, De Roover K, Van Leeuwen K (2013) Subspace K-means clustering. Behav Res Methods 45(4):1011–1023Google Scholar
  68. Tortora C, Browne RP, Franczak BC, McNicholas PD (2015) MixGHD: model based clustering and classification using the mixture of generalized hyperbolic distributions. R Package Version 1:4Google Scholar
  69. Vichi M, Kiers H (2001) Factorial k-means analysis for two way data. Comput Stat Data Anal 37:29–64MathSciNetCrossRefzbMATHGoogle Scholar
  70. Vrbik I, McNicholas PD (2012) Analytic calculations for the EM algorithm for multivariate skew-mixture models. Stat Probab Lett 82(6):1169–1174MathSciNetCrossRefzbMATHGoogle Scholar
  71. Vrbik I, McNicholas PD (2014) Parsimonious skew mixture models for model-based clustering and classification. Comput Stat Data Anal 71:196–210MathSciNetCrossRefGoogle Scholar
  72. Wang K, Ng A, McLachlan G (2013) EMMIXskew: the EM algorithm and skew mixture distribution. R Package Version 1:1Google Scholar
  73. Wei Y, McNicholas PD (2014) Mixture model averaging for clustering. Adv Data Anal Classif. doi: 10.1007/s11634-014-0182-6
  74. Woodbury M (1950) Inverting modified matrices. In: Technical Report 42. Princeton University, PrincetonGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Cristina Tortora
    • 1
    Email author
  • Paul D. McNicholas
    • 1
  • Ryan P. Browne
    • 1
  1. 1.Department of Mathematics and StatisticsMcMaster UniversityHamiltonCanada

Personalised recommendations