Advertisement

Statistics and Computing

, Volume 29, Issue 3, pp 415–428 | Cite as

Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions

  • Mohsen Maleki
  • Darren WraithEmail author
  • Reinaldo B. Arellano-Valle
Article

Abstract

In this paper, we introduce an unrestricted skew-normal generalized hyperbolic (SUNGH) distribution for use in finite mixture modeling or clustering problems. The SUNGH is a broad class of flexible distributions that includes various other well-known asymmetric and symmetric families such as the scale mixtures of skew-normal, the skew-normal generalized hyperbolic and its corresponding symmetric versions. The class of distributions provides a much needed unified framework where the choice of the best fitting distribution can proceed quite naturally through either parameter estimation or by placing constraints on specific parameters and assessing through model choice criteria. The class has several desirable properties, including an analytically tractable density and ease of computation for simulation and estimation of parameters. We illustrate the flexibility of the proposed class of distributions in a mixture modeling context using a Bayesian framework and assess the performance using simulated and real data.

Keywords

Bayesian analysis Finite mixtures MCMC Unrestricted skew-normal generalized hyperbolic family Skew-normal Generalized hyperbolic distribution 

Notes

Acknowledgements

The authors would like to thank the coordinating editor and anonymous reviewers for their suggestions, corrections and encouragement, which helped us to improve earlier versions of the manuscript.

References

  1. Andrews, D.R., Mallows, C.L.: Scale mixture of normal distribution. J. Roy. Stat. Soc. B 36, 99–102 (1974)MathSciNetzbMATHGoogle Scholar
  2. Arellano-Valle, R.B., Azzalini, A.: On the unification of families of skew-normal distributions. Scand. J. Stat. 33, 561–574 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  3. Arellano-Valle, R.B., Genton, M.G.: On fundamental skew distributions. J. Multivar. Anal. 96, 93–116 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. Arellano-Valle, R.B., Genton, M.G.: Multivariate unified skew-elliptical distributions. Chil. J. Stat. 2, 17–34 (2010)MathSciNetzbMATHGoogle Scholar
  5. Arellano-Valle, R.B., Branco, M.D., Genton, M.G.: A unified view on skewed distributions arising from selections. Can. J. Stat. 34, 581–601 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  6. Arellano-Valle, R.B., Bolfarine, H., Lachos, G.H.: Bayesian inference for skew-normal linear mixed model. J. Appl. Stat. 33, 561–574 (2007)MathSciNetGoogle Scholar
  7. Azzalini, A.: Package ‘sn’. http://azzalini.stat.unipd.it/SN (2015). Accessed 13 May 2017
  8. Azzalini, A., with the collaboration of Capitanio, A.: The Skew-Normal and Related Families. IMS Monographs Series. Cambridge University Press (2014)Google Scholar
  9. Barndorff-Nielsen, O.: Hyperbolic distributions and distributions on hyperbolae. Scand. J. Stat. 5, 151–157 (1978)MathSciNetzbMATHGoogle Scholar
  10. Barndorff-Nielsen, O., Blaesild, P.: Hyperbolic distributions. In: Kotz, S., Johnson, N.L., Read, C. (eds.) Encyclopedia of Statistical Sciences, vol. 3. Wiley, New York (1980)Google Scholar
  11. Barndorff-Nielsen, O., Halgreen, C.: Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 38, 309–311 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  12. Basso, R.M., Lachos, V.H., Cabral, C.R.B., Ghosh, P.: Robust mixture modeling based on the scale mixtures of skew-normal distributions. Comput. Stat. Data Anal. 54, 2926–2941 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  13. Böhning, D.: Computer-Assisted Analysis of Mixtures and Applications. Meta-Analysis, Disease Mapping and Others. Chapman & Hall, Boca Raton (2000)zbMATHGoogle Scholar
  14. Branco, M.D., Dey, D.K.: A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 79, 99–113 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  15. Browne, R.P., McNicholas, P.D.: A mixture of generalized hyperbolic distributions. Can. J. Stat. 43(2), 176–198 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  16. Carlin, B.P., Louis, T.A.: Bayesian Methods for Data Analysis. CRC Press, Boca Raton (2011)zbMATHGoogle Scholar
  17. Celeux, G., Hurn, M., Robert, C.P.: Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc. 95, 957–970 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  18. Celeux, G., Forbes, F., Robert, C.P., Titterington, D.M.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–674 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  19. Chhikara, R.S., Folks, J.L.: The Inverse Gaussian Distribution. Marcel Dekker, New York (1989)zbMATHGoogle Scholar
  20. Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics. Wiley, New York (1994)CrossRefzbMATHGoogle Scholar
  21. Forbes, F., Wraith, D.: A new family of multivariate heavy-tailed distributions with variable marginal amounts of tail weight: application to robust clustering. Stat. Comput. 24(6), 971–984 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  22. Franczak, B.C., Browne, R.P., McNicholas, P.D.: Mixtures of shifted asymmetric laplace distributions. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1149–1157 (2014)CrossRefGoogle Scholar
  23. Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer Series in Statistics. Springer, Berlin (2006)zbMATHGoogle Scholar
  24. Frühwirth-Schnatter, S., Pyne, S.: Bayesian inference for finite mixtures of skew-normal and skew-t distributions. Biostatistics 11(2), 317–336 (2010)CrossRefGoogle Scholar
  25. Gelman, A., Rubin, D.B.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7, 457–511 (1992)CrossRefzbMATHGoogle Scholar
  26. Genton, M.G.: Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality. Chapman & Hall, Boca Raton (2004)CrossRefzbMATHGoogle Scholar
  27. Good, I.J.: The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–260 (1953)MathSciNetCrossRefzbMATHGoogle Scholar
  28. Hogan, J.W., Laird, N.M.: Mixture models for the joint distribution of repeated measures and event times. Stat. Med. 16, 239–258 (1997)CrossRefGoogle Scholar
  29. Holzmann, H., Munk, A., Gneiting, T.: Identifiability of finite mixtures of elliptical distributions. Scand. J. Stat. 33(4), 753–763 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  30. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2, 193–218 (1985)CrossRefzbMATHGoogle Scholar
  31. Johnson, N.L., Kotz, S., Balakrishnan, N.: Continous Univariate Distributions, vol. 1. Wiley, New York (1994)zbMATHGoogle Scholar
  32. Jørgensen, B.: Statistical Properties of the Generalized Inverse Gaussian distribution. Springer, New York (1982)CrossRefzbMATHGoogle Scholar
  33. Karlis, D., Santourian, A.: Model-based clustering with non-elliptically contoured distributions. Stat. Comput. 19(1), 73–83 (2009)MathSciNetCrossRefGoogle Scholar
  34. Lachos, V.H., Bolfarine, H., Arellano-Valle, R.B.: Likelihood-based inference for multivariate skew-normal regression models. Commun. Stat. Theory Methods 36(9), 1769–1786 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  35. Lachos, V.H., Ghosh, P., Arellano-Valle, R.B.: Likelihood based inference for skew-normal independent linear mixed models. Stat. Sin. 20, 303–322 (2010)MathSciNetzbMATHGoogle Scholar
  36. Lee, S.X., McLachlan, G.J.: Model-based clustering and classification with non-normal mixture distributions. Stat. Methods Appl. 22(4), 427–454 (2013a)MathSciNetCrossRefzbMATHGoogle Scholar
  37. Lee, S.X., McLachlan, G.J.: On mixtures of skew normal and skew t distributions. Adv. Data Anal. Classif. 7(3), 241–266 (2013b)MathSciNetCrossRefzbMATHGoogle Scholar
  38. Lee, S.X., McLachlan, G.J.: Finite mixtures of multivariate skew t distributions: some recent and new results. Stat. Comput. 24, 181–202 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  39. Lee, S.X., McLachlan, G.J.: Finite mixtures of canonical fundamental skew t-distributions: the unification of the restricted and unrestricted skew t-mixture models. Stat. Comput. 26, 573–589 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  40. Lin, T.I.: Maximum likelihood estimation for multivariate skew normal mixture models. J. Multivar. Anal. 100(2), 257–265 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  41. Lin, T.I.: Robust mixture modeling using multivariate skew t distributions. Stat. Comput. 20(3), 343–356 (2010)MathSciNetCrossRefGoogle Scholar
  42. Lin, T.I., Lee, J.C., Yen, S.Y.: Finite mixture modeling using the skew-normal distribution. Stat. Sin. 17(b), 909–927 (2007)zbMATHGoogle Scholar
  43. Lin, T.I., Ho, H.J., Chen, C.L.: Analysis of multivariate skew normal models with incomplete data. J. Multivar. Anal. 100(10), 2337–2351 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  44. Maier, L.M., Anderson, D.E., De Jager, P.L., Wicker, L.S., Hafler, D.A.: Allelic variant in CTLA4 alters t cell phosphorylation patterns. Proc. Natl. Acad. Sci. USA 104, 18607–18612 (2007)CrossRefGoogle Scholar
  45. Maleki, M., Arellano-Valle, R.B.: Maximum a-posteriori estimation of autoregressive processes based on finite mixtures of scale-mixtures of skew-normal distributions. J. Stat. Comput. Simul. 87(6), 1061–1083 (2017)MathSciNetCrossRefGoogle Scholar
  46. McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, Chichester (2000)CrossRefzbMATHGoogle Scholar
  47. McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts, Techniques and Tools. Princeton University Press, Princeton (2005)zbMATHGoogle Scholar
  48. Mengersen, K., Robert, C., Titterington, D.M.: Mixtures: Estimation and Applications. Wiley, Chichester (2011)CrossRefzbMATHGoogle Scholar
  49. Morris, K., McNicholas, P.D., Punzo, A., Browne, R.P.: Robust Asymmetric Clustering. ArXiv e-print arxiv:1402.6744 (2014)
  50. Pyne, S., Hu, X., Wang, K., Rossin, E., Lin, T.I., Maier, L.M., Baecher-Allan, C., McLachlan, G.J., Tamayo, P., Hafler, D.A., De Jager, P.L., Mesirov, J.P.: Automated high-dimensional flow cytometric data analysis. Proc. Natl. Acad. Sci. 106(21), 8519–8524 (2009)CrossRefGoogle Scholar
  51. R Core Team.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2017). Accessed 20 June 2017
  52. Sahu, S.K., Dey, D.K., Branco, M.D.: A new class of multivariate skew distributions with applications to Bayesian regression models. Can. J. Stat. 31(2), 129–150 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  53. Seshadri, V.: The Inverse Gaussian Distribution: A Case Study in Exponential Families. Oxford University Press, New York (1993)Google Scholar
  54. Teicher, H.: Identifiability of finite mixtures. Ann. Math. Stat. 34(4), 1265–1269 (1963)MathSciNetCrossRefzbMATHGoogle Scholar
  55. Vilca, F., Balakrishnan, N., Zeller, C.B.: Multivariate skew-normal generalized hyperbolic distribution and its properties. J. Multivar. Anal. 128, 73–85 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  56. Vrbik, I., McNicholas, P.D.: Analytic calculations for the EM algorithm for multivariate skew-t mixture models. Stat. Probab. Lett. 82(6), 1169–1174 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  57. Wang, H.X., Zhang, Q.B., Luo, B., Wei, S.: Robust mixture modelling using multivariate t-distribution with missing information. Pattern Recogn. Lett. 25(6), 701–710 (2004)CrossRefGoogle Scholar
  58. Wang, K., Ng, S.K., McLachlan, G.J.: Multivariate skew t mixture models: applications to fluorescence-activated cell sorting data. In: Digital Image Computing: Techniques and Applications, Los Alamitos, California, pp. 526–531. IEEE (2009)Google Scholar
  59. Wraith, D., Forbes, F.: Location and scale mixtures of Gaussians with flexible tail behaviour: properties, inference and application to multivariate clustering. Comput. Stat. Data Anal. 90(Oct.), 61–73 (2015)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Mohsen Maleki
    • 1
  • Darren Wraith
    • 2
    Email author
  • Reinaldo B. Arellano-Valle
    • 3
  1. 1.Department of StatisticsShiraz UniversityShirazIran
  2. 2.Institute of Health and Biomedical Innovation (IHBI)Queensland University of Technology (QUT)BrisbaneAustralia
  3. 3.Department of StatisticsUniversidad Católica de ChileSantiagoChile

Personalised recommendations