Abstract
Scale mixtures of normal distributions are often used as a challenging class for statistical analysis of symmetrical data. Recently, Ferreira et al. (Stat Methodol 8:154–171, 2011) defined the univariate skew scale mixtures of normal distributions that offer much needed flexibility by combining both skewness with heavy tails. In this paper, we develop a multivariate version of the skew scale mixtures of normal distributions, with emphasis on the multivariate skew-Student-t, skew-slash and skew-contaminated normal distributions. The main virtue of the members of this family of distributions is that they are easy to simulate from and they also supply genuine expectation/conditional maximisation either algorithms for maximum likelihood estimation. The observed information matrix is derived analytically to account for standard errors. Results obtained from real and simulated datasets are reported to illustrate the usefulness of the proposed method.
Similar content being viewed by others
References
Andrews, D.F., Mallows, C.L.: Scale mixtures of normal distributions. J. R. Stat. Soc. Ser. B 36, 99–102 (1974)
Arellano-Valle, R.B., Bolfarine, H., Lachos, V.H.: Skew-normal linear mixed models. J. Data Sci. 3, 415–438 (2005)
Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1985)
Azzalini, A., Capitanio, A.: Distributions generated and perturbation of symmetry with emphasis on the multivariate skew-\(t\) distribution. J. R. Stat. Soc. Ser. B 61, 367–389 (2003)
Azzalini, A., Dalla-Valle, A.: The multivariate skew-normal distribution. Biometrika 83(4), 715–726 (1996)
Azzalini, A., Capello, T.D., Kotz, S.: Log-skew-normal and log-skew-\(t\) distributions as models for family income data. J. Income Distrib. 11, 13–21 (2003)
Bolfarine, H., Lachos, V.: Skew probit error-in-variables models. Stat. Methodol. 3, 1–12 (2007)
Branco, M.D., Dey, D.K.: A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 79, 99–113 (2001)
Cabral, C.R.B., Lachos, V.H., Prates, M.O.: Multivariate mixture modeling using skew-normal independent distributions. Comput. Stat. Data Anal. 56(1), 126–142 (2012)
Cabral, C.R.B., Lachos, V.H., Zeller, C.B.: Multivariate measurement error models using finite mixtures of skew-Student \(t\) distributions. J. Multivar. Anal. 124, 179–198 (2014)
Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics. Wiley, Hoboken (1994)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)
Ferreira, C.S., Bolfarine, H., Lachos, V.H.: Skew scale mixtures of normal distributions: properties and estimation. Stat. Methodol. 8, 154–171 (2011)
Gómez, H.W., Venegas, O., Bolfarine, H.: Skew-symmetric distributions generated by the normal distribution function. Environmetrics 18, 395–407 (2007)
Harville, D.: Matrix Algebra From a Statistician’s Perspective. Springer, New York (1997)
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 1. Wiley, New York (1994)
Lachos, V.H., Vilca, L.F., Bolfarine, H., Ghosh, P.: Robust multivariate measurement error models with scale mixtures of skew-normal distributions. Statistics 44(6), 541–556 (2009)
Lachos, V.H., Ghosh, P., Arellano-Valle, R.B.: Likelihood based inference for skew-normal independent linear mixed models. Stat. Sin. 20(1), 303 (2010)
Lange, K.L., Sinsheimer, J.S.: Normal/independent distributions and their applications in robust regression. J. Comput. Graph. Stat. 2, 175–198 (1993)
Lange, K.L., Little, R., Taylor, J.: Robust statistical modeling using \(t\) distribution. J. Am. Stat. Assoc. 84, 881–896 (1989)
Lin, T.I., Ho, H.J., Lee, C.R.: Flexible mixture modelling using the multivariate skew-\(t\)-normal distribution. Stat. Comput. 24, 531–546 (2013)
Little, R.J.A.: Robust estimation of the mean and covariance matrix from data with missing values. Appl. Stat. 37, 23–38 (1988)
Liu, C., Rubin, D.B.: The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 80, 267–278 (1994)
Osorio, F., Paula, G.A., Galea, M.: Assessment of local influence in elliptical linear models with longitudinal structure. Comput. Stat. Data Anal. 51(9), 4354–4368 (2007)
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2015). http://www.R-project.org/
Sahu, S.K., Dey, D.K., Branco, M.D.: A new class of multivariate distributions with applications to Bayesian regression models. Can. J. Stat. 31, 129–150 (2003)
Wang, J., Boyer, J., Genton, M.: A skew-symmetric representation of multivariate distributions. Stat. Sin. 14, 1259–1270 (2004)
Acknowledgments
We thank the editor, associate editor and two referees whose constructive comments led to an improved presentation of the paper. C.S. acknowledges support from FAPEMIG (Minas Gerais State Foundation for Research Development), Grant CEX APQ 01845/14. V.H. acknowledges support from CNPq-Brazil (Grant 305054/2011-2) and FAPESP-Brazil (Grant 2014/02938-9).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Details of the observed information matrix
Considering \({\varvec{\alpha }}=\mathrm{Vech}({\mathbf {B}})\), where \({\varvec{\varSigma }}^{1/2}={\mathbf {B}}={\mathbf {B}}({\varvec{\alpha }})\), the first and second derivatives of \(\log |{\varvec{\varSigma }}|\), \(A_i\) and \(d_i\) are obtained. The notation used is that of Sect. 2 and for a p-dimensional vector \({\varvec{\rho }}=(\rho _1\ldots ,\rho _p)^{\top }\), we will use the notation \(\dot{{\mathbf {B}}}_r=\partial {{\mathbf {B}}({\varvec{\alpha }})}/\partial {\alpha _r}\), with \(r=1,2,\ldots ,p(p+1)/2\). Thus,
-
\({\varvec{\varSigma }}\)
$$\begin{aligned} \frac{\partial ^2 \log {|{\varvec{\varSigma }}|}}{\partial \alpha _k\partial \alpha _s}=-2 \text {tr}({\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_{s}{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_{k}), \end{aligned}$$ -
\(A_i\)
$$\begin{aligned} \frac{\partial A_i}{\partial {\varvec{\mu }}}= & {} -{\mathbf {B}}^{-1}{\varvec{\lambda }},\quad \frac{\partial A_i}{\partial \alpha _{k}}=-{\varvec{\lambda }}^{\top }{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\quad \frac{\partial A_i}{\partial {\varvec{\lambda }}}={\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}), \\ \frac{\partial ^2 A_i}{\partial {\varvec{\mu }}\partial {\varvec{\mu }}^{\top }}= & {} {\mathbf {0}},\quad \frac{\partial ^2 A_i}{\partial {\varvec{\mu }}\partial \alpha _k}={\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}{\varvec{\lambda }},\quad \frac{\partial ^2 A_i}{\partial {\varvec{\mu }}\partial {\varvec{\lambda }}^{\top }}=-{\mathbf {B}}^{-1},\\ \frac{\partial ^2 A_i}{\partial \alpha _k\partial \alpha _s}= & {} -{\varvec{\lambda }}^{\top }{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k+\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\\ \frac{\partial ^2 A_i}{\partial \alpha _k\partial {\varvec{\lambda }}}= & {} -{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\quad \frac{\partial ^2 A_i}{\partial {\varvec{\lambda }}\partial {\varvec{\lambda }}^{\top }}={\mathbf {0}}, \end{aligned}$$ -
\(d_i\)
$$\begin{aligned} \frac{\partial d_i}{\partial {\varvec{\mu }}}= & {} -2{\mathbf {B}}^{-2}({\mathbf {y}}_i-{\varvec{\mu }}),\quad \frac{\partial d_i}{\partial \alpha _k}=-({\mathbf {y}}_i-{\varvec{\mu }})^{\top }{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\\ \frac{\partial d_i}{\partial {\varvec{\lambda }}}= & {} {\mathbf {0}}, \frac{\partial ^2 d_i}{\partial {\varvec{\mu }}\partial {\varvec{\mu }}^{\top }}=2{\mathbf {B}}^{-2},\quad \frac{\partial ^2 d_i}{\partial {\varvec{\mu }}\partial \alpha _k}=2{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\\ \frac{\partial ^2 d_i}{\partial {\varvec{\mu }}\partial {\varvec{\lambda }}^{\top }}= & {} {\mathbf {0}},\quad \frac{\partial ^2 d_i}{\partial \alpha _k\partial {\varvec{\lambda }}^{\top }}={\mathbf {0}},\quad \frac{\partial ^2 d_i}{\partial {\varvec{\lambda }}\partial {\varvec{\lambda }}^{\top }}={\mathbf {0}},\\ \frac{\partial ^2 d_i}{\partial \alpha _k\partial \alpha _s}= & {} ({\mathbf {y}}_i-{\varvec{\mu }})^{\top }{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}+\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}+\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-2}\dot{{\mathbf {B}}}_s+\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-2}\dot{{\mathbf {B}}}_k\\&+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}). \end{aligned}$$
Appendix 2: Joint, conditional and marginal distributions of \(({\mathbf {Y}},U,T)\)
Note first that from (7), it follows that
with U and T independent, \({\varvec{\delta }}_u=\frac{{\varvec{\lambda }}}{\sqrt{u+{\varvec{\lambda }}^{\top }{\varvec{\lambda }}}}\), \({\varvec{\lambda }}_u={\varvec{\lambda }}/\sqrt{u}\).
Using some results given in Lachos et al. (2010), it follows that the joint distribution of \(({\mathbf {Y}},U,T)\) is given by
where \({\mathbf {A}}=\frac{{\varvec{\varSigma }}^{1/2}{\varvec{\delta }}_u}{u^{1/2}}\), \({\varvec{\varSigma }}_a=\frac{1}{u}{\varvec{\varSigma }}^{1/2}({\mathbf {I}}_p+{\varvec{\lambda }}_u{\varvec{\lambda }}_u^\top )^{-1}{\varvec{\varSigma }}^{1/2}\) and \(\varLambda =(1+{\mathbf {A}}^\top {\varvec{\varSigma }}_a^{-1}{\mathbf {A}})^{-1}\). Using the results given in Harville (1997), and after some algebraic manipulations, it follows that \({\varvec{\varSigma }}_a+{\mathbf {A}}{\mathbf {A}}^\top =\frac{1}{u}{\varvec{\varSigma }}\), \(\varLambda =\frac{u}{u+{\varvec{\lambda }}^\top {\varvec{\lambda }}}\) and \(\varLambda {\mathbf {A}}^\top {\varvec{\varSigma }}_a^{-1}=\varLambda ^{1/2}{\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}\).
Thus, the marginal distribution of \({\mathbf {Y}}\sim \mathrm{SSMN}_p({\varvec{\mu }},{\varvec{\varSigma }},{\varvec{\lambda }};H)\) is given by
Then the joint distribution of \(({\mathbf {Y}},T)\) is given by
and
so that, \(T|{\mathbf {Y}}={\mathbf {y}}\sim TN({\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }}),1;(0,+\infty ))\).
Rights and permissions
About this article
Cite this article
Ferreira, C.S., Lachos, V.H. & Bolfarine, H. Likelihood-based inference for multivariate skew scale mixtures of normal distributions. AStA Adv Stat Anal 100, 421–441 (2016). https://doi.org/10.1007/s10182-016-0266-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-016-0266-z