Skip to main content
Log in

Likelihood-based inference for multivariate skew scale mixtures of normal distributions

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

Scale mixtures of normal distributions are often used as a challenging class for statistical analysis of symmetrical data. Recently, Ferreira et al. (Stat Methodol 8:154–171, 2011) defined the univariate skew scale mixtures of normal distributions that offer much needed flexibility by combining both skewness with heavy tails. In this paper, we develop a multivariate version of the skew scale mixtures of normal distributions, with emphasis on the multivariate skew-Student-t, skew-slash and skew-contaminated normal distributions. The main virtue of the members of this family of distributions is that they are easy to simulate from and they also supply genuine expectation/conditional maximisation either algorithms for maximum likelihood estimation. The observed information matrix is derived analytically to account for standard errors. Results obtained from real and simulated datasets are reported to illustrate the usefulness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Andrews, D.F., Mallows, C.L.: Scale mixtures of normal distributions. J. R. Stat. Soc. Ser. B 36, 99–102 (1974)

    MathSciNet  MATH  Google Scholar 

  • Arellano-Valle, R.B., Bolfarine, H., Lachos, V.H.: Skew-normal linear mixed models. J. Data Sci. 3, 415–438 (2005)

    MATH  Google Scholar 

  • Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1985)

    MathSciNet  MATH  Google Scholar 

  • Azzalini, A., Capitanio, A.: Distributions generated and perturbation of symmetry with emphasis on the multivariate skew-\(t\) distribution. J. R. Stat. Soc. Ser. B 61, 367–389 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Azzalini, A., Dalla-Valle, A.: The multivariate skew-normal distribution. Biometrika 83(4), 715–726 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Azzalini, A., Capello, T.D., Kotz, S.: Log-skew-normal and log-skew-\(t\) distributions as models for family income data. J. Income Distrib. 11, 13–21 (2003)

    Google Scholar 

  • Bolfarine, H., Lachos, V.: Skew probit error-in-variables models. Stat. Methodol. 3, 1–12 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Branco, M.D., Dey, D.K.: A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 79, 99–113 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Cabral, C.R.B., Lachos, V.H., Prates, M.O.: Multivariate mixture modeling using skew-normal independent distributions. Comput. Stat. Data Anal. 56(1), 126–142 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Cabral, C.R.B., Lachos, V.H., Zeller, C.B.: Multivariate measurement error models using finite mixtures of skew-Student \(t\) distributions. J. Multivar. Anal. 124, 179–198 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  • Cook, R.D., Weisberg, S.: An Introduction to Regression Graphics. Wiley, Hoboken (1994)

    Book  MATH  Google Scholar 

  • Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  • Ferreira, C.S., Bolfarine, H., Lachos, V.H.: Skew scale mixtures of normal distributions: properties and estimation. Stat. Methodol. 8, 154–171 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Gómez, H.W., Venegas, O., Bolfarine, H.: Skew-symmetric distributions generated by the normal distribution function. Environmetrics 18, 395–407 (2007)

    Article  MathSciNet  Google Scholar 

  • Harville, D.: Matrix Algebra From a Statistician’s Perspective. Springer, New York (1997)

    Book  MATH  Google Scholar 

  • Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 1. Wiley, New York (1994)

    MATH  Google Scholar 

  • Lachos, V.H., Vilca, L.F., Bolfarine, H., Ghosh, P.: Robust multivariate measurement error models with scale mixtures of skew-normal distributions. Statistics 44(6), 541–556 (2009)

    Article  Google Scholar 

  • Lachos, V.H., Ghosh, P., Arellano-Valle, R.B.: Likelihood based inference for skew-normal independent linear mixed models. Stat. Sin. 20(1), 303 (2010)

    MathSciNet  MATH  Google Scholar 

  • Lange, K.L., Sinsheimer, J.S.: Normal/independent distributions and their applications in robust regression. J. Comput. Graph. Stat. 2, 175–198 (1993)

    MathSciNet  Google Scholar 

  • Lange, K.L., Little, R., Taylor, J.: Robust statistical modeling using \(t\) distribution. J. Am. Stat. Assoc. 84, 881–896 (1989)

    MathSciNet  Google Scholar 

  • Lin, T.I., Ho, H.J., Lee, C.R.: Flexible mixture modelling using the multivariate skew-\(t\)-normal distribution. Stat. Comput. 24, 531–546 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Little, R.J.A.: Robust estimation of the mean and covariance matrix from data with missing values. Appl. Stat. 37, 23–38 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, C., Rubin, D.B.: The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 80, 267–278 (1994)

    MathSciNet  MATH  Google Scholar 

  • Osorio, F., Paula, G.A., Galea, M.: Assessment of local influence in elliptical linear models with longitudinal structure. Comput. Stat. Data Anal. 51(9), 4354–4368 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2015). http://www.R-project.org/

  • Sahu, S.K., Dey, D.K., Branco, M.D.: A new class of multivariate distributions with applications to Bayesian regression models. Can. J. Stat. 31, 129–150 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, J., Boyer, J., Genton, M.: A skew-symmetric representation of multivariate distributions. Stat. Sin. 14, 1259–1270 (2004)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We thank the editor, associate editor and two referees whose constructive comments led to an improved presentation of the paper. C.S. acknowledges support from FAPEMIG (Minas Gerais State Foundation for Research Development), Grant CEX APQ 01845/14. V.H. acknowledges support from CNPq-Brazil (Grant 305054/2011-2) and FAPESP-Brazil (Grant 2014/02938-9).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Víctor H. Lachos.

Appendices

Appendix 1: Details of the observed information matrix

Considering \({\varvec{\alpha }}=\mathrm{Vech}({\mathbf {B}})\), where \({\varvec{\varSigma }}^{1/2}={\mathbf {B}}={\mathbf {B}}({\varvec{\alpha }})\), the first and second derivatives of \(\log |{\varvec{\varSigma }}|\), \(A_i\) and \(d_i\) are obtained. The notation used is that of Sect. 2 and for a p-dimensional vector \({\varvec{\rho }}=(\rho _1\ldots ,\rho _p)^{\top }\), we will use the notation \(\dot{{\mathbf {B}}}_r=\partial {{\mathbf {B}}({\varvec{\alpha }})}/\partial {\alpha _r}\), with \(r=1,2,\ldots ,p(p+1)/2\). Thus,

  • \({\varvec{\varSigma }}\)

    $$\begin{aligned} \frac{\partial ^2 \log {|{\varvec{\varSigma }}|}}{\partial \alpha _k\partial \alpha _s}=-2 \text {tr}({\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_{s}{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_{k}), \end{aligned}$$
  • \(A_i\)

    $$\begin{aligned} \frac{\partial A_i}{\partial {\varvec{\mu }}}= & {} -{\mathbf {B}}^{-1}{\varvec{\lambda }},\quad \frac{\partial A_i}{\partial \alpha _{k}}=-{\varvec{\lambda }}^{\top }{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\quad \frac{\partial A_i}{\partial {\varvec{\lambda }}}={\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}), \\ \frac{\partial ^2 A_i}{\partial {\varvec{\mu }}\partial {\varvec{\mu }}^{\top }}= & {} {\mathbf {0}},\quad \frac{\partial ^2 A_i}{\partial {\varvec{\mu }}\partial \alpha _k}={\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}{\varvec{\lambda }},\quad \frac{\partial ^2 A_i}{\partial {\varvec{\mu }}\partial {\varvec{\lambda }}^{\top }}=-{\mathbf {B}}^{-1},\\ \frac{\partial ^2 A_i}{\partial \alpha _k\partial \alpha _s}= & {} -{\varvec{\lambda }}^{\top }{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k+\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\\ \frac{\partial ^2 A_i}{\partial \alpha _k\partial {\varvec{\lambda }}}= & {} -{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\quad \frac{\partial ^2 A_i}{\partial {\varvec{\lambda }}\partial {\varvec{\lambda }}^{\top }}={\mathbf {0}}, \end{aligned}$$
  • \(d_i\)

    $$\begin{aligned} \frac{\partial d_i}{\partial {\varvec{\mu }}}= & {} -2{\mathbf {B}}^{-2}({\mathbf {y}}_i-{\varvec{\mu }}),\quad \frac{\partial d_i}{\partial \alpha _k}=-({\mathbf {y}}_i-{\varvec{\mu }})^{\top }{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\\ \frac{\partial d_i}{\partial {\varvec{\lambda }}}= & {} {\mathbf {0}}, \frac{\partial ^2 d_i}{\partial {\varvec{\mu }}\partial {\varvec{\mu }}^{\top }}=2{\mathbf {B}}^{-2},\quad \frac{\partial ^2 d_i}{\partial {\varvec{\mu }}\partial \alpha _k}=2{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}),\\ \frac{\partial ^2 d_i}{\partial {\varvec{\mu }}\partial {\varvec{\lambda }}^{\top }}= & {} {\mathbf {0}},\quad \frac{\partial ^2 d_i}{\partial \alpha _k\partial {\varvec{\lambda }}^{\top }}={\mathbf {0}},\quad \frac{\partial ^2 d_i}{\partial {\varvec{\lambda }}\partial {\varvec{\lambda }}^{\top }}={\mathbf {0}},\\ \frac{\partial ^2 d_i}{\partial \alpha _k\partial \alpha _s}= & {} ({\mathbf {y}}_i-{\varvec{\mu }})^{\top }{\mathbf {B}}^{-1} [\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}+\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}+\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-2}\dot{{\mathbf {B}}}_s+\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-2}\dot{{\mathbf {B}}}_k\\&+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k+{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_k{\mathbf {B}}^{-1}\dot{{\mathbf {B}}}_s]{\mathbf {B}}^{-1}({\mathbf {y}}_i-{\varvec{\mu }}). \end{aligned}$$

Appendix 2: Joint, conditional and marginal distributions of \(({\mathbf {Y}},U,T)\)

Note first that from (7), it follows that

$$\begin{aligned} \begin{array}{rcl} {\mathbf {Y}}|T=t, U= u&{}\sim &{} N_p({\varvec{\mu }}+ \frac{t}{u^{1/2}}{\varvec{\varSigma }}^{1/2}{\varvec{\delta }}_u,\frac{1}{u}{\varvec{\varSigma }}^{1/2}({\mathbf {I}}_p+{\varvec{\lambda }}_u{{\varvec{\lambda }}_u}^\top )^{-1}{\varvec{\varSigma }}^{1/2}),\\ U&{} \sim &{}H({\varvec{\tau }}),\quad T\sim TN(0,1;(0,+\infty )),\end{array} \end{aligned}$$
(23)

with U and T independent, \({\varvec{\delta }}_u=\frac{{\varvec{\lambda }}}{\sqrt{u+{\varvec{\lambda }}^{\top }{\varvec{\lambda }}}}\), \({\varvec{\lambda }}_u={\varvec{\lambda }}/\sqrt{u}\).

Using some results given in Lachos et al. (2010), it follows that the joint distribution of \(({\mathbf {Y}},U,T)\) is given by

$$\begin{aligned} f({\mathbf {y}},u,t)= & {} 2\phi _p\left( {\mathbf {y}}|{\varvec{\mu }}+{\mathbf {A}}t,{\varvec{\varSigma }}_a\right) \phi _1(t|0,1)h(u;{\varvec{\tau }})\\= & {} 2\phi _p({\mathbf {y}}|{\varvec{\mu }},{\varvec{\varSigma }}_a+{\mathbf {A}}{\mathbf {A}}^\top )\phi _1(t|\varLambda {\mathbf {A}}^\top {\varvec{\varSigma }}_a^{-1}({\mathbf {y}}-{\varvec{\mu }}),\varLambda )h(u;{\varvec{\tau }}),\\&{\mathbf {y}}\in {\mathbb {R}}^p,\;t >0,\;u>0, \end{aligned}$$

where \({\mathbf {A}}=\frac{{\varvec{\varSigma }}^{1/2}{\varvec{\delta }}_u}{u^{1/2}}\), \({\varvec{\varSigma }}_a=\frac{1}{u}{\varvec{\varSigma }}^{1/2}({\mathbf {I}}_p+{\varvec{\lambda }}_u{\varvec{\lambda }}_u^\top )^{-1}{\varvec{\varSigma }}^{1/2}\) and \(\varLambda =(1+{\mathbf {A}}^\top {\varvec{\varSigma }}_a^{-1}{\mathbf {A}})^{-1}\). Using the results given in Harville (1997), and after some algebraic manipulations, it follows that \({\varvec{\varSigma }}_a+{\mathbf {A}}{\mathbf {A}}^\top =\frac{1}{u}{\varvec{\varSigma }}\), \(\varLambda =\frac{u}{u+{\varvec{\lambda }}^\top {\varvec{\lambda }}}\) and \(\varLambda {\mathbf {A}}^\top {\varvec{\varSigma }}_a^{-1}=\varLambda ^{1/2}{\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}\).

Thus, the marginal distribution of \({\mathbf {Y}}\sim \mathrm{SSMN}_p({\varvec{\mu }},{\varvec{\varSigma }},{\varvec{\lambda }};H)\) is given by

$$\begin{aligned} f({\mathbf {y}})= & {} 2\int _0^{+\infty }\int _0^{+\infty }\phi _p \left( {\mathbf {y}}|{\varvec{\mu }},\frac{{\varvec{\varSigma }}}{u}\right) \phi _1 (t|\varLambda {\mathbf {A}}^\top {\varvec{\varSigma }}_a^{-1}({\mathbf {y}}-{\varvec{\mu }}),\varLambda )h(u;{\varvec{\tau }})\mathrm{d}t\mathrm{d}u\\= & {} 2\int _0^{+\infty }\phi _p\left( {\mathbf {y}}|{\varvec{\mu }},\frac{{\varvec{\varSigma }}}{u}\right) h(u;{\varvec{\tau }})\int _0^{+\infty }\phi _1(t|\varLambda ^{1/2}{\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }}),\varLambda )\mathrm{d}t\mathrm{d}u\\= & {} 2\int _0^{+\infty }\int _0^{+\infty }\phi _p\left( {\mathbf {y}}|{\varvec{\mu }},\frac{{\varvec{\varSigma }}}{u}\right) h(u;{\varvec{\tau }})\phi _1(t|{\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }}),1)\mathrm{d}t\mathrm{d}u\\= & {} 2\int _0^{+\infty }\phi _p\left( {\mathbf {y}}|{\varvec{\mu }},\frac{{\varvec{\varSigma }}}{u}\right) h(u;{\varvec{\tau }})\mathrm{d}u\varPhi _1({\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }})). \end{aligned}$$

Then the joint distribution of \(({\mathbf {Y}},T)\) is given by

$$\begin{aligned} f({\mathbf {y}},t)=2f_0({\mathbf {y}}|{\varvec{\mu }},{\varvec{\varSigma }})\phi _1 (t|{\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }}),1),\quad {\mathbf {y}}\in {\mathbb {R}}^p,\;t>0, \end{aligned}$$
(24)

and

$$\begin{aligned} f(t|{\mathbf {y}})=\frac{\phi _1(t|{\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2} ({\mathbf {y}}-{\varvec{\mu }}),1)}{\varPhi _1({\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }}))}, \end{aligned}$$
(25)

so that, \(T|{\mathbf {Y}}={\mathbf {y}}\sim TN({\varvec{\lambda }}^\top {\varvec{\varSigma }}^{-1/2}({\mathbf {y}}-{\varvec{\mu }}),1;(0,+\infty ))\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ferreira, C.S., Lachos, V.H. & Bolfarine, H. Likelihood-based inference for multivariate skew scale mixtures of normal distributions. AStA Adv Stat Anal 100, 421–441 (2016). https://doi.org/10.1007/s10182-016-0266-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-016-0266-z

Keywords

Navigation