Appendix: multivariate skew distributions
Different versions of the multivariate skew–normal(SN) and skew-\(t\) (ST) distributions have been considered and used in the literature (Arellano-Valle and Genton 2005; Arellano-Valle et al. 2007; Azzalini and Capitanio 2003; Sahu et al. 2003 and others). A new class of distributions by introducing skewness in multivariate elliptically distributions were developed in publication (Sahu et al. 2003). The class, which is obtained by using transformation and conditioning, contains many standard families including the multivariate SN and ST distributions. For completeness, this appendix briefly summarizes the multivariate SN and ST distributions that will be used in defining the joint models with skew distributions considered in this paper. Assume an \(m\)-dimensional random vector \(\varvec{Y}\) follows an \(m\) variate SN or ST distribution with location vector \(\varvec{\mu },\,m \times m\) positive (diagonal) dispersion matrix \(\varvec{\Sigma }\) and \(m \times m\) skewness matrix \(\varvec{\Delta }(\varvec{\delta })=\textit{diag}(\delta _1, \delta _2,\ldots , \delta _m)\) or the degrees of freedom \(\nu \), where skewness parameter vector \(\varvec{\delta }=(\delta _1,\ldots ,\delta _m)^T\). In what follows, we briefly discuss multivariate SN and ST distributions introduced by Sahu et al. (2003), which are suitable for Bayesian inference since they are built using the conditional method. For detailed discussion on properties of SN and ST distributions, see reference (Sahu et al. 2003).
Skew-t distribution
An \(m\)-dimensional random vector \(\varvec{Y}\) follows an \(m\)-variate ST distribution if its probability density function (pdf) is given by
$$\begin{aligned} f(\varvec{y}|\varvec{\mu },\varvec{\Sigma },\varvec{\Delta }(\varvec{\delta }),\nu )= 2^m t_{m,\nu }(\varvec{y}|\varvec{\mu }, \varvec{A})P(\varvec{U}>\mathbf{0}), \end{aligned}$$
(15)
where \(\varvec{A}=\varvec{\Sigma }+\varvec{\Delta }^2(\varvec{\delta })\), we denote the \(m\)-variate \(t\) distribution with parameters \(\varvec{\mu },\,\varvec{A}\) and degrees of freedom \(\nu \) by \(t_{m,\nu }(\varvec{\mu }, \varvec{A})\) and the corresponding pdf by \(t_{m,\nu }(\varvec{y}|\varvec{\mu }, \varvec{A})\) henceforth, \(\varvec{U}\) follows the \(t\) distribution \(t_{m,\nu +m}(\cdot )\). We denote this distribution by \(ST_{m,\nu }(\varvec{\mu },\varvec{\Sigma },\varvec{\Delta }(\varvec{\delta }))\). In particular, when \(\varvec{\Sigma }=\sigma ^2 \varvec{I}_m\) and \(\varvec{\Delta }(\varvec{\delta })=\delta \varvec{I}_m\), equation (15) simplifies to
$$\begin{aligned} f(\varvec{y}|\varvec{\mu },\sigma ^2,\delta ,\nu )&= 2^m (\sigma ^2+\delta ^2)^{-m/2}\frac{\Gamma ((\nu +m)/2)}{\Gamma (\nu /2)(\nu \pi )^{m/2}}\nonumber \\&\quad \times \left\{ 1+\frac{(\varvec{y}-\varvec{\mu })^T(\varvec{y}-\varvec{\mu })}{\nu (\sigma ^2+\delta ^2)} \right\} ^{-(\nu +m)/2} \nonumber \\&\quad \times \, T_{m,\nu +m} \left[ \left\{ \frac{\nu +(\sigma ^2+\delta ^2)^{-1}(\varvec{y}-\varvec{\mu })^T(\varvec{y}-\varvec{\mu })}{\nu +m}\right\} ^{-1/2}\right. \nonumber \\&\qquad \qquad \qquad \qquad \qquad \left. \frac{\delta (\varvec{y}-\varvec{\mu })}{\sigma \sqrt{\sigma ^2+\delta ^2}}\right] \!, \end{aligned}$$
(16)
where \(T_{m,\nu +m}(\cdot )\) denotes the cumulative distribution function (cdf) of \(t_{m,\nu +m}(\mathbf{0},\varvec{I}_m)\). However, unlike the SN distribution below, the ST density can not be written as the product of univariate ST densities. Here \(\varvec{Y}\) are dependent but uncorrelated.
The mean and covariance matrix of the ST distribution \(ST_{m,\nu }(\varvec{\mu },\sigma ^2 \varvec{I}_m,\varvec{\Delta }(\varvec{\delta }))\) are given by
$$\begin{aligned} E(\varvec{Y})&= \varvec{\mu }+(\nu /\pi )^{1/2}\frac{\Gamma ((\nu -1)/2)}{\Gamma (\nu /2)}\varvec{\delta },\nonumber \\ cov(\varvec{Y})&= \left[ \sigma ^2\varvec{I}_m+\varvec{\Delta }^2(\varvec{\delta })\right] \frac{\nu }{\nu -2}-\frac{\nu }{\pi }\left[ \frac{\Gamma \{(\nu -1)/2\}}{\Gamma (\nu /2)}\right] ^2\varvec{\Delta }^2(\varvec{\delta }). \end{aligned}$$
(17)
It is noted that when \(\varvec{\delta }=\mathbf{0}\), the ST distribution reduces to the usual \(t\) distribution. In order to have a zero mean vector, we should assume the location parameter \(\varvec{\mu }=-(\nu /\pi )^{1/2}\frac{\Gamma ((\nu -1)/2)}{\Gamma (\nu /2)}\varvec{\delta }\), which is what we assume in the paper. In order to better understand the shape of an ST distribution, plots of an ST density as a function of the skewness parameter with \(\delta = -3, 0, 3\) are shown in Fig. 5a.
By the proposition of Sahu et al. (2003), the ST distribution of \(\varvec{Y}\) has a convenient stochastic representation as follows.
$$\begin{aligned} \varvec{Y}=\varvec{\mu }+\varvec{\Delta }(\varvec{\delta })|\varvec{X}_0|+\varvec{\Sigma }^{1/2}\varvec{X}_1, \end{aligned}$$
(18)
where \(\varvec{X}_0\) and \(\varvec{X}_1\) are two independent random vectors following \(t_{m,\nu }(\mathbf{0},\varvec{I}_m)\). Note that the expression (18) provides a convenient device for random number generation and for implementation purposes. Let \(\varvec{w}=|\varvec{X}_0|\); then \(\varvec{w}\) follows an \(m\)-dimensional standard \(t\) distribution \(t_{m,\nu }(\mathbf{0},\varvec{I}_m)\) truncated in the space \(\varvec{w}>\mathbf{0}\) (i.e., the standard half-\(t\) distribution). Thus, a hierarchical representation of (18) is given by
$$\begin{aligned} \varvec{Y}|\varvec{w}\sim t_{m,\nu +m}(\varvec{\mu }+\varvec{\Delta }(\varvec{\delta })\varvec{w}, \omega \varvec{\Sigma }),\; \varvec{w}\sim t_{m,\nu }(\mathbf{0},\varvec{I}_m)\varvec{I}(\varvec{w}>\mathbf{0}), \end{aligned}$$
(19)
where \(\omega =(\nu +\varvec{w}^T\varvec{w})/(\nu +m)\).
Skew-normal distribution
We briefly discuss a multivariate SN distribution introduced by Sahu et al. (2003) in this section. An \(m\)-dimensional random vector \(\varvec{Y}\) follows an \(m\)-variate SN distribution, if its pdf is given by
$$\begin{aligned} f(\varvec{y}|\varvec{\mu },\varvec{\Sigma },\varvec{\Delta }(\varvec{\delta }))= 2^m|\varvec{A}|^{-1/2}\phi _m \{\varvec{A}^{-1/2}(\varvec{y}-\varvec{\mu })\} P(\varvec{U}>\mathbf{0}), \end{aligned}$$
(20)
where \( \varvec{U}\sim N_m\{\varvec{\Delta }(\varvec{\delta }) \varvec{A}^{-1}(\varvec{y}-\varvec{\mu }), \varvec{I}_m-\varvec{\Delta }(\varvec{\delta }) \varvec{A}^{-1}\varvec{\Delta }(\varvec{\delta })\}\), and \(\phi _m (\cdot )\) is the pdf of \(N_m(\mathbf{0},\varvec{I}_m)\). We denote the above distribution by \(SN_m (\varvec{\mu },\varvec{\Sigma },\varvec{\Delta }(\varvec{\delta }))\). An appealing feature of equation (20) is that it gives independent marginal when \(\varvec{\Sigma }=\textit{diag}(\sigma ^2_1, \sigma ^2_2,\ldots , \sigma ^2_m)\). The pdf (20) thus reduces to
$$\begin{aligned} f(\varvec{y}|\varvec{\mu },\varvec{\Sigma },\varvec{\Delta }(\varvec{\delta }))=\prod _{i=1}^{m}\left[ \frac{2}{\sqrt{\sigma ^2_i+\delta ^2_i}}\phi \left\{ \frac{y_i-\mu _i}{\sqrt{\sigma ^2_i+\delta ^2_i}}\right\} \Phi \left\{ \frac{\delta _i}{\sigma _i}\frac{y_i-\mu _i}{\sqrt{\sigma ^2_i+\delta ^2_i}}\right\} \right] \!,\qquad \end{aligned}$$
(21)
where \(\phi (\cdot )\) and \(\Phi (\cdot )\) are the pdf and cdf of the standard normal distribution, respectively.
The mean and covariance matrix are given by \(E(\varvec{Y}) = \varvec{\mu }+\sqrt{2/\pi }\varvec{\delta },cov(\varvec{Y})=\varvec{\Sigma }+(1-2/\pi )\varvec{\Delta }^2(\varvec{\delta })\). It is noted that when \(\varvec{\delta }=\mathbf{0}\), the SN distribution reduces to usual normal distribution. In addition, the SN distribution is a special case of the ST distribution. That is, the ST distribution reduces to the SN distribution when the degrees of freedom \(\nu \rightarrow \infty \). In order to have a zero mean vector, we should assume the location parameter \(\varvec{\mu }=-\sqrt{2/\pi }\varvec{\delta }\). In order to better understand the shape of an SN distribution, plots of an SN density as a function of the skewness parameter with \(\delta = -3, 0\), and 3 are shown in Fig. 5b.
According to Sahu et al. (2003), if \(\varvec{Y}\) follows \(SN_m(\varvec{\mu },\varvec{\Sigma },\varvec{\Delta }(\varvec{\delta }))\), it can be expressed by a convenient stochastic representation as follows.
$$\begin{aligned} \varvec{Y}=\varvec{\mu }+\varvec{\Delta }|\varvec{X}_0|+\varvec{\Sigma }^{1/2}\varvec{X}_1, \end{aligned}$$
(22)
where \(\varvec{X}_0\) and \(\varvec{X}_1\) are two independent random vectors with \(N_m(\mathbf{0},\varvec{I}_m)\). Let \(\varvec{w}=|\varvec{X}_0|\); then, \(\varvec{w}\) follows an \(m\)-dimensional standard normal distribution \(N_m(\mathbf{0},\varvec{I}_m)\) truncated in the space \(\varvec{w}>\mathbf{0}\). Thus, a two-level hierarchical representation of (22) is given by
$$\begin{aligned} \varvec{Y}|\varvec{w}\sim N_m(\varvec{\mu }+\varvec{\Delta }\varvec{w}, \varvec{\Sigma }),\; \varvec{w}\sim N_m(\mathbf{0},\varvec{I}_m)\varvec{I}(\varvec{w}>\mathbf{0}). \end{aligned}$$
(23)