Chapter 5: Matrix-Variate Gamma and Beta Distributions

Mathai, Arak; Provost, Serge; Haubold, Hans

doi:10.1007/978-3-030-95864-0_5

Arak Mathai⁴,
Serge Provost⁵ &
Hans Haubold⁶

4536 Accesses
3 Altmetric

Abstract

We first define the real matrix-variate gamma function, the gamma integral and the gamma density, wherefrom their counterparts in the complex domain are developed. An important particular case of the real matrix-variate gamma density known as the Wishart density is widely utilized in multivariate statistical analysis. Additionally, real and complex matrix-variate type-1 and type-2 beta density functions are defined. Various results pertaining to each of these distributions are then provided. More general structures are considered as well.

You have full access to this open access chapter, Download chapter PDF

A Note on the Product of Independent Beta Random Variables

Distributions of powers of the central beta matrix variates and applications

Article 25 November 2019

On the Beta-G Poisson Family

Article 18 August 2018

5.1. Introduction

The notations introduced in the preceding chapters will still be followed in this one. Lower-case letters such as x, y will be utilized to represent real scalar variables, whether mathematical or random. Capital letters such as X, Y will be used to denote vector/matrix random or mathematical variables. A tilde placed on top of a letter will indicate that the variables are in the complex domain. However, the tilde will be omitted in the case of constant matrices such as A, B. The determinant of a square matrix A will be denoted as |A| or det(A) and, in the complex domain, the absolute value or modulus of the determinant of B will be denoted as |det(B)|. Square matrices appearing in this chapter will be assumed to be of dimension p × p unless otherwise specified.

We will first define the real matrix-variate gamma function, gamma integral and gamma density, wherefrom their counterparts in the complex domain will be developed. A particular case of the real matrix-variate gamma density known as the Wishart density is widely utilized in multivariate statistical analysis. Actually, the formulation of this distribution in 1928 constituted a significant advance in the early days of the discipline. A real matrix-variate gamma function, denoted by Γ _p(α), will be defined in terms of a matrix-variate integral over a real positive definite matrix X > O. This integral representation of Γ _p(α) will be explicitly evaluated with the help of the transformation of a real positive definite matrix in terms of a lower triangular matrix having positive diagonal elements in the form X = TT ^′ where T = (t _ij) is a lower triangular matrix with positive diagonal elements, that is, t _ij = 0, i < j and t _jj > 0, j = 1, …, p. When the diagonal elements are positive, it can be shown that the transformation X = TT ^′ is unique. Its associated Jacobian is provided in Theorem 1.6.7. This result is now restated for ready reference: For a p × p real positive definite matrix X = (x _ij) > O,

$$\displaystyle \begin{aligned} X=TT'\Rightarrow \text{d}X=2^p\big\{\prod_{j=1}^pt_{jj}^{p+1-j}\big\}\,\text{d}T {} \end{aligned} $$

(5.1.1)

where T = (t _ij), t _ij = 0, i < j and t _jj > 0, j = 1, …, p. Consider the following integral representation of Γ _p(α) where the integral is over a real positive definite matrix X and the integrand is a real-valued scalar function of X:

$$\displaystyle \begin{aligned} \varGamma_p(\alpha)=\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(X)}\text{d}X. {} \end{aligned} $$

(5.1.2)

Under the transformation in (5.1.1),

$$\displaystyle \begin{aligned} |X|{}^{\alpha-\frac{p+1}{2}}\text{d}X&=\big\{\prod_{j=1}^p(t_{jj}^2)^{\alpha-\frac{p+1}{2}}\big\}2^p\big\{\prod_{j=1}^pt_{jj}^{p+1-j}\big\}\,\text{d}T\\ &=2^p\big\{\prod_{j=1}^p(t_{jj}^2)^{\alpha-\frac{j}{2}}\big\}\,\text{d}T.\end{aligned} $$

Observe that tr(X) = tr(TT′) = the sum of the squares of all the elements in T, which is $\sum _{j=1}^pt_{jj}^2+\sum _{i>j}t_{ij}^2$. By letting $t_{jj}^2=y_j\Rightarrow \text{d}t_{jj}=\frac {1}{2}y_j^{\frac {1}{2}-1}\text{d}y_j$, noting that t _jj > 0, the integral over t _jj gives

$$\displaystyle \begin{aligned}2\int_0^{\infty}(t_{jj}^2)^{\alpha-\frac{j}{2}}\text{e}^{-t_{jj}^2}\text{ d}t_{jj}=\varGamma\big(\alpha-\frac{j-1}{2}\big),~\Re\big(\alpha-\frac{j-1}{2}\big)>0,~j=1,\ldots ,p,\end{aligned}$$

the final condition being $\Re (\alpha )>\frac {p-1}{2}$. Thus, we have the gamma product $\varGamma (\alpha )\varGamma (\alpha -\frac {1}{2})\cdots \varGamma (\alpha -\frac {p-1}{2})$. Now for i > j, the integral over t _ij gives

$$\displaystyle \begin{aligned}\prod_{i>j}\int_{-\infty}^{\infty}\text{e}^{-t_{ij}^2}\text{d}t_{ij}=\prod_{i>j}\sqrt{\pi}=\pi^{\frac{p(p-1)}{4}}.\end{aligned}$$

Therefore

$$\displaystyle \begin{aligned} \varGamma_p(\alpha)&=\pi^{\frac{p(p-1)}{4}}\varGamma(\alpha)\varGamma\big(\alpha-\frac{1}{2}\big)\cdots \varGamma\big(\alpha-\frac{p-1}{2}\big), \ \Re(\alpha)>\frac{p-1}{2},\\ &=\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(X)}\text{d}X, \ \Re(\alpha)>\frac{p-1}{2}.{} \end{aligned} $$

(5.1.3)

For example,

$$\displaystyle \begin{aligned}\varGamma_2(\alpha)=\pi^{\frac{(2)(1)}{4}}\varGamma(\alpha)\varGamma\big(\alpha-\frac{1}{2}\big)=\pi^{\frac{1}{2}}\varGamma(\alpha)\varGamma\big(\alpha-\frac{1}{2}\big),\ \Re(\alpha)>\frac{1}{2}. \end{aligned}$$

This Γ _p(α) is known by different names in the literature. The first author calls it the real matrix-variate gamma function because of its association with a real matrix-variate gamma integral.

5.1a. The Complex Matrix-variate Gamma

In the complex case, consider a p × p Hermitian positive definite matrix $\tilde {X}=\tilde {X}^{*}>O$, where $\tilde {X}^{*}$ denotes the conjugate transpose of $\tilde {X}$. Let $\tilde {T}=(\tilde {t}_{ij})$ be a lower triangular matrix with the diagonal elements being real and positive. In this case, it can be shown that the transformation $\tilde {X}=\tilde {T}\tilde {T}^{*}$ is one-to-one. Then, as stated in Theorem 1.6a.7, the Jacobian is

$$\displaystyle \begin{aligned} \text{d}\tilde{X}=2^p\big\{\prod_{j=1}^pt_{jj}^{2(p-j)+1}\big\}\text{d}\,\tilde{T}. {} \end{aligned} $$

(5.1a.1)

With the help of (5.1a.1), we can evaluate the following integral over p × p Hermitian positive definite matrices where the integrand is a real-valued scalar function of $\tilde {X}$. We will denote the integral by $\tilde {\varGamma }_p(\alpha )$, that is,

$$\displaystyle \begin{aligned} \tilde{\varGamma}_p(\alpha)=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{e}^{-\text{tr}(\tilde{X})}\text{d}\tilde{X}. {} \end{aligned} $$

(5.1a.2)

Let us evaluate the integral in (5.1a.2) by making use of (5.1a.1). Parallel to the real case, we have

$$\displaystyle \begin{aligned} |\text{det}(\tilde{X})|{}^{\alpha-p}\text{d}\tilde{X}&=\big\{\prod_{i=1}^p(t_{jj}^2)^{\alpha-p}\big\}2^p\big\{\prod_{j=1}^pt_{jj}^{2(p-j)+1}\big\}\text{d}\tilde{T}\\ &=\big\{\prod_{j=1}^p2(t_{jj}^2)^{\alpha-j+\frac{1}{2}}\big\}\text{d}\tilde{T}. \end{aligned} $$

As well,

$$\displaystyle \begin{aligned}\text{e}^{-\text{tr}(\tilde{X})}=\text{e}^{-\sum_{j=1}^pt_{jj}^2-\sum_{i>j}|\tilde{t}_{ij}|{}^2}.\end{aligned}$$

Since t _jj is real and positive, the integral over t _jj gives the following:

$$\displaystyle \begin{aligned}2\int_0^{\infty}(t_{jj}^2)^{\alpha-j+\frac{1}{2}}\text{e}^{-t_{jj}^2}\text{d}t_{jj}=\varGamma(\alpha-(j-1)),~\Re(\alpha-(j-1))>0,~j=1,\ldots ,p, \end{aligned}$$

the final condition being $\Re (\alpha )>p-1$. Note that the absolute value of $\tilde {t}_{ij}$, namely, $|\tilde {t}_{ij}|$ is such that $|\tilde {t}_{ij}|{ }^2=t_{ij1}^2+t_{ij2}^2$ where $\tilde {t}_{ij}=t_{ij1}+it_{ij2}$ with t _ij1, t _ij2 real and $i=\sqrt {(-1)}$. Thus,

$$\displaystyle \begin{aligned} \prod_{i>j}\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\text{e}^{-(t_{ij1}^2+t_{ij2}^2)}\text{d}t_{ij1}\wedge\text{ d}t_{ij2}=\prod_{i>j}\pi=\pi^{\frac{p(p-1)}{2}}.\end{aligned}$$

Then

$$\displaystyle \begin{aligned} \tilde{\varGamma}_p(\alpha)&=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{e}^{-\text{ tr}(\tilde{X})}\text{d}\tilde{X},~\Re(\alpha)>\frac{p-1}{2},{}\\ &=\pi^{\frac{p(p-1)}{2}}\varGamma(\alpha)\varGamma(\alpha-1)\cdots \varGamma(\alpha-(p-1)),~\Re(\alpha)>p-1. \end{aligned} $$

(5.1a.3)

We will refer to $\tilde {\varGamma }_p(\alpha )$ as the complex matrix-variate gamma because of its association with a complex matrix-variate gamma integral. As an example, consider

$$\displaystyle \begin{aligned}\tilde{\varGamma}_2(\alpha)=\pi^{\frac{(2)(1)}{2}}\varGamma(\alpha)\varGamma(\alpha-1)=\pi\varGamma(\alpha)\varGamma(\alpha-1),~\Re(\alpha)>1.\end{aligned}$$

5.2. The Real Matrix-variate Gamma Density

In view of (5.1.3), we can define a real matrix-variate gamma density with shape parameter α as follows, where X is p × p real positive definite matrix:

$$\displaystyle \begin{aligned} f_1(X)=\begin{cases}\frac{1}{\varGamma_p(\alpha)}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(X)},~X>O,~\Re(\alpha)>\frac{p-1}{2}\\ 0,\mbox{ elsewhere.}\end{cases}{} \end{aligned} $$

(5.2.1)

Example 5.2.1

Let

where x ₁₁, x ₁₂, x ₂₂, x ₁, x ₂, y ₂, x ₃ are all real scalar variables, $i=\sqrt {(-1)}$, $x_{22}>0,~ x_{11}x_{22}-x_{12}^2>0$. While these are the conditions for the positive definiteness of the real matrix X, $x_1>0,~x_3>0,~ x_1x_3-(x_2^2+y_2^2)>0$ are the conditions for the Hermitian positive definiteness of $\tilde {X}$. Let us evaluate the following integrals, subject to the previously specified conditions on the elements of the matrix:

$$\displaystyle \begin{aligned} (1):~~\delta_1&=\int_{X>O}\text{e}^{-(x_{11}+x_{22})}\text{d}x_{11}\wedge\text{d}x_{12}\wedge\text{d}x_{22}\\ (2):~~\delta_2&=\int_{\tilde{X}>O}\text{e}^{-(x_1+x_3)}\text{d}x_1\wedge\text{d}(x_2+iy_2)\wedge\text{d}x_3\\ (3):~~\delta_3&=\int_{X>O}|X|\text{e}^{-(x_{11}+x_{22})}\text{d}x_{11}\wedge\text{d}x_{12}\wedge\text{d}x_{22}\\ (4):~~\delta_4&=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^2\text{e}^{-(x_1+x_3)}\text{d}x_1\wedge\text{d}(x_2+iy_2)\wedge\text{d}x_3. \end{aligned} $$

Solution 5.2.1

(1): Observe that δ ₁ can be evaluated by treating the integral as a real matrix-variate integral, namely,

$$\displaystyle \begin{aligned}\delta_1=\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(X)}\text{d}X\ \mbox{ with }p=2,~ \frac{p+1}{2}=\frac{3}{2},~\alpha=\frac{3}{2}, \end{aligned}$$

and hence the integral is

$$\displaystyle \begin{aligned}\varGamma_2({3}/{2})=\pi^{\frac{2(1)}{4}}\varGamma({3}/{2})\varGamma(1)=\pi^{{1}/{2}}({1}/{2})\varGamma({1}/{2})=\frac{\pi}{2}. \end{aligned}$$

This result can also be obtained by direct integration as a multiple integral. In this case, the integration has to be done under the conditions $x_{11}>0,~x_{22}>0,~ x_{11}x_{22}-x_{12}^2>0$, that is, $x_{12}^2<x_{11}x_{22}$ or $-\sqrt {x_{11}x_{22}}<x_{12}<\sqrt {x_{11}x_{22}}$. The integral over x ₁₂ yields $\int _{-\sqrt {x_{11}x_{22}}}^{\sqrt {x_{11}x_{22}}}\text{d}x_{12}=2\sqrt {x_{11}x_{22}}$, that over x ₁₁ then gives

$$\displaystyle \begin{aligned}2\int_{x_{11}=0}^{\infty}\sqrt{x_{11}}\text{e}^{-{x_{11}}}\text{d}x_{11}=2\int_0^{\infty}x_{11}^{\frac{3}{2}-1}\text{e}^{-x_{11}}\text{ d}x_{11}=2\varGamma({3}/{2})=\pi^{\frac{1}{2}}, \end{aligned}$$

and on integrating with respect to x ₂₂, we have

$$\displaystyle \begin{aligned}\int_0^{\infty}\sqrt{x_{22}}\text{e}^{-x_{22}}\text{d}x_{22}=\frac{1}{2}\pi^{\frac{1}{2}}, \end{aligned}$$

so that $\delta _1=\frac {1}{2}\sqrt {\pi }\sqrt {\pi }=\frac {\pi }{2}$.

(2): On observing that δ ₂ can be viewed as a complex matrix-variate integral, it is seen that

$$\displaystyle \begin{aligned}\delta_2=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{2-2}\text{e}^{-\text{tr}(\tilde{X})}\text{ d}\tilde{X}=\tilde{\varGamma}_2(2)=\pi^{\frac{2(1)}{2}}\varGamma(2)\varGamma(1)=\pi. \end{aligned}$$

This answer can also be obtained by evaluating the multiple integral. Since $\tilde {X}>O$, we have $x_1>0,~x_3>0,~ x_1x_3-(x_2^2+y_2^2)>0$, that is, $x_1>\frac {(x_2^2+y_2^2)}{x_3}$. Integrating first with respect to x ₁ and letting $y=x_1-\frac {(x_2^2+y_2^2)}{x_3}$, we have

$$\displaystyle \begin{aligned}\int_{x_1>\frac{(x_2^2+y_2^2)}{x_3}}\text{e}^{-x_1}\text{d }x_1=\int_{y=0}^{\infty}\text{e}^{-y-\frac{(x_2^2+y_2^2)}{x_3}}\text{d}y=\text{ e}^{-\frac{(x_2^2+y_2^2)}{x_3}}. \end{aligned}$$

Now, the integrals over x ₂ and y ₂ give

$$\displaystyle \begin{aligned}\int_{-\infty}^{\infty}\text{e}^{-\frac{x_2^2}{x_3}}\text{d}x_2=\sqrt{x_3}\int_{-\infty}^{\infty}\text{e}^{-u^2}\text{d}u=\sqrt{x_3}\sqrt{\pi}\ \ \mbox{and}\ \ \int_{-\infty}^{\infty}\text{e}^{-\frac{y_2^2}{x_3}}\text{d}y_2=\sqrt{x_3}\sqrt{\pi}, \end{aligned}$$

that with respect to x ₃ then yielding

$$\displaystyle \begin{aligned}\int_0^{\infty}x_3\text{e}^{-x_3}\text{d}x_3=\varGamma(2)=1, \end{aligned}$$

so that $\delta _2=(1)\sqrt {\pi }\sqrt {\pi }=\pi .$

(3): Observe that δ ₃ can be evaluated as a real matrix-variate integral. Then

$$\displaystyle \begin{aligned} \delta_3&=\int_{X>O}|X|\text{e}^{-\text{tr}(X)}\text{d}X=\int_{X>O}|X|{}^{\frac{5}{2}-\frac{3}{2}}\text{e}^{-\text{tr}(X)}\text{d}X,~\text{with}~ \frac{p+1}{2}=\frac{3}{2}~\text{as}~p=2\\ &=\varGamma_2({5}/{2})=\pi^{\frac{2(1)}{4}}\varGamma({5}/{2})\varGamma({4}/{2})=\pi^{\frac{1}{2}}({3}/{2})({1}/{2})\pi^{{1}/{2}}(1)\\ &=\frac{3}{4}\pi.\end{aligned} $$

Let us proceed by direct integration:

$$\displaystyle \begin{aligned} \delta_3&=\int_{X>O}[x_{11}x_{22}-x_{12}^2]\text{e}^{-(x_{11}+x_{22})}\text{d}x_{11}\wedge\text{d}x_{12}\wedge\text{d}x_{22}\\ &=\int_{X>O}x_{22}\Big[x_{11}-\frac{x_{12}^2}{x_{22}}\Big]\text{e}^{-(x_{11}+x_{22})}\text{d}x_{11}\wedge\text{d}x_{12}\wedge\text{d}x_{22};\end{aligned} $$

letting $y=x_{11}-\frac {x_{12}^2}{x_{22}}$, the integral over x ₁₁ yields

$$\displaystyle \begin{aligned}\int_{x_{11}>\frac{x_{12}^2}{x_{22}}}\Big[x_{11}-\frac{x_{12}^2}{x_{22}}\Big]\text{e}^{-x_{11}}\text{d}x_{11}=\int_{y=0}^{\infty}y\text{ e}^{-y-\frac{x_{12}^2}{x_{22}}}\text{d}y=\text{e}^{-\frac{x_{12}^2}{x_{22}}}. \end{aligned}$$

Now, the integral over x ₁₂ gives $\sqrt {x_{22}}\sqrt {\pi }$ and finally, that over x ₂₂ yields

$$\displaystyle \begin{aligned}\int_{x_{22}>0}x_{22}^{\frac{3}{2}}\text{e}^{-x_{22}}\text{d}x_{22}=\varGamma({5}/{2})=({3}/{2})({1}/{2})\sqrt{\pi}=\frac{3}{4}\pi. \end{aligned}$$

(4): Noting that we can treat δ ₄ as a complex matrix-variate integral, we have

$$\displaystyle \begin{aligned} \delta_4&=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^2\text{e}^{-\text{tr}(\tilde{X})}\text{d}\tilde{X}=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{4-2}\text{e}^{-\text{ tr}(\tilde{X})}\text{d}\tilde{X}=\tilde{\varGamma}_2(4),~\alpha=4,~p=2,\\ &=\pi^{\frac{2(1)}{2}}\varGamma(4)\varGamma(3)=\pi(3!)(2!)=12\pi.\end{aligned} $$

Direct evaluation will be challenging in this case as the integrand involves $|\text{det}(\tilde {X})|{ }^2$.

If a scale parameter matrix B > O is to be introduced in (5.2.1), then consider $\text{tr}(BX)=\text{tr}(B^{\frac {1}{2}}XB^{\frac {1}{2}})$ where $B^{\frac {1}{2}}$ is the positive definite square root of the real positive definite constant matrix B. On applying the transformation $Y=B^{\frac {1}{2}}XB^{\frac {1}{2}}\Rightarrow \text{d}X=|B|{ }^{-\frac {(p+1)}{2}}\text{d}Y$, as stated in Theorem 1.6.5, we have

$$\displaystyle \begin{aligned} \int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(BX)}\text{d}X&=\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(B^{\frac{1}{2}}XB^{\frac{1}{2}})}\text{ d}X\\ &=|B|{}^{-\alpha}\int_{Y>O}|Y|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(Y)}\text{d}Y\\ &=|B|{}^{-\alpha}\varGamma_p(\alpha).{} \end{aligned} $$

(5.2.2)

This equality brings about two results. First, the following identity which will turn out to be very handy in many of the computations:

$$\displaystyle \begin{aligned} |B|{}^{-\alpha}\equiv \frac{1}{\varGamma_p(\alpha)}\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(BX)}\text{d}X,~B>O,~ \Re(\alpha)>\frac{p-1}{2}. {} \end{aligned} $$

(5.2.3)

As well, the following two-parameter real matrix-variate gamma density with shape parameter α and scale parameter matrix B > O can be constructed from (5.2.2):

$$\displaystyle \begin{aligned} f(X)=\begin{cases}\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(BX)},~ X>O,~B>O,\ \Re(\alpha)>\frac{p-1}{2}\\ 0,\mbox{ elsewhere}.\end{cases} {} \end{aligned} $$

(5.2.4)

5.2.1. The mgf of the real matrix-variate gamma distribution

Let us determine the mgf associated with the density given in (5.2.4), that is, the two-parameter real matrix-variate gamma density. Observing that X = X ^′, let T be a symmetric p × p real positive definite parameter matrix. Then, noting that

$$\displaystyle \begin{aligned} \text{tr}(TX)=\sum_{j=1}^pt_{jj}x_{jj}+2\sum_{i>j}t_{ij}x_{ij}, \end{aligned}$$

(i)

it is seen that the non-diagonal elements in X multiplied by the corresponding parameters will have twice the weight of the diagonal elements multiplied by the corresponding parameters. For instance, consider the 2 × 2 case:

(ii)

where α ₁ and α ₂ represent elements that are not involved in the evaluation of the trace. Note that due to the symmetry of T and X, t ₂₁ = t ₁₂ and x ₂₁ = x ₁₂, so that the cross product term t ₁₂ x ₁₂ in (ii) appears twice whereas each of the terms t ₁₁ x ₁₁ and t ₂₂ x ₂₂ appear only once.

However, in order to be consistent with the mgf in a real multivariate case, each variable need only be multiplied once by the corresponding parameter, the mgf being then obtained by taking the expected value of the resulting exponential sum. Accordingly, the parameter matrix has to be modified as follows: let _∗ T = (_∗ t _ij) where ${{ }_{*}t_{jj}}=t_{jj},~ {{ }_{*}t_{ij}}=\frac {1}{2}t_{ij},~ i\ne j,$ and t _ij = t _ji for all i and j or, in other words, the non-diagonal elements of the symmetric matrix T are weighted by $\frac {1}{2}$, such a matrix being denoted as _∗ T. Then,

$$\displaystyle \begin{aligned}\text{tr}({{}_{*}T}X)=\sum_{i,j}t_{ij}x_{ij}, \end{aligned}$$

and the mgf in the real matrix-variate two-parameter gamma density, denoted by M _X(_∗ T), is the following:

$$\displaystyle \begin{aligned} M_X({{}_{*}T})&=E[\text{e}^{\text{tr}({{}_{*}T}X)}]\\ &=\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{\text{tr}({}_{*}TX-BX)}\text{d}X.\end{aligned} $$

Now, since

$$\displaystyle \begin{aligned}\text{tr}(BX-{{}_{*}T}X)=\text{tr}((B-{{}_{*}T})^{\frac{1}{2}}X(B-{{}_{*}T})^{\frac{1}{2}}) \end{aligned}$$

for (B −_∗ T) > O, that is, $~ (B-{{ }_{*}T})^{\frac {1}{2}}>O,$ which means that $Y=(B-{{ }_{*}T})^{\frac {1}{2}}X(B-{{ }_{*}T})^{\frac {1}{2}}\Rightarrow \text{ d}X=|B-{{ }_{*}T})|{ }^{-(\frac {p+1}{2})}\text{d}Y$, we have

$$\displaystyle \begin{aligned} M_X({}_{*}T)&=\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}((B-_{*}T)X)}\text{d}X\\ &=\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}|B-{{}_{*}T}|{}^{-\alpha}\int_{Y>O}|Y|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(Y)}\text{d}Y\\ &=|B|{}^{\alpha}|B-{{}_{*}T}|{}^{-\alpha}\\ &=|I-B^{-1}{{}_{*}T}|{}^{-\alpha}\ \mbox{ for } I-B^{-1}{{}_{*}T}>O.{} \end{aligned} $$

(5.2.5)

When _∗ T is replaced by −_∗ T, (5.2.5) gives the Laplace transform of the two-parameter gamma density in the real matrix-variate case as specified by (5.2.4), which is denoted by L _f(_∗ T), that is,

$$\displaystyle \begin{aligned} L_f({{}_{*}T})=M_X(-{{}_{*}T})=|I+B^{-1}{{}_{*}T}|{}^{-\alpha}\ \mbox{ for }I+B^{-1}{{}_{*}T}>O. {} \end{aligned} $$

(5.2.6)

For example, if

then |B| = 5 and

If _∗ T is partitioned into sub-matrices and X is partitioned accordingly as

(iii)

where _∗ T ₁₁ and X ₁₁ are r × r, r ≤ p, then what can be said about the densities of the diagonal blocks X ₁₁ and X ₂₂? The mgf of X ₁₁ is available from the definition by letting _∗ T ₁₂ = O,_∗ T ₂₁ = O and _∗ T ₂₂ = O, as then $E[\text{e}^{\text{tr}({{ }_{*}T}X)}]=E[\text{ e}^{\text{tr}({{ }_{*}T}_{11}X_{11})}].$ However, B ⁻¹ _∗ T is not positive definite since B ⁻¹ _∗ T is not symmetric, and thereby I − B ⁻¹ _∗ T cannot be positive definite when _∗ T ₁₂ = O,_∗ T ₂₁ = O,_∗ T ₂₂ = O. Consequently, the mgf of X ₁₁ cannot be determined from (5.2.6). As an alternative, we could rewrite (5.2.6) in the symmetric format and then try to evaluate the density of X ₁₁. As it turns out, the densities of X ₁₁ and X ₂₂ can be readily obtained from the mgf in two situations: either when B = I or B is a block diagonal matrix, that is,

(iv)

Hence we have the following results:

Theorem 5.2.1

Let the p × p matrices X > O and _∗ T > O be partitioned as in (iii). Let X have a p × p real matrix-variate gamma density with shape parameter α and scale parameter matrix I _p . Then X ₁₁ has an r × r real matrix-variate gamma density and X ₂₂ has a (p − r) × (p − r) real matrix-variate gamma density with shape parameter α and scale parameters I _r and I _p−r, respectively.

Theorem 5.2.2

Let X be partitioned as in (iii). Let the p × p real positive definite parameter matrix B > O be partitioned as in (iv). Then X ₁₁ has an r × r real matrix-variate gamma density with the parameters (α and B ₁₁ > O) and X ₂₂ has a (p − r) × (p − r) real matrix-variate gamma density with the parameters (α and B ₂₂ > O).

Theorem 5.2.3

Let X be partitioned as in (iii). Then X ₁₁ and X ₂₂ are statistically independently distributed under the restrictions specified in Theorems 5.2.1 and 5.2.2.

In the general case of B, write the mgf as M _X(_∗ T) = |B|^α|B −_∗ T|^−α, which corresponds to a symmetric format. Then, when ,

which is obtained by making use of the representations of the determinant of a partitioned matrix, which are available from Sect. 1.3. Now, on comparing the last line with the first one, it is seen that X ₁₁ has a real matrix-variate gamma distribution with shape parameter α and scale parameter matrix $B_{11}-B_{12}B_{22}^{-1}B_{21}$. Hence, the following result:

Theorem 5.2.4

If the p × p real positive definite matrix has a real matrix-variate gamma density with the shape parameter α and scale parameter matrix B and if X and B are partitioned as in (iii), then X ₁₁ has a real matrix-variate gamma density with shape parameter α and scale parameter matrix $B_{11}-B_{12}B_{22}^{-1}B_{21}$ , and the sub-matrix X ₂₂ has a real matrix-variate gamma density with shape parameter α and scale parameter matrix $B_{22}-B_{21}B_{11}^{-1}B_{12}$.

5.2a. The Matrix-variate Gamma Function and Density, Complex Case

Let $\tilde {X}=\tilde {X}^{*}>O$ be a p × p Hermitian positive definite matrix. When $\tilde {X}$ is Hermitian, all its diagonal elements are real and hence $\text{ tr}(\tilde {X})$ is real. Let $\text{det}(\tilde {X})$ denote the determinant and $|\text{det}(\tilde {X})|$ denote the absolute value of the determinant of $\tilde {X}$. As a result, $|\text{det}(\tilde {X})|{ }^{\alpha -p}\,\text{e}^{-\text{tr}(\tilde {X})}$ is a real-valued scalar function of $\tilde {X}$. Let us consider the following integral, denoted by $\tilde {\varGamma }_p(\alpha )$:

$$\displaystyle \begin{aligned} \tilde{\varGamma}_p(\alpha)=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{\alpha-p}\,\text{e}^{-\text{tr}(\tilde{X})}\text{d}\tilde{X}, {} \end{aligned} $$

(5.2a.1)

which was evaluated in Sect. 5.1a. In fact, (5.1a.3) provides two representations of the complex matrix-variate gamma function $\tilde {\varGamma }_p(\alpha )$. With the help of (5.1a.3), we can define the complex p × p matrix-variate gamma density as follows:

$$\displaystyle \begin{aligned} \tilde{f}_1(\tilde{X})=\begin{cases}\frac{1}{\tilde{\varGamma}_p(\alpha)}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{e}^{-\text{tr}(\tilde{X})},~\tilde{X}>O,\ \Re(\alpha)>p-1\\ 0,\ \mbox{ elsewhere}.\end{cases} {} \end{aligned} $$

(5.2a.2)

For example, let us examine the 2 × 2 complex matrix-variate case. Let $\tilde {X}$ be a matrix in the complex domain, $\bar {\tilde {X}}$ denoting its complex conjugate and $\tilde {X}^{*},$ its conjugate transpose. When $\tilde {X}=\tilde {X}^{*},$ the matrix is Hermitian and its diagonal elements are real. In the 2 × 2 Hermitian case, let

Then, the determinants are

$$\displaystyle \begin{aligned} \text{det}(\tilde{X})&=x_1x_3-(x_2-iy_2)(x_2+iy_2)=x_1x_3-(x_2^2+y_2^2)\\ &=\text{det}(\tilde{X}^{*}),~ x_1>0,~x_3>0,~x_1x_3-(x_2^2+y_2^2)>0,\end{aligned} $$

due to Hermitian positive definiteness of $\tilde {X}$. As well,

$$\displaystyle \begin{aligned}|\text{det}(\tilde{X})|=+[(\text{det}(\tilde{X})(\text{det}(\tilde{X}^{*}))]^{\frac{1}{2}}=x_1x_3-(x_2^2+y_2^2)>0. \end{aligned}$$

Note that $\text{tr}(\tilde {X})=x_1+x_3$ and $\tilde {\varGamma }_2(\alpha )=\pi ^{\frac {2(1)}{2}}\varGamma (\alpha )\varGamma (\alpha -1),~ \Re (\alpha )>1,~ p=2$. The density is then of the following form:

$$\displaystyle \begin{aligned} f_1(\tilde{X})&=\frac{1}{\tilde{\varGamma}_2(\alpha)}|\text{det}(\tilde{X})|{}^{\alpha-2}\text{e}^{-\text{tr}(\tilde{X})}\\ &=\frac{1}{\pi \varGamma(\alpha)\varGamma(\alpha-1)}[x_1x_3-(x_2^2+y_2^2)]^{\alpha-2}\text{e}^{-(x_1+x_3)}\end{aligned} $$

for $x_1>0,~x_3>0,~ x_1x_3-(x_2^2+y_2^2)>0,~ \Re (\alpha )>1$, and $f_1(\tilde {X})=0$ elsewhere.

Now, consider a p × p parameter matrix $\tilde {B}>O$. We can obtain the following identity corresponding to the identity in the real case:

$$\displaystyle \begin{aligned} |\text{det}(\tilde{B})|{}^{-\alpha}\equiv \frac{1}{\tilde{\varGamma}_p(\alpha)}\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{e}^{-\text{ tr}(\tilde{B}\tilde{X})}\text{d}\tilde{X},~ \Re(\alpha)>p-1. {} \end{aligned} $$

(5.2a.3)

A two-parameter gamma density in the complex domain can then be derived by proceeding as in the real case; it is given by

$$\displaystyle \begin{aligned} \tilde{f}(\tilde{X})=\begin{cases}\frac{|\text{det}(\tilde{B})|{}^{\alpha}}{\tilde{\varGamma}_p(\alpha)}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{e}^{-\text{ tr}(\tilde{B}\tilde{X})},\ \tilde{B}>O,~\tilde{X}>O,~ \Re(\alpha)>p-1\\ 0,\mbox{ elsewhere.} \end{cases}{} \end{aligned} $$

(5.2a.4)

5.2a.1. The mgf of the complex matrix-variate gamma distribution

The moment generating function in the complex domain is slightly different from that in the real case. Let $\tilde {T}>O$ be a p × p parameter matrix and let $\tilde {X}$ be p × p two-parameter gamma distributed as in (5.2a.4). Then $\tilde {T}=T_1+iT_2$ and $\tilde {X}=X_1+iX_2$, with T ₁, T ₂, X ₁, X ₂ real and $i=\sqrt {(-1)}$. When $\tilde {T}$ and $\tilde {X}$ are Hermitian positive definite, T ₁ and X ₁ are real symmetric and T ₂ and X ₂ are real skew symmetric. Then consider

$$\displaystyle \begin{aligned}\text{tr}(\tilde{T}^{*}\tilde{X})=\text{tr}(T_1X_1)+\text{tr}(T_2X_2)+i[\text{tr}(T_1X_2)-\text{tr}(T_2X_1)]. \end{aligned}$$

Note that tr(T ₁ X ₁) + tr(T ₂ X ₂) contains all the real variables involved multiplied by the corresponding parameters, where the diagonal elements appear once and the off-diagonal elements each appear twice. Thus, as in the real case, $\tilde {T}$ has to be replaced by ${{ }_{*}\tilde {T}}={{ }_{*}T}_1+i{{ }_{*}T}_2$. A term containing i still remains; however, as a result of the following properties, this term will disappear.

Lemma 5.2a.1

Let $\tilde {T},~ \tilde {X},~ T_1,~T_2,~X_1,~X_2$ be as defined above. Then, tr(T ₁ X ₂) = 0, tr(T ₂ X ₁) = 0, tr(_∗ T ₁ X ₂) = 0, tr(_∗ T ₂ X ₁) = 0.

Proof

For any real square matrix A, tr(A) = tr(A′) and for any two matrices A and B where AB and BA are defined, tr(AB) = tr(BA). With the help of these two results, we have the following:

$$\displaystyle \begin{aligned} \text{tr}(T_1X_2)=\text{tr}(T_1X_2)^{\prime}=\text{tr}(X_2^{\prime}T_1^{\prime})=-\text{tr}(X_2T_1)=-\text{tr}(T_1X_2) \end{aligned}$$

as T ₁ is symmetric and X ₂ is skew symmetric. Now, tr(T ₁ X ₂) = −tr(T ₁ X ₂) ⇒tr(T ₁ X ₂) = 0 since it is a real quantity. It can be similarly established that the other results stated in the lemma hold.

We may now define the mgf in the complex case, denoted by $M_{\tilde {X}}({ }_{*}T)$, as follows:

$$\displaystyle \begin{aligned} M_{\tilde{X}}({}_{*}\tilde{T})&=E[\text{e}^{\text{tr}({}_{*}\tilde{T}^{*}\tilde{X})}]=\int_{\tilde{X}>O}\text{e}^{\text{ tr}({}_{*}\tilde{T}^{*}\tilde{X})}\tilde{f}(\tilde{X})\text{d}\tilde{X}\\ &=\frac{|\text{det}(\tilde{B})|{}^{\alpha}}{\tilde{\varGamma}_p(\alpha)}\int_{\tilde{X}>O}\text{ e}^{-\text{tr}(\tilde{B}-_{*}\tilde{T}^{*})\tilde{X}}\text{d}\tilde{X}. \end{aligned} $$

Since $\text{tr}(\tilde {X}(\tilde {B}-{{ }_{*}\tilde {T}}^{*}))=\text{ tr}(C\tilde {X}C^{*})$ for $C=(\tilde {B}-{{ }_{*}\tilde {T}}^{*})^{\frac {1}{2}}$ and C > O, it follows from Theorem 1:6a.5 that $\tilde {Y}=C\tilde {X}C^{*} \Rightarrow \text{d}\tilde {Y}=|\text{det}(CC^{*})|{ }^{p}\,\text{d}\tilde {X}$, that is, $\text{d}\tilde {X}=|\text{det}(\tilde {B}-{{ }_{*}\tilde {T}}^{*})|{ }^{-p}\text{ d}\tilde {Y}$ for $\tilde {B}-{{ }_{*}\tilde {T}}^{*}>O$. Then,

$$\displaystyle \begin{aligned} M_{\tilde{X}}({{}_{*}\tilde{T}})&=\frac{|\text{det}(\tilde{B})|{}^{\alpha}}{\tilde{\varGamma}_p(\alpha)}|\text{ det}(\tilde{B}-{{}_{*}\tilde{T}}^{*})|{}^{-\alpha}\int_{\tilde{Y}>O}|\text{det}(\tilde{Y})|{}^{\alpha-p}\text{e}^{-\text{tr}(\tilde{Y})}\text{d}\tilde{Y}\\ &=|\text{ det}(\tilde{B})|{}^{\alpha}|\text{det}(\tilde{B}-{{}_{*}\tilde{T}}^{*})|{}^{-\alpha}\ \mbox{ for }\tilde{B}-{{}_{*}\tilde{T}}^{*}>O\\ &=|\text{ det}(I-\tilde{B}^{-1}{{}_{*}\tilde{T}}^{*})|{}^{-\alpha},\ ~ I-\tilde{B}^{-1}{{}_{*}\tilde{T}}^{*}>O.{} \end{aligned} $$

(5.2a.5)

For example, let p = 2 and

with $\tilde {x}_{21}=\tilde {x}_{12}^{*}$ and $ {{{ }_{*}\tilde t}}_{21}={{{ }_{*}\tilde t}}_{12}^{\,*}$. In this case, the conjugate transpose is only the conjugate since the quantities are scalar. Note that B = B ^∗ and hence B is Hermitian. The leading minors of B being |(3)| = 3 > 0 and |B| = (3)(2) − (−i)(i) = 5 > 0, B is Hermitian positive definite. Accordingly,

$$\displaystyle \begin{aligned} M_{\tilde{X}}({{}_{*}{\tilde{T}}})&=|\text{det}(B)|{}^{\alpha}|\text{det}(B-{{}_{*}{\tilde{T}}}^{*})|{}^{-\alpha}\\ &=5^{\alpha}[(3-{{}_{*}{\tilde{t}}}_{11}^{*})(2-{{}_{*}{\tilde{t}}}_{22}^{*})+(i+{{}_{*}{\tilde{t}}}_{12}^{*})(i-{{}_{*}{\tilde{t}}}_{12})].\end{aligned} $$

Now, consider the partitioning of the following p × p matrices:

(i)

where $\tilde {X}_{11}$ and ${{ }_{*}\tilde {T}}_{11}$ are r × r, r ≤ p. Then, proceeding as in the real case, we have the following results:

Theorem 5.2a.1

Let $\tilde {X}$ have a p × p complex matrix-variate gamma density with shape parameter α and scale parameter I _p , and $\tilde {X}$ be partitioned as in (i). Then, $\tilde {X}_{11}$ has an r × r complex matrix-variate gamma density with shape parameter α and scale parameter I _r and $\tilde {X}_{22}$ has a (p − r) × (p − r) complex matrix-variate gamma density with shape parameter α and scale parameter I _p−r.

Theorem 5.2a.2

Let the p × p complex matrix $\tilde {X}$ have a p × p complex matrix-variate gamma density with the parameters ( $\alpha ,\tilde {B}>O)$ and let $\tilde {X}$ and $\tilde {B}$ be partitioned as in (i) and $\tilde {B}_{12}=O,~\tilde {B}_{21}=O$ . Then $\tilde {X}_{11}$ and $\tilde {X}_{22}$ have r × r and (p − r) × (p − r) complex matrix-variate gamma densities with shape parameter α and scale parameters $\tilde {B}_{11}$ and $\tilde {B}_{22},$ respectively.

Theorem 5.2a.3

Let $\tilde {X},\ \tilde {X}_{11},\ \tilde {X}_{22}$ and $\tilde {B}$ be as specified in Theorems 5.2a.1 or 5.2a.2 . Then, $\tilde {X}_{11}$ and $\tilde {X}_{22}$ are statistically independently distributed as complex matrix-variate gamma random variables on r × r and (p − r) × (p − r) matrices, respectively.

For a general matrix B where the sub-matrices B ₁₂ and B ₂₁ are not assumed to be null, the marginal densities of $\tilde {X}_{11}$ and $\tilde {X}_{22}$ being given in the next result can be determined by proceeding as in the real case.

Theorem 5.2a.4

Let $\tilde {X}$ have a complex matrix-variate gamma density with shape parameter α and scale parameter matrix B = B ^∗ > O. Letting $\tilde {X}$ and B be partitioned as in (i), then the sub-matrix $\tilde {X}_{11}$ has a complex matrix-variate gamma density with shape parameter α and scale parameter matrix $B_{11}-B_{12}B_{22}^{-1}B_{21}$ , and the sub-matrix $\tilde {X}_{22}$ has a complex matrix-variate gamma density with shape parameter α and scale parameter matrix $B_{22}-B_{21}B_{11}^{-1}B_{12}$.

Exercises 5.2

5.2.1

Show that

$$\displaystyle \begin{aligned} \pi^{\tfrac{(p-r)(p-r-1)}{4}}\pi^{\frac{tr(r-1)}{4}}\pi^{\tfrac{2r(p-r)}{4}}=\pi^{\tfrac{p(p-1)}{4}}. \end{aligned} $$

5.2.2

Show that $\varGamma _r(\alpha )\varGamma _{p-r}(\alpha -\frac {r}{2})=\varGamma _p(\alpha ).$

5.2.3

Evaluate (1): ∫_X>Oe^−tr(X)dX, (2): ∫_X>O|X| e^−tr(X)dX.

5.2.4

Write down (1): Γ ₃(α), (2): Γ ₄(α) explicitly in the real and complex cases.

5.2.5

Evaluate the integrals in Exercise 5.2.3 for the complex case. In (2) replace det(X) by |det(X)|.

5.3. Matrix-variate Type-1 Beta and Type-2 Beta Densities, Real Case

The p × p matrix-variate beta function denoted by B _p(α, β) is defined as follows in the real case:

$$\displaystyle \begin{aligned} B_p(\alpha,\beta)=\frac{\varGamma_p(\alpha)\varGamma_p(\beta)}{\varGamma_p(\alpha+\beta)},~ \Re(\alpha)>\frac{p-1}{2},~\Re(\beta)>\frac{p-1}{2}. {} \end{aligned} $$

(5.3.1)

This function has the following integral representations in the real case where it is assumed that $\Re (\alpha )>\frac {p-1}{2}$ and $\Re (\beta )>\frac {p-1}{2}$:

$$\displaystyle \begin{aligned} B_p(\alpha,\beta)&=\int_{O<X<I}|X|{}^{\alpha-\frac{p+1}{2}}|I-X|{}^{\beta-\frac{p+1}{2}}\text{d}X,\ \mbox{a type-1 beta integral}{} \end{aligned} $$

(5.3.2)

$$\displaystyle \begin{aligned} B_p(\beta,\alpha)&=\int_{O<Y<I}|Y|{}^{\beta-\frac{p+1}{2}}|I-Y|{}^{\alpha-\frac{p+1}{2}}\text{d}Y,\ \mbox{a type-1 beta integral}{} \end{aligned} $$

(5.3.3)

$$\displaystyle \begin{aligned} B_p(\alpha,\beta)&=\int_{Z>O}|Z|{}^{\alpha-\frac{p+1}{2}}|I+Z|{}^{-(\alpha+\beta)}\text{d}Z,\ \mbox{a type-2 beta integral}{} \end{aligned} $$

(5.3.4)

$$\displaystyle \begin{aligned} B_p(\beta,\alpha)&=\int_{T>O}|T|{}^{\beta-\frac{p+1}{2}}|I+T|{}^{-(\alpha+\beta)}\text{d}T,\ \mbox{a type-2 beta integral}.{} \end{aligned} $$

(5.3.5)

For example, for p = 2, let

Then for example, (5.3.2) will be of the following form:

$$\displaystyle \begin{aligned} B_p(\alpha,\beta)&= \int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}|I-X|{}^{\beta-\frac{p+1}{2}}\text{d}X\\ &=\int_{x_{11}>0}\int_{x_{22}>0}\int_{x_{11}x_{22}-x_{12}^2>0}[x_{11}x_{22}-x_{12}^2]^{\alpha-\frac{3}{2}}\\ & \qquad \qquad \qquad \qquad \qquad \times [(1-x_{11})(1-x_{22})-x_{12}^2]^{\beta-\frac{3}{2}}\text{d}x_{11}\wedge\text{d}x_{12}\wedge\text{d}x_{22}.\end{aligned} $$

We will derive two of the integrals (5.3.2)–(5.3.5), the other ones being then directly obtained. Let us begin with the integral representations of Γ _p(α) and Γ _p(β) for $\Re (\alpha )>\frac {p-1}{2},~\Re (\beta )>\frac {p-1}{2}$:

$$\displaystyle \begin{aligned} \varGamma_p(\alpha)\varGamma_p(\beta)&=\Big[\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(X)}\text{ d}X\Big]\Big[\int_{Y>O}|Y|{}^{\beta-\frac{p+1}{2}}\text{e}^{-\text{tr}(Y)}\text{d}Y\Big]\\ &=\int_X\int_Y|X|{}^{\alpha-\frac{p+1}{2}}|Y|{}^{\beta-\frac{p+1}{2}}\text{e}^{-\text{tr}(X+Y)}\text{d}X\wedge\text{d}Y.\end{aligned} $$

Making the transformation U = X + Y, X = V , whose Jacobian is 1, taking out U from $|U-V|=|U|~|I-U^{-\frac {1}{2}}VU^{-\frac {1}{2}}|$, and then letting $W= U^{-\frac {1}{2}}VU^{-\frac {1}{2}}\Rightarrow \text{d}V=|U|{ }^{\frac {p+1}{2}}\text{d}W$, we have

$$\displaystyle \begin{aligned} \varGamma_p(\alpha)\varGamma_p(\beta)&=\int_U\int_V|V|{}^{\alpha-\frac{p+1}{2}}|U-V|{}^{\beta-\frac{p+1}{2}}\text{e}^{-\text{tr}(U)}\text{d}U\wedge\text{d}V\\ &=\Big\{\int_{U>O}|U|{}^{\alpha+\beta-\frac{p+1}{2}}\text{e}^{-\text{tr}(U)}\text{d}U\Big\}\\ &\ \ \ \ \ \ \times\Big\{\int_{O<W<I}|W|{}^{\alpha-\frac{p+1}{2}}|I-W|{}^{\beta-\frac{p+1}{2}}\text{d}W\Big\}\\ &=\varGamma_p(\alpha+\beta)\int_{O<W<I}|W|{}^{\alpha-\frac{p+1}{2}}|I-W|{}^{\beta-\frac{p+1}{2}}\text{d}W.\end{aligned} $$

Thus, on dividing both sides by Γ _p(α, β), we have

$$\displaystyle \begin{aligned} B_p(\alpha,\beta)=\int_{O<W<I}|W|{}^{\alpha-\frac{p+1}{2}}|I-W|{}^{\beta-\frac{p+1}{2}}\text{d}W. \end{aligned}$$

(i)

This establishes (5.3.2). The initial conditions $\Re (\alpha )>\frac {p-1}{2},~\Re (\beta )>\frac {p-1}{2}$ are sufficient to justify all the steps above, and hence no conditions are listed at each stage. Now, take Y = I − W to obtain (5.3.3). Let us take the W of (i) above and consider the transformation

$$\displaystyle \begin{aligned}Z=(I-W)^{-\frac{1}{2}}W(I-W)^{-\frac{1}{2}}=(W^{-1}-I)^{-\frac{1}{2}}(W^{-1}-I)^{-\frac{1}{2}}=(W^{-1}-I)^{-1}\end{aligned}$$

which gives

$$\displaystyle \begin{aligned} Z^{-1}=W^{-1}-I\Rightarrow |Z|{}^{-(p+1)}\text{d}Z=|W|{}^{-(p+1)}\text{d}W. \end{aligned}$$

(ii)

Taking determinants and substituting in (ii) we have

$$\displaystyle \begin{aligned}\text{d}W=|I+Z|{}^{-(p+1)}\text{d}Z. \end{aligned}$$

On expressing W, I − W and dW in terms of Z, we have the result (5.3.4). Now, let T = Z ⁻¹ with the Jacobian dT = |Z|^−(p+1)dZ, then (5.3.4) transforms into the integral (5.3.5). These establish all four integral representations of the real matrix-variate beta function. We may also observe that B _p(α, β) = B _p(β, α) or α and β can be interchanged in the beta function. Consider the function

$$\displaystyle \begin{aligned} f_3(X)=\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha)\varGamma_p(\beta)}|X|{}^{\alpha-\frac{p+1}{2}}|I-X|{}^{\beta-\frac{p+1}{2}} {} \end{aligned} $$

(5.3.6)

for $O<X<I,~\Re (\alpha )>\frac {p-1}{2},~\Re (\beta )>\frac {p-1}{2}$, and f ₃(X) = 0 elsewhere. This is a type-1 real matrix-variate beta density with the parameters (α, β), where O < X < I means X > O, I − X > O so that all the eigenvalues of X are in the open interval (0, 1). As for

$$\displaystyle \begin{aligned} f_4(Z)=\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha)\varGamma_p(\beta)}|Z|{}^{\alpha-\frac{p+1}{2}}|I+Z|{}^{-(\alpha+\beta)} {} \end{aligned} $$

(5.3.7)

whenever $Z>O,~ \Re (\alpha )>\frac {p-1}{2},~\Re (\beta )>\frac {p-1}{2}$, and f ₄(Z) = 0 elsewhere, this is a p × p real matrix-variate type-2 beta density with the parameters (α, β).

5.3.1. Some properties of real matrix-variate type-1 and type-2 beta densities

In the course of the above derivations, it was shown that the following results hold. If X is a p × p real positive definite matrix having a real matrix-variate type-1 beta density with the parameters (α, β), then: (1): Y ₁ = I − X is real type-1 beta distributed with the parameters (β, α); (2): $Y_2=(I-X)^{-\frac {1}{2}}X(I-X)^{-\frac {1}{2}}$ is real type-2 beta distributed with the parameters (α, β); (3): $Y_3=(I-X)^{\frac {1}{2}}X^{-1}(I-X)^{\frac {1}{2}}$ is real type-2 beta distributed with the parameters (β, α). If Y is real type-2 beta distributed with the parameters (α, β) then: (4): Z ₁ = Y ⁻¹ is real type-2 beta distributed with the parameters (β, α); (5): $Z_2=(I+Y)^{-\frac {1}{2}}Y(I+Y)^{-\frac {1}{2}}$ is real type-1 beta distributed with the parameters (α, β); (6): $Z_3=I-(I+Y)^{-\frac {1}{2}}Y(I+Y)^{-\frac {1}{2}}=(I+Y)^{-1}$ is real type-1 beta distributed with the parameters (β, α).

5.3a. Matrix-variate Type-1 and Type-2 Beta Densities, Complex Case

A matrix-variate beta function in the complex domain is defined as

$$\displaystyle \begin{aligned} \tilde{B}_p(\alpha,\beta)=\frac{\tilde{\varGamma}_p(\alpha)\tilde{\varGamma}_p(\beta)}{\tilde{\varGamma}_p(\alpha+\beta)},~ \Re(\alpha)>p-1,~\Re(\beta)>p-1 {} \end{aligned} $$

(5.3a.1)

with a tilde over B. As $\tilde {B}_p(\alpha ,\beta )=\tilde {B}_p(\beta ,\alpha ),$ clearly α and β can be interchanged. Then, $\tilde {B}_p(\alpha ,\beta )$ has the following integral representations, where $\Re (\alpha )>p-1, \Re (\beta )>p-1$:

$$\displaystyle \begin{aligned} \tilde{B}_p(\alpha,\beta)&=\!\int_{O<\tilde{X}<I}\!\!|\text{det}(\tilde{X})|{}^{\alpha-p}|\text{det}(I-\tilde{X})|{}^{\beta-p}\text{d}\tilde{X},\ \mbox{a type-1 beta integral }{} \end{aligned} $$

(5.3a.2)

$$\displaystyle \begin{aligned} \tilde{B}_p(\beta,\alpha)&=\int_{O<\tilde{Y}<I}|\text{det}(\tilde{Y})|{}^{\beta-p}|\text{det}(I-\tilde{Y})|{}^{\alpha-p}\text{d}\tilde{Y},\ \mbox{a type-1 beta integral }{} \end{aligned} $$

(5.3a.3)

$$\displaystyle \begin{aligned} \tilde{B}_p(\alpha,\beta)&=\int_{\tilde{Z}>O}|\text{det}(\tilde{Z})|{}^{\alpha-p}|\text{det}(I+\tilde{Z})|{}^{-(\alpha+\beta)}\text{d}\tilde{Z},\ \mbox{a type-2 beta integral }{} \end{aligned} $$

(5.3a.4)

$$\displaystyle \begin{aligned} \tilde{B}_p(\beta,\alpha)&=\!\int_{\tilde{T}>O}\!\!|\text{det}(\tilde{T})|{}^{\beta-p}|\text{det}(I+\tilde{T})|{}^{-(\alpha+\beta)}\text{d}\tilde{T},\ \mbox{a type-2 beta integral. }{} \end{aligned} $$

(5.3a.5)

For instance, consider the integrand in (5.3a.2) for the case p = 2. Let

the diagonal elements of $\tilde {X}$ being real; since $\tilde {X}$ is Hermitian positive definite, we have $x_1>0,~x_3>0,~x_1x_2-(x_2^2+y_2^2)>0$, $\text{ det}(\tilde {X})=x_1x_3-(x_2^2+y_2^2)>0$ and $\text{det}(I-\tilde {X})=(1-x_1)(1-x_3)-(x_2^2+y_2^2)>0$. The integrand in (5.3a.2) is then

$$\displaystyle \begin{aligned}{}[x_1x_3-(x_2^2+y_2^2)]^{\alpha-\frac{3}{2}}[(1-x_1)(1-x_3)-(x_2^2+y_2^2)]^{\beta-\frac{3}{2}}. \end{aligned}$$

The derivations of (5.3a.2)–(5.3a.5) being parallel to those provided in the real case, they are omitted. We will list one case for each of a type-1 and a type-2 beta density in the complex p × p matrix-variate case:

$$\displaystyle \begin{aligned} \tilde{f}_3(\tilde{X})=\frac{\tilde{\varGamma}_p(\alpha+\beta)}{\tilde{\varGamma}_p(\alpha)\tilde{\varGamma}_p(\beta)}|\text{det}(\tilde{X})|{}^{\alpha-p}|\text{ det}(I-\tilde{X})|{}^{\beta-p} {} \end{aligned} $$

(5.3a.6)

for $O<\tilde {X}<I,~\Re (\alpha )>p-1,~ \Re (\beta )>p-1$ and $\tilde {f}_3(\tilde {X})=0$ elsewhere;

$$\displaystyle \begin{aligned} \tilde{f}_4(\tilde{Z})=\frac{\tilde{\varGamma}_p(\alpha+\beta)}{\tilde{\varGamma}_p(\alpha)\tilde{\varGamma}_p(\beta)} |\text{det}(\tilde{Z})|{}^{\alpha-p}|\text{ det}(I+\tilde{Z})|{}^{-(\alpha+\beta)} {} \end{aligned} $$

(5.3a.7)

for $\tilde {Z}>O,~ \Re (\alpha )>p-1,~ \Re (\beta )>p-1$ and $\tilde {f}_4(\tilde {Z})=0$ elsewhere.

Properties parallel to (1) to (6) which are listed in Sect. 5.3.1 also hold in the complex case.

5.3.2. Explicit evaluation of type-1 matrix-variate beta integrals, real case

A detailed evaluation of a type-1 matrix-variate beta integral as a multiple integral is presented in this section as the steps will prove useful in connection with other computations; the reader may also refer to Mathai (2014,b). The real matrix-variate type-1 beta function which is denoted by

$$\displaystyle \begin{aligned}B_p(\alpha,\beta)=\frac{\varGamma_p(\alpha)\varGamma_p(\beta)}{\varGamma_p(\alpha+\beta)},~\Re(\alpha)>\frac{p-1}{2}, ~\Re(\beta)>\frac{p-1}{2}, \end{aligned}$$

has the following type-1 beta integral representation:

$$\displaystyle \begin{aligned}B_p(\alpha,\beta)=\int_{O<X<I}|X|{}^{\alpha-\frac{p+1}{2}}|I-X|{}^{\beta-\frac{p+1}{2}}\text{d}X, \end{aligned}$$

for $\Re (\alpha )>\frac {p-1}{2},~\Re (\beta )>\frac {p-1}{2}$ where X is a real p × p symmetric positive definite matrix. The standard derivation of this integral relies on the properties of real matrix-variate gamma integrals after making suitable transformations, as was previously done. It is also possible to evaluate the integral directly and show that it is equal to Γ _p(α)Γ _p(β)∕Γ _p(α + β) where, for example,

$$\displaystyle \begin{aligned}\varGamma_p(\alpha)=\pi^{\frac{p(p-1)}{4}}\varGamma(\alpha)\varGamma(\alpha-{1}/{2})\cdots \varGamma(\alpha-{(p-1)}/{2}), ~\Re(\alpha)>\frac{p-1}{2}. \end{aligned}$$

A convenient technique for evaluating a real matrix-variate gamma integral consists of making the transformation X = TT ^′ where T is a lower triangular matrix whose diagonal elements are positive. However, on applying this transformation, the type-1 beta integral does not simplify due to the presence of the factor $|I-X|{ }^{\beta -\frac {p+1}{2}}$. Hence, we will attempt to evaluate this integral by appropriately partitioning the matrices and then, successively integrating out the variables. Letting X = (x _ij) be a p × p real matrix, x _pp can then be extracted from the determinants of |X| and |I − X| after partitioning the matrices. Thus, let

where X ₁₁ is the (p − 1) × (p − 1) leading sub-matrix, X ₂₁ is 1 × (p − 1), X ₂₂ = x _pp and $X_{12}=X_{21}^{\prime }$. Then $|X|=|X_{11}||x_{pp}-X_{21}X_{11}^{-1}X_{12}|$ so that

$$\displaystyle \begin{aligned} |X|{}^{\alpha-\frac{p+1}{2}}=|X_{11}|{}^{\alpha-\frac{p+1}{2}}[x_{pp}-X_{21}X_{11}^{-1}X_{12}]^{\alpha-\frac{p+1}{2}}, \end{aligned}$$

(i)

and

$$\displaystyle \begin{aligned} |I-X|{}^{\beta-\frac{p+1}{2}}=|I-X_{11}|{}^{\beta-\frac{p+1}{2}}[(1-x_{pp})-X_{21}(I-X_{11})^{-1}X_{12}]^{\beta-\frac{p+1}{2}}. \end{aligned}$$

(ii)

It follows from (i) that $x_{pp}>X_{21}X_{11}^{-1}X_{12}$ and, from (ii) that x _pp < 1 − X ₂₁(I − X ₁₁)⁻¹ X ₁₂; thus, we have $X_{21}X_{11}^{-1}X_{12}<x_{pp}<1-X_{21}(I-X_{11})^{-1}X_{12}$. Let $y=x_{pp}-X_{21}X_{11}^{-1}X_{12}\Rightarrow \text{d}y=\text{d}x_{pp}$ for fixed X ₂₁, X ₁₁, so that 0 < y < b where

$$\displaystyle \begin{aligned} b&=1-X_{21}X_{11}^{-1}X_{12}-X_{21}(I-X_{11})^{-1}X_{12}\\ &=1-X_{21}X_{11}^{-\frac{1}{2}}(I-X_{11})^{-\frac{1}{2}}(I-X_{11})^{-\frac{1}{2}}X_{11}^{-\frac{1}{2}}X_{12}\\ &=1-WW',~W=X_{21}X_{11}^{-\frac{1}{2}}(I-X_{11})^{-\frac{1}{2}}. \end{aligned} $$

The second factor on the right-hand side of (ii) then becomes

$$\displaystyle \begin{aligned}{}[b-y]^{\beta-\frac{p+1}{2}}=b^{\beta-\frac{p+1}{2}}[1-{y}/{b}]^{\beta-\frac{p+1}{2}}. \end{aligned}$$

Now letting $u=\frac {y}{b}$ for fixed b, the terms containing u and b become $b^{\alpha +\beta -(p+1)+1}u^{\alpha -\frac {p+1}{2}}$ $(1-u)^{\beta -\frac {p+1}{2}}$. Integration over u then gives

$$\displaystyle \begin{aligned}\int_0^1u^{\alpha-\frac{p+1}{2}}(1-u)^{\beta-\frac{p+1}{2}}\text{ d}u=\frac{\varGamma(\alpha-\frac{p-1}{2})\varGamma(\beta-\frac{p-1}{2})}{\varGamma(\alpha+\beta-(p-1))}\end{aligned}$$

for $~\Re (\alpha )>\frac {p-1}{2}, ~\Re (\beta )>\frac {p-1}{2}.$ Letting $W=X_{21}X_{11}^{-\frac {1}{2}}(I-X_{11})^{-\frac {1}{2}}$ for fixed X ₁₁, $\text{d}X_{21}=|X_{11}|{ }^{\frac {1}{2}}|I-X_{11}|{ }^{\frac {1}{2}}\text{d}W$ from Theorem 1.6.1 of Chap. 1 or Theorem 1.18 of Mathai (1997), where X ₁₁ is a (p − 1) × (p − 1) matrix. Now, letting v = WW ^′ and integrating out over the Stiefel manifold by applying Theorem 4.2.3 of Chap. 4 or Theorem 2.16 and Remark 2.13 of Mathai (1997), we have

$$\displaystyle \begin{aligned} \text{d}W=\frac{\pi^{\frac{p-1}{2}}}{\varGamma(\frac{p-1}{2})}v^{\frac{p-1}{2}-1}\text{d}v. \end{aligned}$$

Thus, the integral over b becomes

$$\displaystyle \begin{aligned} \int b^{\alpha+\beta-p}\text{d}X_{21}&=\int_0^1v^{\frac{p-1}{2}-1}(1-v)^{\alpha+\beta-p}\text{d}v\\ &=\frac{\varGamma(\frac{p-1}{2})\varGamma(\alpha+\beta-(p-1))}{\varGamma(\alpha+\beta-\frac{p-1}{2})},~\Re(\alpha+\beta)>p-1. \end{aligned} $$

Then, on multiplying all the factors together, we have

$$\displaystyle \begin{aligned}|X_{11}^{(1)}|{}^{\alpha+\frac{1}{2}-\frac{p+1}{2}}|I-X_{11}^{(1)}|{}^{\beta+\frac{1}{2}-\frac{p+1}{2}}\pi^{\frac{p-1}{2}} \frac{\varGamma(\alpha-\frac{p-1}{2})\varGamma(\beta-\frac{p-1}{2})}{\varGamma(\alpha+\beta-\frac{p-1}{2})}\end{aligned}$$

whenever $\Re (\alpha )>\frac {p-1}{2},~\Re (\beta )>\frac {p-1}{2}$. In this case, $X_{11}^{(1)}$ represents the (p − 1) × (p − 1) leading sub-matrix at the end of the first set of operations. At the end of the second set of operations, we will denote the (p − 2) × (p − 2) leading sub-matrix by $X_{11}^{(2)}$, and so on. The second step of the operations begins by extracting x _p−1,p−1 and writing

$$\displaystyle \begin{aligned}|X_{11}^{(1)}|=|X_{11}^{(2)}|~[x_{p-1,p-1}-X_{21}^{(2)}[X_{11}^{(2)}]^{-1}X_{12}^{(2)}] \end{aligned}$$

where $X_{21}^{(2)}$ is a 1 × (p − 2) vector. We then proceed as in the first sequence of steps to obtain the final factors in the following form:

$$\displaystyle \begin{aligned}|X_{11}^{(2)}|{}^{\alpha+1-\frac{p+1}{2}}|I-X_{11}^{(2)}|{}^{\beta+1-\frac{p+1}{2}}\pi^{\frac{p-2}{2}} \frac{\varGamma(\alpha-\frac{p-2}{2})\varGamma(\beta-\frac{p-2}{2})}{\varGamma(\alpha+\beta-\frac{p-2}{2})}\end{aligned}$$

for $\Re (\alpha )>\frac {p-2}{2},~\Re (\beta )>\frac {p-2}{2}$. Proceeding in such a manner, in the end, the exponent of π will be

$$\displaystyle \begin{aligned}\frac{p-1}{2}+\frac{p-2}{2}+\cdots +\frac{1}{2}=\frac{p(p-1)}{4},\end{aligned}$$

and the gamma product will be

$$\displaystyle \begin{aligned}\frac{\varGamma(\alpha-\frac{p-1}{2})\varGamma(\alpha-\frac{p-2}{2})\cdots \varGamma(\alpha)\varGamma(\beta-\frac{p-1}{2})\cdots \varGamma(\beta)} {\varGamma(\alpha+\beta-\frac{p-1}{2})\cdots \varGamma(\alpha+\beta)}. \end{aligned}$$

These gamma products, along with $\pi ^{\frac {p(p-1)}{4}},$ can be written as $\frac {\varGamma _p(\alpha )\varGamma _p(\beta )}{\varGamma _p(\alpha +\beta )}=B_p(\alpha ,~\beta )$; hence the result. It is thus possible to obtain the beta function in the real matrix-variate case by direct evaluation of a type-1 real matrix-variate beta integral.

A similar approach can yield the real matrix-variate beta function from a type-2 real matrix-variate beta integral of the form

$$\displaystyle \begin{aligned}\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}|I+X|{}^{-(\alpha+\beta)}\text{d}X \end{aligned}$$

where X is a p × p positive definite symmetric matrix and it is assumed that $\Re (\alpha )>\frac {p-1}{2}$ and $\Re (\beta )>\frac {p-1}{2}$, the evaluation procedure being parallel.

Example 5.3.1

By direct evaluation as a multiple integral, show that

$$\displaystyle \begin{aligned}\int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}|I-X|{}^{\beta-\frac{p+1}{2}}\text{d}X=\frac{\varGamma_p(\alpha)\varGamma_p(\beta)}{\varGamma_p(\alpha+\beta)}\end{aligned}$$

for p = 2.

Solution 5.3.1

The integral to be evaluated will be denoted by δ. Let

$$\displaystyle \begin{aligned} |X|&=x_{11}[x_{22}-x_{21}x_{11}^{-1}x_{12}]=x_{11}\Big[x_{22}-\frac{x_{12}^2}{x_{11}}\,\Big]\\ |I-X|&=[1-x_{11}][(1-x_{22})-x_{12}(1-x_{11})^{-1}x_{12}]=(1-x_{11})\Big[1-x_{22}-\frac{x_{12}^2}{1-x_{11}}\,\Big]. \end{aligned} $$

(i)

It is seen from (i) that

$$\displaystyle \begin{aligned}\frac{x_{12}^2}{x_{11}}\le x_{22}\le 1-\frac{x_{12}^2}{1-x_{11}}. \end{aligned}$$

Letting $y=x_{22}-\frac {x_{12}^2}{x_{11}}$ so that $0\le y\le b,\ \mbox{and}\ b=1-\frac {x_{12}^2}{x_{11}}-\frac {x_{12}^2}{1-x_{11}}=1-\frac {x_{12}^2}{x_{11}(1-x_{11})},$ we have

$$\displaystyle \begin{aligned} |X|{}^{\alpha-\frac{3}{2}}|I-X|{}^{\beta-\frac{3}{2}}\text{d}X&=x_{11}^{\alpha-\frac{3}{2}}(1-x_{11})^{\beta-\frac{3}{2}}y^{\alpha-\frac{3}{2}}\\ &\ \ \ \times (b-y)^{\beta-\frac{3}{2}}\text{d}x_{11}\wedge\text{d}x_{22}\wedge\text{d}y.\end{aligned} $$

Now, integrating out y, we have

$$\displaystyle \begin{aligned} \int_{y=0}^by^{\alpha-\frac{3}{2}}(b-y)^{\beta-\frac{3}{2}}\text{d}y&=b^{\alpha+\beta-3+1}\int_0^1v^{\alpha-\frac{3}{2}}(1-v)^{\beta-\frac{3}{2}}\text{ d}v,~v=\frac{y}{b}\\ &=b^{\alpha+\beta-2}\frac{\varGamma(\alpha-\frac{1}{2})\varGamma(\beta-\frac{1}{2})}{\varGamma(\alpha+\beta-1)} \end{aligned} $$

(ii)

whenever $\Re (\alpha )>\frac {1}{2}$ and $\Re (\beta )>\frac {1}{2},$ b being as previously defined. Letting $w=\frac {x_{12}}{[x(1-x)]^{\frac {1}{2}}}$, $\text{d}x_{12}=[x_{11}(1-x_{11}]^{\frac {1}{2}}\text{d}w$ for fixed x ₁₁. The exponents of x ₁₁ and (1 − x ₁₁) then become $\alpha -\frac {3}{2}+\frac {1}{2}$ and $\beta -\frac {3}{2}+\frac {1}{2}$, and the integral over w gives the following:

$$\displaystyle \begin{aligned} \int_{-1}^1(1-w^2)^{\alpha+\beta-2}\text{d}w&=2\int_0^1(1-w^2)^{\alpha+\beta-2}\text{d}w=\int_0^1z^{\frac{1}{2}-1}(1-z)^{\alpha+\beta-2}\text{d}z\\ &=\frac{\varGamma(\frac{1}{2})\varGamma(\alpha+\beta-1)}{\varGamma(\alpha+\beta-\frac{1}{2})}. \end{aligned} $$

(iii)

Now, integrating out x ₁₁, we obtain

$$\displaystyle \begin{aligned} \int_0^1x_{11}^{\alpha-1}(1-x_{11})^{\beta-1}\text{d}x_{11}=\frac{\varGamma(\alpha)\varGamma(\beta)}{\varGamma(\alpha+\beta)}. \end{aligned}$$

(iv)

Then, on collecting the factors from (i) to (iv), we have

$$\displaystyle \begin{aligned}\delta=\varGamma({1}/{2})\frac{\varGamma(\alpha)\varGamma(\alpha-\frac{1}{2})\varGamma(\beta)\varGamma(\beta-\frac{1}{2})} {\varGamma(\alpha+\beta)\varGamma(\alpha+\beta-\frac{1}{2})}. \end{aligned}$$

Finally, noting that for $p=2,~\pi ^{\frac {p(p-1)}{4}}=\pi ^{\frac {1}{2}}=\frac {\pi ^{\frac {1}{2}}\pi ^{\frac {1}{2}}}{\pi ^{\frac {1}{2}}}$, the desired result is obtained, that is,

$$\displaystyle \begin{aligned}\delta=\frac{\varGamma_2(\alpha)\varGamma_2(\beta)}{\varGamma_2(\alpha+\beta)}=B_2(\alpha,\beta). \end{aligned}$$

This completes the computations.

5.3a.1. Evaluation of matrix-variate type-1 beta integrals, complex case

The integral representation for B _p(α, β) in the complex case is

$$\displaystyle \begin{aligned}\int_{O<\tilde{X}<I}|\text{det}(\tilde{X})|{}^{\alpha-p}|\text{det}(I-\tilde{X})|{}^{\beta-p}\text{d}\tilde{X}=\tilde{B}_p(\alpha,\beta) \end{aligned}$$

whenever $\Re (\alpha )>p-1,~\Re (\beta )>p-1$ where det(⋅) denotes the determinant of (⋅) and |det(⋅)|, the absolute value (or modulus) of the determinant of (⋅). In this case, $\tilde {X}=(\tilde {x}_{ij})$ is a p × p Hermitian positive definite matrix and accordingly, all of its diagonal elements are real and positive. As in the real case, let us extract x _pp by partitioning $\tilde {X}$ as follows:

where $\tilde {X}_{22}\equiv x_{pp}$ is a real scalar. Then, the absolute value of the determinants have the following representations:

$$\displaystyle \begin{aligned} |\text{det}(\tilde{X})|{}^{\alpha-p}=|\text{det}(\tilde{X}_{11})|{}^{\alpha-p}|x_{pp}-\tilde{X}_{21}\tilde{X}_{11}^{-1}\tilde{X}_{12}^{*}|{}^{\alpha-p} \end{aligned}$$

(i)

where * indicates conjugate transpose, and

$$\displaystyle \begin{aligned} |\text{det}(I-\tilde{X})|{}^{\beta-p}=|\text{det}(I-\tilde{X}_{11})|{}^{\beta-p}|(1-x_{pp})-\tilde{X}_{21}(I-\tilde{X}_{11})^{-1}\tilde{X}_{12}^{*}|{}^{\beta-p}. \end{aligned}$$

(ii)

Note that whenever $\tilde {X}$ and $I-\tilde {X}$ are Hermitian positive definite, $\tilde {X}_{11}^{-1}$ and $(I-\tilde {X}_{11})^{-1}$ are too Hermitian positive definite. Further, the Hermitian forms $\tilde {X}_{21}\tilde {X}_{11}^{-1}\tilde {X}_{12}^{*}$ and $\tilde {X}_{21}(I-\tilde {X}_{11})^{-1}\tilde {X}_{12}^{*}$ remain real and positive. It follows from (i) and (ii) that

$$\displaystyle \begin{aligned}\tilde{X}_{21}\tilde{X}_{11}^{-1}\tilde{X}_{12}^{*}<x_{pp}<1-\tilde{X}_{21}(I-\tilde{X}_{11})^{-1}\tilde{X}_{12}^{*}. \end{aligned}$$

Since the traces of Hermitian forms are real, the lower and upper bounds of x _pp are real as well. Let

$$\displaystyle \begin{aligned}\tilde{W}=\tilde{X}_{21}\tilde{X}_{11}^{-\frac{1}{2}}(I-\tilde{X}_{11})^{-\frac{1}{2}}\end{aligned}$$

for fixed $\tilde {X}_{11}$. Then

$$\displaystyle \begin{aligned}\text{d}\tilde{X}_{21}=|\text{det}(\tilde{X}_{11})|{}^{-1}|\text{det}(I-\tilde{X}_{11})|{}^{-1}\text{d}\tilde{W}\end{aligned}$$

and $|\text{det}(\tilde {X})|{ }^{\alpha -p},~ |\text{det}(I-\tilde {X}_{11})|{ }^{\beta -p}$ will become $|\text{det}(\tilde {X}_{11})|{ }^{\alpha +1-p},~ |\text{ det}(I-\tilde {X}_{11})|{ }^{\beta +1-p},$ respectively. Then, we can write

$$\displaystyle \begin{aligned} |(1-x_{pp})&-\tilde{X}_{21}\tilde{X}_{11}^{-1}\tilde{X}_{12}^{*}-\tilde{X}_{21}(I-\tilde{X}_{11})^{-1} \tilde{X}_{12}^{*}|{}^{\beta-p}\\ &=(b-y)^{\beta-p}=b^{\beta-p}[1-{y}/{b}\,]^{\beta-p}. \end{aligned} $$

Now, letting u = y∕b, the factors containing u and b will be of the form u ^α−p (1 − u)^β−p b ^α+β−2p+1; the integral over u then gives

$$\displaystyle \begin{aligned}\int_0^1u^{\alpha-p}(1-u)^{\beta-p}\text{d}u=\frac{\varGamma(\alpha-(p-1))\varGamma(\beta-(p-1))}{\varGamma(\alpha+\beta-2(p-1))}, \end{aligned}$$

for $\Re (\alpha )>p-1,~\Re (\beta )>p-1$. Letting $v=\tilde {W}\tilde {W}^{*}$ and integrating out over the Stiefel manifold by making use of Theorem 4.2a.3 of Chap. 4 or Corollaries 4.5.2 and 4.5.3 of Mathai (1997), we have

$$\displaystyle \begin{aligned}\text{d}\tilde{W}=\frac{\pi^{p-1}}{\varGamma(p-1)}v^{(p-1)-1}\text{d}v. \end{aligned}$$

The integral over b gives

$$\displaystyle \begin{aligned} \int b^{\alpha+\beta-2p+1}\text{d}\tilde{X}_{21}&=\int_0^1v^{(p-1)-1}(1-v)^{\alpha+\beta-2p+1}\text{d}v\\ &=\frac{\varGamma(p-1)\varGamma(\alpha+\beta-2(p-1))}{\varGamma(\alpha+\beta-p+1)}, \end{aligned} $$

for $\Re (\alpha )>p-1,~\Re (\beta )>p-1$. Now, taking the product of all the factors yields

$$\displaystyle \begin{aligned}|\text{det}(\tilde{X}_{11})|{}^{\alpha+1-p}|\text{det}(I-\tilde{X}_{11})|{}^{\beta+1-p}\pi^{p-1}\frac{\varGamma(\alpha-p+1)\varGamma(\beta-p+1)}{\varGamma(\alpha+\beta-p+1)}\end{aligned}$$

for $\Re (\alpha )>p-1,~\Re (\beta )>p-1$. On extracting x _p−1,p−1 from $|\tilde {X}_{11}|$ and $|I-\tilde {X}_{11}|$ and continuing this process, in the end, the exponent of π will be $(p-1)+(p-2)+\cdots +1=\frac {p(p-1)}{2}$ and the gamma product will be

$$\displaystyle \begin{aligned}\frac{\varGamma(\alpha-(p-1))\varGamma(\alpha-(p-2))\cdots \varGamma(\alpha)\varGamma(\beta-(p-1))\cdots \varGamma(\beta)} {\varGamma(\alpha+\beta-(p-1))\cdots \varGamma(\alpha+\beta)}. \end{aligned}$$

These factors, along with $\pi ^{\frac {p(p-1)}{2}}$ give

$$\displaystyle \begin{aligned}\frac{{\tilde\varGamma_p}(\alpha){\tilde\varGamma_p}(\beta)}{{\tilde\varGamma_p}(\alpha+\beta)}=\tilde{B_p}(\alpha,~\beta),~ \Re(\alpha)>p-1,~\Re(\beta)>p-1. \end{aligned}$$

The procedure for evaluating a type-2 matrix-variate beta integral by partitioning matrices is parallel and hence will not be detailed here.

Example 5.31

For p = 2, evaluate the integral

$$\displaystyle \begin{aligned}\int_{O<\tilde{X}<I}|\text{det}(\tilde{X})|{}^{\alpha-p}|\text{det}(I-\tilde{X})|{}^{\beta-p}\text{d}\tilde{X}\end{aligned}$$

as a multiple integral and show that it evaluates out to $\tilde {B}_2(\alpha ,\beta )$, the beta function in the complex domain.

Solution 5.31

For p = 2, $\pi ^{\frac {p(p-1)}{2}}=\pi ^{\frac {2(1)}{2}}=\pi $, and

$$\displaystyle \begin{aligned}\tilde{B}_2(\alpha,\beta)=\pi\frac{\varGamma(\alpha)\varGamma(\alpha-1)\varGamma(\beta)\varGamma(\beta-1)}{\varGamma(\alpha+\beta)\varGamma(\alpha+\beta-1)}\end{aligned}$$

whenever $\Re (\alpha )>1 $ and $\Re (\beta )>1$. For p = 2, our matrix and the relevant determinants are

where $\tilde {x}_{12}^{*}$ is only the conjugate of $\tilde {x}_{12}$ as it is a scalar quantity. By expanding the determinants of the partitioned matrices as explained in Sect. 1.3, we have the following:

$$\displaystyle \begin{aligned} \text{det}(\tilde{X})&=\tilde{x}_{11}[\tilde{x}_{22}-\tilde{x}_{12}^{*}\tilde{x}_{11}^{-1}\tilde{x}_{12}] =\tilde{x}_{11}\Big[\tilde{x}_{22}-\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{\tilde{x}_{11}}\Big] \end{aligned} $$

(i)

$$\displaystyle \begin{aligned} \text{det}(I-\tilde{X})&=(1-\tilde{x}_{11})\Big[1-\tilde{x}_{22}-\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{1-\tilde{x}_{11}}\Big]. \end{aligned} $$

(ii)

From (i) and (ii), it is seen that

$$\displaystyle \begin{aligned}\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{\tilde{x}_{11}}\le \tilde{x}_{22}\le 1-\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{1-\tilde{x}_{11}}.\end{aligned}$$

Note that when $\tilde {X}$ is Hermitian, $\tilde {x}_{11}$ and $\tilde {x}_{22}$ are real and hence we may not place a tilde on these variables. Let $\tilde {y}=x_{22}-{\tilde {x}_{12}\tilde {x}_{12}^{*}}/{x_{11}}.$ Note that $\tilde {y}$ is also real since $\tilde {x}_{12}\tilde {x}_{12}^{*}$ is real. As well, 0 ≤ y ≤ b, where

$$\displaystyle \begin{aligned}b=1-\Big[\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{x_{11}}\Big]-\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{1-x_{11}} =1-\frac{\tilde{x}_{12}\tilde{x}_{12}^{*}}{x_{11}(1-x_{11})}.\end{aligned}$$

Further, b is a real scalar of the form $b=1-\tilde {w}\tilde {w}^{*}$ where $\tilde {w}=\frac {\tilde {x}_{12}}{[x_{11}(1-x_{11})]^{\frac {1}{2}}}\Rightarrow \text{d}\tilde {x}_{12}=x_{11}(1-x_{11})\text{d}\tilde {w}$. This will make the exponents of x ₁₁ and (1 − x ₁₁) as α − p + 1 = α − 1 and β − 1, respectively. Now, on integrating out y, we have

$$\displaystyle \begin{aligned} \int_{y=0}^by^{\alpha-2}(b-y)^{\beta-2}\text{d}y=b^{\alpha+\beta-3}\frac{\varGamma(\alpha-1)\varGamma(\beta-1)}{\varGamma(\alpha+\beta-2)},\ \Re(\alpha)>1,\ \Re(\beta)>1. \end{aligned}$$

(iii)

Integrating out $\tilde {w}$, we have the following:

$$\displaystyle \begin{aligned} \int_{\tilde{w}}(1-\tilde{w}\tilde{w}^{*})^{\alpha+\beta-3}\text{d}\tilde{w}=\pi\frac{\varGamma(\alpha+\beta-2)}{\varGamma(\alpha+\beta-1)}. \end{aligned}$$

(iv)

This integral is evaluated by writing $z=\tilde {w}\tilde {w}^{*}$. Then, it follows from Theorem 4.2a.3 that

$$\displaystyle \begin{aligned}\text{d}\tilde{w}=\frac{\pi^{p-1}}{\varGamma(p-1)}z^{(p-1)-1}\text{d}z=\pi \text{d}z\mbox{ for }p=2.\end{aligned}$$

Now, collecting all relevant factors from (i) to (iv), the required representation of the initial integral, denoted by δ, is obtained:

$$\displaystyle \begin{aligned}\delta=\pi\frac{\varGamma(\alpha)\varGamma(\alpha-1)\varGamma(\beta)\varGamma(\beta-1)}{\varGamma(\alpha+\beta)\varGamma(\alpha+\beta-1)} =\frac{\tilde{\varGamma}_2(\alpha)\tilde{\varGamma}_2(\beta)}{\tilde{\varGamma}_2(\alpha+\beta)}=\tilde{B}_2(\alpha+\beta) \end{aligned}$$

whenever $\Re (\alpha )>1$ and $\Re (\beta )>1$. This completes the computations.

5.3.3. General partitions, real case

In Sect. 5.3.2, we have considered integrating one variable at a time by suitably partitioning the matrices. Would it also be possible to have a general partitioning and integrate a block of variables at a time, rather than integrating out individual variables? We will consider the real matrix-variate gamma integral first. Let the p × p positive definite matrix X be partitioned as follows:

so that X ₁₂ is p ₁ × p ₂ with $X_{21}=X_{12}^{\prime }$ and p ₁ + p ₂ = p. Without any loss of generality, let us assume that p ₁ ≥ p ₂. The determinant can be partitioned as follows:

$$\displaystyle \begin{aligned} |X|{}^{\alpha-\frac{p+1}{2}}&=|X_{11}|{}^{\alpha-\frac{p+1}{2}}|X_{22}-X_{21}X_{11}^{-1}X_{12}|{}^{\alpha-\frac{p+1}{2}}\\ &=|X_{11}|{}^{\alpha-\frac{p+1}{2}}|X_{22}|{}^{\alpha-\frac{p+1}{2}}|I-X_{22}^{-\frac{1}{2}}X_{21}X_{11}^{-1}X_{12} X_{22}^{-\frac{1}{2}}|{}^{\alpha-\frac{p+1}{2}}. \end{aligned} $$

Letting

$$\displaystyle \begin{aligned}Y=X_{22}^{-\frac{1}{2}}X_{21}X_{11}^{-\frac{1}{2}}\Rightarrow \text{d}Y=|X_{22}|{}^{-\frac{p_1}{2}}|X_{11}|{}^{-\frac{p_2}{2}}\text{d}X_{21}\end{aligned}$$

for fixed X ₁₁ and X ₂₂ by making use of Theorem 1.6.4 of Chap. 1 or Theorem 1.18 of Mathai (1997),

$$\displaystyle \begin{aligned}|X|{}^{\alpha-\frac{p+1}{2}}\text{d}X_{21}=|X_{11}|{}^{\alpha+\frac{p_2}{2}-\frac{p+1}{2}}|X_{22}|{}^{\alpha+\frac{p_1}{2}-\frac{p+1}{2}} |I-YY'|{}^{\alpha-\frac{p+1}{2}}\text{d}Y.\end{aligned}$$

Letting S = Y Y ^′ and integrating out over the Stiefel manifold, we have

$$\displaystyle \begin{aligned}\text{d}Y=\frac{\pi^{\frac{p_1p_2}{2}}}{\varGamma_{p_2}(\frac{p_1}{2})}|S|{}^{\frac{p_1}{2}-\frac{p_2+1}{2}}\text{d}S; \end{aligned}$$

refer to Theorem 2.16 and Remark 2.13 of Mathai (1997) or Theorem 4.2.3 of Chap. 4. Now, the integral over S gives

$$\displaystyle \begin{aligned}\int_{O<S<I}|S|{}^{\frac{p_1}{2}-\frac{P_2+1}{2}}|I-S|{}^{\alpha-\frac{p_1}{2}-\frac{p_2+1}{2}}\text{ d}S=\frac{\varGamma_{p_2}(\frac{p_1}{2})\varGamma_{p_2}(\alpha-\frac{p_1}{2})}{\varGamma_{p_2}(\alpha)}, \end{aligned}$$

for $\Re (\alpha )>\frac {p_1-1}{2}$. Collecting all the factors, we have

$$\displaystyle \begin{aligned}|X_{11}|{}^{\alpha-\frac{p_1+1}{2}}|X_{22}|{}^{\alpha-\frac{p_2+1}{2}}\pi^{\frac{p_1p_2}{2}}\frac{\varGamma_{p_2}(\alpha -\frac{p_1}{2})}{\varGamma_{p_2}(\alpha)}. \end{aligned}$$

One can observe from this result that the original determinant splits into functions of X ₁₁ and X ₂₂. This also shows that if we are considering a real matrix-variate gamma density, then the diagonal blocks X ₁₁ and X ₂₂ are statistically independently distributed, where X ₁₁ will have a p ₁-variate gamma distribution and X ₂₂, a p ₂-variate gamma distribution. Note that tr(X) = tr(X ₁₁) + tr(X ₂₂) and hence, the integral over X ₂₂ gives $\varGamma _{p_2}(\alpha )$ and the integral over X ₁₁, $\varGamma _{p_1}(\alpha )$. Thus, the total integral is available as

$$\displaystyle \begin{aligned}\varGamma_{p_1}(\alpha)\varGamma_{p_2}(\alpha)\pi^{\frac{p_1p_2}{2}}\frac{\varGamma_{p_2}(\alpha-\frac{p_1}{2})} {\varGamma_{p_2}(\alpha)} =\varGamma_p(\alpha)\end{aligned}$$

since $\pi ^{\frac {p_1p_2}{2}}\varGamma _{p_1}(\alpha )\varGamma _{p_2}(\alpha -\frac {p_1}{2})=\varGamma _p(\alpha )$.

Hence, it is seen that instead of integrating out variables one at a time, we could have also integrated out blocks of variables at a time and verified the result. A similar procedure works for real matrix-variate type-1 and type-2 beta distributions, as well as the matrix-variate gamma and type-1 and type-2 beta distributions in the complex domain.

5.3.4. Methods avoiding integration over the Stiefel manifold

The general method of partitioning matrices previously described involves the integration over the Stiefel manifold as an intermediate step and relies on Theorem 4.2.3. We will consider another procedure whereby integration over the Stiefel manifold is not required. Let us consider the real gamma case first. Again, we begin with the decomposition

$$\displaystyle \begin{aligned} |X|{}^{\alpha-\frac{p+1}{2}}=|X_{11}|{}^{\alpha-\frac{p+1}{2}}|X_{22}-X_{21}X_{11}^{-1}X_{12}|{}^{\alpha-\frac{p+1}{2}}. {} \end{aligned} $$

(5.3.8)

Instead of integrating out X ₂₁ or X ₁₂, let us integrate out X ₂₂. Let X ₁₁ be a p ₁ × p ₁ matrix and X ₂₂ be a p ₂ × p ₂ matrix, with p ₁ + p ₂ = p. In the above partitioning, we require that X ₁₁ be nonsingular. However, when X is positive definite, both X ₁₁ and X ₂₂ will be positive definite, and thereby nonsingular. From the second factor in (5.3.8), $X_{22}>X_{21}X_{11}^{-1}X_{12}$ as $X_{22}-X_{21}X_{11}^{-1}X_{12}$ is positive definite. We will attempt to integrate out X ₂₂ first. Let $U=X_{22}-X_{21}X_{11}^{-1}X_{12}$ so that dU = dX ₂₂ for fixed X ₁₁ and X ₁₂. Since tr(X) = tr(X ₁₁) + tr(X ₂₂), we have

$$\displaystyle \begin{aligned}\text{e}^{-\text{tr}(X_{22})}=\text{e}^{-\text{tr}(U)-\text{tr}(X_{21}X_{11}^{-1}X_{12})}. \end{aligned}$$

On integrating out U, we obtain

$$\displaystyle \begin{aligned}\int_{U>O}|U|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(U)}\text{d}U=\varGamma_{p_2}(\alpha-\frac{p_1}{2}),~\Re(\alpha)>\frac{p-1}{2}\end{aligned}$$

since $\alpha -\frac {p+1}{2}=\alpha -\frac {p_1}{2}-\frac {p_2+1}{2}$. Letting

$$\displaystyle \begin{aligned}Y=X_{21}X_{11}^{-\frac{1}{2}}\Rightarrow \text{d}Y=|X_{11}|{}^{-\frac{p_2}{2}}\text{d}X_{21}\end{aligned}$$

for fixed X ₁₁ (Theorem 1.6.1), we have

$$\displaystyle \begin{aligned}\int_{X_{21}}\text{e}^{-\text{tr}(X_{21}X_{11}^{-1}X_{12})}\text{d}X_{21}=|X_{11}|{}^{\frac{p_2}{2}}\int_{Y}\text{e}^{-\text{tr}(YY')}\text{d}Y. \end{aligned}$$

But tr(Y Y ′) is the sum of the squares of the p ₁ p ₂ elements of Y and each integral is of the form $\int _{-\infty }^{\infty }\text{e}^{-z^2}\text{ d}z=\sqrt {\pi }$. Hence,

$$\displaystyle \begin{aligned}\int_{Y}\text{e}^{-\text{tr}(YY')}\text{d}Y=\pi^{\frac{p_1p_2}{2}}. \end{aligned}$$

We may now integrate out X ₁₁:

$$\displaystyle \begin{aligned} \int_{X_{11}>O}|X_{11}|{}^{\alpha+\frac{p_2}{2}-\frac{p+1}{2}}&\text{e}^{-\text{tr}(X_{11})}\text{d}X_{11}\\ &=\int_{X_{11}>O}|X_{11}|{}^{\alpha-\frac{p_1+1}{2}}\text{e}^{-\text{tr}(X_{11})}\text{d}X_{11}\\ &=\varGamma_{p_1}(\alpha). \end{aligned} $$

Thus, we have the following factors:

$$\displaystyle \begin{aligned}\pi^{\frac{p_1p_2}{2}}\varGamma_{p_2}(\alpha-{p_1}/{2})\varGamma_{p_1}(\alpha)=\varGamma_p(\alpha) \end{aligned}$$

since

$$\displaystyle \begin{aligned}\frac{p_1(p_1-1)}{4}+\frac{p_2(p_2-1)}{4}+\frac{p_1p_2}{2}=\frac{p(p-1)}{4},~\ p=p_1+p_2, \end{aligned}$$

and

$$\displaystyle \begin{aligned} \varGamma_{p_1}(\alpha)\varGamma_{p_2}(\alpha-{p_1}/{2})&=\varGamma(\alpha)\varGamma(\alpha-{1}/{2}) \cdots \varGamma(\alpha-{(p_1-1)}/{2})\varGamma_{p_2}(\alpha-{(p_1)}/{2})\\ &=\varGamma(\alpha)\cdots \varGamma(\alpha-{(p_1+p_2-1)}/{2}). \end{aligned} $$

Hence the result. This procedure avoids integration over the Stiefel manifold and does not require that p ₁ ≥ p ₂. We could have integrated out X ₁₁ first, if needed. In that case, we would have used the following expansion:

$$\displaystyle \begin{aligned}|X|{}^{\alpha-\frac{p+1}{2}}=|X_{22}|{}^{\alpha-\frac{p+1}{2}}|X_{11}-X_{12}X_{22}^{-1}X_{21}|{}^{\alpha-\frac{p+1}{2}}. \end{aligned}$$

We would have then proceeded as before by integrating out X ₁₁ first and would have ended up with

$$\displaystyle \begin{aligned}\pi^{\frac{p_1p_2}{2}}\varGamma_{p_1}(\alpha-{p_2}/{2})\varGamma_{p_2}(\alpha)=\varGamma_p(\alpha),\ \, p=p_1+p_2.\end{aligned}$$

Note 5.3.1: If we are considering a real matrix-variate gamma density, such as the Wishart density, then from the above procedure, observe that after integrating out X ₂₂, the only factor containing X ₂₁ is the exponential function, which has the structure of a matrix-variate Gaussian density. Hence, for a given X ₁₁, X ₂₁ is matrix-variate Gaussian distributed. Similarly, for a given X ₂₂, X ₁₂ is matrix-variate Gaussian distributed. Further, the diagonal blocks X ₁₁ and X ₂₂ are independently distributed.

The same procedure also applies for the evaluation of the gamma integrals in the complex domain. Since the steps are parallel, they will not be detailed here.

5.3.5. Arbitrary moments of the determinants, real gamma and beta matrices

Let the p × p real positive definite matrix X have a real matrix-variate gamma density with the parameters (α, B > O). Then for an arbitrary h, we can evaluate the h-th moment of the determinant of X with the help of the matrix-variate gamma integral, namely,

$$\displaystyle \begin{aligned} \int_{X>O}|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(BX)}\text{d}X=|B|{}^{-\alpha}\varGamma_p(\alpha). \end{aligned}$$

(i)

By making use of (i), we can evaluate the h-th moment in a real matrix-variate gamma density with the parameters (α, B > O) by considering the associated normalizing constant. Let u ₁ = |X|. Then, the moments of u ₁ can be obtained by integrating out over the density of X:

$$\displaystyle \begin{aligned} E[u_1]^h&=\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}\int_{X>O}u_1^h|X|{}^{\alpha-\frac{p+1}{2}}\text{e}^{-\text{tr}(BX)}\text{d}X\\ &=\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}\int_{X>O}|X|{}^{\alpha+h-\frac{p+1}{2}}\text{e}^{-\text{tr}(BX)}\text{d}X\\ &=\frac{|B|{}^{\alpha}}{\varGamma_p(\alpha)}\varGamma_p(\alpha+h)|B|{}^{-(\alpha+h)},~\Re(\alpha+h)>\frac{p-1}{2}.\end{aligned} $$

Thus,

$$\displaystyle \begin{aligned}E[u_1]^h=|B|{}^{-h}\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)},~ \Re(\alpha+h)>\frac{p-1}{2}.\end{aligned}$$

This is evaluated by observing that when E[u ₁]^h is taken, α is replaced by α + h in the integrand and hence, the answer is obtained from equation (i). The same procedure enables one to evaluate the h-th moment of the determinants of type-1 beta and type-2 beta matrices. Let Y be a p × p real positive definite matrix having a real matrix-variate type-1 beta density with the parameters (α, β) and u ₂ = |Y |. Then, the h-th moment of Y is obtained as follows:

$$\displaystyle \begin{aligned} E[u_2]^h&=\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha)\varGamma_p(\beta)}\int_{O<Y<I}u_2^h|Y|{}^{\alpha-\frac{p+1}{2}}|I-Y|{}^{\beta-\frac{p+1}{2}}\text{d}Y\\ &=\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha)\varGamma_p(\beta)}\int_{O<Y<I}|Y|{}^{\alpha+h-\frac{p+1}{2}}|I-Y|{}^{\beta-\frac{p+1}{2}}\text{d}Y\\ &=\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha)\varGamma_p(\beta)}\frac{\varGamma_p(\alpha+h)\varGamma_p(\beta)} {\varGamma_p(\alpha+\beta+h)},~\Re(\alpha+h)>\frac{p-1}{2},\\ &=\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha+\beta+h)},~ \Re(\alpha+h)>\frac{p-1}{2}.\end{aligned} $$

In a similar manner, let u ₃ = |Z| where Z has a p × p real matrix-variate type-2 beta density with the parameters (α, β). In this case, take α + β = (α + h) + (β − h), replacing α by α + h and β by β − h. Then, considering the normalizing constant of a real matrix-variate type-2 beta density, we obtain the h-th moment of u ₃ as follows:

$$\displaystyle \begin{aligned}E[u_3]^h=\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}\frac{\varGamma_p(\beta-h)}{\varGamma_p(\beta)},~ \Re(\alpha+h)>\frac{p-1}{2},~\Re(\beta-h)>\frac{p-1}{2}.\end{aligned}$$

Relatively few moments will exist in this case, as $\Re (\alpha +h)>\frac {p-1}{2}$ implies that $\Re (h)>-\Re (\alpha )+\frac {p-1}{2}$ and $\Re (\beta -h)>\frac {p-1}{2}$ means that $\Re (h)<\Re (\beta )-\frac {p-1}{2}$. Accordingly, only moments in the range $-\Re (\alpha )+\frac {p-1}{2}<\Re (h)<\Re (\beta )-\frac {p-1}{2}$ will exist. We can summarize the above results as follows: When X is distributed as a real p × p matrix-variate gamma with the parameters (α, B > O),

$$\displaystyle \begin{aligned} E|X|{}^h=\frac{|B|{}^{-h}\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)},~ \Re(\alpha)>\frac{p-1}{2}. {} \end{aligned} $$

(5.3.9)

When Y has a p × p real matrix-variate type-1 beta density with the parameters (α, β) and if u ₂ = |Y | then

$$\displaystyle \begin{aligned} E[u_2]^h=\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha+\beta+h)},~ \Re(\alpha+h)>\frac{p-1}{2}. {} \end{aligned} $$

(5.3.10)

When the p × p real positive definite matrix Z has a real matrix-variate type-2 beta density with the parameters (α, β), then letting u ₃ = |Z|,

$$\displaystyle \begin{aligned} E[u_3]^h=\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}\frac{\varGamma_p(\beta-h)}{\varGamma_p(\beta)},~ -\Re(\alpha)+\frac{p-1}{2}<\Re(h)<\Re(\beta)-\frac{p-1}{2}. {} \end{aligned} $$

(5.3.11)

Let us examine (5.3.9):

$$\displaystyle \begin{aligned} E|X|{}^h&=|B|{}^{-h}\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}\\ &=|B|{}^{-h}\frac{\varGamma(\alpha+h)}{\varGamma(\alpha)}\frac{\varGamma(\alpha-\frac{1}{2}+h)}{\varGamma(\alpha-\frac{1}{2})} \ldots \frac{\varGamma(\alpha-\frac{p-1}{2}+h)}{\varGamma(\alpha-\frac{p-1}{2})}\\ &=E[x_1^h]E[x_2^h]\cdots E[x_p^h]\end{aligned} $$

where x _j is a real scalar gamma random variable with shape parameter $\alpha -\frac {j-1}{2}$ and scale parameter λ _j where λ _j > 0, j = 1, …, p are the eigenvalues of B > O by observing that the determinant is the product and trace is the sum of the eigenvalues λ ₁, …, λ _p. Further, x ₁, .., x _p, are independently distributed. Hence, structurally, we have the following representation:

$$\displaystyle \begin{aligned} |X|=\prod_{j=1}^px_j {} \end{aligned} $$

(5.3.12)

where x _j has the density

$$\displaystyle \begin{aligned}f_{1j}(x_j)=\frac{\lambda_j^{\alpha-\frac{j-1}{2}}}{\varGamma(\alpha-\frac{j-1}{2})}x_j^{\alpha-\frac{j-1}{2}-1}\text{e}^{-\lambda_jx_j},\ \ 0\le x_j<\infty, \end{aligned}$$

for $\Re (\alpha )>\frac {j-1}{2},~ \lambda _j>0$ and zero otherwise. Similarly, when the p × p real positive definite matrix Y has a real matrix-variate type-1 beta density with the parameters (α, β), the determinant, |Y |, has the structural representation

$$\displaystyle \begin{aligned} |Y|=\prod_{j=1}^py_j {} \end{aligned} $$

(5.3.13)

where y _j is a real scalar type-1 beta random variable with the parameter $(\alpha -\frac {j-1}{2},~\beta )$ for j = 1, …, p. When the p × p real positive definite matrix Z has a real matrix-variate type-2 beta density, then |Z|, the determinant of Z, has the following structural representation:

$$\displaystyle \begin{aligned} |Z|=\prod_{j=1}^pz_j {} \end{aligned} $$

(5.3.14)

where z _j has a real scalar type-2 beta density with the parameters $(\alpha -\frac {j-1}{2},~\beta -\frac {j-1}{2})$ for j = 1, …, p.

Example 5.3.2

Consider a real 2 × 2 matrix X having a real matrix-variate distribution. Derive the density of the determinant |X| if X has (a) a gamma distribution with the parameters (α, B = I); (b) a real type-1 beta distribution with the parameters $(\alpha =\frac {3}{2},~\beta =\frac {3}{2})$; (c) a real type-2 beta distribution with the parameters $(\alpha =\frac {3}{2},~\beta =\frac {3}{2})$.

Solution 5.3.2

We will derive the density in these three cases by using three different methods to illustrate the possibility of making use of various approaches for solving such problems. (a) Let u ₁ = |X| in the gamma case. Then for an arbitrary h,

$$\displaystyle \begin{aligned}E[u_1^h]=\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}=\frac{\varGamma(\alpha+h)\varGamma(\alpha+h-\frac{1}{2})}{\varGamma(\alpha)\varGamma(\alpha-\frac{1}{2})},~ \Re(\alpha)>\frac{1}{2}. \end{aligned}$$

Since the gammas differ by $\frac {1}{2}$, they can be combined by utilizing the following identity:

$$\displaystyle \begin{aligned} \varGamma(mz)=(2\pi)^{\frac{1-m}{2}}m^{mz-\frac{1}{2}}\varGamma(z)\varGamma\big(z+\frac{1}{m}\big)\cdots \varGamma\big(z+\frac{m-1}{m}\big),\ \ m=1,2,\ldots, {} \end{aligned} $$

(5.3.15)

which is the multiplication formula for gamma functions. For m = 2, we have the duplication formula:

$$\displaystyle \begin{aligned}\varGamma(2z)=(2\pi)^{-\frac{1}{2}}2^{2z-\frac{1}{2}}\varGamma(z)\varGamma(z+{1}/{2}). \end{aligned}$$

Thus,

$$\displaystyle \begin{aligned}\varGamma(z)\varGamma(z+{1}/{2})=\pi^{\frac{1}{2}}2^{1-2z}\varGamma(2z). \end{aligned}$$

Now, by taking $z=\alpha -\frac {1}{2}+h$ in the numerator and $z=\alpha -\frac {1}{2}$ in the denominator, we can write

$$\displaystyle \begin{aligned}E[u_1^h]=\frac{\varGamma(\alpha+h)\varGamma(\alpha+h-\frac{1}{2})}{\varGamma(\alpha)\varGamma(\alpha-\frac{1}{2})} =\frac{\varGamma(2\alpha-1+2h)}{\varGamma(2\alpha-1)}2^{-2h}. \end{aligned}$$

Accordingly,

$$\displaystyle \begin{aligned}E[(4u_1)]^h=\frac{\varGamma(2\alpha-1+2h)}{\varGamma(2\alpha-1)}\Rightarrow E[2u_1^{\frac{1}{2}}]^{2h}=\frac{\varGamma(2\alpha-1+2h)}{\varGamma(2\alpha-1)}. \end{aligned}$$

This shows that $v=2u_1^{\frac {1}{2}}$ has a real scalar gamma distribution with the parameters (2α − 1, 1) whose density is

$$\displaystyle \begin{aligned} f(v)\,\text{d}v&=\frac{v^{(2\alpha-1)-1}}{\varGamma(2\alpha-1)}\text{e}^{-v}\text{d}v=\frac{(2u_1^{\frac{1}{2}})^{{2\alpha-2}}}{\varGamma(2\alpha-1)}\text{ e}^{-2u_1^{\frac{1}{2}}}\text{d}(2u_1^{\frac{1}{2}})\\ &=\frac{2^{2\alpha-2}u_1^{\alpha-\frac{1}{2}-1}}{\varGamma(2\alpha-1)}\text{e}^{-2u_1^{\frac{1}{2}}}\text{d}u_1.\end{aligned} $$

Hence the density of u ₁, denoted by f ₁(u ₁), is the following:

$$\displaystyle \begin{aligned}f_1(u_1)=\frac{2^{2\alpha-2}u^{\alpha-\frac{1}{2}-1}}{\varGamma(2\alpha-1)}\text{e}^{-2u^{\frac{1}{2}}},~0\le u_1<\infty \end{aligned}$$

and zero elsewhere. It can easily be verified that f ₁(u ₁) is a density.

(b) Let u ₂ = |X|. Then for an arbitrary h, $\alpha =\frac {3}{2}$ and $\beta =\frac {3}{2}$,

$$\displaystyle \begin{aligned} E[u_2^h]&=\frac{\varGamma_p(\alpha+h)}{\varGamma_p(\alpha)}\frac{\varGamma_p(\alpha+\beta)}{\varGamma_p(\alpha+\beta+h)}\\ &=\frac{\varGamma(3)\varGamma(\frac{5}{2})}{\varGamma(\frac{3}{2})\varGamma(\frac{2}{2})}\frac{\varGamma(\frac{3}{2}+h)\varGamma(1+h)}{\varGamma(3+h)\varGamma(\frac{5}{2}+h)}\\ &=3\Big\{\frac{1}{(2+h)(1+h)(\frac{3}{2}+h)}\Big\}=3\Big\{\frac{2}{2+h}+\frac{2}{1+h}-\frac{4}{\frac{3}{2}+h}\Big\},\end{aligned} $$

the last expression resulting from an application of the partial fraction technique. This results from h-th moment of the distribution of u ₂, whose density which is

$$\displaystyle \begin{aligned}f_2(u_2)=6\{1+u_2-2u_2^{\frac{1}{2}}\},~0\le u_2\le 1, \end{aligned}$$

and zero elsewhere, is readily seen to be bona fide.

(c) Let the density u ₃ = |X| be denoted by f ₃(u ₃). The Mellin transform of f ₃(u ₃), with Mellin parameter s, is

$$\displaystyle \begin{aligned} E[u_3^{s-1}]&=\frac{\varGamma_p(\alpha+s-1)}{\varGamma_p(\alpha)}\frac{\varGamma_p(\beta-s+1)}{\varGamma_p(\beta)} =\frac{\varGamma_2(\frac{3}{2}+s-1)}{\varGamma_2(\frac{3}{2})} \frac{\varGamma_2(\frac{3}{2}-s+1)}{\varGamma_2(\frac{3}{2})}\\ &=\frac{1}{[\varGamma({3}/{2})\varGamma(1)]^2}\varGamma({1}/{2}+s)\varGamma(s)\varGamma({5}/{2}-s)\varGamma(2-s),\end{aligned} $$

the corresponding density being available by taking the inverse Mellin transform, namely,

$$\displaystyle \begin{aligned} f_3(u_3)=\frac{4}{\pi}\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\varGamma(s)\varGamma(s+{1}/{2})\varGamma({5}/{2}-s)\varGamma(2-s)u_3^{-s}\text{d}s \end{aligned}$$

(i)

where $i=\sqrt {(-1)}$ and c in the integration contour is such that 0 < c < 2. The integral in (i) is available as the sum of residues at the poles of $\varGamma (s)\varGamma (s+\frac {1}{2})$ for 0 ≤ u ₃ ≤ 1 and the sum of residues at the poles of $\varGamma (2-s)\varGamma (\frac {5}{2}-s)$ for 1 < u ₃ < ∞. We can also combine Γ(s) and $\varGamma (s+\frac {1}{2})$ as well as Γ(2 − s) and $\varGamma (\frac {5}{2}-s)$ by making use of the duplication formula for gamma functions. We will then be able to identify the functions in each of the sectors, 0 ≤ u ₃ ≤ 1 and 1 < u ₃ < ∞. These will be functions of $u_3^{\frac {1}{2}}$ as done in the case (a). In order to illustrate the method relying on the inverse Mellin transform, we will evaluate the density f ₃(u ₃) as a sum of residues. The poles of $\varGamma (s)\varGamma (s+\frac {1}{2})$ are simple and hence two sums of residues are obtained for 0 ≤ u ₃ ≤ 1. The poles of Γ(s) occur at s = −ν, ν = 0, 1, …, and those of $\varGamma (s+\frac {1}{2})$ occur at $s=-\frac {1}{2}-\nu ,~\nu =0,1,\ldots $. The residues and the sum thereof will be evaluated with the help of the following two lemmas.

Lemma 5.3.1

Consider a function Γ(γ + s)ϕ(s)u ^−s whose poles are simple. The residue at the pole s = −γ − ν, ν = 0, 1, …, denoted by R _ν , is given by

$$\displaystyle \begin{aligned}R_{\nu}=\frac{(-1)^{\nu}}{\nu!}\phi(-\gamma-\nu)u^{\gamma+\nu}.\end{aligned}$$

Lemma 5.3.2

When Γ(δ) and Γ(δ − ν) are defined

$$\displaystyle \begin{aligned}\varGamma(\delta-\nu)=\frac{(-1)^{\nu}\varGamma(\delta)}{(-\delta+1)_{\nu}}\end{aligned}$$

where, for example, (a)_ν = a(a + 1)⋯(a + ν − 1), (a)₀ = 1, a≠0, is the Pochhammer symbol.

Observe that Γ(α) is defined for all α≠0, −1, −2, …, and that an integral representation requires $\Re (\alpha )>0$. As well, Γ(α + k) = Γ(α)(α)_k, k = 1, 2, ….. With the help of Lemmas 5.3.1 and 5.3.2, the sum of the residues at the poles of Γ(s) in the integral in (i), excluding the constant $\frac {4}{\pi }$, is the following:

$$\displaystyle \begin{aligned} &\sum_{\nu=0}^{\infty}\frac{(-1)^{\nu}}{\nu!}\varGamma\Big(\frac{1}{2}-\nu\Big)\varGamma\Big(\frac{5}{2}+\nu\Big)\varGamma(2+\nu)\\ &=\sum_{\nu=0}^{\infty}\frac{(-1)^{\nu}}{\nu!}\varGamma\Big(\frac{1}{2}\Big)\varGamma\Big(\frac{5}{2}\Big)\varGamma(2) \frac{(-1)^{\nu}}{(\frac{1}{2})_{\nu}}\Big(\frac{5}{2}\Big)_{\nu}(2)_{\nu}u_3^{\nu}\\ &=\frac{3}{4}\pi~{{}_2F_1}\Big(\frac{5}{2},2;\frac{1}{2};u_3\Big),~ 0\le u_3\le 1,\end{aligned} $$

where the ₂ F ₁(⋅) is Gauss’ hypergeometric function. The same procedure consisting of taking the sum of the residues at the poles $s=-\frac {1}{2}-\nu ,~ \nu =0,1,\ldots , $ gives

$$\displaystyle \begin{aligned}-3\pi u_3^{\frac{1}{2}}~{{}_2F_1}\Big(3,\frac{5}{2};\frac{3}{2};u_3\Big),~0\le u_3\le 1. \end{aligned}$$

The inverse Mellin transform for the sector 1 < u ₃ < ∞ is available as the sum of residues at the poles of $\varGamma (\frac {5}{2}-s)$ and Γ(2 − s) which occur at $s=\frac {5}{2}+\nu $ and s = 2 + ν for ν = 0, 1, … . The sum of residues at the poles of $\varGamma (\frac {5}{2}-s)$ is the following:

$$\displaystyle \begin{aligned} &\sum_{\nu=0}^{\infty}\frac{(-1)^{\nu}}{\nu!}\varGamma\Big(\frac{5}{2}+\nu\Big)\varGamma(3+\nu)\varGamma\Big(\!-\frac{1}{2}-\nu\Big)u_3^{-\frac{5}{2}-\nu}\\ &=-3\pi u_3^{-\frac{5}{2}}~{{}_2F_1}\Big(\frac{5}{2},3;\frac{3}{2};\frac{1}{u_3}\Big),~1<u_3<\infty,\end{aligned} $$

and the sum of the residues at the poles of Γ(2 − s) is given by

$$\displaystyle \begin{aligned}\frac{3}{4}\pi u_3^{-2}~{{}_2F_1}\Big(2,\frac{5}{2};\frac{1}{2};\frac{1}{u_3}\Big),~1<u_3<\infty.\end{aligned}$$

Now, on combining all the hypergeometric series and multiplying the result by the constant $\frac {4}{\pi },$ the final representation of the required density is obtained as

$$\displaystyle \begin{aligned}f_3(u_3)=\begin{cases}3~{{}_2F_1}(\frac{5}{2},2;\frac{1}{2};u_3)-12u_3^{\frac{1}{2}}{{}_2F_1}(3,\frac{5}{2};\frac{3}{2};u_3),\ 0\le u_3\le 1,\\ 3u_3^{-2}{{}_2F_1}(2,\frac{5}{2};\frac{1}{2};\frac{1}{u_3})-12u_3^{-\frac{5}{2}}{{}_2F_1}(\frac{5}{2},3;\frac{3}{2};\frac{1}{u_3}),\ 1<u_3<\infty. \end{cases}\end{aligned}$$

This completes the computations.

5.3a.2. Arbitrary moments of the determinants in the complex case

In the complex matrix-variate case, one can consider the absolute value of the determinant, which will be real; however, the parameters will be different from those in the real case. For example, consider the complex matrix-variate gamma density. If $\tilde {X}$ has a p × p complex matrix-variate gamma density with the parameters $(\alpha ,~\tilde {B}>O)$, then the h-th moment of the absolute value of the determinant of $\tilde {X}$ is the following:

$$\displaystyle \begin{aligned} E[|\text{det}(\tilde{X})|]^h&=\frac{|\text{det}(\tilde{B})|{}^{-h}\tilde{\varGamma}_p(\alpha+h)}{\tilde{\varGamma}_p(\alpha)}\\ &=(\lambda_1\cdots \lambda_p)^{-h}\prod_{j=1}^p\frac{\varGamma(\alpha-(j-1)+h)}{\varGamma(\alpha-(j-1))} =\prod_{j=1}^pE[\tilde{x}_j]^h,\end{aligned} $$

that is, $|\text{ det}(\tilde {X})|$ has the structural representation

$$\displaystyle \begin{aligned} |\text{det}(\tilde{X})|=\tilde{x}_1\tilde{x}_2\cdots \tilde{x}_p, {} \end{aligned} $$

(5.3a.8)

where the $\tilde {x}_j$ is a real scalar gamma random variable with the parameters (α − (j − 1), λ _j), j = 1, …, p, and the $\tilde {x}_j$’s are independently distributed. Similarly, when $\tilde {Y}$ is a p × p complex Hermitian positive definite matrix having a complex matrix-variate type-1 beta density with the parameters (α, β), the absolute value of the determinant of $\tilde {Y}$, $|{\text{det}(\tilde {Y})}|$, has the structural representation

$$\displaystyle \begin{aligned} |\text{det}(\tilde{Y})|=\prod_{j=1}^p\tilde{y}_j {} \end{aligned} $$

(5.3a.9)

where the $\tilde {y}_j$’s are independently distributed, $\tilde {y}_j$ being a real scalar type-1 beta random variable with the parameters (α − (j − 1), β), j = 1, …, p. When $\tilde {Z}$ is a p × p Hermitian positive definite matrix having a complex matrix-variate type-2 beta density with the parameters (α, β), then for arbitrary h, the h-th moment of the absolute value of the determinant is given by

$$\displaystyle \begin{aligned} E[|\text{det}(\tilde{Z})|]^h&=\frac{\tilde{\varGamma}_p(\alpha+h)}{\tilde{\varGamma}_p(\alpha)} \frac{\tilde{\varGamma}_p(\beta-h)}{\tilde{\varGamma}_p(\beta)}\\ &=\Big\{\prod_{j=1}^p\frac{\varGamma(\alpha-(j-1)+h)}{\varGamma(\alpha-(j-1))}\Big\}\Big\{\prod_{j=1}^p\frac{\varGamma(\beta-(j-1)-h)} {\varGamma(\beta-(j-1))}\Big\}\\ &=\prod_{j=1}^pE[\tilde{z}_j]^h,\end{aligned} $$

so that the absolute value of the determinant of $\tilde {Z}$ has the following structural representation:

$$\displaystyle \begin{aligned} |\text{det}(\tilde{Z})|=\prod_{j=1}^p\tilde{z}_j {} \end{aligned} $$

(5.3a.10)

where the $\tilde {z}_j$’s are independently distributed real scalar type-2 beta random variables with the parameters (α − (j − 1), β − (j − 1)) for j = 1, …, p. Thus, in the real case, the determinant and, in the complex case, the absolute value of the determinant have structural representations in terms of products of independently distributed real scalar random variables. The following is the summary of what has been discussed so far:

Distribution	Parameters, real case	Parameters, complex case
gamma	$(\alpha -\frac {j-1}{2},~\lambda _j)$	(α − (j − 1), λ _j)
type-1 beta	$(\alpha -\frac {j-1}{2},~\beta )$	(α − (j − 1), β)
type-2 beta	$(\alpha -\frac {j-1}{2},~\beta -\frac {j-1}{2})$	(α − (j − 1), β − (j − 1))

for j = 1, …, p. When we consider the determinant in the real case, the parameters differ by $\frac {1}{2}$ whereas the parameters differ by 1 in the complex domain. Whether in the real or complex cases, the individual variables appearing in the structural representations are real scalar variables that are independently distributed.

Example 5.3a.2

Even when p = 2, some of the poles will be of order 2 since the gammas differ by integers in the complex case, and hence a numerical example will not be provided for such an instance. Actually, when poles of order 2 or more are present, the series representation will contain logarithms as well as psi and zeta functions. A simple illustrative example is now considered. Let $\tilde {X}$ be 2 × 2 matrix having a complex matrix-variate type-1 beta distribution with the parameters (α = 2, β = 2). Evaluate the density of $\tilde {u}=|\text{det}(\tilde {X})|$.

Solution 5.3a.2

Let us take the (s − 1)th moment of $\tilde {u}$ which corresponds to the Mellin transform of the density of $\tilde {u}$, with Mellin parameter s:

$$\displaystyle \begin{aligned} E[\tilde{u}^{s-1}]&=\frac{\tilde{\varGamma}_p(\alpha+s-1)}{\tilde{\varGamma}_p(\alpha)}\frac{\tilde{\varGamma}_p(\alpha+\beta)} {\tilde{\varGamma}_p(\alpha+\beta+s-1)}\\ &=\frac{\varGamma(\alpha+\beta)\varGamma(\alpha+\beta-1)}{\varGamma(\alpha)\varGamma(\alpha-1)}\frac{\varGamma(\alpha+s-1)\varGamma(\alpha+s-2)} {\varGamma(\alpha+\beta+s-1)\varGamma(\alpha+\beta+s-2)}\\ &=\frac{\varGamma(4)\varGamma(3)}{\varGamma(2)\varGamma(1)}\frac{\varGamma(1+s)\varGamma(s)}{\varGamma(3+s)\varGamma(2+s)}=\frac{12}{(2+s)(1+s)^2s}.\end{aligned} $$

The inverse Mellin transform then yields the density of $\tilde {u}$, denoted by $\tilde {g}(\tilde {u})$, which is

$$\displaystyle \begin{aligned} \tilde{g}(\tilde{u})=12\,\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\frac{1}{(2+s)(1+s)^2s}u^{-s}\text{d}s \end{aligned}$$

(i)

where the c in the contour is any real number c > 0. There is a pole of order 1 at s = 0 and another pole of order 1 at s = −2, the residues at these poles being obtained as follows:

$$\displaystyle \begin{aligned}\lim_{s\to 0}\frac{u^{-s}}{(2+s)(1+s)^2}=\frac{1}{2},~~ \lim_{s\to -2}\frac{u^{-s}}{(1+s)^2s}=-\frac{u^2}{2}. \end{aligned}$$

The pole at s = −1 is of order 2 and hence the residue is given by

$$\displaystyle \begin{aligned} \lim_{s\to -1}\Big\{\frac{\text{d}}{\text{d}s}\frac{u^{-s}}{s(2+s)}\Big\}&=\lim_{s\to -1}\Big\{\frac{(-\ln u)u^{-s}}{s(2+s)}-\frac{u^{-s}}{s^2(2+s)}-\frac{u^{-s}}{s(2+s)^2}\Big\}\\ &=u\ln u-u+u=u\ln u.\end{aligned} $$

Hence the density is the following:

$$\displaystyle \begin{aligned}\tilde{g}(\tilde{u})=6-6u^2+12 u\ln u,~0\le u\le 1, \end{aligned}$$

and zero elsewhere, where u is real. It can readily be shown that $\tilde {g}(\tilde {u})\ge 0$ and $\int _0^1\tilde {g}(\tilde {u})\text{d}u=1$. This completes the computations.

Exercises 5.3

5.3.1

Evaluate the real p × p matrix-variate type-2 beta integral from first principles or by direct evaluation by partitioning the matrix as in Sect. 5.3.3 (general partitioning).

5.3.2

Repeat Exercise 5.3.6 for the complex case.

5.3.3

In the 2 × 2 partitioning of a p × p real matrix-variate gamma density with shape parameter α and scale parameter I, where the first diagonal block X ₁₁ is r × r, r < p, compute the density of the rectangular block X ₁₂.

5.3.4

Repeat Exercise 5.3.8 for the complex case.

5.3.5

Let the p × p real matrices X ₁ and X ₂ have real matrix-variate gamma densities with the parameters (α ₁, B > O) and (α ₂, B > O), respectively, B being the same for both distributions. Compute the density of (1): $U_1=X_2^{-\frac {1}{2}}X_1X_2^{-\frac {1}{2}}$, (2): $U_2=X_1^{\frac {1}{2}}X_2^{-1}X_1^{\frac {1}{2}}$, (3): $U_3=(X_1+X_2)^{-\frac {1}{2}}X_2(X_1+X_2)^{-\frac {1}{2}}$, when X ₁ and X ₂ are independently distributed.

5.3.6

Repeat Exercise 5.3.10 for the complex case.

5.3.7

In the transformation Y = I − X that was used in Sect. 5.3.1, the Jacobian is $\text{d}Y=(-1)^{\frac {p(p+1)}{2}}\text{d}X$. What happened to the factor $(-1)^{\frac {p(p+1)}{2}}$?

5.3.8

Consider X in the (a) 2 × 2, (b) 3 × 3 real matrix-variate case. If X is real matrix-variate gamma distributed, then derive the densities of the determinant of X in (a) and (b) if the parameters are $\alpha =\frac {5}{2},~B=I$. Consider $\tilde {X}$ in the (a) 2 × 2, (b) 3 × 3 complex matrix-variate case. Derive the distributions of $|\text{det}(\tilde {X})|$ in (a) and (b) if $\tilde {X}$ is complex matrix-variate gamma distributed with parameters (α = 2 + i, B = I).

5.3.9

Consider the real cases (a) and (b) in Exercise 5.3.13 except that the distribution is type-1 beta with the parameters $(\alpha =\frac {5}{2},~\beta =\frac {5}{2})$. Derive the density of the determinant of X.

5.3.10

Consider $\tilde {X}$, (a) 2 × 2, (b) 3 × 3 complex matrix-variate type-1 beta distributed with parameters $\alpha =\frac {5}{2}+i,~\beta =\frac {5}{2}-i)$. Then derive the density of $|\text{det}(\tilde {X})|$ in the cases (a) and (b).

5.3.11

Consider X, (a) 2 × 2, (b) 3 × 3 real matrix-variate type-2 beta distributed with the parameters $(\alpha =\frac {3}{2},~\beta =\frac {3}{2})$. Derive the density of |X| in the cases (a) and (b).

5.3.12

Consider $\tilde {X},$ (a) 2 × 2, (b) 3 × 3 complex matrix-variate type-2 beta distributed with the parameters $(\alpha =\frac {3}{2},~\beta =\frac {3}{2})$. Derive the density of $|\text{det}(\tilde {X})|$ in the cases (a) and (b).

5.4. The Densities of Some General Structures

Three cases were examined in Section 5.3: the product of real scalar gamma variables, the product of real scalar type-1 beta variables and the product of real scalar type-2 beta variables, where in all these instances, the individual variables were mutually independently distributed. Let us now consider the corresponding general structures. Let x _j be a real scalar gamma variable with shape parameter α _j and scale parameter 1 for convenience and let the x _j’s be independently distributed for j = 1, …, p. Then, letting v ₁ = x ₁⋯x _p,

$$\displaystyle \begin{aligned} E[v_1^h]=\prod_{j=1}^p\frac{\varGamma(\alpha_j+h)}{\varGamma(\alpha_j)},~ \Re(\alpha_j+h)>0,~ \Re(\alpha_j)>0. {} \end{aligned} $$

(5.4.1)

Now, let y ₁, …, y _p be independently distributed real scalar type-1 beta random variables with the parameters $(\alpha _j,~\beta _j), ~ \Re (\alpha _j)>0,~\Re (\beta _j)>0,~j=1,\ldots ,p,$ and v ₂ = y ₁⋯y _p,

$$\displaystyle \begin{aligned} E[v_2^h]=\prod_{j=1}^p\frac{\varGamma(\alpha_j+h)}{\varGamma(\alpha_j)}\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j+\beta_j+h)} {} \end{aligned} $$

(5.4.2)

for $\Re (\alpha _j)>0,~\Re (\beta _j)>0,~ \Re (\alpha _j+h)>0,~j=1,\ldots ,p$. Similarly, let z ₁, …, z _p, be independently distributed real scalar type-2 beta random variables with the parameters (α _j, β _j), j = 1, …, p, and let v ₃ = z ₁⋯z _p. Then, we have

$$\displaystyle \begin{aligned} E[v_3^h]=\prod_{j=1}^p\frac{\varGamma(\alpha_j+h)}{\varGamma(\alpha_j)}\frac{\varGamma(\beta_j-h)}{\varGamma(\beta_j)} {} \end{aligned} $$

(5.4.3)

for $\Re (\alpha _j)>0,~ \Re (\beta _j)>0,~ \Re (\alpha _j+h)>0,~ \Re (\beta _j-h)>0,~j=1,\ldots ,p$. The corresponding densities of v ₁, v ₂, v ₃, respectively denoted by g ₁(v ₁), g ₂(v ₂), g ₃(v ₃), are available from the inverse Mellin transforms by taking (5.4.1) to (5.4.3) as the Mellin transforms of g ₁, g ₂, g ₃ with h = s − 1 for a complex variable s where s is the Mellin parameter. Then, for suitable contours L, the densities can be determined as follows:

$$\displaystyle \begin{aligned} g_1(v_1)&=\frac{1}{2\pi i}\int_LE[v_1^{s-1}]v_1^{-s}\text{d}s,~ i=\sqrt{(-1)},\\ &=\Big\{\prod_{j=1}^p\frac{1}{\varGamma(\alpha_j)}\Big\}\frac{1}{2\pi i}\int_L\Big\{\prod_{j=1}^p\varGamma(\alpha_j+s-1)\Big\}v_1^{-s}\text{d}s\\ &=\Big\{\prod_{j=1}^p\frac{1}{\varGamma(\alpha_j)}\Big\}G_{0,p}^{p,0}[v_1|{}_{\alpha_j-1,~j=1,\ldots ,p}],~0\le v_1<\infty,{} \end{aligned} $$

(5.4.4)

where $\Re (\alpha _j+s-1)>0,~ j=1,\ldots ,p,$ and g ₁(v ₁) = 0 elsewhere. This last representation is expressed in terms of a G-function, which will be defined in Sect. 5.4.1.

$$\displaystyle \begin{aligned} g_2(v_2)&=\frac{1}{2\pi i}\int_LE[v_2^{s-1}]v_2^{-s}\text{d}s, ~i=\sqrt{(-1)},\\ &=\Big\{\prod_{j=1}^p\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j)}\Big\}\frac{1}{2\pi i}\int_L\Big\{\prod_{j=1}^p\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j+\beta_j+s-1)}\Big\}v_2^{-s}\text{d}s\\ &=\Big\{\prod_{j=1}^p\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j)}\Big\} G^{p,0}_{p,p}\left[v_2\big\vert_{\alpha_j-1,~j=1,\ldots ,p}^{\alpha_j+\beta_j-1,~j=1,\ldots ,p}\right],~0\le v_2\le 1,{} \end{aligned} $$

(5.4.5)

where $G_{p,p}^{p,0}$ is a G-function, $\Re (\alpha _j+s-1)>0,~\Re (\alpha _j)>0,~ \Re (\beta _j)>0,~j=1,\ldots ,p,$ and g ₂(v ₂) = 0 elsewhere.

$$\displaystyle \begin{aligned} g_3(v_3)&=\frac{1}{2\pi i}\int_LE[v_3^{s-1}]v_3^{-s}\text{d}s, ~i=\sqrt{(-1)},\\ &=\Big\{\prod_{j=1}^p\frac{1}{\varGamma(\alpha_j)\varGamma(\beta_j)}\Big\}\int_L\Big\{\prod_{j=1}^p\varGamma(\alpha_j+s-1) \varGamma(\beta_j-s+1)\Big\}v_3^{-s}\text{d}s\\ &=\Big\{\prod_{j=1}^p\frac{1}{\varGamma(\alpha_j)\varGamma(\beta_j)}\Big\} G_{p,p}^{p,p}\left[v_3\big\vert_{\alpha_j-1,~j=1,\ldots ,p}^{-\beta_j,~j=1,\ldots ,p}\right],~0\le v_3<\infty,{} \end{aligned} $$

(5.4.6)

where $\Re (\alpha _j)>0,~\Re (\beta _j)>0,~\Re (\alpha _j+s-1)>0,~\Re (\beta _j-s+1)>0,~ j=1,\ldots ,p,$ and g ₃(v ₃) = 0 elsewhere.

5.4.1. The G-function

The G-function is defined in terms of the following Mellin-Barnes integral:

$$\displaystyle \begin{aligned} G(z)&=G_{p,q}^{m,n}(z)=G_{p,q}^{m,n}\left[z\big\vert_{b_1,\ldots ,b_q}^{a_1,\ldots ,a_p}\right]\\ &=\frac{1}{2\pi i}\int_L\phi(s)z^{-s}\text{d}s,~ i=\sqrt{(-1)}\\ \phi(s)&=\frac{\{\prod_{j=1}^m\varGamma(b_j+s)\}\{\prod_{j=1}^n\varGamma(1-a_j-s)\}}{\{\prod_{j=m+1}^q\varGamma(1-b_j-s)\}\{\prod_{j=n+1}^p\varGamma(a_j+s)\}}\end{aligned} $$

where the parameters a _j, j = 1, …, p, b _j, j = 1, …, q, can be complex numbers. There are three general contours L, say L ₁, L ₂, L ₃ where L ₁ is a loop starting and ending at −∞ that contains all the poles of Γ(b _j + s), j = 1, …, m, and none of those of Γ(1 − a _j − s), j = 1, …, n. In general L will separate the poles of Γ(b _j + s), j = 1, …, m, from those of Γ(1 − a _j − s), j = 1, …, n, which lie on either side of the contour. L ₂ is a loop starting and ending at + ∞, which encloses all the poles of Γ(1 − a _j − s), j = 1, …, n. L ₃ is the straight line contour c − i∞ to c + i∞. The existence of the contours, convergence conditions, explicit series forms for general parameters as well as applications are available in Mathai (1993). G-functions can readily be evaluated with symbolic computing packages such as MAPLE and Mathematica.

Example 5.4.1

Let x ₁, x ₂, x ₃ be independently distributed real scalar random variables, x ₁ being real gamma distributed with the parameters (α ₁ = 3, β ₁ = 2), x ₂, real type-1 beta distributed with the parameters $(\alpha _2=\frac {3}{2}+2i,~\beta _2=\frac {1}{2})$ and x ₃, real type-2 beta distributed with the parameters $(\alpha _3=\frac {5}{2}+i,~\beta _3=2-i)$. Let $u_1=x_1x_2x_3,~ u_2=\frac {x_1}{x_2x_3}$ and $ u_3=\frac {x_2}{x_1x_3}$ with densities g _j(u _j), j = 1, 2, 3, respectively. Derive the densities g _j(u _j), j = 1, 2, 3, and represent them in terms of G-functions.

Solution 5.4.1

Observe that $E\big [\frac {1}{x_j}\big ]^{s-1}\!\!\!\!=\!E[x_j^{-s+1}],~j=1,2,3$, and that g ₁(u ₁), g ₂(u ₂) and g ₃(u ₃) will share the same ‘normalizing constant’, say c, which is the product of the parts of the normalizing constants in the densities of x ₁, x ₂ and x ₃ that do not cancel out when determining the moments, respectively denoted by c ₁, c ₂ and c ₃, that is, c = c ₁ c ₂ c ₃. Thus,

$$\displaystyle \begin{aligned} c&=\frac{1}{\varGamma(\alpha_1)}\frac{\varGamma(\alpha_2+\beta_2)}{\varGamma(\alpha_2)}\frac{1}{\varGamma(\alpha_3)\varGamma(\beta_3)}\\ &=\frac{1}{\varGamma(3)}\frac{\varGamma(2+2i)}{\varGamma(\frac{3}{2}+2i)}\frac{1}{\varGamma(\frac{5}{2}+i)\varGamma(2-i)}. \end{aligned} $$

(i)

The following are $E[x_j^{s-1}]$ and $E[x_j^{-s+1}]$ for j = 1, 2, 3:

$$\displaystyle \begin{aligned} E[x_1^{s-1}]&=c_1~2^{s-1}\varGamma(2+s),~E[x_1^{-s+1}]=c_1~2^{-s+1}\varGamma(4-s) \end{aligned} $$

(ii)

$$\displaystyle \begin{aligned} E[x_2^{s-1}]&=c_2~\frac{\varGamma(\frac{1}{2}+2i+s)}{\varGamma(1+2i+s)},~ E[x_2^{-s+1}]=c_2~\frac{\varGamma(\frac{5}{2}+2i-s)}{\varGamma(3+2i-s)} \end{aligned} $$

(iii)

$$\displaystyle \begin{aligned} E[x_3^{s-1}]&=c_3~\varGamma({3}/{2}+i+s)\varGamma(3-i-s),~E[x_3^{-s+1}]=c_3~\varGamma({7}/{2}+i-s)\varGamma(1-i+s).\end{aligned} $$

(iv)

Then from (i)-(iv),

$$\displaystyle \begin{aligned}E[u_1^{s-1}]=c\,2^{s-1}\varGamma(2+s)\frac{\varGamma(\frac{1}{2}+2i+s)}{\varGamma(1+2i+s)}\varGamma({3}/{2}+i+s)\varGamma(3-i-s).\end{aligned}$$

Taking the inverse Mellin transform and writing the density g ₁(u ₁) in terms of a G-function, we have

$$\displaystyle \begin{aligned}g_1(u_1)=\frac{c}{2}\,G^{3,1}_{2,3}\left[\frac{u_1}{2}\Big\vert_{2,~\frac{1}{2}+2i,~\frac{3}{2}+i}^{-2+i,~1+2i}\right].\end{aligned}$$

Using (i)-(iv) and rearranging the gamma functions so that those involving + s appear together in the numerator, we have the following:

$$\displaystyle \begin{aligned}E[u_2^{s-1}]=\frac{c}{2}\,2^s\,\varGamma(2+s)\frac{\varGamma(1-i+s)}{\varGamma(3+2i-s)}\varGamma({5}/{2}+2i-s)\varGamma({7}/{2}+i-s).\end{aligned}$$

Taking the inverse Mellin transform and expressing the result in terms of a G-function, we obtain the density g ₂(u ₂) as

$$\displaystyle \begin{aligned}g_2(u_2)=\frac{c}{2}\,G^{2,2}_{2,3}\left[\frac{u_2}{2}\Big\vert_{2,~1-i,~\frac{5}{2}+2i,~-2-2i}^{-\frac{3}{2}-2i,~-\frac{5}{2}-i}\right].\end{aligned}$$

Using (i)-(iv) and conveniently rearranging the gamma functions involving + s, we have

$$\displaystyle \begin{aligned}E[u_3^{s-1}]=2\,c\,2^{-s}\varGamma({1}/{2}+2i+s)\varGamma(1-i+s)\varGamma(4-s)\frac{\varGamma(\frac{7}{2}+i-s)}{\varGamma(1+2i+s)}.\end{aligned}$$

On taking the inverse Mellin transform, the following density is obtained:

$$\displaystyle \begin{aligned}g_3(u_3)=2\,c\,G^{2,2}_{3,2}\left[2u_3\Big\vert_{\frac{1}{2}+2i,~1-i}^{-3,~-\frac{5}{2}-i,~1+2i}\right].\end{aligned}$$

This completes the computations.

5.4.2. Some special cases of the G-function

Certain special cases of the G-function can be written in terms of elementary functions. Here are some of them:

$$\displaystyle \begin{aligned} G_{0,1}^{1,0}(z|a)&=z^a\text{e}^{-z},~z\ne 0\\ G_{1,1}^{1,1}\big[-z\big\vert_{0}^{1-a}\big]&=\varGamma(a)(1-z)^{-a},~ |z|<1\\ G_{1,1}^{1,0}\big[z\big\vert_{\alpha}^{\alpha+\beta+1}\big]&=\frac{1}{\varGamma(\beta+1)}z^{\alpha}(1-z)^{\beta},~ |z|<1\\ G_{1,1}^{1,1}\big[az^{\alpha}\big\vert_{\beta/\alpha}^{\beta/\alpha}\big]&= a^{\beta/\alpha}\Big[\frac{z^{\beta}}{1+az^{\alpha}}\Big], ~|az^{\alpha}|<1\end{aligned} $$

$$\displaystyle \begin{aligned} G_{1,1}^{1,1}\left[az^{\alpha}\big\vert_{\beta/\alpha}^{1-\gamma+\beta/\alpha}\right]& =\varGamma(\gamma)a^{\beta/\alpha}\Big[\frac{z^{\beta}}{(1+az^{\alpha})^{\gamma}}\Big], ~|az^{\alpha}|<1\\ G_{2,2}^{1,2}\left[-z^2\big\vert_{0,\frac{1}{2}}^{1-a,\frac{1}{2}-a}\right]&=\frac{\varGamma(2a)}{2^{2a}}[(1+z)^{-2a}+(1-z)^{-2a}], ~|z|<1\\ G_{2,2}^{1,2}\left[z\big\vert_{0,-2a}^{\frac{1}{2}-a,1-a}\right]&=\frac{\pi^{\frac{1}{2}}}{a}[1+(1+z)^{\frac{1}{2}}]^{-2a}, ~|z|<1\\ G_{0,2}^{1,0}\Big[\frac{z^2}{4}\big\vert_{\frac{1}{4},-\frac{1}{4}}\Big]&=\Big(\frac{2}{\pi z}\Big)^{\frac{1}{2}}\sin z\end{aligned} $$

$$\displaystyle \begin{aligned} G_{0,2}^{1,0}\Big[\frac{z^2}{4}\big\vert_{-\frac{1}{4},\frac{1}{4}}\Big]&=\Big(\frac{2} {\pi z}\Big)^{\frac{1}{2}}\cos z\\ G_{0,2}^{1,0}\Big[\!\!-\!\frac{z^2}{4}\big\vert_{0,-\frac{1}{2}}\Big]&=\frac{2}{z\pi^{\frac{1}{2}}}\text{sinh} z\\ G_{0,2}^{1,0}\Big[\!\!-\!\frac{z^2}{4}\big\vert_{0,\frac{1}{2}}\Big]&=\pi^{-\frac{1}{2}}\text{cosh} z\\ G_{2,2}^{1,2}\Big[\!\!\pm z\big\vert_{1,0}^{1,1}\Big]&=\ln (1\pm z), ~|z|<1\end{aligned} $$

$$\displaystyle \begin{aligned} G_{p,q+1}^{1,p}&\Big[z\big\vert_{0,1-b_1,\ldots 1-b_q}^{1-a_1,\ldots ,1-a_p}\Big]\\ &=\left[\frac{\varGamma(a_1)\cdots \varGamma(a_p)}{\varGamma(b_1)\cdots \varGamma(b_q)}\right]{{}_pF_q}(a_1,\ldots ,a_p;b_1,\ldots ,b_q;-z)\end{aligned} $$

for p ≤ q or p = q + 1 and |z| < 1.

5.4.3. The H-function

If we have a general structure corresponding to v ₁, v ₂ and v ₃ of Sect. 5.4, say w ₁, w ₂ and w ₃ of the form

$$\displaystyle \begin{aligned} w_1&=x_1^{\delta_1}x_2^{\delta_2}\cdots x_p^{\delta_p}{} \end{aligned} $$

(5.4.7)

$$\displaystyle \begin{aligned} w_2&=y_1^{\delta_1}y_2^{\delta_2}\cdots y_p^{\delta_p}{} \end{aligned} $$

(5.4.8)

$$\displaystyle \begin{aligned} w_3&=z_1^{\delta_1}z_2^{\delta_2}\cdots z_p^{\delta_p}{} \end{aligned} $$

(5.4.9)

for some δ _j > 0, j = 1, …, p the densities of w ₁, w ₂ and w ₃ are then available in terms of a more general function known as the H-function. It is again a Mellin-Barnes type integral defined and denoted as follows:

$$\displaystyle \begin{aligned} H(z)&=H_{p,q}^{m,n}(z)=H_{p,q}^{m,n}\left[z\big\vert_{(b_1,\beta_1),\ldots ,(b_q,\beta_q)}^{(a_1,\alpha_1),\ldots , (a_p,\alpha_p)}\right]\\ &=\frac{1}{2\pi i}\int_L\psi(s)z^{-s}\text{d}s,~ i=\sqrt{(-1)},\\ \psi(s)&=\frac{\{\prod_{j=1}^m\varGamma(b_j+\beta_js)\}\{\prod_{j=1}^n\varGamma(1-a_j-\alpha_js)\}}{\{\prod_{j=m+1}^q\varGamma(1-b_j-\beta_js)\}\{\prod_{j=n+1}^p \varGamma(a_j+\alpha_js)\}}{} \end{aligned} $$

(5.4.10)

where α _j > 0, j = 1, …, p, β _j > 0, j = 1, …, q, are real and positive, a _j, j = 1, …, p, and b _j, j = 1, …, q, are complex numbers. Three main contours L ₁, L ₂, L ₃ are utilized, similarly to those described in connection with the G-function. Existence conditions, properties and applications of this generalized hypergeometric function are available from Mathai et al. (2010) among other monographs. Numerous special cases can be expressed in terms of known elementary functions.

Example 5.4.2

Let x ₁ and x ₂ be independently distributed real type-1 beta random variables with the parameters (α _j > 0, β _j > 0), j = 1, 2, respectively. Let $y_1=x_1^{\delta _1},~\delta _1>0,$ and $y_2=x_2^{\delta _2},~\delta _2>0$. Compute the density of u = y ₁ y ₂.

Solution 5.4.2

Arbitrary moments of y ₁ and y ₂ are available from those of x ₁ and

x ₂.

$$\displaystyle \begin{aligned} E[x_j^h]&=\frac{\varGamma(\alpha_j+h)}{\varGamma(\alpha_j)}\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j+\beta_j+h)}, ~\Re(\alpha_j+h)>0,~j=1,2,\\ E[y_j^h]&=E[x_j^{\delta_jh}]=\frac{\varGamma(\alpha_j+\delta_jh)}{\varGamma(\alpha_j)} \frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j+\beta_j+\delta_jh)},~\Re(\alpha_j+\delta_jh)>0,\\ E[u^{s-1}]&=E[y_1^{s-1}]E[y_2^{s-1}]\\ &=\prod_{j=1}^2\frac{\varGamma(\alpha_j+\delta_j(s-1))}{\varGamma(\alpha_j)} \frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j+\beta_j+\delta_j(s-1))}.{} \end{aligned} $$

(5.4.11)

Accordingly, the density of u, denoted by g(u), is the following:

$$\displaystyle \begin{aligned} g(u)&=C\frac{1}{2\pi i}\int_L\Big\{\prod_{j=1}^2\frac{\varGamma(\alpha_j-\delta_j+\delta_js)}{\varGamma(\alpha_j+\beta_j-\delta_j+\delta_js)}\Big\}u^{-s}\text{d}s\\ &=C H_{2,2}^{2,0}\left[u\Big\vert_{(\alpha_1-\delta_1,~\delta_1),~(\alpha_2-\delta_2,~\delta_2)}^{(\alpha_1+\beta_1 -\delta_1,~\delta_1),~(\alpha_2+\beta_2-\delta_2,~\delta_2)}\right],\\ C&=\prod_{j=1}^2\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j)}{} \end{aligned} $$

(5.4.12)

where 0 ≤ u ≤ 1, $\Re (\alpha _j-\delta _j+\delta _js)>0,~ \Re (\alpha _j)>0,~\Re (\beta _j)>0,~j=1,2$ and g(u) = 0 elsewhere.

When α ₁ = 1 = ⋯ = α _p, β ₁ = 1 = ⋯ = β _q, the H-function reduces to a G-function. This G-function is frequently referred to as Meijer’s G-function and the H-function, as Fox’s H-function.

5.4.4. Some special cases of the H-function

Certain special cases of the H-function are listed next.

$$\displaystyle \begin{aligned} H_{0,1}^{1,0}[x|{}_{(b,\beta)}]&=\beta^{-1}x^{\frac{b}{\beta}}\text{e}^{-x^{\frac{1}{\beta}}};\\ H_{1,1}^{1,1}\big[z|{}_{(0,1)}^{(1-\nu,1)}\big]&=\varGamma(\nu)(1+z)^{-\nu}=\varGamma(\nu){{}_1F_0}(\nu;~~;-z),~|z|<1;\\ H_{0,2}^{1,0}\Big[\frac{z^2}{4}\Big\vert_{(\frac{a+\nu}{2},\frac{a-\nu}{2},1)}\Big]&=(\frac{\nu}{2})^aJ_{\nu}(z)\end{aligned} $$

where the Bessel function

$$\displaystyle \begin{aligned} J_{\nu}(z)&=\sum_{k=0}^{\infty}\frac{(-1)^k(z/2)^{\nu+2k}}{k!\varGamma(\nu+k+1)}=\frac{(z/2)^{\nu}}{\varGamma(\nu+1)}{{}_0F_1}(~~;1+\nu;-\frac{z^2}{4});\\ H_{1,2}^{1,1}\left[z\big\vert_{(0,1),(1-c,1)}^{(1-a,1)}\right]&=\frac{\varGamma(a)}{\varGamma(c)}{{}_1F_1}(a;c;-z);\\ H_{2,2}^{1,2}\left[x\big\vert_{(0,1),(1-c,1)}^{(1-a,1),(1-b,1)}\right]&=\frac{\varGamma(a)\varGamma(b)}{\varGamma(c)}{{}_2F_1}(a,b;c;-z);\\ H_{1,2}^{1,1}\left[-z\big\vert_{(0,1),(1-\beta,\alpha)}^{(1-\gamma,1)}\right]&=\varGamma(\gamma)E_{\alpha,\beta}^{\gamma}(z),~ \Re(\gamma)>0,\end{aligned} $$

where the generalized Mittag-Leffler function

$$\displaystyle \begin{aligned}E_{\alpha,\beta}^{\gamma}(z)=\sum_{k=0}^{\infty}\frac{(\gamma)_k}{k!}\frac{z^k}{\varGamma(\beta+\alpha k)},~\Re(\alpha)>0,~\Re(\beta)>0, \end{aligned}$$

where Γ(γ) is defined. For γ = 1, we have $E_{\alpha ,\beta }^1(z)=E_{\alpha ,\beta }(z)$; when γ = 1, β = 1, $E_{\alpha ,1}^1(z)=E_{\alpha }(z)$ and when γ = 1 = β = α, we have E ₁(z) = e^z.

$$\displaystyle \begin{aligned}H_{0,2}^{2,0}\Big[z\big\vert_{(0,1),(\frac{\nu}{\rho},\frac{1}{\rho})}\Big]=\rho K_{\rho}^{\nu}(z) \end{aligned}$$

where $K_{\rho }^{\nu }(z)$ is Krätzel function

$$\displaystyle \begin{aligned}K_{\rho}^{\nu}(z)=\int_0^{\infty}t^{\nu-1}\text{e}^{-t^{\rho}-\frac{z}{t}}\text{d}t,~ \Re(z)>0.\end{aligned}$$

$$\displaystyle \begin{aligned}H_{1,1}^{1,0}\left[x\Big\vert_{(\alpha,1)}^{(\alpha+\frac{1}{2},1)}\right]=\pi^{-\frac{1}{2}}z^{\alpha} (1-z)^{-\frac{1}{2}},~|z|<1;\end{aligned}$$

$$\displaystyle \begin{aligned} H_{2,2}^{2,0}&\left[z\Big\vert_{(\alpha,1),(\alpha,1)}^{(\alpha+\frac{1}{3},1),(\alpha+\frac{2}{3},1)}\right]\\ &=z^{\alpha}{{}_2F_1}\Big(\frac{2}{3},\frac{1}{3};1;1-z\Big),~|1-z|<1.\end{aligned} $$

Exercises 5.4

5.4.1

Show that

$$\displaystyle \begin{aligned}z^{\gamma}G_{0,1}^{1,0}\big[pz^{\alpha}|{}_{\beta/\alpha} \big]=p^{\beta/\alpha}z^{\beta+\gamma}\text{e}^{-pz^{\alpha}}.\end{aligned}$$

5.4.2

Show that

$$\displaystyle \begin{aligned}\text{e}^{-z}=G_{1,2}^{1,1}\left[z\big\vert_{0,1/3}^{1/3}\right]=G_{2,3}^{2,1}\left[z\big\vert_{0,\frac{1}{2},-\frac{1}{2}}^{-\frac{1}{2},\frac{1}{2}}\right].\end{aligned}$$

5.4.3

Show that

$$\displaystyle \begin{aligned}z^{\frac{1}{3}}(1-z)^{-\frac{5}{6}}=\varGamma({1}/{6})\,G_{1,1}^{1,0}\left[z\big\vert_{\frac{1}{3}}^{\frac{1}{2}}\right].\end{aligned}$$

5.4.4

Show that

$$\displaystyle \begin{aligned} \int_0^{\infty}x^{a-1}&(1-x)^{b-c}(1+x-zx)^{-b}\text{d}x\\ &=\frac{\varGamma(a)\varGamma(c-a)}{\varGamma(c)}{{}_2F_1}(a,b;c;z),~|z|<1,~ \Re(c-a)>0. \end{aligned} $$

5.4.5

Show that

$$\displaystyle \begin{aligned} (a_1-a_2)&H_{p,q}^{m,n}\left[z\big\vert_{(b_1,\beta_1),\ldots ,(b_q,\beta_q)}^{(a_1,\alpha_1),(a_2,\alpha_1),(a_3,\alpha_3),\ldots ,(a_p,\alpha_p)}\right]\\ &=H_{p,q}^{m,n}\left[z\big\vert_{(b_1,\beta_1),\ldots ,(b_q,\beta_q)}^{(a_1,\alpha_1),(a_2-1,\alpha_1),(a_3,\alpha_3),\ldots ,(a_p,\alpha_p)}\right]\\ &\ \ \ \ -H_{p,q}^{m,n}\left[z\big\vert_{(b_1,\beta_1),\ldots ,(b_q,\beta_q)}^{(a_1-1,\alpha_1),(a_2,\alpha_1),(a_3,\alpha_3),\ldots ,(a_p,\alpha_p)}\right], ~n\ge 2. \end{aligned} $$

5.5,5.5a. The Wishart Density

A particular case of the real p × p matrix-variate gamma distribution, known as the Wishart distribution, is the preeminent distribution in multivariate statistical analysis. In the general p × p real matrix-variate gamma density with parameters (α, B > O), let $\alpha =\frac {m}{2},~ B=\frac {1}{2}\varSigma ^{-1}$ and Σ > O; the resulting density is called a Wishart density with degrees of freedom m and parameter matrix Σ > O. This density, denoted by f _w(W), is given by

$$\displaystyle \begin{aligned} f_w(W)=\frac{|W|{}^{\frac{m}{2}-\frac{p+1}{2}}}{2^{\frac{mp}{2}}|\varSigma|{}^{\frac{m}{2}}\varGamma_p(\frac{m}{2})}\text{e}^{-\frac{1}{2}\text{ tr}(\varSigma^{-1}W)},~W>O,~\varSigma>O, {} \end{aligned} $$

(5.5.1)

for m ≥ p, and f _w(W) = 0 elsewhere. This will be denoted as W ∼ W _p(m, Σ). Clearly, all the properties discussed in connection with the real matrix-variate gamma density still hold in this case. Algebraic evaluations of the marginal densities and explicit evaluations of the densities of sub-matrices will be considered, some aspects having already been discussed in Sects. 5.2 and 5.2.1.

In the complex case, the density is the following, denoted by $\tilde {f}_w(\tilde {W})$:

$$\displaystyle \begin{aligned} \tilde{f}_w(\tilde{W})=\frac{|\text{det}(\tilde{W})|{}^{m-p}\text{e}^{-\text{tr}(\varSigma^{-1}\tilde{W})}}{|\text{ det}(\varSigma)|{}^m\tilde{\varGamma}_p(m)},~\tilde{W}>O,~\varSigma>O,~m\ge p, {} \end{aligned} $$

(5.5a.1)

and $\tilde {f}_w(\tilde {W})=0$ elsewhere. This will be denoted as $\tilde {W}\sim \tilde {W}_p(m,\varSigma )$.

5.5.1. Explicit evaluations of the matrix-variate gamma integral, real case

Is it possible to evaluate the matrix-variate gamma integral explicitly by using conventional integration? We will now investigate some aspects of this question.

When the Wishart density is derived from samples coming from a Gaussian population, the basic technique relies on the triangularization process. When Σ = I, that is, W ∼ W _p(m, I), can the integral of the right-hand side of (5.5.1) be evaluated by resorting to conventional methods or by direct evaluation? We will address this problem by making use of the technique of partitioning matrices. Let us partition

where let X ₂₂ = x _pp so that $X_{21}=(x_{p1},\ldots ,x_{p\,p-1}), ~X_{12}=X_{21}^{\prime }$. Then, on applying a result from Sect. 1.3, we have

$$\displaystyle \begin{aligned} |X|{}^{\alpha-\frac{p+1}{2}}=|X_{11}|{}^{\alpha-\frac{p+1}{2}}[x_{pp}-X_{21}X_{11}^{-1}X_{12}]^{\alpha-\frac{p+1}{2}}. {} \end{aligned} $$

(5.5.2)

Note that when X is positive definite, X ₁₁ > O and x _pp > 0, and the quadratic form $X_{21}X_{11}^{-1}X_{12}>0$. As well,

$$\displaystyle \begin{aligned}{}[x_{pp}-X_{21}X_{11}^{-1}X_{12}]^{\alpha-\frac{p+1}{2}}=x_{pp}^{\alpha-\frac{p+1}{2}}[1-x_{pp}^{-\frac{1}{2}} X_{21}X_{11}^{-\frac{1}{2}}X_{11}^{-\frac{1}{2}}X_{12}x_{pp}^{-\frac{1}{2}}]^{\alpha-\frac{p+1}{2}}.{} \end{aligned} $$

(5.5.3)

Letting $Y=x_{pp}^{-\frac {1}{2}}X_{21}X_{11}^{-\frac {1}{2}},$ then referring to Mathai (1997, Theorem 1.18) or Theorem 1.6.4 of Chap. 1, $\text{d}Y=x_{pp}^{-\frac {p-1}{2}}|X_{11}|{ }^{-\frac {1}{2}}\text{d}X_{21}$ for fixed X ₁₁ and x _pp, . The integral over x _pp gives

$$\displaystyle \begin{aligned}\int_0^{\infty}x_{pp}^{\alpha+\frac{p-1}{2}-\frac{p+1}{2}}\text{e}^{-x_{pp}}\text{d}x_{pp}=\varGamma(\alpha),~\Re(\alpha)>0. \end{aligned}$$

If we let u = Y Y ^′, then from Theorem 2.16 and Remark 2.13 of Mathai (1997) or using Theorem 4.2.3, after integrating out over the Stiefel manifold, we have

$$\displaystyle \begin{aligned}\text{d}Y=\frac{\pi^{\frac{p-1}{2}}}{\varGamma(\frac{p-1}{2})}u^{\frac{p-1}{2}-1}\text{d}u. \end{aligned}$$

(Note that n in Theorem 2.16 of Mathai (1997) corresponds to p − 1 and p is 1). Then, the integral over u gives

$$\displaystyle \begin{aligned}\int_0^1u^{\frac{p-1}{2}-1}(1-u)^{\alpha-\frac{p+1}{2}}\text{ d}u=\frac{\varGamma(\frac{p-1}{2})\varGamma(\alpha-\frac{p-1}{2})}{\varGamma(\alpha)},~\Re(\alpha)>\frac{p-1}{2}. \end{aligned}$$

Now, collecting all the factors, we have

$$\displaystyle \begin{aligned} |X_{11}|{}^{\alpha+\frac{1}{2}-\frac{p+1}{2}}\varGamma(\alpha)&\frac{\pi^{\frac{p-1}{2}}}{\varGamma(\frac{p-1}{2})} \frac{\varGamma(\frac{p-1}{2})\varGamma(\alpha-\frac{p-1}{2})}{\varGamma(\alpha)}\\ &=|X_{11}^{(1)}|{}^{\alpha+\frac{1}{2}-\frac{p+1}{2}} \pi^{\frac{p-1}{2}}\varGamma(\alpha-{(p-1)}/{2}) \end{aligned} $$

for $\Re (\alpha )>\frac {p-1}{2}$. Note that $|X_{11}^{(1)}|$ is (p − 1) × (p − 1) and |X ₁₁|, after the completion of the first part of the operations, is denoted by $|X_{11}^{(1)}|$, the exponent being changed to $\alpha +\frac {1}{2}-\frac {p+1}{2}$. Now repeat the process by separating x _p−1,p−1, that is, by writing

Here, $X_{11}^{(2)}$ is of order (p − 2) × (p − 2) and $X_{21}^{(2)}$ is of order 1 × (p − 2). As before, letting u = Y Y ^′ with $Y=x_{p-1,p-1}^{-\frac {1}{2}}X_{21}^{(2)}[X_{11}^{(2)}]^{-\frac {1}{2}},$ $\text{d}Y=x_{p-1,p-1}^{-\frac {p-2}{2}}|X_{11}^{(2)}|{ }^{-\frac {1}{2}}\text{d}X_{21}^{(2)}.$ The integral over the Stiefel manifold gives $\frac {\pi ^{\frac {p-2}{2}}}{\varGamma (\frac {p-2}{2})}u^{\frac {p-2}{2}-1}\text{d}u$ and the factor containing (1 − u) is $(1-u)^{\alpha +\frac {1}{2}-\frac {p+1}{2}}$, the integral over u yielding

$$\displaystyle \begin{aligned}\int_0^1u^{\frac{p-2}{2}-1}(1-u)^{\alpha+\frac{1}{2}-\frac{p+1}{2}}\text{d}u=\frac{\varGamma(\frac{p-2}{2})\varGamma(\alpha-\frac{p-2}{2})}{\varGamma(\alpha)}\end{aligned}$$

and that over v = x _p−1,p−1 giving

$$\displaystyle \begin{aligned}\int_0^1v^{\alpha+\frac{1}{2}+\frac{p-2}{2}-\frac{p+1}{2}}\text{e}^{-v}\text{d}v=\varGamma(\alpha),~\Re(\alpha)>0. \end{aligned}$$

The product of these factors is then

$$\displaystyle \begin{aligned}|X_{11}^{(2)}|{}^{\alpha+1-\frac{p+1}{2}}\pi^{\frac{p-2}{2}}\varGamma(\alpha-{(p-2)}/{2}),~\Re(\alpha)>\frac{p-2}{2}. \end{aligned}$$

Successive evaluations carried out by employing the same procedure yield the exponent of π as $\frac {p-1}{2}+\frac {p-2}{2}+\cdots +\frac {1}{2}=\frac {p(p-1)}{4}$ and the gamma product, $\varGamma (\alpha -\frac {p-1}{2})\varGamma (\alpha -\frac {p-2}{2})\cdots \varGamma (\alpha ),$ the final result being Γ _p(α). The result is thus verified.

5.5a.1. Evaluation of matrix-variate gamma integrals in the complex case

The matrices and gamma functions belonging to the complex domain will be denoted with a tilde. As well, in the complex case, all matrices appearing in the integrals will be p × p Hermitian positive definite unless otherwise stated; as an example, for such a matrix X, this will be denoted by $\tilde {X}>O$. The integral of interest is

$$\displaystyle \begin{aligned} \tilde{\varGamma}_p(\alpha)=\int_{\tilde{X}>O}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{e}^{-\text{tr}(\tilde{X})}\text{d}\tilde{X}. {} \end{aligned} $$

(5.5a.2)

A standard procedure for evaluating the integral in (5.5a.2) consists of expressing the positive definite Hermitian matrix as $\tilde {X}=\tilde {T}\tilde {T}^{*}$ where $\tilde {T}$ is a lower triangular matrix with real and positive diagonal elements t _jj > 0, j = 1, …, p, where an asterisk indicates the conjugate transpose. Then, referring to (Mathai (1997, Theorem 3.7) or Theorem 1.6.7 of Chap. 1, the Jacobian is seen to be as follows:

$$\displaystyle \begin{aligned} \text{d}\tilde{X}=2^p\Big\{\prod_{j=1}^pt_{jj}^{2(p-j)+1}\Big\}\text{d}\tilde{T} {} \end{aligned} $$

(5.5a.3)

and then

$$\displaystyle \begin{aligned} \text{tr}(\tilde{X})&=\text{tr}(\tilde{T}\tilde{T}^{*})\\ &=t_{11}^2+\cdots +t_{pp}^2+|\tilde{t}_{21}|{}^2+\cdots +|\tilde{t}_{p1}|{}^2+\cdots +|\tilde{t}_{pp-1}|{}^2 \end{aligned} $$

and

$$\displaystyle \begin{aligned}|\text{det}(\tilde{X})|{}^{\alpha-p}\text{d}\tilde{X}=2^p\Big\{\prod_{j=1}^pt_{jj}^{2\alpha-2j+1}\Big\}\text{d}\tilde{T}. \end{aligned}$$

Now. integrating out over $\tilde {t}_{jk}$ for j > k,

$$\displaystyle \begin{aligned}\int_{\tilde{t}_{jk}}\text{e}^{-|\tilde{t}_{jk}|{}^2}\text{d}\tilde{t}_{jk}=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\text{ e}^{-(t_{jk1}^2+t_{jk2}^2)}\text{d}t_{jk1}\wedge\text{d}t_{jk2}=\pi \end{aligned}$$

and

$$\displaystyle \begin{aligned}\prod_{j>k}\pi=\pi^{\frac{p(p-1)}{2}}. \end{aligned}$$

As well,

$$\displaystyle \begin{aligned}2\int_0^{\infty}t_{jj}^{2\alpha-2j+1}\text{e}^{-t_{jj}^2}\text{ d}\,t_{jj}=\varGamma(\alpha-j+1),~\Re(\alpha)>j-1, \end{aligned}$$

for j = 1, …, p. Taking the product of all these factors then gives

$$\displaystyle \begin{aligned}\pi^{\frac{p(p-1)}{2}}\varGamma(\alpha)\varGamma(\alpha-1)\cdots \varGamma(\alpha-p+1)=\tilde{\varGamma}_p(\alpha),~\Re(\alpha)>p-1, \end{aligned}$$

and hence the result is verified.

An alternative method based on partitioned matrix, complex case

The approach discussed in this section relies on the successive extraction of the diagonal elements of $\tilde {X}$, a p × p positive definite Hermitian matrix, all of these elements being necessarily real and positive, that is, x _jj > 0, j = 1, …, p. Let

where $\tilde {X}_{11}$ is (p − 1) × (p − 1) and

$$\displaystyle \begin{aligned}|\text{det}(\tilde{X})|{}^{\alpha-p}=|\text{det}(\tilde{X}_{11})|{}^{\alpha-p}|x_{pp}-\tilde{X}_{21}\tilde{X}_{11}^{-1}\tilde{X}_{12}|{}^{\alpha-p}\end{aligned}$$

and

$$\displaystyle \begin{aligned}\text{tr}(\tilde{X})=\text{tr}(\tilde{X}_{11})+x_{pp}. \end{aligned}$$

Then,

$$\displaystyle \begin{aligned}|x_{pp}-\tilde{X}_{21}\tilde{X}_{11}^{-1}\tilde{X}_{12}|{}^{\alpha-p}=x_{pp}^{\alpha-p}|1-x_{pp}^{-\frac{1}{2}} \tilde{X}_{21}\tilde{X}_{11}^{-\frac{1}{2}}\tilde{X}_{11}^{-\frac{1}{2}}\tilde{X}_{12}x_{pp}^{-\frac{1}{2}}|{}^{\alpha-p}. \end{aligned}$$

Let

$$\displaystyle \begin{aligned}\tilde{Y}=x_{pp}^{-\frac{1}{2}}\tilde{X}_{21}\tilde{X}_{11}^{-\frac{1}{2}}\Rightarrow \text{d}\tilde{Y}=x_{pp}^{-(p-1)}|\text{det}(\tilde{X}_{11})|{}^{-1}\text{ d}\tilde{X}_{21}, \end{aligned}$$

referring to Theorem 1.6a.4 or Mathai (1997, Theorem 3.2(c)) for fixed x _pp and X ₁₁. Now, the integral over x _pp gives

$$\displaystyle \begin{aligned}\int_0^{\infty}x_{pp}^{\alpha-p+(p-1)}\text{e}^{-x_{pp}}\text{d}x_{pp}=\varGamma(\alpha),~\Re(\alpha)>0. \end{aligned}$$

Letting $u=\tilde {Y}\tilde {Y}^{*}$, $\text{ d}\tilde {Y}=u^{p-2}\frac {\pi ^{p-1}}{\varGamma (p-1)}\text{d}u$ by applying Theorem 4.2a.3 or Corollaries 4.5.2 and 4.5.3 of Mathai (1997), and noting that u is real and positive, the integral over u gives

$$\displaystyle \begin{aligned}\int_0^{\infty}u^{(p-1)-1}(1-u)^{\alpha-(p-1)-1}\text{d}u=\frac{\varGamma(p-1)\varGamma(\alpha-(p-1))}{\varGamma(\alpha)},~\Re(\alpha)>p-1. \end{aligned}$$

Taking the product, we obtain

$$\displaystyle \begin{aligned}&|\text{det}(\tilde{X}_{11}^{(1)})|{}^{\alpha+1-p}\,\varGamma(\alpha)\frac{\pi^{p-1}}{\varGamma(p-1)}\frac{\varGamma(p-1) \varGamma(\alpha-(p-1))}{\varGamma(\alpha)}\\ &=\pi^{p-1}\varGamma(\alpha-(p-1))|\text{det}(\tilde{X}_{11}^{(1)})|{}^{\alpha+1-p} \end{aligned} $$

where $\tilde {X}_{11}^{(1)}$ stands for $\tilde {X}_{11}$ after having completed the first set of integrations. In the second stage, we extract x _p−1,p−1, the first (p − 2) × (p − 2) submatrix being denoted by $\tilde {X}_{11}^{(2)}$ and we continue as previously explained to obtain $|\text{det}(\tilde {X}_{11}^{(2)})|{ }^{\alpha +2-p}\pi ^{p-2}\varGamma (\alpha -(p-2))$. Proceeding successively in this manner, we have the exponent of π as (p − 1) + (p − 2) + ⋯ + 1 = p(p − 1)∕2 and the gamma product as Γ(α − (p − 1))Γ(α − (p − 2))⋯Γ(α) for $\Re (\alpha )>p-1$. That is,

$$\displaystyle \begin{aligned}\pi^{\frac{p(p-1)}{2}}\varGamma(\alpha)\varGamma(\alpha-1)\cdots \varGamma(\alpha-(p-1))=\tilde{\varGamma}_p(\alpha).\end{aligned}$$

5.5.2. Triangularization of the Wishart matrix in the real case

Let W ∼ W _p(m, Σ), Σ > O be a p × p matrix having a Wishart distribution with m degrees of freedom and parameter matrix Σ > O, that is, let W have a density of the following form for Σ = I:

$$\displaystyle \begin{aligned} f_w(W)=\frac{|W|{}^{\frac{m}{2}-\frac{p+1}{2}}\text{e}^{-\frac{1}{2}\text{tr}(W)}}{2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2})},~W>O,~m\ge p, {} \end{aligned} $$

(5.5.4)

and f _w(W) = 0 elsewhere. Let us consider the transformation W = TT ^′ where T is a lower triangular matrix with positive diagonal elements. Since W > O, the transformation W = TT ^′ with the diagonal elements of T being positive is one-to-one. We have already evaluated the associated Jacobian in Theorem 1.6.7, namely,

$$\displaystyle \begin{aligned} \text{d}W=2^p\Big\{\prod_{j=1}^pt_{jj}^{p+1-j}\Big\}\text{d}T. {} \end{aligned} $$

(5.5.5)

Under this transformation,

$$\displaystyle \begin{aligned} f(W)\text{d}W&=\frac{1}{2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2})}\Big\{\prod_{j=1}^p(t_{jj}^2)^{\frac{m}{2}-\frac{p+1}{2}}\Big\}\,\text{e}^{-\frac{1}{2}\sum_{i\ge j}t_{ij}^2}\,2^p\Big\{\prod_{j=1}^pt_{jj}^{p+1-j}\Big\}\text{d }T\\ &=\frac{1}{2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2})}2^p\Big\{\prod_{j=1}^p(t_{jj}^2)^{\frac{m}{2}-\frac{j}{2}}\Big\}\,\text{ e}^{-\frac{1}{2}\sum_{j=1}^pt_{jj}^2-\frac{1}{2}\sum_{i>j}t_{ij}^2}\,\text{d}T.{} \end{aligned} $$

(5.5.6)

In view of (5.5.6), it is evident that t _jj, j = 1, …, p and the t _ij’s, i > j are mutually independently distributed. The form of the function containing t _ij, i > j, is $\text{e}^{-\frac {1}{2}t_{ij}^2},$ and hence the t _ij’s for i > j are mutually independently distributed real standard normal variables. It is also seen from (5.5.6) that the density of $t_{jj}^2$ is of the form

$$\displaystyle \begin{aligned}c_jy_j^{\frac{m}{2}-\frac{j-1}{2}-1}\text{e}^{-\frac{1}{2}y_j},~y_j=t_{jj}^2, \end{aligned}$$

which is the density of a real chisquare variable having m − (j − 1) degrees of freedom for j = 1, …, p, where c _j is the normalizing constant. Hence, the following result:

Theorem 5.5.1

Let the real p × p positive definite matrix W have a real Wishart density as specified in (5.5.4) and let W = TT ^′ where T = (t _ij) is a lower triangular matrix whose diagonal elements are positive. Then, the non-diagonal elements t _ij such that i > j are mutually independently distributed as real standard normal variables, the diagonal elements $t_{jj}^2,~ j=1,\ldots ,p,$ are independently distributed as a real chisquare variables having m − (j − 1) degrees of freedom for j = 1, …, p, and the $t_{jj}^2$ ’s and t _ij ’s are mutually independently distributed.

Corollary 5.5.1

Let W ∼ W _p(n, σ ² I), where σ ² > 0 is a real scalar quantity. Let W = TT ^′ where T = (t _ij) is a lower triangular matrix whose diagonal elements are positive. Then, the t _jj ’s are independently distributed for j = 1, …, p, the t _ij ’s, i > j, are independently distributed, and all t _jj ’s and t _ij ’s are mutually independently distributed, where $t_{jj}^2/{\sigma ^2}$ has a real chisquare distribution with m − (j − 1) degrees of freedom for j = 1, …, p, and t _ij, i > j, has a real scalar Gaussian distribution with mean value zero and variance σ ² , that is, $t_{ij} \overset {iid}{\sim } N(0,\sigma ^2)$ for all i > j.

5.5a.2. Triangularization of the Wishart matrix in the complex domain

Let $\tilde {W}$ have the following Wishart density in the complex domain:

$$\displaystyle \begin{aligned} \tilde{f}_w(\tilde{W})=\frac{1}{\tilde{\varGamma}_p(m)}|\text{det}(\tilde{W})|{}^{m-p}\text{e}^{-\text{tr}(\tilde{W})},~\tilde{W}>O,~ m\ge p, {} \end{aligned} $$

(5.5a.4)

and $\tilde {f}_w(\tilde {W})=0$ elsewhere, which is denoted $\tilde {W}\sim \tilde {W}_p(m,I)$. Consider the transformation $\tilde {W}=\tilde {T}\tilde {T}^{*}$ where $\tilde {T}$ is lower triangular whose diagonal elements are real and positive. The transformation $\tilde {W}=\tilde {T}\tilde {T}^{*}$ is then one-to-one and its associated Jacobian, as given in Theorem 1.6a.7, is the following:

$$\displaystyle \begin{aligned} \text{d}\tilde{W}=2^p\Big\{\prod_{j=1}^pt_{jj}^{2(p-j)+1}\Big\}\text{d}\tilde{T}. {} \end{aligned} $$

(5.5a.5)

Then we have

$$\displaystyle \begin{aligned} \tilde{f}(\tilde{W})\text{d}\tilde{W}&=\frac{1}{\tilde{\varGamma}_p(m)}\Big\{\prod_{j=1}^p(t_{jj}^2)^{m-p}\Big\}\text{e}^{-\sum_{i\ge j}|\tilde{t}_{ij}|{}^2}2^p\Big\{\prod_{j=1}^pt_{jj}^{2(p-j)+1}\Big\}\text{d}\tilde{T}\\ &=\frac{1}{\tilde{\varGamma}_p(m)}2^p\Big\{\prod_{j=1}^p(t_{jj}^2)^{m-j+\frac{1}{2}}\Big\}\,\text{e}^{-\sum_{j=1}^pt_{jj}^2-\sum_{i>j}|\tilde{t}_{ij}|{}^2}\text{ d}\tilde{T}. {} \end{aligned} $$

(5.5a.6)

In light of (5.5a.6), it is clear that all the t _jj’s and $\tilde {t}_{ij}$’s are mutually independently distributed where $\tilde {t}_{ij},~i>j,$ has a complex standard Gaussian density and $t_{jj}^2$ has a complex chisquare density with degrees of freedom m − (j − 1) or a real gamma density with the parameters (α = m − (j − 1), β = 1), for j = 1, …, p. Hence, we have the following result:

Theorem 5.5a.1

Let the complex Wishart density be as specified in (5.5a.4), that is, $\tilde {W}\sim \tilde {W}_p(m,I)$ . Consider the transformation $\tilde {W}=\tilde {T}\tilde {T}^{*}$ where $\tilde {T}=(\tilde {t}_{ij})$ is a lower triangular matrix in the complex domain whose diagonal elements are real and positive. Then, for i > j, the $\tilde {t}_{ij}$ ’s are standard Gaussian distributed in the complex domain, that is, $\tilde {t}_{ij}\sim \tilde {N}_1(0,1),~i>j$, $t_{jj}^2$ is real gamma distributed with the parameters (α = m − (j − 1), β = 1) for j = 1, …, p, and all the t _jj ’s and $\tilde {t}_{ij}$ ’s, i > j, are mutually independently distributed.

Corollary 5.5a.1

Let $\tilde {W}\sim \tilde {W}_p(m,\sigma ^2I)$ where σ ² > 0 is a real positive scalar. Let $\tilde {T},~ t_{jj},~ \tilde {t}_{ij},~i>j$ , be as defined in Theorem 5.5a.1 . Then, $t_{jj}^2/{\sigma ^2}$ is a real gamma variable with the parameters (α = m − (j − 1), β = 1) for j = 1, …, p, $\tilde {t}_{ij}\sim \tilde {N}_1(0,\sigma ^2)$ for all i > j, and the t _jj ’s and $\tilde {t}_{ij}$ ’s are mutually independently distributed.

5.5.3. Samples from a p-variate Gaussian population and the Wishart density

Let the p × 1 real vector X _j be normally distributed, X _j ∼ N _p(μ, Σ), Σ > O. Let X ₁, …, X _n be a simple random sample of size n from this normal population and the p × n sample matrix be denoted in bold face lettering as X = (X ₁, X ₂, …, X _n) where $X_j^{\prime }=(x_{1j},x_{2j},\ldots ,x_{pj})$. Let the sample mean be $\bar {X}=\frac {1}{n}(X_1+\cdots +X_n)$ and the matrix of sample means be denoted by the bold face ${\bar {\mathbf {X}}}=(\bar {X},\ldots ,\bar {X})$. Then, the p × p sample sum of products matrix S is given by

$$\displaystyle \begin{aligned}S=(\mathbf{X}-{\bar{\mathbf{X}}})(\mathbf{X}-{\bar{\mathbf{X}}})^{\prime}=(s_{ij}),~ s_{ij}=\sum_{k=1}^n(x_{ik}-\bar{x}_i)(x_{jk}-\bar{x}_j) \end{aligned}$$

where $\bar {x}_r=\sum _{k=1}^nx_{rk}/n,~ r=1,\ldots ,p,$ are the averages on the components. It has already been shown in Sect. 3.5 for instance that the joint density of the sample values X ₁, …, X _n, denoted by L, can be written as

$$\displaystyle \begin{aligned} L=\frac{1}{(2\pi)^{\frac{np}{2}}|\varSigma|{}^{\frac{n}{2}}}\text{e}^{-\frac{1}{2}\text{tr}(\varSigma^{-1}S)-\frac{n}{2}(\bar{X}-\mu)'\varSigma^{-1}(\bar{X}-\mu)}. {} \end{aligned} $$

(5.5.7)

But $(\mathbf {X}-{\bar {\mathbf {X}}})J=O,~ J^{\prime }=(1,\ldots ,1),$ which implies that the columns of $(\mathbf {X}-{\bar {\mathbf {X}}})$ are linearly related, and hence the elements in $(\mathbf {X}-{\bar {\mathbf {X}}})$ are not distinct. In light of equation (4.5.17), one can write the sample sum of products matrix S in terms of a p × (n − 1) matrix Z _n−1 of distinct elements so that $S=Z_{n-1}Z_{n-1}^{\prime }$. As well, according to Theorem 3.5.3 of Chap. 3, S and $\bar {X}$ are independently distributed. The p × n matrix Z is obtained through the orthonormal transformation X P = Z, PP ^′ = I, P′P = I where P is n × n. Then dX = dZ, ignoring the sign. Let the last column of P be p _n. We can specify p _n to be $\frac {1}{\sqrt {n}}J$ so that $\mathbf {X}p_n=\sqrt {n}\bar {\mathbf {X}}$. Note that in light of (4.5.17), the deleted column in Z corresponds to $\sqrt {n}\bar {\mathbf {X}}$. The following considerations will be helpful to those who might need further confirmation of the validity of the above statement. Observe that $\mathbf {X} -{\bar {\mathbf {X}}}=\mathbf {X}(I-B),$ with $B=\frac {1}{n}JJ^{\prime }$ where J is a n × 1 vector of unities. Since I − B is idempotent and of rank n − 1, the eigenvalues are 1 repeated n − 1 times and a zero. An eigenvector, corresponding to the eigenvalue zero, is J normalized or $\frac {1}{\sqrt {n}}J$. Taking this as the last column p _n of P, we have $\mathbf {X}p_n=\sqrt {n}\bar {\mathbf {X}}$. Note that the other columns of P, namely p ₁, …, p _n−1, correspond to the n − 1 orthonormal solutions coming from the equation BY = Y where Y is a n × 1 non-null vector. Hence we can write $\text{d}Z=\text{d}Z_{n-1}\wedge \text{d}\bar {X}$. Now, integrating out $\bar {X}$ from (5.5.7), we have

$$\displaystyle \begin{aligned} L~\text{d}Z_{n-1}=c~\text{e}^{-\frac{1}{2}\text{tr}(\varSigma^{-1}Z_{n-1}Z_{n-1}^{\prime})}\text{d}Z_{n-1},~ S=Z_{n-1}Z_{n-1}^{\prime}, {} \end{aligned} $$

(5.5.8)

where c is a constant. Since Z _n−1 contains p(n − 1) distinct real variables, we may apply Theorems 4.2.1, 4.2.2 and 4.2.3, and write dZ _n−1 in terms of dS as

$$\displaystyle \begin{aligned} \text{d}Z_{n-1}=\frac{\pi^{\frac{p(n-1)}{2}}}{\varGamma_p(\frac{n-1}{2})}|S|{}^{\frac{n-1}{2}-\frac{p+1}{2}}\text{d}S,~ n-1\ge p. {} \end{aligned} $$

(5.5.9)

Then, if the density of S is denoted by f(S),

$$\displaystyle \begin{aligned}f(S)\text{d}S=c_1\frac{|S|{}^{\frac{n-1}{2}-\frac{p+1}{2}}}{\varGamma_p(\frac{n-1}{2})}\text{e}^{-\frac{1}{2}\text{tr}(\varSigma^{-1}S)}\text{d}S \end{aligned}$$

where c ₁ is a constant. From a real matrix-variate gamma density, we have the normalizing constant, thereby the value of c ₁. Hence

$$\displaystyle \begin{aligned} f(S)\text{d}S=\frac{|S|{}^{\frac{n-1}{2}-\frac{p+1}{2}}}{2^{\frac{(n-1)p}{2}}|\varSigma|{}^{\frac{n-1}{2}}\varGamma_p(\frac{n-1}{2})}\text{e}^{-\frac{1}{2}\text{ tr}(\varSigma^{-1}S)}\text{d}S {} \end{aligned} $$

(5.5.10)

for S > O, Σ > O, n − 1 ≥ p and f(X) = 0 elsewhere, Γ _p(⋅) being the real matrix-variate gamma given by

$$\displaystyle \begin{aligned}\varGamma_p(\alpha)=\pi^{\frac{p(p-1)}{4}}\varGamma(\alpha)\varGamma(\alpha-{1}/{2})\cdots \varGamma(\alpha-{(p-1)/}{2}),~\Re(\alpha)>{(p-1)}/{2}.\end{aligned}$$

Usually the sample size is taken as N so that N − 1 = n the number of degrees of freedom associated with the Wishart density in (5.5.10). Since we have taken the sample size as n, the number of degrees of freedom is n − 1 and the parameter matrix is Σ > O. Then S in (5.5.10) is written as S ∼ W _p(m, Σ), with m = n − 1 ≥ p. Thus, the following result:

Theorem 5.5.2

Let X ₁, …, X _n be a simple random sample of size n from a N _p(μ, Σ), Σ > O. Let $X_j,~\mathbf {X},~\bar {X},~{\bar {\mathbf {X}}},~S$ be as defined in Sect. 5.5.3 . Then, the density of S is a real Wishart density with m = n − 1 degrees of freedom and parameter matrix Σ > O, as given in (5.5.10).

5.5a.3. Sample from a complex Gaussian population and the Wishart density

Let $\tilde {X}_j\sim \tilde {N}_p(\tilde {\mu },\varSigma ),~ \varSigma >O,~ j=1,\ldots ,n$ be independently distributed. Let $\tilde {\mathbf {X}}=(\tilde {X}_1,\ldots ,\tilde {X}_n),~ \bar {\tilde {X}}=\frac {1}{n}(\tilde {X}_1+\cdots +\tilde {X}_n),~ {\bar {\tilde {\mathbf {X}}}}=(\bar {\tilde {X}},\ldots ,\bar {\tilde {X}})$ and let $\tilde {S}=(\tilde {\mathbf {X}}-{\bar {\tilde {\mathbf {X}}}})(\tilde {\mathbf {X}}-{\bar {\tilde {\mathbf {X}}}})^{*}$ where a * indicates the conjugate transpose. We have already shown in Sect. 3.5a that the joint density of $\tilde {X}_1,\ldots ,\tilde {X}_n$, denoted by $\tilde {L}$, can be written as

$$\displaystyle \begin{aligned} \tilde{L}=\frac{1}{\pi^{np}|\text{det}(\varSigma)|{}^n}\text{e}^{-\text{ tr}(\varSigma^{-1}\tilde{S})-n(\bar{\tilde{X}}-\tilde{\mu})^{*}\varSigma^{-1}(\bar{\tilde{X}}-\tilde{\mu})}. {} \end{aligned} $$

(5.5a.7)

Then, following steps parallel to (5.5.7) to (5.5.10), we obtain the density of $\tilde {S}$, denoted by $\tilde {f}(\tilde {S})$, as the following:

$$\displaystyle \begin{aligned} \tilde{f}(\tilde{S})\text{d}\tilde{S}=\frac{|\text{det}(S)|{}^{m-p}}{|\text{det}(\varSigma)|{}^{m}\tilde{\varGamma}_p(m)}\text{e}^{-\text{tr}(\varSigma^{-1}S)}\text{ d}\tilde{S},~ m=n-1\ge p, {} \end{aligned} $$

(5.5a.8)

for $\tilde {S}>O,~\varSigma >O,~ n-1\ge p,$ and $\tilde {f}(\tilde {S})=0$ elsewhere, where the complex matrix-variate gamma function being given by

$$\displaystyle \begin{aligned}\tilde{\varGamma}_p(\alpha)=\pi^{\frac{p(p-1)}{2}}\varGamma(\alpha)\varGamma(\alpha-1)\cdots \varGamma(\alpha-p+1),~\Re(\alpha)>p-1.\end{aligned}$$

Hence, we have the following result:

Theorem 5.5a.2

Let $\tilde {X}_j\sim \tilde {N}_p(\mu ,\varSigma ),~\varSigma >O,~j=1,\ldots ,n$ , be independently and identically distributed. Let $\tilde {\mathbf {X}},~ \bar {\tilde {X}}, ~{\bar {\tilde {\mathbf {X}}}},~\tilde {S}$ be as previously defined. Then, $\tilde {S}$ has a complex matrix-variate Wishart density with m = n − 1 degrees of freedom and parameter matrix Σ > O, as given in (5.5a.8).

5.5.4. Some properties of the Wishart distribution, real case

If we have statistically independently distributed Wishart matrices with the same parameter matrix Σ, then it is easy to see that the sum is again a Wishart matrix. This can be noted by considering the Laplace transform of matrix-variate random variables discussed in Sect. 5.2. If S _j ∼ W _p(m _j, Σ), j = 1, …, k, with the same parameter matrix Σ > O and the S _j’s are statistically independently distributed, then from equation (5.2.6), the Laplace transform of the density of S _j is

$$\displaystyle \begin{aligned} L_{ S_j}({{}_{*}T})=|I+2\varSigma {{}_{*}T}|{}^{-\frac{m_j}{2}},~ I+2\varSigma {{}_{*}T}>O,~ j=1,\ldots ,k, {} \end{aligned} $$

(5.5.11)

where _∗ T is a symmetric parameter matrix T = (t _ij) = T′ > O with off-diagonal elements weighted by $\frac {1}{2}$. When S _j’s are independently distributed, then the Laplace transform of the sum S = S ₁ + ⋯ + S _k is the product of the Laplace transforms:

$$\displaystyle \begin{aligned} \prod_{j=1}^k|I+2\varSigma {{}_{*}T}|{}^{-\frac{m_j}{2}}=|I+2\varSigma {{}_{*}T}|{}^{-\frac{1}{2}(m_1+\cdots +m_k)}\Rightarrow S\sim W_p(m_1+\cdots +m_k, \varSigma). {} \end{aligned} $$

(5.5.12)

Hence, the following result:

Theorems 5.5.3, 5.5a.3

Let S _j ∼ W _p(m _j, Σ), Σ > O, j = 1, …, k, be statistically independently distributed real Wishart matrices with m ₁, …, m _k degrees of freedoms and the same parameter matrix Σ > O. Then the sum S = S ₁ + ⋯ + S _k is real Wishart distributed with degrees of freedom m ₁ + ⋯ + m _k and the same parameter matrix Σ > O, that is, S ∼ W _p(m ₁ + ⋯ + m _k, Σ), Σ > O. In the complex case, let $\tilde {S}_j\sim \tilde {W}_p(m,\tilde {\varSigma }),~ \tilde {\varSigma }=\tilde {\varSigma }^{*}>O,~ j=1,\ldots ,k$ , be independently distributed with the same $\tilde {\varSigma }$ . Then, the sum $\tilde {S}=\tilde {S}_1+\cdots +\tilde {S}_k\sim \tilde {W}_p(m_1+\cdots +m_k,\tilde {\varSigma })$.

We now consider linear functions of independent Wishart matrices. Let S _j ∼ W _p(m _j, Σ), Σ > O, j = 1, …k, be independently distributed and S _a = a ₁ S ₁ + ⋯ + a _k S _k where a ₁, …, a _k are real scalar constants, then the Laplace transform of the density of S _a is

$$\displaystyle \begin{aligned} L_{S_a}({{}_{*}T})=\prod_{j=1}^k|I+2a_j\varSigma {{}_{*}T}|{}^{-\frac{m_j}{2}}, ~I+2a_j\varSigma {{}_{*}T}>O,~ j=1,\ldots ,k. \end{aligned}$$

(i)

The inverse is quite complicated and the corresponding density cannot be easily determined; moreover, the density is not a Wishart density unless a ₁ = ⋯ = a _k. The types of complications occurring can be apprehended from the real scalar case p = 1 which is discussed in Mathai and Provost (1992). Instead of real scalars, we can also consider p × p constant matrices as coefficients, in which case the inversion of the Laplace transform will be more complicated. We can also consider Wishart matrices with different parameter matrices. Let U _j ∼ W _p(m _j, Σ _j), Σ _j > O, j = 1, …, k, be independently distributed and U = U ₁ + ⋯ + U _k. Then, the Laplace transform of the density of U, denoted by L _U(_∗ T), is the following:

$$\displaystyle \begin{aligned} L_U({{}_{*}T})=\prod_{j=1}^k|I+2\varSigma_j {{}_{*}T}|{}^{-\frac{m_j}{2}}, I+2\varSigma_j {{}_{*}T}>O,~ j=1,\ldots ,k. \end{aligned}$$

(ii)

This case does not yield a Wishart density as an inverse Laplace transform either, unless Σ ₁ = ⋯ = Σ _k. In both (i) and (ii), we have linear functions of independent Wishart matrices; however, these linear functions do not have Wishart distributions.

Let us consider a symmetric transformation on a Wishart matrix S. Let S ∼ W _p(m, Σ), Σ > O and U = ASA ^′ where A is a p × p nonsingular constant matrix. Let us take the Laplace transform of the density of U:

$$\displaystyle \begin{aligned} L_U({{}_{*}T})&=E[\text{e}^{-\text{tr}({{}_{*}T}'U)}]=E[\text{e}^{-\text{tr}({{}_{*}T}'ASA')}]=E[\text{e}^{-\text{tr}(A'{{}_{*}T}AS)}]\\ &=E[\text{e}^{-\text{tr}[(A'{{}_{*}T}A)'S]}]=L_S(A'{{}_{*}T}A)=|I+2\varSigma (A'{{}_{*}T}A|{}^{-\frac{m}{2}}\\ &=|I+2(A\varSigma A'){{}_{*}T}|{}^{-\frac{m}{2}}\\ &\Rightarrow U\sim W_p(m, A\varSigma A'),~ \varSigma>O,~ |A|\ne 0.\end{aligned} $$

Hence we have the following result:

Theorems 5.5.4, 5.5a.4

Let S ∼ W _p(m, Σ > O) and U = ASA′, |A|≠0. Then, U ∼ W _p(m, AΣA′), Σ > O, |A|≠0, that is, when U = ASA ^′ where A is a nonsingular p × p constant matrix, then U is Wishart distributed with degrees of freedom m and parameter matrix AΣA ^′ . In the complex case, the constant p × p nonsingular matrix A can be real or in the complex domain. Let $\tilde {S}\sim \tilde {W}_p(m,\tilde {\varSigma }),~ \tilde {\varSigma }=\tilde {\varSigma }^{*}>O$ . Then $\tilde {U}=A\tilde {S}A^{*}\sim \tilde {W}_p(m, A\tilde {\varSigma }A^{*}).$

If A is not a nonsingular matrix, is there a corresponding result? Let B be a constant q × p matrix, q ≤ p, which is of full rank q. Let X _j ∼ N _p(μ, Σ), Σ > O, j = 1, …, n, be iid so that we have a simple random sample of size n from a real p-variate Gaussian population. Let the q × 1 vectors Y _j = BX _j, j = 1, …, n, be iid. Then E[Y _j] = Bμ, Cov(Y _j) = E[(Y _j−E(Y _j))(Y _j−E(Y _j))′] = BE[(X _j−E(X _j))(X _j−E(X _j))′B′ = BΣB ^′ which is q × q. As well, Y _j ∼ N _q(Bμ, BΣB′), BΣB′ > O. Consider the sample matrix formed from the Y _j’s, namely the q × n matrix Y = (Y ₁, …, Y _n) = (BX ₁, …, BX _n) = B(X ₁, …, X _n) = B X where X is the p × n sample matrix from X _j. Then, the sample sum of products matrix in Y is $(\mathbf {Y}-{\bar {\mathbf {Y}}})(\mathbf {Y}-{\bar {\mathbf {Y}}})'=S_y$, say, where the usual notation is utilized, namely, $\bar {Y}=\frac {1}{n}(Y_1+\cdots +Y_n)$ and ${\bar {\mathbf {Y}}}=(\bar {Y},\ldots ,\bar {Y})$. Now, the problem is equivalent to taking a simple random sample of size n from a q-variate real Gaussian population with mean value vector Bμ and positive definite covariance matrix BΣB′ > O. Hence, the following result:

Theorems 5.5.5, 5.5a.5

Let X _j ∼ N _p(μ, Σ), Σ > O, j = 1, …, n, be iid, and S be the sample sum of products matrix in this p-variate real Gaussian population. Let B be a q × p constant matrix, q ≤ p, which has full rank q. Then BSB ^′ is real Wishart distributed with degrees of freedom m = n − 1, n being equal to the sample size, and parameter matrix BΣB′ > O, that is, BSB ∼ W _q(m, BΣB′). Similarly, in the complex case, let B be a q × p, q ≤ p, constant matrix of full rank q, where B may be in the real or complex domain. Then, $B\tilde {S}B^{*}$ is Wishart distributed with degrees of freedom m and parameter matrix $B\tilde {\varSigma }B^{*}$ , that is, $B\tilde {S}B^{*}\sim \tilde {W}_q(m, B\tilde {\varSigma }B^{*}).$

5.5.5. The generalized variance

Let $X_j,~ X_j^{\prime }=(x_{1j},\ldots ,x_{pj}),$ be a real p × 1 vector random variable for j = 1, …, n, and the X _j’s be iid (independently and identically distributed) as X _j. Let the covariance matrix associated with X _j be Cov(X _j) = E[(X _j − E(X _j))(X _j − E(X _j))′] = Σ, Σ ≥ O, for j = 1, …, n in the real case and $\tilde {\varSigma }=E[(\tilde {X}_j-E(\tilde {X}_j))(\tilde {X}_j-E(\tilde {X}_j))^{*}]$ in the complex case, where an asterisk indicates the conjugate transpose. Then, the diagonal elements in Σ represent the squares of a measure of scatter or variances associated with the elements x _1j, …, x _pj and the off-diagonal elements in Σ provide the corresponding measure of joint dispersion or joint scatter in the pair (x _rj, x _sj) for all r≠s. Thus, Σ gives a configuration of individual and joint squared scatter in all the elements x _1j, …, x _pj. If we wish to have a single number or single scalar quantity representing this configuration of individual and joint scatter in the elements x _1j, …, x _pj what should be that measure? Wilks had taken the determinant of Σ, |Σ|, as that measure and called it the generalized variance or square of the scatter representing the whole configuration of scatter in all the elements x _1j, …, x _pj. If there is no scatter in one or in a few elements but there is scatter or dispersion in all other elements, then the determinant is zero. If the matrix is singular then the determinant is zero, but this does not mean that there is no scatter in these elements. Thus, determinant as a measure of scatter or dispersion, violates a very basic condition that if the proposed measure is zero then there should not be any scatter in any of the elements or Σ should be a null matrix. Hence, the first author suggested to take a norm of Σ, ∥Σ∥, as a single measure of scatter in the whole configuration, such as ∥Σ∥₁ =max_i∑_j|σ _ij| or ∥Σ∥₂ = largest eigenvalue of Σ since Σ is at least positive semi-definite. Note that normality is not assumed in the above discussion.

If S ∼ W _p(m, Σ), Σ > O, what is then the distribution of Wilks’ generalized variance in S, namely |S|, which can be referred to as the sample generalized variance? Let us determine the h-th moment of the sample generalized variance |S| for an arbitrary h. This has already been discussed for real and complex matrix-variate gamma distributions in Sect. 5.4.1 and can be obtained from the normalizing constant in the Wishart density:

$$\displaystyle \begin{aligned} E[|S|{}^h]&=\frac{\int_{X>O}|S|{}^{\frac{m}{2}+h-\frac{p+1}{2}}\text{e}^{-\frac{1}{2}\text{tr}(\varSigma^{-1}S)}} {2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2})|\varSigma|{}^{\frac{m}{2}}}\text{d}S\\ &=2^{ph}|\varSigma|{}^h\frac{\varGamma_p(\frac{m}{2}+h)}{\varGamma_p(\frac{m}{2})},~ \Re(\frac{m}{2}+h)>\frac{p-1}{2}.{} \end{aligned} $$

(5.5.13)

Then

$$\displaystyle \begin{aligned} E[|(2\varSigma)^{-1}S|{}^h]&=\frac{\varGamma_p(\frac{m}{2}+h)}{\varGamma_p(\frac{m}{2})} =\prod_{j=1}^p\frac{\varGamma(\frac{m}{2}+h-\frac{j-1}{2})}{\varGamma(\frac{m}{2}-\frac{j-1}{2})}\\ &=E[y_1^h]E[y_2^h]\cdots E[y_p^h]{} \end{aligned} $$

(5.5.14)

where y ₁, ⋯ , y _p are independently distributed real scalar gamma random variables with the parameters $(\frac {m}{2}-\frac {j-1}{2},~ 1),~ j=1,\ldots ,p$. In the complex case

$$\displaystyle \begin{aligned} E[|\text{det}((\tilde{\varSigma})^{-1}\tilde{S})]^h]&=\frac{\tilde{\varGamma}_p(m+h)}{\tilde{\varGamma}_p(m)}=\prod_{j=1}^p\frac{\tilde{\varGamma}(m-(j-1)+h)} {\tilde{\varGamma}(m-(j-1))}\\ &=E[\tilde{y}_1^h]\cdots E[\tilde{y}_p^h]{} \end{aligned} $$

(5.5a.9)

where $\tilde {y}_1,\ldots ,\tilde {y}_p$ and independently distributed real scalar gamma random variables with the parameters (m − (j − 1), 1), j = 1, …, p. Note that if we consider E[|Σ ⁻¹ S|^h] instead of E[|(2Σ)⁻¹ S|^h] in (5.5.14), then the y _j’s are independently distributed as real chisquare random variables having m − (j − 1) degrees of freedom for j = 1, …, p. This can be stated as a result.

Theorems 5.5.6, 5.5a.6

Let S ∼ W _p(m, Σ), Σ > O, and |S| be the generalized variance associated with this Wishart matrix or the sample generalized variance in the corresponding p-variate real Gaussian population. Then, $E[|(2\varSigma )^{-1}S|{ }^h]=E[y_1^h]\cdots E[y_p^h]$ so that |(2Σ)⁻¹ S| has the structural representation |(2Σ)⁻¹ S| = y ₁⋯y _p where the y _j ’s are independently distributed real gamma random variables with the parameters $(\frac {m}{2}-\frac {j-1}{2},~ 1),~ j=1,\ldots ,p$ . Equivalently, $E[|\varSigma ^{-1}S|{ }^h]=E[z_1^h] \cdots E[z_p]^h]$ where the z _j ’s are independently distributed real chisquare random variables having m − (j − 1), j = 1, …, p, degrees of freedom. In the complex case, if we let $\tilde {S}\sim \tilde {W}_p(m,\tilde {\varSigma }),~ \tilde {\varSigma }=\tilde {\varSigma }^{*}>O,$ and $|\mathit{\text{det}}(\tilde {S})|$ be the generalized variance, then $|\mathit{\text{ det}}((\tilde {\varSigma })^{-1}\tilde {S})|$ has the structural representation $|\mathit{\text{det}}((\tilde {\varSigma })^{-1}\tilde {S})|=\tilde {y}_1\cdots \tilde {y}_p$ where the $\tilde {y}_j$ ’s are independently distributed real scalar gamma random variables with the parameters (m − (j − 1), 1), j = 1, …, p or chisquare random variables in the complex domain having m − (j − 1), j = 1, …, p, degrees of freedom.

5.5.6. Inverse Wishart distribution

When S ∼ W _p(m, Σ), Σ > O, what is then the distribution of S ⁻¹? Since S has a real matrix-variate gamma distribution, that of its inverse is directly available from the transformation U = S ⁻¹. In light of Theorem 1.6.6, we have dS = |U|^−(p+1)dU for the real case and $\text{ d}\tilde {X}=|\text{det}(\tilde {U}\tilde {U}^{*})|{ }^{-p}\text{d}\tilde {U}$ in the complex domain. Thus, denoting the density of U by g(U), we have the following result:

Theorems 5.5.7, 5.5a.7

Let the real Wishart matrix S ∼ W _p(m, Σ), Σ > O, and the Wishart matrix in the complex domain $\tilde {S}\sim \tilde {W}_p(m,~\tilde {\varSigma }),~ \tilde {\varSigma }=\tilde {\varSigma }^{*}>O$ . Let U = S ⁻¹ and $\tilde {U}=\tilde {S}^{-1}$ . Letting the density of S be denoted by g(U) and that of $\tilde {U}$ be denoted by $\tilde {g}(\tilde {U})$,

$$\displaystyle \begin{aligned} g(U)=\frac{|U|{}^{-\frac{m}{2}-\frac{p+1}{2}}}{2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2})|\varSigma|{}^{\frac{m}{2}}}\mathit{\text{e}}^{-\frac{1}{2}\mathit{\text{tr}}(\varSigma^{-1}U^{-1})},~U>O, ~\varSigma>O, {} \end{aligned} $$

(5.5.15)

and zero elsewhere, and

$$\displaystyle \begin{aligned} \tilde{g}(\tilde{U})=\frac{|\mathit{\text{det}}(\tilde{U})|{}^{-m-p}}{\tilde{\varGamma}_p(m)|\mathit{\text{det}}(\tilde{\varSigma})|{}^m}\mathit{\text{e}}^{-\mathit{\text{ tr}}(\tilde{\varSigma}^{-1}\tilde{U}^{-1})},~\tilde{U}=\tilde{U}^{*}>O,~ \tilde{\varSigma}=\tilde{\varSigma}^{*}>O, \end{aligned}$$

(5.5a.10)

and zero elsewhere.

5.5.7. Marginal distributions of a Wishart matrix

At the beginning of this chapter, we had explicitly evaluated real and complex matrix-variate gamma integrals and determined that the diagonal blocks are again real and complex matrix-variate gamma integrals. Hence, the following results are already available from the discussion on the matrix-variate gamma distribution. We will now establish the results via Laplace transforms. Let S be Wishart distributed with degrees of freedom m and parameter matrix Σ > O, that is, S ∼ W _p(m, Σ), Σ > O, m ≥ p. Let us partition S and Σ as follows:

(i)

(referred to as a 2 × 2 partitioning) S ₁₁, Σ ₁₁ being r × r and S ₂₂, Σ ₂₂ being (p − r) × (p − r) – refer to Sect. 1.3 for results on partitioned matrices. Let _∗ T be a similarly partitioned p × p parameter matrix with _∗ T ₁₁ being r × r where

(ii)

Observe that _∗ T is a slightly modified parameter matrix T = (t _ij) = T ^′ where the t _ij’s are weighted with $\frac {1}{2}$ for i≠j to obtain _∗ T. Noting that $\text{tr}({{ }_{*}T}'S)=\text{tr}({{ }_{*}T}_{11}^{\prime }S_{11})$, the Laplace transform of the Wishart density W _p(m, Σ), Σ > O, with _∗ T as defined above, is given by

(5.5.16)

Thus, S ₁₁ has a Wishart distribution with m degrees of freedom and parameter matrix Σ ₁₁. It can be similarly established that S ₂₂ is Wishart distributed with degrees of freedom m and parameter matrix Σ ₂₂. Hence, the following result:

Theorems 5.5.8, 5.5a.8

Let S ∼ W _p(m, Σ), Σ > O. Let S and Σ be partitioned into a 2 × 2 partitioning as above. Then, the sub-matrices S ₁₁ ∼ W _r(m, Σ ₁₁), Σ ₁₁ > O, and S ₂₂ ∼ W _p−r(m, Σ ₂₂), Σ ₂₂ > O. In the complex case, let $\tilde {S}\sim \tilde {W}_p(m,~\tilde {\varSigma }),~ \tilde {\varSigma }=\tilde {\varSigma }^{*}>O$ . Letting $\tilde {S}$ be partitioned as in the real case, $\tilde {S}_{11}\sim \tilde {W}_r(m,~\tilde {\varSigma }_{11})$ and $\tilde {S}_{22}\sim \tilde {W}_{p-r}(m,~\tilde {\varSigma }_{22}).$

Corollaries 5.5.2, 5.5a.2

Let S ∼ W _p(m, Σ), Σ > O. Suppose that Σ ₁₂ = O in the 2 × 2 partitioning of Σ. Then S ₁₁ and S ₂₂ are independently distributed with S ₁₁ ∼ W _r(m, Σ ₁₁) and S ₂₂ ∼ W _p−r(m, Σ ₂₂). Consider a k × k partitioning of S and Σ, the order of the diagonal blocks S _jj and Σ _jj being p _j × p _j , p ₁ + ⋯ + p _k = p. If Σ _ij = O for all i≠j, then the S _jj ’s are independently distributed as Wishart matrices on p _j components, with degrees of freedom m and parameter matrices Σ _jj > O, j = 1, …, k. In the complex case, consider the same type of partitioning as in the real case. Then, if $\tilde {\varSigma }_{ij}=O$ for all i≠j, $\tilde {S}_{jj}, ~ j=1,\ldots ,k,$ are independently distributed as $\tilde {S}_{jj}\sim \tilde {W}_{p_j}(m,~\tilde {\varSigma }_{jj}),~ j=1,\ldots ,k,~ p_1+\cdots +p_k=p$.

Let S be a p × p real Wishart matrix with m degrees of freedom and parameter matrix Σ > O. Consider the following 2 × 2 partitioning of S and Σ ⁻¹:

Then, the density, denoted by f(S), can be written as

$$\displaystyle \begin{aligned} f(S)&=\frac{|S|{}^{\frac{m}{2}-\frac{p+1}{2}}}{2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2})|\varSigma|{}^{\frac{m}{2}}}\,\text{e}^{-\frac{1}{2}\text{tr}(\varSigma^{-1}S)}\\ &=\frac{|S_{11}|{}^{\frac{m}{2}-\frac{p+1}{2}}|S_{22}-S_{21}S_{11}^{-1}S_{12}|{}^{\frac{m}{2}-\frac{p+1}{2}}}{2^{\frac{mp}{2}}\varGamma_p(\frac{m}{2}) |\varSigma|{}^{\frac{m}{2}}}\\ &\ \ \ \ \times\text{e}^{-\frac{1}{2}[\text{tr}(\varSigma^{11}S_{11})+\text{tr}(\varSigma^{22}S_{22})+\text{tr}(\varSigma^{12}S_{21})+\text{ tr}(\varSigma^{21}S_{12})]}.\end{aligned} $$

In this case, dS = dS ₁₁ ∧dS ₂₂ ∧dS ₁₂. Let $U_2=S_{22}-S_{21}S_{11}^{-1}S_{12}$. Referring to Sect. 1.3, the coefficient of S ₁₁ in the exponent is $\varSigma ^{11}=(\varSigma _{11}-\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{21})^{-1}$. Let $U_2=S_{22}-S_{21}S_{11}^{-1}S_{12}$ so that $S_{22}=U_2+S_{21}S_{11}^{-1}S_{12}$ and dS ₂₂ = dU ₂ for fixed S ₁₁ and S ₁₂. Then, the function of U ₂ is of the form

$$\displaystyle \begin{aligned}|U_2|{}^{\frac{m}{2}-\frac{p+1}{2}}\text{e}^{-\frac{1}{2}\text{tr}(\varSigma^{22}U_2)}. \end{aligned}$$

However, U ₂ is (p − r) × (p − r) and we can write $\frac {m}{2}-\frac {p+1}{2}=\frac {m-r}{2}-\frac {p-r+1}{2}$. Therefore $U_2\sim W_{p-r}(m-r,~\varSigma _{22}-\varSigma _{21}\varSigma _{11}^{-1}\varSigma _{12})$ as $\varSigma ^{22}=(\varSigma _{22}-\varSigma _{21}\varSigma _{11}^{-1}\varSigma _{12})^{-1}$. From symmetry, $U_1=S_{11}-S_{12}S_{22}^{-1}S_{21}\sim W_r(m-(p-r),~\varSigma _{11}-\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{21})$. After replacing S ₂₂ in the exponent by $U_2+S_{21}S_{11}^{-1}S_{12}$, the exponent, excluding $-\frac {1}{2}$, can be written as $\text{tr}[\varSigma ^{11}S_{11}]+\text{ tr}[\varSigma ^{22}S_{12}S_{22}^{-1}S_{21}]+\text{tr}[\varSigma ^{12}S_{21}]+\text{tr}[\varSigma ^{21}S_{12}]$. Let us try to integrate out S ₁₂. To this end, let $V=S_{11}^{-\frac {1}{2}}S_{12}\Rightarrow \text{d}S_{12}=|S_{11}|{ }^{\frac {p-r}{2}}\text{d}V$ for fixed S ₁₁. Then the determinant of S ₁₁ in f(X) becomes $|S_{11}|{ }^{\frac {m}{2}-\frac {p+1}{2}}\times |S_{11}|{ }^{\frac {p-r}{2}}=|S_{11}|{ }^{\frac {m}{2}-\frac {r+1}{2}}$. The exponent, excluding $-\frac {1}{2}$ becomes the following, denoting it by ρ:

$$\displaystyle \begin{aligned} \rho=\text{tr}(\varSigma^{12}V'S_{11}^{\frac{1}{2}})+\text{tr}(\varSigma^{21}S_{11}^{\frac{1}{2}}V)+\text{tr}(\varSigma^{22}V'V). \end{aligned}$$

(i)

Note that tr(Σ ²² V ′V ) = tr(V Σ ²² V ′) and

$$\displaystyle \begin{aligned} (V+C)\varSigma^{22}(V+C)'=V\varSigma^{22}V'+V\varSigma^{22}C'+C\varSigma^{22}V+C\varSigma^{22}C'. \end{aligned}$$

(ii)

On comparing (i) and (ii), we have $C'=(\varSigma ^{22})^{-1}\varSigma ^{21}S_{11}^{\frac {1}{2}}$. Substituting for C and C ^′ in ρ, the term containing S ₁₁ in the exponent becomes $-\frac {1}{2}\text{tr}(S_{11}(\varSigma ^{11}-\varSigma ^{12}(\varSigma ^{22})^{-1}\varSigma ^{21})=-\frac {1}{2}\text{ tr}(S_{11}\varSigma _{11}^{-1})$. Collecting the factors containing S ₁₁, we have S ₁₁ ∼ W _r(m, Σ ₁₁) and from symmetry, S ₂₂ ∼ W _p−r(m, Σ ₂₂). Since the density f(S) splits into a function of U ₂, S ₁₁ and $S_{11}^{-\frac {1}{2}}S_{12}$, these quantities are independently distributed. Similarly, U ₁, S ₂₂ and $ S_{22}^{-\frac {1}{2}}S_{21}$ are independently distributed. The exponent of |U ₁| is $\frac {m}{2}-\frac {p+1}{2}=(\frac {m}{2}-\frac {p-r}{2})-\frac {r+1}{2}$. Observing that U ₁ is r × r, we have the density of $U_1=S_{11}-S_{12}S_{22}^{-1}S_{21}$ as a real Wishart density on r components, with degrees of freedom m − (p − r) and parameter matrix $\varSigma _{11}-\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{21}$ whose the density, denoted by f ₁(U), is the following:

$$\displaystyle \begin{aligned} f_1(U_1)=\frac{|U_1|{}^{\frac{m-(p-r)}{2}-\frac{r+1}{2}}}{2^{\frac{r(m-(p-r))}{2}}\varGamma_r(\frac{m-(p-r)}{2})|\varSigma_{11}-\varSigma_{12}\varSigma_{22}^{-1} \varSigma_{21}|{}^{\frac{m-(p-r)}{2}}}\text{e}^{-\frac{1}{2}\text{tr}[U_1(\varSigma_{11}-\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21})^{-1}]}. \end{aligned}$$

(iii)

A similar expression can be obtained for the density of U ₂. Thus, the following result:

Theorems 5.5.9, 5.5a.9

Let S ∼ W _p(m, Σ), Σ > O, m ≥ p. Consider the 2 × 2 partitioning of S as specified above, S ₁₁ being r × r. Let $U_1=S_{11}-S_{12}S_{22}^{-1}S_{21}$ . Then,

$$\displaystyle \begin{aligned} U_1\sim W_{r}(m-(p-r),~ \varSigma_{11}-\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21}). {} \end{aligned} $$

(5.5.17)

In the complex case, let $\tilde {S}\sim \tilde {W}_p(m,~\tilde {\varSigma }),~ \tilde {\varSigma }=\tilde {\varSigma }^{*}>O$ . Consider the same partitioning as in the real case and let $\tilde {S}_{11}$ be r × r. Then, letting $\tilde {U}_1=\tilde {S}_{11}-\tilde {S}_{12}\tilde {S}_{22}^{-1}\tilde {S}_{21}$, $\tilde {U}_1$ is Wishart distributed as

$$\displaystyle \begin{aligned} \tilde{U}_1\sim\tilde{W}_r(m-(p-r),~ \tilde{\varSigma}_{11}-\tilde{\varSigma}_{12}\tilde{\varSigma}_{22}^{-1}\tilde{\varSigma}_{21}). \end{aligned} $$

(5.5a.11)

A similar density is obtained for $\tilde {U}_2=\tilde {S}_{22}-\tilde {S}_{21}\tilde {S}_{11}^{-1}\tilde {S}_{12}$.

Example 5.5.1

Let the 3 × 3 matrix S ∼ W ₃(5, Σ), Σ > O. Determine the distributions of $Y_1=S_{22}-S_{21}S_{11}^{-1}S_{12},~ Y_2=S_{22}$ and Y ₃ = S ₁₁ where

with S ₁₁ being 2 × 2.

Solution 5.5.1

Let the densities of Y _j be denoted by f _j(Y _j), j = 1, 2, 3. We need the following matrix, denoted by B:

From our usual notations, Y ₁ ∼ W _p−r(m − r, B). Observing that Y ₁ is a real scalar, we denote it by y ₁, its density being given by

$$\displaystyle \begin{aligned} f_1(y_1)&=\frac{y_1^{\frac{m-r}{2}-\frac{(p-r)+1}{2}}}{2^{\frac{(m-r)(p-r)}{2}}|B|{}^{\frac{m-r}{2}}\varGamma(\frac{m-r}{2})}\text{e}^{-\frac{1}{2}\text{ tr}(B^{-1}y_1)}\\ &=\frac{y_1^{\frac{3}{2}-1}\text{e}^{-\frac{5}{26}y_1}}{2^{\frac{3}{2}}(13/5)^{\frac{3}{2}}\varGamma(\frac{3}{2})},~ 0\le y_1<\infty,\end{aligned} $$

and zero elsewhere. Now, consider Y ₂ which is also a real scalar that will be denoted by y ₂. As per our notation, Y ₂ = S ₂₂ ∼ W _p−r(m, Σ ₂₂). Its density is then as follows, observing that Σ ₂₂ = (3), |Σ ₂₂| = 3 and $\varSigma _{22}^{-1}=(\frac {1}{3})$:

$$\displaystyle \begin{aligned} f_2(y_2)&=\frac{y_2^{\frac{m}{2}-\frac{(p-r)+1}{2}}\text{e}^{-\frac{1}{2}\text{ tr}(\varSigma_{22}^{-1}y_2)}}{2^{\frac{m(p-r)}{2}}\varGamma_{p-r}(\frac{m}{2})|\varSigma_{22}|{}^{\frac{m}{2}}}\\ &=\frac{y_2^{\frac{5}{2}-1}\text{e}^{-\frac{1}{6}y_2}}{2^{\frac{5}{2}}\varGamma(\frac{5}{2})3^{\frac{5}{2}}},~ 0\le y_2<\infty,\end{aligned} $$

and zero elsewhere. Note that Y ₃ = S ₁₁ is 2 × 2. With our usual notations, p = 3, r = 2, m = 5 and |Σ ₁₁| = 5; as well,

Thus, the density of Y ₃ is

$$\displaystyle \begin{aligned} f_3(Y_3)&=\frac{|S_{11}|{}^{\frac{m}{2}-\frac{r+1}{2}}\text{e}^{-\frac{1}{2}\text{ tr}(\varSigma_{11}^{-1}S_{11})}}{2^{\frac{mr}{2}}\varGamma_r(\frac{m}{2})|\varSigma_{11}|{}^{\frac{m}{2}}}\\ &=\frac{[s_{11}s_{22}-s_{12}^2]\text{e}^{-\frac{1}{10}(3s_{11}+2s_{12}+2s_{22})}}{(3)(2^3)(5)^{\frac{5}{2}}\pi},~ Y_3>O,\end{aligned} $$

and zero elsewhere. This completes the calculations.

Example 5.5a.1

Let the 3 × 3 Hermitian positive definite matrix $\tilde {S}$ have a complex Wishart density with degrees of freedom m = 5 and parameter matrix Σ > O. Determine the densities of $\tilde {Y}_1=\tilde {S}_{22}-\tilde {S}_{21}\tilde {S}_{11}^{-1}\tilde {S}_{12}, \ \tilde {Y}_2=\tilde {S}_{22}$ and $\tilde {Y}_3=\tilde {S}_{11}$ where

with S ₁₁ and Σ ₁₁ being 2 × 2.

Solution 5.5a.1

Observe that Σ is Hermitian positive definite. We need the following numerical results:

Note that $\tilde {Y}_1$ and $\tilde {Y}_2$ are real scalar quantities which will be denoted as y ₁ and y ₂, respectively. Let the densities of y ₁ and y ₂ be f _j(y _j), j = 1, 2. Then, with our usual notations, f ₁(y ₁) is

$$\displaystyle \begin{aligned} f_1(y_1)&=\frac{|\text{det}(\tilde{y}_1)|{}^{(m-r)-(p-r)}\text{e}^{-\text{tr}(B^{-1}\tilde{y}_1)}}{|\text{det}(B)|{}^{m-r}\tilde{\varGamma}_{p-r}(m-r)}\\ &=\frac{y_1^2\,\text{e}^{-\frac{5}{7}y_1}}{(\frac{7}{5})^{3}\varGamma(3)},~0\le y_1<\infty,\end{aligned} $$

and zero elsewhere, and the density of $\tilde {y}_2$, is

$$\displaystyle \begin{aligned} f_2(y_2)&=\frac{|\text{det}(\tilde{y}_2)|{}^{m-(p-r)}\text{e}^{-\text{tr}(\varSigma_{22}^{-1}\tilde{y}_2)}}{|\text{det}(\varSigma_{22})|{}^m\tilde{\varGamma}_{p-r}(m)}\\ &=\frac{y_2^4\,\text{e}^{-\frac{1}{2}y_2}}{2^5\varGamma(5)},~0\le y_2<\infty,\end{aligned} $$

and zero elsewhere. Note that $\tilde {Y}_3=\tilde {S}_{11}$ is 2 × 2. Letting

and $|\text{det}(\tilde {Y}_3)|=[s_{11}s_{22}-\tilde {s}_{12}^{*}\tilde {s}_{12}]$. With our usual notations, the density of $\tilde {Y}_3$, denoted by $\tilde {f}_3(\tilde {Y}_3)$, is the following:

$$\displaystyle \begin{aligned} \tilde{f}_3(\tilde{Y}_3)&=\frac{|\text{det}(\tilde{Y}_3)|{}^{m-r}\text{e}^{-\text{tr}(\varSigma_{11}^{-1}\tilde{Y}_3)}}{|\text{ det}(\varSigma_{11})|{}^m\tilde{\varGamma}_r(m)}\\ &=\frac{[s_{11}s_{22}-\tilde{s}_{12}\tilde{s}_{12}^{*}]^3\text{e}^{-\frac{1}{5}[2s_{11}+3s_{22}+i\tilde{s}_{12}^{*}-i\tilde{s}_{12}]}}{5^5\tilde{\varGamma}_2(5)}, \ \tilde{Y}_3>O, \end{aligned} $$

and zero elsewhere, where $5^5\tilde {\varGamma }_2(5)=3125(144)\pi $. This completes the computations.

5.5.8. Connections to geometrical probability problems

Consider the representation of the Wishart matrix $S=Z_{n-1}Z_{n-1}^{\prime }$ given in (5.5.8) where the p rows are linearly independent 1 × (n − 1) vectors. Then, these p linearly independent rows, taken in order, form a convex hull and determine a p-parallelotope in that hull, which is determined by the p points in the (n − 1)-dimensional Euclidean space, n − 1 ≥ p. Then, as explained in Mathai (1999), the volume content of this parallelotope is $v=|Z_{n-1}Z_{n-1}^{\prime }|{ }^{\frac {1}{2}}=|S|{ }^{\frac {1}{2}}$, where S ∼ W _p(n − 1, Σ), Σ > O. Thus, the volume content of this parallelotope is the positive square root of the generalized variance |S|. The distributions of this random volume when the p random points are uniformly, type-1 beta, type-2 beta and gamma distributed are provided in Chap. 4 of Mathai (1999).

5.6. The Distribution of the Sample Correlation Coefficient

Consider the real Wishart density or matrix-variate gamma in (5.5.10) for p = 2. For convenience, let us take the degrees of freedom parameter n − 1 = m. Then for p = 2, f(S) in (5.5.10), denoted by f ₂(S), is the following, observing that |S| = s ₁₁ s ₂₂(1 − r ²) where r is the sample correlation coefficient:

$$\displaystyle \begin{aligned} f_2(S)=f_2(s_{11},~s_{22},~r)=\frac{[s_{11}s_{22}(1-r^2)]^{\frac{m}{2}-\frac{3}{2}}\text{e}^{-\frac{1}{2}\text{ tr}(\varSigma^{-1}S)}}{2^m[\sigma_{11}\sigma_{22}(1-\rho^2)]^{\frac{m}{2}}\varGamma_2(\frac{m}{2})} {} \end{aligned} $$

(5.6.1)

where ρ = the population correlation coefficient, $|\varSigma |=\sigma _{11}\sigma _{22}-\sigma _{12}^2=\sigma _{11}\sigma _{22}(1-\rho ^2)$, $\varGamma _2(\frac {m}{2})=\pi ^{\frac {1}{2}}\varGamma (\frac {m}{2})\varGamma (\frac {m-1}{2}), ~-1<\rho <1$,

(i)

$$\displaystyle \begin{aligned} \text{tr}(\varSigma^{-1}S)&=\frac{1}{1-\rho^2}\Big\{\frac{s_{11}}{\sigma_{11}}-2\rho\frac{s_{12}}{\sqrt{\sigma_{11}\sigma_{22}}}+\frac{s_{22}}{\sigma_{22}}\Big\}\\ &=\frac{1}{1-\rho^2}\Big\{\frac{s_{11}}{\sigma_{11}}-2\rho r\frac{\sqrt{s_{11}s_{22}}}{\sqrt{\sigma_{11}\sigma_{22}}}+\frac{s_{22}}{\sigma_{22}}\Big\}. \end{aligned} $$

(ii)

Let us make the substitution $x_1=\frac {s_{11}}{\sigma _{11}},~ x_2=\frac {s_{22}}{\sigma _{22}}$. Note that dS = ds ₁₁ ∧ds ₂₂ ∧ ds ₁₂. But $\text{d}s_{12}=\sqrt {s_{11}s_{22}}\,\text{d}r$ for fixed s ₁₁ and s ₂₂. In order to obtain the density of r, we must integrate out x ₁ and x ₂, observing that $\sqrt {s_{11}s_{22}}$ is coming from ds ₁₂:

$$\displaystyle \begin{aligned} &\int_{s_{11},s_{22}}f_2(S)\,\text{d}s_{11}\wedge\text{d}s_{22}=\int_{x_1>0}\int_{x_2>0}\ \, \frac{1}{2^m(\sigma_{11}\sigma_{22})^{\frac{m}{2}}(1-\rho^2)^{\frac{m}{2}}\pi^{\frac{1}{2}}\varGamma(\frac{m}{2}) \varGamma(\frac{m-1}{2})}\\ &\ \ \ \ \ \ \ \ \times (1-r^2)^{\frac{m-3}{2}}(\sigma_{11}\sigma_{22}x_1x_2)^{\frac{m}{2}-1}\text{e}^{-\frac{1}{2(1-\rho^2)}\{x_1-2r\rho \sqrt{x_1x_2}+x_2\}}\sigma_{11} \sigma_{22}\,\text{d}x_1\wedge\text{d}x_2.\end{aligned} $$

(iii)

For convenience, let us expand

$$\displaystyle \begin{aligned} \text{e}^{-\frac{1}{2(1-\rho^2)}(-2r\rho\sqrt{x_1x_2})}=\sum_{k=0}^{\infty}\Big(\frac{r\rho}{1-\rho^2}\Big)^k\frac{x_1^{\frac{k}{2}}\,x_2^{\frac{k}{2}}}{k!}. \end{aligned}$$

(iv)

Then the part containing x ₁ gives the integral

$$\displaystyle \begin{aligned} \int_{x_1=0}^{\infty}x_1^{\frac{m}{2}-1+\frac{k}{2}}\text{e}^{-\frac{x_1}{2(1-\rho^2)}}\text{ d}x_1=[2(1-\rho^2)]^{\frac{m}{2}+\frac{k}{2}}\varGamma(\frac{m}{2}+\frac{k}{2}),~m\ge 2. \end{aligned}$$

(v)

By symmetry, the integral over x ₂ gives $[2(1-\rho ^2)]^{\frac {m}{2}+\frac {k}{2}}\varGamma (\frac {m+k}{2}),~ m\ge 2.$ Collecting all the constants we have

$$\displaystyle \begin{aligned} \frac{(\sigma_{11}\sigma_{22})^{\frac{m}{2}}2^{m+k}(1-\rho^2)^{m+k}\varGamma^2(\frac{m+k}{2})} {2^m(\sigma_{11}\sigma_{22})^{\frac{m}{2}}(1-\rho^2)^{\frac{m}{2}}\pi^{\frac{1}{2}}\varGamma(\frac{m}{2})\varGamma(\frac{m-1}{2})} =\frac{(1-\rho^2)^{\frac{m}{2}+k}2^k\varGamma^2(\frac{m+k}{2})}{\pi^{\frac{1}{2}}\varGamma(\frac{m}{2})\varGamma(\frac{m-1}{2})}. \end{aligned}$$

(vi)

We can simplify $\varGamma (\frac {m}{2})\varGamma (\frac {m}{2}-\frac {1}{2})$ by using the duplication formula for gamma functions, namely

$$\displaystyle \begin{aligned} \varGamma(2z)=\pi^{-\frac{1}{2}}2^{2z-1}\varGamma(z)\varGamma\Big(z+\frac{1}{2}\Big),~ z=\frac{m-1}{2}, {} \end{aligned} $$

(5.6.2)

Then,

$$\displaystyle \begin{aligned} \varGamma\Big(\frac{m}{2}\Big)\varGamma\Big(\frac{m-1}{2}\Big)=\frac{\varGamma(m-1)\pi^{\frac{1}{2}}}{2^{m-2}}. \end{aligned}$$

(vii)

Hence the density of r, denoted by f _r(r), is the following:

$$\displaystyle \begin{aligned} f_r(r)=\frac{2^{m-2}(1-\rho^2)^{\frac{m}{2}}}{\varGamma(m-1)\pi}(1-r^2)^{\frac{1}{2}(m-3)}\sum_{k=0}^{\infty}\frac{(2r\rho)^k}{k!} \varGamma^2\Big(\frac{m+k}{2}\Big),~-1\le r\le 1,{} \end{aligned} $$

(5.6.3)

and zero elsewhere, m = n − 1, n being the sample size.

5.6.1. The special case ρ = 0

In this case, (5.6.3) becomes

$$\displaystyle \begin{aligned} f_r(r)&=\frac{2^{m-2}\varGamma^2(\frac{m}{2})}{\varGamma(m-1)\pi}(1-r^2)^{\frac{m-1}{2}-1},~-1\le r\le 1,~m=n-1{} \end{aligned} $$

(5.6.4)

$$\displaystyle \begin{aligned} &=\frac{\varGamma(\frac{m}{2})}{\sqrt{\pi}\varGamma(\frac{m-1}{2})}(1-r^2)^{\frac{m-1}{2}-1},~-1\le r\le 1, {} \end{aligned} $$

(5.6.5)

zero elsewhere, m = n − 1 ≥ 2, n being the sample size. The simplification is made by using the duplication formula and writing $\varGamma (m-1)=\pi ^{-\frac {1}{2}}2^{m-2}\varGamma (\frac {m-1}{2})\varGamma (\frac {m}{2})$. For testing the hypothesis H _o : ρ = 0, the test statistic is r and the null distribution, that is, the distribution under the null hypothesis H _o is given in (5.6.5). Numerical tables of percentage points obtained from (5.6.5) are available. If ρ≠0, the non-null distribution is available from (5.6.3); so, if we wish to test the hypothesis H _o : ρ = ρ _o where ρ _o is a given quantity, we can compute the percentage points from (5.6.3). It can be shown from (5.6.5) that for ρ = 0, $ t_m=\sqrt {m}\frac {r}{\sqrt {1-r^2}}$ is distributed as a Student-t with m degrees of freedom, and hence for testing H _o : ρ = 0 against H ₁ : ρ≠0, the null hypothesis can be rejected if $|t_m|=\sqrt {m}\big |\frac {r}{\sqrt {1-r^2}}\big |\ge t_{m,\frac {\alpha }{2}}$ where $Pr\{|t_m|\ge t_{m,\frac {\alpha }{2}}\}=\alpha $. For tests that make use of the Student-t statistic, refer to Mathai and Haubold (2017b). Since the density given in (5.6.5) is an even function, when ρ = 0, all odd order moments are equal to zero and the even order moments can easily be evaluated from type-1 beta integrals.

5.6.2. The multiple and partial correlation coefficients

Let the p × 1 real vector X _j with $X_j^{\prime }=(x_{1j},\ldots ,x_{pj})$ have a p-variate distribution whose mean value E(X _j) = μ and covariance matrix Cov(X _j) = Σ, Σ > O, where μ is p × 1 and Σ is p × p, for j = 1, …, n, the X _j’s being iid (independently and identically distributed). Consider the following partitioning of Σ:

Let

$$\displaystyle \begin{aligned} \rho_{1.(2\ldots p)}^2=\frac{\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21}}{\sigma_{11}}. {} \end{aligned} $$

(5.6.6)

Then, ρ _1.(2…p) is called the multiple correlation coefficient of x _1j on x _2j, …, x _pj. The sample value corresponding to $\rho _{1.(2\ldots p)}^2$ which is denoted by $r_{1.(2\ldots p)}^2$ and referred to as the square of the sample multiple correlation coefficient, is given by

(5.6.7)

where s ₁₁ is 1 × 1, S ₂₂ is (p − 1) × (p − 1), $S=(\mathbf {X}-{\bar {\mathbf {X}}})(\mathbf {X}-{\bar {\mathbf {X}}})'$, X = (X ₁, …, X _n) is the p × n sample matrix, n being the sample size, $\bar {X}=\frac {1}{n}(X_1+\cdots +X_n),~ {\bar {\mathbf {X}}}=(\bar {X},\ldots ,\bar {X})$ is p × n, the X _j’s, j = 1, …, n, being iid according to a given p-variate population having mean value vector μ and covariance matrix Σ > O, which need not be Gaussian.

5.6.3. Different derivations of ρ _1.(2…p)

Consider a prediction problem involving real scalar variables where x ₁ is predicted by making use of x ₂, …, x _p or linear functions thereof. Let $A_2^{\prime }=(a_2,\ldots ,a_p)$ be a constant vector where a _j, j = 2, …, p are real scalar constants. Letting $X_{(2)}^{\prime }=(x_2,\ldots ,x_p)$, a linear function of X ₍₂₎ is $u=A_2^{\prime }X_{(2)}=a_2x_2+\cdots +a_px_p$. Then, the mean value and variance of this linear function are $E[u]=E[A_2^{\prime }X_{(2)}]=A_2^{\prime }\mu _{(2)}$ and $\text{Var}(u)=\text{Var}(A_2^{\prime }X_{(2)})=A_2^{\prime }\varSigma _{22}A_2$ where $\mu _{(2)}^{\prime }=(\mu _2,\ldots ,\mu _p)=E[X_{(2)}]$ and Σ ₂₂ is the covariance matrix associated with X ₍₂₎, which is available from the partitioning of Σ specified in the previous subsection. Let us determine the correlation between x ₁, the variable being predicted, and u, a linear function of the variables being utilized to predict x ₁, denoted by ρ _1,u, that is,

$$\displaystyle \begin{aligned}\rho_{1,u}=\frac{\text{Cov}(x_1,u)}{\sqrt{\text{Var}(x_1)\text{Var}(u)}}, \end{aligned}$$

where Cov(x ₁, u) = E[(x ₁−E(x ₁))(u−E(u))] = E[(x ₁−E(x ₁))(X ₍₂₎−E(X ₍₂₎))′A ₂] = Cov(x ₁, X ₍₂₎)A ₂ = Σ ₁₂ A ₂, Var(x ₁) = σ ₁₁, $\text{Var}(u)=A_2^{\prime }\varSigma _{22}A_2>O$. Letting $\varSigma _{22}^{-\frac {1}{2}}$ be the positive definite square root of Σ ₂₂, we can write $\varSigma _{12}A_2=(\varSigma _{12}\varSigma _{22}^{-\frac {1}{2}})(\varSigma _{22}^{\frac {1}{2}}A_2)$. Then, on applying Cauchy-Schwartz’ inequality, we may write $\varSigma _{12}A_2=(\varSigma _{12}\varSigma _{22}^{-\frac {1}{2}})$ $(\varSigma _{22}^{\frac {1}{2}}A_2)\le \sqrt {(\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{21})(A_2^{\prime }\varSigma _{22}A_2)}$. Thus,

$$\displaystyle \begin{aligned} \rho_{1,u}&\le \frac{\sqrt{(\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21})(A_2^{\prime}\varSigma_{22}A_2)}}{\sqrt{(\sigma_{11})(A_2^{\prime}\varSigma_{22}A_2)}} =\frac{\sqrt{\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21}}}{\sqrt{\sigma_{11}}},\mbox{ that is, }\\ \rho_{1,u}^2&\le \frac{\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21}}{\sigma_{11}}=\rho_{1.(2\ldots p)}^2.{} \end{aligned} $$

(5.6.8)

This establishes the following result:

Theorem 5.6.1

The multiple correlation coefficient ρ _1.(2…p) of x ₁ on x ₂, …, x _p represents the maximum correlation between x ₁ and an arbitrary linear function of x ₂, …, x _p.

This shows that if we consider the joint variation of x ₁ and (x ₂, …, x _p), this scale-free joint variation, namely the correlation, is maximum when the scale-free covariance, which constitutes a scale-free measure of joint variation, is the multiple correlation coefficient. Correlation measures a scale-free joint scatter in the variables involved, in this case x ₁ and (x ₂, …, x _p). Correlation does not measure general relationships between the variables; counterexamples are provided in Mathai and Haubold (2017b). Hence “maximum correlation” should be interpreted as maximum joint scale-free variation or joint scatter in the variables.

For the next property, we will use the following two basic results on conditional expectations, referring also to Mathai and Haubold (2017b). Let x and y be two real scalar random variables having a joint distribution. Then,

$$\displaystyle \begin{aligned} E[y]=E[E(y|x)] \end{aligned}$$

(i)

whenever the expected values exist, where the inside expectation is taken in the conditional space of y, given x, for all x, that is, E _y|x(y|x), and the outside expectation is taken in the marginal space of x, that is E _x(x). The other result states that

$$\displaystyle \begin{aligned} \text{Var}(y)=\text{Var}(E[y|x])+E[\text{Var}(y|x)] \end{aligned}$$

(ii)

where it is assumed that the expected value of the conditional variance and the variance of the conditional expectation exist. Situations where the results stated in (i) and (ii) are applicable or not applicable are described and illustrated in Mathai and Haubold (2017b). In result (i), x can be a scalar, vector or matrix variable. Now, let us examine the problem of predicting x ₁ on the basis of x ₂, …, x _p. What is the “best” predictor function of x ₂, …, x _p for predicting x ₁, “best” being construed as in the minimum mean square sense. If ϕ(x ₂, …, x _p) is an arbitrary predictor, then at given values of x ₂, …, x _p, ϕ is a constant. Consider the squared distance (x ₁ − b)² between x ₁ and b = ϕ(x ₂, …, x _p|x ₂, …, x _p) or b is ϕ at given values of x ₂, …, x _p. Then, “minimum in the mean square sense” means to minimize the expected value of (x ₁ − b)² over all b or $\min E(x_1-b)^2$. We have already established in Mathai and Haubold (2017b) that the minimizing value of b is b = E[x ₁] at given x ₂, …, x _p or the conditional expectation of x ₁, given x ₂, …, x _p or b = E[x ₁|x ₂, …, x _p]. Hence, this “best” predictor is also called the regression of x ₁ on (x ₂, …, x _p) or E[x ₁|x ₂, …, x _p] = the regression of x ₁ on x ₂, …, x _p, or the best predictor of x ₁ based on x ₂, …, x _p. Note that, in general, for any scalar variable y and a constant a,

$$\displaystyle \begin{aligned} E[y-a]^2&=E[y-E(y)+E(y)-a]^2=E[(y-E(y)]^2-2E[(y-E(y))(E(y)-a)]\\ &\ \ \ \ +E[(E(y)-a)^2]=\text{Var}(y)+0+ [E(y)-a]^2. \end{aligned} $$

(iii)

As the only term on the right-hand side containing a is [E(y) − a]², the minimum is attained when this term is zero since it is a non-negative constant, zero occurring when a = E[y]. Thus, E[y − a]² is minimized when a = E[y]. If a = ϕ(X ₍₂₎) at given value of X ₍₂₎, then the best predictor of x ₁, based on X ₍₂₎ is E[x ₁|X ₍₂₎] or the regression of x ₁ on X ₍₂₎. Let us determine what happens when E[x ₁|X ₍₂₎] is a linear function in X ₍₂₎. Let the linear function be $b_0+b_2x_2+\cdots +b_px_p=b_0+B_2^{\prime }X_{(2)},\ B_2^{\prime }=(b_2,\ldots ,b_p),$ where b ₀, b ₂, …, b _p are real constants [Note that only real variables and real constants are considered in this section]. That is, for some constant b ₀,

$$\displaystyle \begin{aligned} E[x_1|X_{(2)}]=b_0+b_2x_2+\cdots+b_px_p. \end{aligned}$$

(iv)

Taking expectation with respect to x ₁, x ₂, …, x _p in (iv), it follows from (i) that the left-hand side becomes E[x ₁], the right side being b ₀ + b ₂ E[x ₂] + ⋯ + b _p E[x _p]; subtracting this from (iv), we have

$$\displaystyle \begin{aligned} E[x_1|X_{(2)}]-E[x_1]=b_2(x_2-E[x_2])+\cdots+b_p(x_p-E[x_p]). \end{aligned}$$

(v)

Multiplying both sides of (v) by x _j − E[x _j] and taking expectations throughout, the right-hand side becomes b ₂ σ _2j + ⋯ + b _p σ _pj where σ _ij = Cov(x _i, x _j), i≠j, and it is the variance of x _j when i = j. The left-hand side is E[(x _j−E(x _j))(E[x ₁|x ₍₂₎]−E(x ₁))] = E[E(x ₁ x _j|X ₍₂₎)]−E(x _j)E(x ₁) = E[x ₁ x _j]−E(x ₁)E(x _j) = Cov(x ₁, x _j). Three properties were utilized in the derivation, namely (i), the fact that Cov(u, v) = E[(u − E(u))(v − E(v))] = E[u(v − E(v))] = E[v(u − E(u))] and Cov(u, v) = E(uv) − E(u)E(v). As well, Var(u) = E[u − E(u)]² = E[u(u − E(u))] as long as the second order moments exist. Thus, we have the following by combining all the linear equations for j = 2, …, p:

$$\displaystyle \begin{aligned} \varSigma_{21}=\varSigma_{22}b\ \Rightarrow\ b=\varSigma_{22}^{-1}\varSigma_{21}\mbox{ or }b'=\varSigma_{12}\varSigma_{22}^{-1} {} \end{aligned} $$

(5.6.9)

when Σ ₂₂ is nonsingular, which is the case as it was assumed that Σ ₂₂ > O. Now, the best predictor of x ₁ based on a linear function of X ₍₂₎ or the best predictor in the class of all linear functions of X ₍₂₎ is

$$\displaystyle \begin{aligned} E[x_1|X_{(2)}]=b'X_{(2)}=\varSigma_{12}\varSigma_{22}^{-1}X_{(2)}. {} \end{aligned} $$

(5.6.10)

Let us consider the correlation between x ₁ and its best linear predictor based on X ₍₂₎ or the correlation between x ₁ and the linear regression of x ₁ on X ₍₂₎. Observe that $\text{Cov}(x_1,\varSigma _{12}\varSigma _{22}^{-1}X_{(2)})= \varSigma _{12}\varSigma _{22}^{-1}\text{ Cov}(X_{(2)},~x_1)=\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{21},~ \varSigma _{21}=\varSigma _{12}^{\prime }$. Consider the variance of the best linear predictor: $\text{ Var}(b'X_{(2)})=b'\text{Cov}(X_{(2)})b=\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{22}\varSigma _{22}^{-1}\varSigma _{21}=\varSigma _{12}\varSigma _{22}^{-1}\varSigma _{21}$. Thus, the square of the correlation between x ₁ and its best linear predictor or the linear regression on X ₍₂₎, denoted by $\rho ^2_{x_1,b'X_{(2)}}$, is the following:

$$\displaystyle \begin{aligned} \rho^2_{x_1,b'X_{(2)}}=\frac{[\text{Cov}(x_1,b'X_{(2)})]^2}{\text{Var}(x_1)\text{ Var}(b'X_{(2)})}=\frac{\varSigma_{12}\varSigma_{22}^{-1}\varSigma_{21}}{\sigma_{11}}=\rho_{1.(2\ldots p)}^2. {} \end{aligned} $$

(5.6.11)

Hence, the following result:

Theorem 5.6.2

The multiple correlation ρ _1.(2…p) between x ₁ and x ₂, …, x _p is also the correlation between x ₁ and its best linear predictor or x ₁ and its linear regression on x ₂, …, x _p.

Observe that normality has not been assumed for obtaining all of the above properties. Thus, the results hold for any population for which moments of order two exist. However, in the case of a nonsingular normal population, that is, X _j ∼ N _p(μ, Σ), Σ > O, it follows from equation (3.3.5), that for r = 1, $E[x_1|X_{(2)}]=\varSigma _{12}\varSigma _{22}^{-1}X_{(2)}$ when E[X ₍₂₎] = μ ₍₂₎ = O and E(x ₁) = μ ₁ = 0; otherwise, $E[x_1|X_{(2)}]=\mu _{1}+\varSigma _{12}\varSigma _{22}^{-1}(X_{(2)}-\mu _{(2)})$.

5.6.4. Distributional aspects of the sample multiple correlation coefficient

From (5.6.7), we have

$$\displaystyle \begin{aligned} 1-r_{1.(2\ldots p)}^2=1-\frac{S_{12}S_{22}^{-1}S_{21}}{s_{11}}=\frac{s_{11}-S_{12}S_{22}^{-1}S_{21}}{s_{11}}=\frac{|S|}{|S_{22}|s_{11}}, {} \end{aligned} $$

(5.6.12)

which can be established from the expansion of the determinant $|S|=|S_{22}|~|S_{11}-S_{12}S_{22}^{-1}S_{21}|$, which is available from Sect. 1.3. In our case S ₁₁ is 1 × 1 and hence we denote it as s ₁₁ and then $|S_{11}-S_{12}S_{22}^{-1}S_{21}|$ is $s_{11}-S_{12}S_{22}^{-1}S_{21}$ which is 1 × 1. Let $u=1-r_{1.(2\ldots p)}^2=\frac {|S|}{|S_{22}|s_{11}}$. We can compute arbitrary moments of u by integrating out over the density of S, namely the Wishart density with m = n − 1 degrees of freedom when the population is Gaussian, where n is the sample size. That is, for arbitrary h,

$$\displaystyle \begin{aligned} E[u^h]=\frac{1}{2^{\frac{mp}{2}}|\varSigma|{}^{\frac{m}{2}}\varGamma_p(\frac{m}{2})}\int_{S>O}u^h|S|{}^{\frac{m}{2}-\frac{p+1}{2}}\text{e}^{-\frac{1}{2}\text{ tr}(\varSigma^{-1}S)}\text{d}S. \end{aligned}$$

(i)

Note that $u^h=|S|{ }^h|S_{22}|{ }^{-h}s_{11}^{-h}$. Among the three factors |S|^h, |S ₂₂|^−h and $s_{11}^{-h}$, |S ₂₂|^−h and $s_{11}^{-h}$ are creating problems. We will replace these by equivalent integrals so that the problematic part be shifted to the exponent. Consider the identities

$$\displaystyle \begin{aligned} s_{11}^{-h}&=\frac{1}{\varGamma(h)}\int_{x=0}^{\infty}x^{h-1}\text{e}^{-s_{11}x}\text{d}x,~ x>0,~s_{11}>0,~ \Re(h)>0 \end{aligned} $$

(ii)

$$\displaystyle \begin{aligned} |S_{22}|{}^{-h}&=\frac{1}{\varGamma_{p_2}(h)}\int_{X_2>O}|X_2|{}^{h-\frac{p_2+1}{2}}\text{e}^{-\text{tr}(S_{22}X_2)}\text{d}X_2, \end{aligned} $$

(iii)

for $X_2>O,~ S_{22}>O,~\Re (h)>\frac {p_2-1}{2}$ where X ₂ > O is a p ₂ × p ₂ real positive definite matrix, p ₂ = p − 1, p ₁ = 1, p ₁ + p ₂ = p. Then, excluding $-\frac {1}{2}$, the exponent in (i) becomes the following:

(iv)

Noting that (Σ ⁻¹ + 2Z) = Σ ⁻¹(I + 2ΣZ), we are now in a position to integrate out S from (i) by using a real matrix-variate gamma integral, denoting the constant part in (i) as c ₁:

$$\displaystyle \begin{aligned} E[u^h]&=c_1\frac{1}{\varGamma(h)\varGamma_{p_2}(h)}\int_{x=0}^{\infty}x^{h-1}\int_{X_2>O}|X_2|{}^{h-\frac{p_2+1}{2}}\\ &\qquad \qquad \qquad \qquad \qquad \times \Big[\int_{S>O}|S|{}^{\frac{m}{2}+h-\frac{p+1}{2}}\text{e}^{-\frac{1}{2}\text{tr}[S( \varSigma^{-1}+2Z)]}\text{d}S\Big]\text{ d}x\wedge\text{d}X_{2}\\ &=c_12^{p(\frac{m}{2}+h)}\varGamma_p({m}/{2}+h)\int_{x=0}^{\infty}\int_{X_2>O}x^{h-1}|X_2|{}^{h-\frac{p_2+1}{2}}|\varSigma^{-1}+2Z|{}^{-(\frac{m}{2}+h)}\text{d}x\wedge\text{ d}X_2\\ &=\frac{c_12^{p(\frac{m}{2}+h)}}{\varGamma(h)\varGamma_{p_2}(h)}\varGamma_p({m}/{2}+h)|\varSigma|{}^{\frac{m}{2}+h}\int_{x=0}^{\infty}\int_{X_2>O}x^{h-1} |X_2|{}^{h-\frac{p_2+1}{2}}\\ &\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \times |I+2\varSigma Z|{}^{-(\frac{m}{2}+h)}\text{d}x\wedge\text{d}X_2. {} \end{aligned} $$

(5.6.13)

The integral in (5.6.13) can be evaluated for a general Σ, which will produce the non-null density of $1-r_{1.(2\ldots p)}^2$—non-null in the sense that the population multiple correlation ρ _1.(2…p)≠0. However, if ρ _1.(2…p) = 0, which we call the null case, the determinant part in (5.6.13) splits into two factors, one depending only on x and the other only involving X ₂. So letting H _o: ρ _1.(2…p) = 0,

$$\displaystyle \begin{aligned} E[u^h|H_o]&=\frac{c_12^{p(\frac{m}{2}+h)}}{\varGamma(h)\varGamma_{p_2}(h)}\varGamma_p({m}/{2}+h)|\varSigma|{}^{\frac{m}{2}+h}\int_0^{\infty}x^{h-1} [1+2\sigma_{11}x]^{-(\frac{m}{2}+h)}\text{d}x\\ &\qquad \qquad \qquad \qquad \qquad \times \int_{X_2>O}|X_2|{}^{h-\frac{p_2+1}{2}}|I+2\varSigma_{22}X_2|{}^{-(\frac{m}{2}+h)}\text{d}X_2. \end{aligned} $$

(v)

But the x-integral gives $\frac {\varGamma (h)\varGamma (\frac {m}{2})}{\varGamma (\frac {m}{2}+h)}(2\sigma _{11})^{-h}$ for $\Re (h)>0$ and the X ₂-integral gives $\frac {\varGamma _{p_2}(h)\varGamma _{p_2}(\frac {m}{2})}{\varGamma _{p_2}(\frac {m}{2}+h)}|2\varSigma _{22}|{ }^{-h}$ for $\Re (h)>\frac {p_2-1}{2}$. Substituting all these in (v), we note that all the factors containing 2 and Σ, σ ₁₁, Σ ₂₂ cancel out, and then by using the fact that

$$\displaystyle \begin{aligned}\frac{\varGamma_p(\frac{m}{2}+h)}{\varGamma(\frac{m}{2}+h)\varGamma_{p-1}(\frac{m}{2}+h)}=\pi^{\frac{p-1}{2}}\frac{\varGamma(\frac{m}{2}-\frac{p-1}{2}+h)} {\varGamma(\frac{m}{2}+h)}, \end{aligned}$$

we have the following expression for the h-th null moment of u:

$$\displaystyle \begin{aligned} E[u^h|H_o]=\frac{\varGamma(\frac{m}{2})}{\varGamma(\frac{m}{2}-\frac{p-1}{2})}\frac{\varGamma(\frac{m}{2}-\frac{p-1}{2}+h)}{\varGamma(\frac{m}{2}+h)},~ \Re(h)>-\frac{m}{2}+\frac{p-1}{2},{} \end{aligned} $$

(5.6.14)

which happens to be the h-th moment of a real scalar type-1 beta random variable with the parameters $(\frac {m}{2}-\frac {p-1}{2},~ \frac {p-1}{2})$. Since h is arbitrary, this h-th moment uniquely determines the distribution, thus the following result:

Theorem 5.6.3

When the population has a p-variate Gaussian distribution with the parameters μ and Σ > O, and the population multiple correlation coefficient ρ _1.(2…p) = 0, the sample multiple correlation coefficient r _1.(2…p) is such that $u=1-r^2_{1.(2\ldots p)}$ is distributed as a real scalar type-1 beta random variable with the parameters $(\frac {m}{2}-\frac {p-1}{2},~ \frac {p-1}{2})$ , and thereby $v=\frac {u}{1-u}=\frac {1-r^2_{1.(2\ldots p)}}{r^2_{1.(2\ldots p)}}$ is distributed as a real scalar type-2 beta random variable with the parameters $(\frac {m}{2}-\frac {p-1}{2},~ \frac {p-1}{2})$ and $w=\frac {1-u}{u}=\frac {r^2_{1.(2\ldots p)}}{1-r^2_{1.(2\ldots p)}}$ is distributed as a real scalar type-2 beta random variable with the parameters $(\frac {p-1}{2},~ \frac {m}{2}-\frac {p-1}{2})$ whose density is

$$\displaystyle \begin{aligned} f_w(w)=\frac{\varGamma(\frac{m}{2})}{\varGamma(\frac{m}{2}-\frac{p-1}{2})\varGamma(\frac{p-1}{2})}w^{\frac{p-1}{2}-1}(1+w)^{-(\frac{m}{2})},~0\le w<\infty, {} \end{aligned} $$

(5.6.15)

and zero elsewhere.

As F-tables are available, we may conveniently express the above real scalar type-2 beta density in terms of an F-density. It suffices to make the substitution $w=\frac {p-1}{m-p+1}F$ where F is a real F random variable having p − 1 and m − p + 1 degrees of freedom, that is, an F _{p−1,m−p+1} random variable, with m = n − 1, n being the sample size. The density of this F random variable, denoted by f _F(F), is the following:

$$\displaystyle \begin{aligned}f_F(F)=\frac{\varGamma(\frac{m}{2})}{\varGamma(\frac{p-1}{2})\varGamma(\frac{m}{2}-\frac{p-1}{2})}\Big(\frac{p-1}{m-p+1}\Big)^{\frac{p-1}{2}}F^{\frac{p-1}{2}-1} \Big(1+\frac{p-1}{m-p+1}F\Big)^{-\frac{m}{2}},~ {} \end{aligned} $$

(5.6.16)

whenever 0 ≤ F < ∞, and zero elsewhere. In the above simplification, observe that $\frac {(p-1)/2}{(\frac {m}{2}-\frac {p-1}{2})}=\frac {p-1}{m-p+1}$. Then, for taking a decision with respect to testing the hypothesis H _o : ρ _1.(2…p) = 0, first compute $F_{p-1,m-p+1}=\frac {m-p+1}{p-1}w,~ w=\frac {r^2_{1.(2\ldots p)}}{1-r^2_{1.(2\ldots p)}}$. Then, reject H _o if the observed F _{p−1,m−p+1} ≥ F _{p−1,m−p+1,α} for a given α. This will be a test at significance level α or, equivalently, a test whose critical region’s size is α. The non-null distribution for evaluating the power of this likelihood ratio test can be determined by evaluating the integral in (5.6.13) and identifying the distribution through the uniqueness property of arbitrary moments.

Note 5.6.1

By making use of Theorem 5.6.3 as a starting point and exploiting various results connecting real scalar type-1 beta, type-2 beta, F and gamma variables, one can obtain numerous results on the distributional aspects of certain functions involving the sample multiple correlation coefficient.

5.6.5. The partial correlation coefficient

Partial correlation is a concept associated with the correlation between residuals in two variables after removing the effects of linear regression on a set of other variables. Consider the real vector $X'=(x_1,x_2,x_3,\ldots ,x_p)=(x_1,x_2,X_3^{\prime }), ~X_3^{\prime }=(x_3,\ldots ,x_p)$ where x ₁, …, x _p are all real scalar variables. Let the covariance matrix of X be Σ > O and let it be partitioned as follows:

where σ ₁₁, σ ₁₂, σ ₂₁, σ ₂₂ are 1 × 1, Σ ₁₃ and Σ ₂₃ are 1 × (p − 2), $\varSigma _{31}=\varSigma _{13}^{\prime },~\varSigma _{32}=\varSigma _{23}^{\prime }$ and Σ ₃₃ is (p − 2) × (p − 2). Let E[X] = O without any loss of generality. Consider the problem of predicting x ₁ by using a linear function of X ₍₃₎. Then, the regression of x ₁ on X ₍₃₎ is $E[x_1|X_{(3)}]=\varSigma _{13}\varSigma _{33}^{-1}X_{(3)}$ from (5.6.10), and the residual part, after removing this regression from x ₁ is $e_1=x_1-\varSigma _{13}\varSigma _{33}^{-1}X_{(3)}$. Similarly, the linear regression of x ₂ on X ₍₃₎ is $E[x_2|X_{(3)}]=\varSigma _{23}\varSigma _{33}^{-1}X_{(3)}$ and the residual in x ₂ after removing the effect of X ₍₃₎ is $e_2=x_2-\varSigma _{13}\varSigma _{33}^{-1}X_{(3)}$. What are then the variances of e ₁ and e ₂, the covariance between e ₁ and e ₂, and the scale-free covariance, namely the correlation between e ₁ and e ₂? Since e ₁ and e ₂ are all linear functions of the variables involved, we can utilize the expressions for variances of linear functions and covariance between linear functions, a basic discussion of such results being given in Mathai and Haubold (2017b). Thus,

$$\displaystyle \begin{aligned} \text{Var}(e_1)&=\text{Var}(x_1)+\text{Var}(\varSigma_{13}\varSigma_{33}^{-1}X_{(3)})-2\,\text{Cov}(x_1,~\varSigma_{13}\varSigma_{33}^{-1}X_{(3)})\\ &=\sigma_{11}+\varSigma_{13}\varSigma_{33}^{-1}\text{Cov}(X_{(3)})\varSigma_{33}^{-1}\varSigma_{31}-2\,\text{Cov}(x_1,~\varSigma_{33}^{-1}X_{(3)})\\ &=\sigma_{11}+\varSigma_{13}\varSigma_{33}^{-1}\varSigma_{31}-2\,\varSigma_{13}\varSigma_{33}^{-1}\varSigma_{31} =\sigma_{11}-\varSigma_{13}\varSigma_{33}^{-1}\varSigma_{31}. \end{aligned} $$

(i)

It can be similarly shown that

$$\displaystyle \begin{aligned} \text{Var}(e_2)&=\sigma_{22}-\varSigma_{23}\varSigma_{33}^{-1}\varSigma_{32} \end{aligned} $$

(ii)

$$\displaystyle \begin{aligned} \text{Cov}(e_1,e_2)&=\sigma_{12}-\varSigma_{13}\varSigma_{33}^{-1}\varSigma_{32}. \end{aligned} $$

(iii)

Then, the correlation between the residuals e ₁ and e ₂, which is called the partial correlation between x ₁ and x ₂ after removing the effects of linear regression on X ₍₃₎ and is denoted by ρ _12.(3…p), is such that

$$\displaystyle \begin{aligned}\rho_{12.(3\ldots p)}^2=\frac{[\sigma_{12}-\varSigma_{13}\varSigma_{33}^{-1}\varSigma_{32}]^2}{[\sigma_{11}-\varSigma_{13}\varSigma_{33}^{-1}\varSigma_{31}] [\sigma_{22}-\varSigma_{23}\varSigma_{33}^{-1}\varSigma_{32}]}.{} \end{aligned} $$

(5.6.17)

In the above simplifications, we have for instance used the fact that $\varSigma _{13}\varSigma _{33}^{-1}\varSigma _{32}=\varSigma _{23}\varSigma _{33}^{-1}\varSigma _{31}$ since both are real 1 × 1 and one is the transpose of the other.

The corresponding sample partial correlation coefficient between x ₁ and x ₂ after removing the effects of linear regression on X ₍₃₎, denoted by r _12.(3…p), is such that:

$$\displaystyle \begin{aligned} r_{12.(3\ldots p)}^2=\frac{[s_{12}-S_{13}S_{33}^{-1}S_{32}]^2}{[s_{11}-S_{13}S_{33}^{-1}S_{31}][s_{22}-S_{23}S_{33}^{-1}S_{32}]} {} \end{aligned} $$

(5.6.18)

where the sample sum of products matrix S is partitioned correspondingly, that is,

(5.6.19)

and s ₁₁, s ₁₂, s ₂₁, s ₂₂ being 1 × 1. In all the above derivations, we did not use any assumption of an underlying Gaussian population. The results hold for any general population as long as product moments up to second order exist. However, if we assume a p-variate nonsingular Gaussian population, then we can obtain some interesting results on the distributional aspects of the sample partial correlation, as was done in the case of the sample multiple correlation. Such results will not be herein considered.

Exercises 5.6

5.6.1

Let the p × p real positive definite matrix W be distributed as W ∼ W _p(m, Σ) with Σ = I. Consider the partitioning where W ₁₁ is r × r, r < p. Evaluate explicitly the normalizing constant in the density of W by first integrating out (1): W ₁₁, (2): W ₂₂, (3): W ₁₂.

5.6.2

Repeat Exercise 5.6.1 for the complex case.

5.6.3

Let the p × p real positive definite matrix W have a real Wishart density with degrees of freedom m ≥ p and parameter matrix Σ > O. Consider the transformation W = TT ^′ where T is lower triangular with positive diagonal elements. Evaluate the densities of the t _jj’s and the t _ij’s, i > j if (1): Σ = diag(σ ₁₁, …, σ _pp), (2): Σ > O is a general matrix.

5.6.4

Repeat Exercise 5.6.3 for the complex case. In the complex case, the diagonal elements in T are real and positive.

5.6.5

Let S ∼ W _p(m, Σ), Σ > O. Compute the density of S ⁻¹ in the real case, and repeat for the complex case.

5.7. Distributions of Products and Ratios of Matrix-variate Random Variables

In the real scalar case, one can easily interpret products and ratios of real scalar variables, whether these are random or mathematical variables. However, when it comes to matrices, products and ratios are to be carefully defined. Let X ₁ and X ₂ be independently distributed p × p real symmetric and positive definite matrix-variate random variables with density functions f ₁(X ₁) and f ₂(X ₂), respectively. By definition, f ₁ and f ₂ are respectively real-valued scalar functions of the matrices X ₁ and X ₂. Due to statistical independence of X ₁ and X ₂, their joint density, denoted by f(X ₁, X ₂), is the product of the marginal densities, that is, f(X ₁, X ₂) = f ₁(X ₁)f ₂(X ₂). Let us define a ratio and a product of matrices. Let $U_2=X_2^{\frac {1}{2}}X_1X_2^{\frac {1}{2}}$ and $U_1=X_2^{\frac {1}{2}}X_1^{-1}X_2^{\frac {1}{2}}$ be called the symmetric product and symmetric ratio of the matrices X ₁ and X ₂, where $X_2^{\frac {1}{2}}$ denotes the positive definite square root of the positive definite matrix X ₂. Let us consider the product U ₂ first. We could have also defined a product by interchanging X ₁ and X ₂. When it comes to ratios, we could have considered the ratios X ₁ to X ₂ as well as X ₂ to X ₁. Nonetheless, we will start with U ₁ and U ₂ as defined above.

5.7.1. The density of a product of real matrices

Consider the transformation $U_2=X_2^{\frac {1}{2}}X_1X_2^{\frac {1}{2}},~V=X_2$. Then, it follows from Theorem 1.6.5 that:

$$\displaystyle \begin{aligned} \text{d}X_1\wedge\text{d}X_2=|V|{}^{-\frac{p+1}{2}}\text{d}U_2\wedge\text{d}V. {} \end{aligned} $$

(5.7.1)

Letting the joint density of U ₂ and V be denoted by g(U ₂, V ) and the marginal density of U ₂, by g ₂(U ₂), we have

$$\displaystyle \begin{aligned} f_1(X_1)f_2(X_2)&\,\text{d}X_1\wedge\text{d}X_2=|V|{}^{-\frac{p+1}{2}}f_1(V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}}) f_2(V)\,\text{d}U_2\wedge\text{d}V\\ g_2(U_2)&=\int_V|V|{}^{-\frac{p+1}{2}}f_1(V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}})f_2(V)\text{d}V,{} \end{aligned} $$

(5.7.2)

g ₂(U ₂) being referred to as the density of the symmetric product U ₂ of the matrices X ₁ and X ₂. For example, letting X ₁ and X ₂ be independently distributed two-parameter matrix-variate gamma random variables with the densities

$$\displaystyle \begin{aligned} f_{3j}(X_j)=\frac{|B_j|{}^{\alpha_j}}{\varGamma_p(\alpha_j)}|X_j|{}^{\alpha_j-\frac{p+1}{2}}\text{e}^{-\text{tr}(B_jX_j)},~j=1,2, \end{aligned}$$

(i)

for $B_j>O,~ X_j>O,~ \Re (\alpha _j)>\frac {p-1}{2},~j=1,2$, and zero elsewhere, we have

$$\displaystyle \begin{aligned} g_2(U_2)=c|U_2|{}^{\alpha_1-\frac{p+1}{2}}\int_{V>O}|V|{}^{\alpha_2-\alpha_1-\frac{p+1}{2}}\text{e}^{-\text{tr}(B_2V+B_1V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}})}\text{ d}V, {} \end{aligned} $$

(5.7.3)

where c is the product of the normalizing constants of the densities specified in (i). On comparing (5.7.3) with the Krätzel integral defined in the real scalar case in Chap. 2, as well as in Mathai (2012) and Mathai and Haubold (1988, 2011a, 2017,a), it is seen that (5.7.3) can be regarded as a real matrix-variate analogue of Krätzel’s integral. One could also obtain the real matrix-variate version of the inverse Gaussian density from the integrand.

As another example, let f ₁(X ₁) be a real matrix-variate type-1 beta density as previously defined in this chapter, whose parameters are $(\gamma +\frac {p+1}{2},~\alpha )$ with $\Re (\alpha )>\frac {p-1}{2},~ \Re (\gamma )>-1$, its density being given by

$$\displaystyle \begin{aligned} f_4(X_1)=\frac{\varGamma_p(\gamma+\frac{p+1}{2}+\alpha)}{\varGamma_p(\gamma+\frac{p+1}{2})\varGamma_p(\alpha)}|X_1|{}^{\gamma} |I-X_1|{}^{\alpha-\frac{p+1}{2}} \end{aligned}$$

(ii)

for $O<X_1<I,~ \Re (\gamma )>-1,~\Re (\alpha )>\frac {p-1}{2}$, and zero elsewhere. Letting f ₂(X ₂) = f(X ₂) be any other density, the density of U ₂ is then

$$\displaystyle \begin{aligned} g_2(U_2)&=\int_V|V|{}^{-\frac{p+1}{2}}f_1(V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}})f_2(V)\text{d}V\\ &=\frac{\varGamma_p(\gamma+\frac{p+1}{2}+\alpha)}{\varGamma_p(\gamma+\frac{p+1}{2})\varGamma_p(\alpha)} \int_{V}|V|{}^{-\frac{p+1}{2}}|V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}}|{}^{\gamma}|I-V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}}|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V\\ &=\frac{\varGamma_p(\gamma+\frac{p+1}{2}+\alpha)}{\varGamma_p(\gamma+\frac{p+1}{2})}\frac{|U_2|{}^{\gamma}}{\varGamma_p(\alpha)} \int_{V>U_2>O}|V|{}^{-\alpha-\gamma} |V-U_2|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V\ \ \ \\ &=\frac{\varGamma_p(\gamma+\frac{p+1}{2}+\alpha)}{\varGamma_p(\gamma+\frac{p+1}{2})}K_{2,U_2,\gamma}^{-\alpha}f{} \end{aligned} $$

(5.7.4)

where

$$\displaystyle \begin{aligned} K_{2,U_2,\gamma}^{-\alpha}f=\frac{|U_2|{}^{\gamma}}{\varGamma_p(\alpha)}\int_{V>U_2>O}|V|{}^{-\alpha-\gamma} |V-U_2|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V,~ \Re(\alpha)>\frac{p-1}{2},{}\end{aligned} $$

(5.7.5)

is called the real matrix-variate Erdélyi-Kober right-sided or second kind fractional integral of order α and parameter γ as for p = 1, that is, in the real scalar case, (5.7.5) corresponds to the Erdélyi-Kober fractional integral of the second kind of order α and parameter γ. This connection of the density of a symmetric product of matrices to a fractional integral of the second kind was established by Mathai (2009, 2010) and further papers.

5.7.2. M-convolution and fractional integral of the second kind

Mathai (1997) referred to the structure in (5.7.2) as the M-convolution of a product where f ₁ and f ₂ need not be statistical densities. Actually, they could be any function provided the integral exists. However, if f ₁ and f ₂ are statistical densities, this M-convolution of a product can be interpreted as the density of a symmetric product. Thus, a physical interpretation to an M-convolution of a product is provided in terms of statistical densities. We have seen that (5.7.2) is connected to a fractional integral when f ₁ is a real matrix-variate type-1 beta density and f ₂ is an arbitrary density. From this observation, one can introduce a general definition for a fractional integral of the second kind in the real matrix-variate case. Let

$$\displaystyle \begin{aligned} f_1(X_1)=\phi_1(X_1)\frac{|I-X_1|{}^{\alpha-\frac{p+1}{2}}}{\varGamma_p(\alpha)},~ \Re(\alpha)>\frac{p-1}{2}, \end{aligned}$$

(iii)

and f ₂(X ₂) = ϕ ₂(X ₂)f(X ₂) where ϕ ₁ and ϕ ₂ are specified functions and f is an arbitrary function. Then, consider the M-convolution of a product, again denoted by g ₂(U ₂):

$$\displaystyle \begin{aligned} g_2(U_2)&=\int_V|V|{}^{-\frac{p+1}{2}}\phi_1(V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}})\frac{|I-V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}}|{}^{\alpha-\frac{p+1}{2}}}{\varGamma_p(\alpha)}\\ &\ \ \ \ \ \ \ \times \phi_2(V)f(V)\text{d}V,~ \Re(\alpha)>\frac{p-1}{2}.{} \end{aligned} $$

(5.7.6)

The right-hand side (5.7.6) will be called a fractional integral of the second kind of order α in the real matrix-variate case. By letting p = 1 and specifying ϕ ₁ and ϕ ₂, one can obtain all the fractional integrals of the second kind of order α that have previously been defined by various authors. Hence, for a general p, one has the corresponding real matrix-variate cases. For example, on letting ϕ ₁(X ₁) = |X ₁|^γ and ϕ ₂(X ₂) = 1, one has Erdélyi-Kober fractional integral of the second kind of (5.7.5) in the real matrix-variate case as for p = 1, it is the Erdélyi-Kober fractional integral of the second kind of order α. Letting ϕ ₁(X ₁) = 1 and ϕ ₂(X ₂) = |X ₂|^α, (5.7.6) simplifies to the following integral, again denoted by g ₂(U ₂):

$$\displaystyle \begin{aligned} g_2(U_2)=\frac{1}{\varGamma_p(\alpha)}\int_{V>U_2>O}|V-U_2|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V. {} \end{aligned} $$

(5.7.7)

For p = 1, (5.7.7) is Weyl fractional integral of the second kind of order α. Accordingly, (5.7.7) is Weyl fractional integral of the second kind in the real matrix-variate case. For p = 1, (5.7.7) is also the Riemann-Liouville fractional integral of the second kind of order α in the real scalar case, if there exists a finite upper bound for V . If V is bounded above by a real positive definite constant matrix B > O in the integral in (5.7.7), then (5.7.7) is Riemann-Liouville fractional integral of the second kind of order α for the real matrix-variate case. Connections to other fractional integrals of the second kind can be established by referring to Mathai and Haubold (2017).

The appeal of fractional integrals of the second kind resides in the fact that they can be given physical interpretations as the density of a symmetric product when f ₁ and f ₂ are densities or as an M-convolution of products, whether in the scalar variable case or the matrix-variate case, and that in both the real and complex domains.

5.7.3. A pathway extension of fractional integrals

Consider the following modification to the general definition of a fractional integral of the second kind of order α in the real matrix-variate case given in (5.7.6). Let

$$\displaystyle \begin{aligned} f_1(X_1)=\phi_1(X_1)\frac{1}{\varGamma_p(\alpha)}|I-a(1-q)X_1|{}^{\frac{\eta}{1-q}-\frac{p+1}{2}},~ f_2(X_2)=\phi_2(X_2)f(X_2), \end{aligned}$$

(iv)

where $\Re (\alpha )>\frac {p-1}{2}$ and q < 1, and a > 0, η > 0 are real scalar constants. For all q < 1, g ₂(U ₂) corresponding to f ₁(X ₁) and f ₂(X ₂) of (iv) will define a family of fractional integrals of the second kind. Observe that when X ₁ and I − a(1 − q)X ₁ > O, then $O<X_1<\frac {1}{a(1-q)}I$. However, by writing (1 − q) = −(q − 1) for q > 1, one can switch into a type-2 beta form, namely, I + a(q − 1)X ₁ > O for q > 1, which implies that X ₁ > O and the fractional nature is lost. As well, when q → 1,

$$\displaystyle \begin{aligned}|I+a(q-1)X_1|{}^{-\frac{\eta}{q-1}}\to \text{e}^{-a\,\eta\,\text{tr}(X_1)}\end{aligned}$$

which is the exponential form or gamma density form. In this case too, the fractional nature is lost. Thus, through q, one can obtain matrix-variate type-1 and type-2 beta families and a gamma family of functions from (iv). Then q is called the pathway parameter which generates three families of functions. However, the fractional nature of the integrals is lost for the cases q > 1 and q → 1. In the real scalar case, x ₁ may have an exponent and making use of $[1-(1-q)x_1^{\delta }]^{\alpha -1}$ can lead to interesting fractional integrals for q < 1. However, raising X ₁ to an exponent δ in the matrix-variate case will fail to produce results of interest as Jacobians will then take inconvenient forms that cannot be expressed in terms of the original matrices; this is for example explained in detail in Mathai (1997) for the case of a squared real symmetric matrix.

5.7.4. The density of a ratio of real matrices

One can define a symmetric ratio in four different ways: $X_2^{\frac {1}{2}}X_1^{-1}X_2^{\frac {1}{2}}$ with V = X ₂ or V = X ₁ and $X_1^{\frac {1}{2}}X_2^{-1}X_1^{\frac {1}{2}}$ with V = X ₂ or V = X ₁. All these four forms will produce different structures on f ₁(X ₁)f ₂(X ₂). Since the form U ₁ that was specified in Sect. 5.7 in terms of $X_2^{\frac {1}{2}}X_1^{-1}X_2^{\frac {1}{2}}$ with V = X ₂ provides connections to fractional integrals of the first kind, we will consider this one whose density, denoted by g ₁(U ₁), is the following observing that $\text{d}X_1\wedge \text{ d}X_2=|V|{ }^{\frac {p+1}{2}}|U_1|{ }^{-(p+1)}\text{d}U_1\wedge \text{d}V$:

$$\displaystyle \begin{aligned} g_1(U_1)=\int_V|V|{}^{\frac{p+1}{2}}|U_1|{}^{-(p+1)}f_1(V^{\frac{1}{2}}U_1^{-1}V^{\frac{1}{2}})f_2(V)\text{d}V {} \end{aligned} $$

(5.7.8)

provided the integral exists. As in the fractional integral of the second kind in real matrix-variate case, we can give a general definition for a fractional integral of the first kind in the real matrix-variate case as follows: Let f ₁(X ₁) and f ₂(X ₂) be taken as in the case of fractional integral of the second kind with ϕ ₁ and ϕ ₂ as preassigned functions. Then

$$\displaystyle \begin{aligned} g_1(U_1)&=\int_V|V|{}^{\frac{p+1}{2}}|U_1|{}^{-(p+1)}\phi_1(V^{\frac{1}{2}}U_1^{-1}V^{\frac{1}{2}})\\ &\ \ \ \ \ \ \times \frac{1}{\varGamma_p(\alpha)}|I-V^{\frac{1}{2}}U_1^{-1}V^{\frac{1}{2}}|{}^{\alpha-\frac{p+1}{2}}\phi_2(V)f(V)\text{d}V, ~ \Re(\alpha)>\frac{p-1}{2}.{} \end{aligned} $$

(5.7.9)

As an example, letting

$$\displaystyle \begin{aligned}\phi_1(X_1)=\frac{\varGamma_p(\gamma+\alpha)}{\varGamma_p(\gamma)}|X_1|{}^{\gamma-\frac{p+1}{2}}\mbox{ and }\phi_2(X_2)=1,~ \Re(\gamma)>\frac{p-1}{2},\end{aligned}$$

we have

$$\displaystyle \begin{aligned} g_1(U_1)&=\frac{\varGamma_p(\gamma+\alpha)}{\varGamma_p(\gamma)}\frac{|U_1|{}^{-\alpha-\gamma}}{\varGamma_p(\alpha)} \int_{V<U_1}|V|{}^{\gamma}|U_1-V|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V\\ &=\frac{\varGamma_p(\gamma+\alpha)}{\varGamma_p(\gamma)}K_{1,U_1,\gamma}^{-\alpha}f{} \end{aligned} $$

(5.7.10)

for $\Re (\alpha )>\frac {p-1}{2}, ~ \Re (\gamma )>\frac {p-1}{2}$, where

$$\displaystyle \begin{aligned} K_{1,U_1,\gamma}^{-\alpha}f=\frac{|U_1|{}^{-\alpha-\gamma}}{\varGamma_p(\alpha)}\int_{V<U_1}|V|{}^{\gamma}|U_1-V|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V {} \end{aligned} $$

(5.7.11)

for $\Re (\alpha )>\frac {p-1}{2},~ \Re (\gamma )>\frac {p-1}{2},$ is Erdélyi-Kober fractional integral of the first kind of order α and parameter γ in the real matrix-variate case. Since for p = 1 or in the real scalar case, $K_{1,u_1,\gamma }^{-\alpha }f$ is Erdélyi-Kober fractional integral of order α and parameter γ, the first author referred to $K_{1,u_1,\gamma }^{-\alpha }f$ in (5.7.11) as Erdélyi-Kober fractional integral of the first kind of order α in the real matrix-variate case.

By specializing ϕ ₁ and ϕ ₂ in the real scalar case, that is, for p = 1, one can obtain all the fractional integrals of the first kind of order α that have been previously introduced in the literature by various authors. One can similarly derive the corresponding results on fractional integrals of the first kind in the real matrix-variate case. Before concluding this section, we will consider one more special case. Let

$$\displaystyle \begin{aligned}\phi_1(X_1)=|X_1|{}^{-\alpha-\frac{p+1}{2}}\mbox{ and }\phi_2(X_2)=|X_2|{}^{\alpha}. \end{aligned}$$

In this case, g ₁(U ₁) is not a statistical density but it is the M-convolution of a ratio. Under the above substitutions, g ₁(U ₁) of (5.7.9) becomes

$$\displaystyle \begin{aligned} g_1(U_1)=\frac{1}{\varGamma_p(\alpha)}\int_{V<U_1}|U_1-V|{}^{\alpha-\frac{p+1}{2}}f(V)\text{d}V,~\Re(\alpha)>\frac{p-1}{2}. {} \end{aligned} $$

(5.7.12)

For p = 1, (5.7.12) is Weyl fractional integral of the first kind of order α; accordingly the first author refers to (5.7.12) as Weyl fractional integral of the first kind of order α in the real matrix-variate case. Since we are considering only real positive definite matrices here, there is a natural lower bound for the integral or the integral is over O < V < U ₁. When there is a specific lower bound, such as O < V , then for p = 1, (5.7.12) is called the Riemann-Liouville fractional integral of the first kind of order α. Hence (5.7.12) will be referred to as the Riemann-Liouville fractional integral of the first kind of order α in the real matrix-variate case.

Example 5.7.1

Let X ₁ and X ₂ be independently distributed p × p real positive definite gamma matrix-variate random variables whose densities are

$$\displaystyle \begin{aligned}f_j(X_j)=\frac{1}{\varGamma_p(\alpha_j)}|X_j|{}^{\alpha_j-\frac{p+1}{2}}\text{e}^{-\text{tr}(X_j)},~X_j>O,~\Re(\alpha_j)>\frac{p-1}{2},~j=1,2, \end{aligned}$$

and zero elsewhere. Show that the densities of the symmetric ratios of matrices $U_1=X_1^{-\frac {1}{2}}X_2X_1^{-\frac {1}{2}}$ and $U_2=X_2^{\frac {1}{2}}X_1^{-1}X_2^{\frac {1}{2}}$ are identical.

Solution 5.7.1

Observe that for p = 1 that is, in the real scalar case, both U ₁ and U ₂ are the ratio of real scalar variables $\frac {x_2}{x_1}$ but in the matrix-variate case U ₁ and U ₂ are different matrices. Hence, we cannot expect the densities of U ₁ and U ₂ to be the same. They will happen to be identical because of a property called functional symmetry of the gamma densities. Consider U ₁ and let V = X ₁. Then, $X_2=V^{\frac {1}{2}}U_1V^{\frac {1}{2}}$ and $\text{ d}X_1\wedge \text{d}X_2=|V|{ }^{\frac {p+1}{2}}\text{d}V\wedge \text{d}U_1$. Due to the statistical independence of X ₁ and X ₂, their joint density is f ₁(X ₁)f ₂(X ₂) and the joint density of U ₁ and V is $|V|{ }^{\frac {p+1}{2}}f_1(V)f_2(V^{\frac {1}{2}}U_1V^{\frac {1}{2}})$, the marginal density of U ₁, denoted by g ₁(U ₁), being the following:

$$\displaystyle \begin{aligned}g_1(U_1)=\frac{|U_1|{}^{\alpha_2-\frac{p+1}{2}}}{\varGamma_p(\alpha_1)\varGamma_p(\alpha_2)}\int_{V>O}|V|{}^{\alpha_1+\alpha_2-\frac{p+1}{2}}\text{e}^{-\text{ tr}(V+V^{\frac{1}{2}}U_1V^{\frac{1}{2}})}\text{d}V. \end{aligned}$$

The exponent of e can be written as follows:

$$\displaystyle \begin{aligned}-\text{tr}(V)-\text{tr}(V^{\frac{1}{2}}U_1V^{\frac{1}{2}})\!=\!-\text{tr}(V)-\text{tr}(VU_1)\!=\!-\text{tr}(V(I+U_1))\!=\!-\!\text{ tr}[(I+U_1)^{\frac{1}{2}}V(I+U_1)^{\frac{1}{2}}].\end{aligned}$$

Letting $Y=(I+U_1)^{\frac {1}{2}}V(I+U_1)^{\frac {1}{2}}\Rightarrow \text{d}Y=|I+U_1|{ }^{\frac {p+1}{2}}\text{d}V.$ Then carrying out the integration in g ₁(U ₁), we obtain the following density function:

$$\displaystyle \begin{aligned} g_1(U_1)=\frac{\varGamma_p(\alpha_1+\alpha_2)}{\varGamma_p(\alpha_1)\varGamma_p(\alpha_2)}|U_1|{}^{\alpha_2-\frac{p+1}{2}}|I+U_1|{}^{-(\alpha_1+\alpha_2)}, \end{aligned}$$

(i)

which is a real matrix-variate type-2 beta density with the parameters (α ₂, α ₁). The original conditions $\Re (\alpha _j)>\frac {p-1}{2},~j=1,2,$ remain the same, no additional conditions being needed. Now, consider U ₂ and let V = X ₂ so that $X_1=V^{\frac {1}{2}}U_2^{-1}V^{\frac {1}{2}}\Rightarrow \text{ d}X_1\wedge \text{d}X_2=|V|{ }^{\frac {p+1}{2}}|U_2|{ }^{-(p+1)}\text{d}V\wedge \text{d}U_2$. The marginal density of U ₂ is then:

$$\displaystyle \begin{aligned} g_2(U_2)=\frac{|U_2|{}^{-\alpha_1+\frac{p+1}{2}}|U_2|{}^{-(p+1)}}{\varGamma_p(\alpha_1)\varGamma_p(\alpha_2)}\int_{V>O}|V|{}^{\alpha_1+\alpha_2-\frac{p+1}{2}} \text{e}^{-\text{tr}[V+V^{\frac{1}{2}}U_2^{-1}V^{\frac{1}{2}}]} \end{aligned}$$

(ii)

As previously explained, the exponent in (ii) can be simplified to $-\text{tr}[(I+U_2^{-1})^{\frac {1}{2}}V$ $(I+U_2^{-1})^{\frac {1}{2}}]$, which once integrated out yields $\varGamma _p(\alpha _1+\alpha _2)|I+U_2^{-1}|{ }^{-(\alpha _1+\alpha _2)}$. Then,

$$\displaystyle \begin{aligned} |U_2|{}^{-\alpha_1+\frac{p+1}{2}}|U_2|{}^{-(p+1)}|I+U_2^{-1}|{}^{-(\alpha_1+\alpha_2)}=|U_2|{}^{\alpha_2-\frac{p+1}{2}}|I+U_2|{}^{-(\alpha_1+\alpha_2)}. \end{aligned}$$

(iii)

It follows from (i),(ii) and (iii) that g ₁(U ₁) = g ₂(U ₂). Thus, the densities of U ₁ and U ₂ are indeed one and the same, as had to be proved.

5.7.5. A pathway extension of first kind integrals, real matrix-variate case

As in the case of fractional integral of the second kind, we can also construct a pathway extension of the first kind integrals in the real matrix-variate case. Let

$$\displaystyle \begin{aligned} f_1(X_1)=\frac{\phi_1(X_1)}{\varGamma_p(\alpha)}|I-a(1-q)X_1|{}^{\alpha-\frac{p+1}{2}},~ \alpha=\frac{\eta}{1-q},~ \Re(\alpha)>\frac{p-1}{2}, {} \end{aligned} $$

(5.7.13)

and f ₂(X ₂) = ϕ ₂(X ₂)f(X ₂) for the scalar parameters a > 0, η > 0, q < 1. When q < 1, (5.7.13) remains in the generalized type-1 beta family of functions. However, when q > 1, f ₁ switches to the generalized type-2 beta family of functions and when q → 1, (5.6.13) goes into a gamma family of functions. Since X ₁ > O for q > 1 and q → 1, the fractional nature is lost in those instances. Hence, only the case q < 1 is relevant in this subsection. For various values of q < 1, one has a family of fractional integrals of the first kind coming from (5.7.13). For details on the concept of pathway, the reader may refer to Mathai (2005) and later papers. With the function f ₁(X ₁) as specified in (5.7.13) and the corresponding f ₂(X ₂) = ϕ ₂(X ₂)f(X ₂), one can write down the M-convolution of a ratio, g ₁(U ₁), corresponding to (5.7.8). Thus, we have the pathway extended form of g ₁(U ₁).

5.7a. Density of a Product and Integrals of the Second Kind

The discussion in this section parallels that in the real matrix-variate case. Hence, only a summarized treatment will be provided. With respect to the density of a product when $\tilde {f}_1$ and $\tilde {f}_2$ are matrix-variate gamma densities in the complex domain, the results are parallel to those obtained in the real matrix-variate case. Hence, we will consider an extension of fractional integrals to the complex matrix-variate cases. Matrices in the complex domain will be denoted with a tilde. Let $\tilde {X}_1$ and $\tilde {X}_2$ be independently distributed Hermitian positive definite complex matrix-variate random variables whose densities are $\tilde {f}_1(\tilde {X}_1)$ and $\tilde {f}_2(\tilde {X}_2),$ respectively. Let $\tilde {U}_2=\tilde {X}_2^{\frac {1}{2}}\tilde {X}_1\tilde {X}_2^{\frac {1}{2}}$ and $\tilde {U}_1=\tilde {X}_2^{\frac {1}{2}}\tilde {X}_1^{-1}\tilde {X}_2^{\frac {1}{2}}$ where $\tilde {X}_2^{\frac {1}{2}}$ denotes the Hermitian positive definite square root of the Hermitian positive definite matrix $\tilde {X}_2$. Statistical densities are real-valued scalar functions whether the argument matrix is in the real or complex domain.

5.7a.1. Density of a product and fractional integral of the second kind, complex case

Let us consider the transformation $(\tilde {X}_1,~\tilde {X}_2)\to (\tilde {U}_2,~\tilde {V})$ and $(\tilde {X}_1,~\tilde {X}_2)\to (\tilde {U}_1,~\tilde {V})$, the Jacobians being available from Chap. 1 or Mathai (1997). Then,

$$\displaystyle \begin{aligned}\text{d}\tilde{X}_1\wedge\text{d}\tilde{X}_2=\begin{cases}|\text{det}(\tilde{V})|{}^{-p}\text{d}\tilde{U}_2\wedge\text{d}\tilde{V}\\ |\text{det}(\tilde{V})|{}^p|\text{det}(\tilde{U}_1)|{}^{-2p}\text{d}\tilde{U}_1\wedge\text{d}\tilde{V}.\end{cases}{}\end{aligned} $$

(5.7a.1)

When f ₁ and f ₂ are statistical densities, the density of the product, denoted by $\tilde {g}_2(\tilde {U}_2)$, is the following:

$$\displaystyle \begin{aligned} \tilde{g}_2(\tilde{U}_2)=\int_{\tilde{V}}|\text{det}(\tilde{V})|{}^{-p}f_1(\tilde{V}^{-\frac{1}{2}}\tilde{U}_2\tilde{V}^{-\frac{1}{2}})f_2(\tilde{V})\,\text{ d}\tilde{V} {} \end{aligned} $$

(5.7a.2)

where |det(⋅)| is the absolute value of the determinant of (⋅). If f ₁ and f ₂ are not statistical densities, (5.7a.1) will be called the M-convolution of the product. As in the real matrix-variate case, we will give a general definition of a fractional integral of order α of the second kind in the complex matrix-variate case. Let

$$\displaystyle \begin{aligned}\tilde{f}_1(\tilde{X}_1)=\phi_1(\tilde{X}_1)\frac{1}{\tilde{\varGamma}_p(\alpha)}|\text{det}(I-\tilde{X}_1)|{}^{\alpha-p},~ \Re(\alpha)>p-1, \end{aligned}$$

and $f_2(\tilde {X}_2)=\phi _2(\tilde {X}_2)\tilde {f}(\tilde {X}_2)$ where ϕ ₁ and ϕ ₂ are specified functions and f is an arbitrary function. Then, (5.7a.2) becomes

$$\displaystyle \begin{aligned} \tilde{g}_2(\tilde{U}_2)&=\int_{\tilde{V}}|\text{det}(\tilde{V})|{}^{-p}\phi_1(\tilde{V}^{-\frac{1}{2}}\tilde{U}_2\tilde{V}^{-\frac{1}{2}})\\ &\ \ \ \ \ \ \ \times \frac{1}{\tilde{\varGamma}_p(\alpha)}|\text{ det}(I-\tilde{V}^{-\frac{1}{2}}\tilde{U}_2\tilde{V}^{-\frac{1}{2}})|{}^{\alpha-p}\phi_2(\tilde{V})f(\tilde{V})\, \text{d}\tilde{V}{} \end{aligned} $$

(5.7a.3)

for $\Re (\alpha )>p-1$. As an example, let

$$\displaystyle \begin{aligned}\phi_1(\tilde{X}_1)=\frac{\tilde{\varGamma}_p(\gamma+p+\alpha)}{\tilde{\varGamma}_p(\gamma+p)}|\text{det}(\tilde{X}_1)|{}^{\gamma}\, \mbox{ and }\ \phi_2(\tilde{X}_2)=1. \end{aligned}$$

Observe that $\tilde {f}_1(\tilde {X}_1)$ has now become a complex matrix-variate type-1 beta density with the parameters (γ + p, α) so that (5.7a.3) can be expressed as follows:

$$\displaystyle \begin{aligned} \tilde{g}_2(\tilde{U}_2)&=\frac{\tilde{\varGamma}_p(\gamma+p+\alpha)}{\tilde{\varGamma}_p(\gamma+p)}\frac{|\text{ det}(\tilde{U}_2)|{}^{\gamma}}{\tilde{\varGamma}_p(\alpha)}\int_{\tilde{V}>\tilde{U}_2>O}|\text{det}(\tilde{V})|{}^{-\alpha-\gamma}|\text{ det}(\tilde{V}-\tilde{U}_2)|{}^{\alpha-p}\tilde{f}(\tilde{V})\,\text{d}\tilde{V}\\ &=\frac{\tilde{\varGamma}_p(\gamma+p+\alpha)}{\tilde{\varGamma}_p(\gamma+p)}\tilde{K}_{2,\tilde{U}_2,\gamma}^{-\alpha}f {} \end{aligned} $$

(5.7a.4)

where

$$\displaystyle \begin{aligned} \tilde{K}_{2,\tilde{U}_2,\gamma}^{-\alpha}f=\frac{|\text{det}(\tilde{U}_2)|{}^{\gamma}}{\tilde{\varGamma}_p(\alpha)}\int_{\tilde{V}>\tilde{U}_2>O}|\text{ det}(\tilde{V})|{}^{-\alpha-\gamma}|\text{det}(\tilde{V}-\tilde{U}_2)|{}^{\alpha-p}f(\tilde{V})\,\text{d}\tilde{V} {} \end{aligned} $$

(5.7a.5)

is Erdélyi-Kober fractional integral of the second kind of order α in the complex matrix-variate case, which is defined for $\Re (\alpha )>p-1,~\Re (\gamma )>-1$. The extension of fractional integrals to complex matrix-variate cases was introduced in Mathai (2013). As a second example, let

$$\displaystyle \begin{aligned}\phi_1(\tilde{X}_1)=1\, \mbox{ and }\ \phi_2(\tilde{X}_2)=|\text{det}(\tilde{V})|{}^{\alpha}. \end{aligned}$$

In that case, (5.7a.3) becomes

$$\displaystyle \begin{aligned} \tilde{g}_2(\tilde{U}_2)=\int_{\tilde{V}>\tilde{U}_2>O}|\text{det}(\tilde{V}-\tilde{U}_2)|{}^{\alpha-p}f(V)\,\text{d}\tilde{V},~ \Re(\alpha)>p-1. {} \end{aligned} $$

(5.7a.6)

The integral (5.7a.6) is Weyl fractional integral of the second kind of order α in the complex matrix-variate case. If V is bounded above by a Hermitian positive definite constant matrix B > O, then (5.7a.6) is a Riemann-Liouville fractional integral of the second kind of order α in the complex matrix-variate case.

A pathway extension parallel to that developed in the real matrix-variate case can be similarly obtained. Accordingly, the details of the derivation are omitted.

5.7a.2. Density of a ratio and fractional integrals of the first kind, complex case

We will now derive the density of the symmetric ratio $\tilde {U}_1$ defined in Sect. 5.7a. If $\tilde {f}_1$ and $\tilde {f}_2$ are statistical densities, then the density of $\tilde {U}_1$, denoted by $\tilde {g}_1(\tilde {U}_1)$, is given by

$$\displaystyle \begin{aligned}\tilde{g}_1(\tilde{U}_1)=\int_{\tilde{V}} |\text{det}(\tilde{V})|{}^{p}|\text{ det}(\tilde{U_1})|{}^{-2p}\tilde{f}_1(\tilde{V}^{\frac{1}{2}}\tilde{U}^{-1}\tilde{V}^{\frac{1}{2}}) \tilde{f}_2(\tilde{V})\,\text{d}\tilde{V},{}\end{aligned} $$

(5.7a.7)

provided the integral is convergent. For the general definition, let us take

$$\displaystyle \begin{aligned}\tilde{f}_1(\tilde{X}_1)=\phi_1(\tilde{X}_1)\frac{1}{\tilde{\varGamma}_p(\alpha)}|\text{det}(I-\tilde{X}_1)|{}^{\alpha-p},~ \Re(\alpha)>p-1, \end{aligned}$$

and $\tilde {f}_2(\tilde {X}_2)=\phi _2(\tilde {X}_2)\tilde {f}(\tilde {X}_2)$ where ϕ ₁ and ϕ ₂ are specified functions and $\tilde {f}$ is an arbitrary function. Then $\tilde {g}_1(\tilde {U}_1)$ is the following:

$$\displaystyle \begin{aligned} \tilde{g}_1(\tilde{U}_1)&=\int_{\tilde{V}}|\text{det}(\tilde{V})|{}^p|\text{ det}(\tilde{U}_1)|{}^{-2p}\frac{1}{\tilde{\varGamma}_p(\alpha)}\phi_1(\tilde{V}^{\frac{1}{2}}\tilde{U}_1^{-1}\tilde{V}^{\frac{1}{2}})\\ &\ \ \ \ \ \times |\text{det}(I-\tilde{V}^{\frac{1}{2}}\tilde{U}_1^{-1}\tilde{V}^{\frac{1}{2}})|{}^{\alpha-p}\phi_2(\tilde{V})f(\tilde{V})\,\text{ d}\tilde{V}.{} \end{aligned} $$

(5.7a.8)

As an example, let

$$\displaystyle \begin{aligned}\phi_1(\tilde{X}_1)=\frac{\tilde{\varGamma}_p(\gamma+\alpha)}{\tilde{\varGamma}_p(\gamma)}|\text{det}(\tilde{X}_1)|{}^{\gamma-p}\end{aligned}$$

and ϕ ₂ = 1. Then,

$$\displaystyle \begin{aligned} \tilde{g}_1(\tilde{U}_1)&=\frac{\tilde{\varGamma}_p(\gamma+\alpha)}{\tilde{\varGamma}_p(\gamma)}\frac{|\text{det}(\tilde{U}_1)|{}^{-\alpha-\gamma}} {\tilde{\varGamma}_p(\alpha)}\int_{O<\tilde{V}<\tilde{U}_1}|\text{det}(\tilde{V})|{}^{\gamma}\\ &\qquad \qquad \qquad \quad \qquad \qquad \qquad \ \ \times |\text{det}(\tilde{U}_1-\tilde{V})|{}^{\alpha-p}f(\tilde{V}) \,\text{d}\tilde{V},~\Re(\alpha)>p-1\\ &=\frac{\tilde{\varGamma}_p(\gamma+\alpha)}{\tilde{\varGamma}_p(\gamma)}K_{1,\tilde{U}_1,\gamma}^{-\alpha}f {} \end{aligned} $$

(5.7a.9)

where

$$\displaystyle \begin{aligned} K_{1,\tilde{U}_1,\gamma}^{-\alpha}f=\frac{|\text{det}(\tilde{U}_1)|{}^{-\alpha-\gamma}}{\tilde{\varGamma}_p(\alpha)}\int_{V<\tilde{U}_1}|\text{ det}(\tilde{V})|{}^{\gamma}|\text{det}(\tilde{U}_1-\tilde{V})|{}^{\alpha-p}f(\tilde{V})\,\text{d}\tilde{V} {} \end{aligned} $$

(5.7a.10)

for $\Re (\alpha )>p-1$, is the Erdélyi-Kober fractional integral of the first kind of order α and parameter γ in the complex matrix-variate case. We now consider a second example. On letting $\phi _1(\tilde {X}_1)=|\text{det}(\tilde {X}_1)|{ }^{-\alpha -p}$ and $\phi _2(\tilde {X}_2)=|\text{det}(\tilde {X}_2)|{ }^{\alpha }$, the density of $\tilde {U}_1$ is

$$\displaystyle \begin{aligned} \tilde{g}_1(\tilde{U}_1)=\frac{1}{\tilde{\varGamma}_p(\alpha)}\int_{\tilde{V}<\tilde{U}_1}|\text{det}(\tilde{U}_1-\tilde{V})|{}^{\alpha-p}f(\tilde{V})\text{d}\tilde{V}, ~\Re(\alpha)>p-1. {} \end{aligned} $$

(5.7a.11)

The integral in (5.7a.11) is Weyl’s fractional integral of the first kind of order α in the complex matrix-variate case, denoted by $\tilde {W}_{1,\tilde {U}_1}^{-\alpha }f$. Observe that we are considering only Hermitian positive definite matrices. Thus, there is a lower bound, the integral being over $O<\tilde {V}<\tilde {U}_1$. Hence (5.7a.11) can also represent a Riemann-Liouville fractional integral of the first kind of order α in the complex matrix-variate case with a null matrix as its lower bound. For fractional integrals involving several matrices and fractional differential operators for functions of matrix argument, refer to Mathai (2014a, 2015); for pathway extensions, see Mathai and Haubold (2008, 2011).

Exercises 5.7

All the matrices appearing herein are p × p real positive definite, when real, and Hermitian positive definite, when in the complex domain. The M-transform of a real-valued scalar function f(X) of the p × p real matrix X, with the M-transform parameter ρ, is defined as

$$\displaystyle \begin{aligned}M_{f}(\rho)=\int_{X>O}|X|{}^{\rho-\frac{p+1}{2}}f(X)\,\text{d}X,~ \Re(\rho)>\frac{p-1}{2},\end{aligned}$$

whenever the integral is convergent. In the real case, the M-convolution of a product $U_2=X_2^{\frac {1}{2}}X_1X_2^{\frac {1}{2}}$ with the corresponding functions f ₁(X ₁) and f ₂(X ₂), respectively, is

$$\displaystyle \begin{aligned}g_2(U_2)=\int_V|V|{}^{-\frac{p+1}{2}}f_1(V^{-\frac{1}{2}}U_2V^{-\frac{1}{2}})f_2(V)\,\text{d}V\end{aligned}$$

whenever the integral is convergent. The M-convolution of a ratio in the real case is g ₁(U ₁). The M-convolution of a product and a ratio in the complex case are $\tilde {g}_2(\tilde {U}_2)$ and $ \tilde {g}_1(\tilde {U}_1)$, respectively, as defined earlier in this section. If α is the order of a fractional integral operator operating on f, denoted by A ^−α f, then the semigroup property is that A ^−α A ^−β f = A ^−(α+β) f = A ^−β A ^−α f.

5.7.1

Show that the M-transform of the M-convolution of a product is the product of the M-transforms of the individual functions f ₁ and f ₂, both in the real and complex cases.

5.7.2

What are the M-transforms of the M-convolution of a ratio in the real and complex cases? Establish your assertions.

5.7.3

Show that the semigroup property holds for Weyl’s fractional integral of the (1): first kind, (2): second kind, in the real matrix-variate case.

5.7.4

Do (1) and (2) of Exercise 5.7.3 hold in the complex matrix-variate case? Prove your assertion.

5.7.5

Evaluate the M-transforms of the Erdélyi-Kober fractional integral of order α of (1): the first kind, (2): the second kind and state the conditions for their existence.

5.7.6

Repeat Exercise 5.7.5 for (1) Weyl’s fractional integral of order α, (2) the Riemann-Liouville fractional integral of order α.

5.7.7

Evaluate the Weyl fractional integral of order α of (a): the first kind, (b): the second kind, in the real matrix-variate case, if possible, if the arbitrary function is (1): e^−tr(X), (2): e^tr(X) and write down the conditions wherever it is evaluated.

5.7.8

Repeat Exercise 5.7.7 for the complex matrix-variate case.

5.7.9

Evaluate the Erdélyi-Kober fractional integral of order α and parameter γ of the (a): first kind, (b): second kind, in the real matrix-variate case, if the arbitrary function is (1): |X|^δ, (2): |X|^−δ, wherever possible, and write down the necessary conditions.

5.7.10

Repeat Exercise 5.7.9 for the complex case. In the complex case, |X| = determinant of X, is to be replaced by $|\text{ det}(\tilde {X})|$, the absolute value of the determinant of $\tilde {X}$.

5.8. Densities Involving Several Matrix-variate Random Variables, Real Case

We will start with real scalar variables. The most popular multivariate distribution, apart from the normal distribution, is the Dirichlet distribution, which is a generalization of the type-1 and type-2 beta distributions.

5.8.1. The type-1 Dirichlet density, real scalar case

Let x ₁, …, x _k be real scalar random variables having a joint density of the form

$$\displaystyle \begin{aligned} f_1(x_1,\ldots ,x_k)=c_k\,x_1^{\alpha_1-1}\cdots x_k^{\alpha_k-1}(1-x_1-\cdots -x_k)^{\alpha_{k+1}-1} {} \end{aligned} $$

(5.8.1)

for ω = {(x ₁, …, x _k)|0 ≤ x _j ≤ 1, j = 1, …, k, 0 ≤ x ₁ + ⋯ + x _k ≤ 1}, $\Re (\alpha _j)>0,~j=1,\ldots ,k+1$ and f ₁ = 0 elsewhere. This is type-1 Dirichlet density where c _k is the normalizing constant. Note that ω describes a simplex and hence the support of f ₁ is the simplex ω. Evaluation of the normalizing constant can be achieved in differing ways. One method relies on the direct integration of the variables, one at a time. For example, integration over x ₁ involves two factors $x_1^{\alpha _1-1}$ and $(1-x_1-\cdots -x_k)^{\alpha _{k+1}-1}$. Let I ₁ be the integral over x ₁. Observe that x ₁ varies from 0 to 1 − x ₂ −⋯ − x _k. Then

$$\displaystyle \begin{aligned}I_1=\int_{x_1=0}^{1-x_2-\cdots -x_k}x_1^{\alpha_1-1}(1-x_1-\cdots-x_k)^{\alpha_{k+1}-1}\text{d}x_1. \end{aligned}$$

But

$$\displaystyle \begin{aligned}(1-x_1-\cdots -x_k)^{\alpha_{k+1}-1}=(1-x_2-\cdots -x_k)^{\alpha_{k+1}-1}\left[1-\frac{x_1}{1-x_2-\cdots -x_k}\right]^{\alpha_{k+1}-1}. \end{aligned}$$

Make the substitution $y=\frac {x_1}{1-x_2-\cdots -x_k}\Rightarrow \text{d}x_1=(1-x_2-\cdots -x_k)\text{d}y$, which enable one to integrate out y by making use of a real scalar type-1 beta integral giving

$$\displaystyle \begin{aligned}\int_0^1y^{\alpha_1-1}(1-y)^{\alpha_{k+1}-1}\text{d}y=\frac{\varGamma(\alpha_1)\varGamma(\alpha_{k+1})}{\varGamma(\alpha_1+\alpha_{k+1})}\end{aligned}$$

for $\Re (\alpha _1)>0,~\Re (\alpha _{k+1})>0$. Now, proceed similarly by integrating out x ₂ from $\,x_2^{\alpha _2-1}(1-x_2-\cdots -x_k)^{\alpha _1+\alpha _{k+1}}$, and continue in this manner until x _k is reached. Finally, after canceling out all common gamma factors, one has Γ(α ₁)⋯Γ(α _k+1)∕Γ(α ₁ + ⋯ + α _k+1) for $\Re (\alpha _j)>0,~ j=1,\ldots ,k+1$. Thus, the normalizing constant is given by

$$\displaystyle \begin{aligned}c_k=\frac{\varGamma(\alpha_1+\cdots+\alpha_{k+1})}{\varGamma(\alpha_1)\cdots \varGamma(\alpha_{k+1})},~\Re(\alpha_j)>0,~ j=1,\ldots ,k+1.{}\end{aligned} $$

(5.8.2)

Another method for evaluating the normalizing constant c _k consists of making the following transformation:

$$\displaystyle \begin{aligned} x_1&=y_1\\ x_2&=y_2(1-y_1)\\ x_j&=y_j(1-y_1)(1-y_2)\cdots (1-y_{j-1}),~ j=2,\ldots ,k {} \end{aligned} $$

(5.8.3)

It is then easily seen that

$$\displaystyle \begin{aligned} \text{d}x_1\wedge\ldots \wedge\text{d}x_k&=(1-y_1)^{k-1}(1-y_2)^{k-2}\cdots (1-y_{k-1})\, \text{d}y_1\wedge\ldots \wedge\text{d}y_k.{} \end{aligned} $$

(5.8.4)

Under this transformation, one has

$$\displaystyle \begin{aligned} 1-x_1&=1-y_1\\ 1-x_1-x_2&=(1-y_1)(1-y_2)\\ 1-x_1-\cdots -x_k&=(1-y_1)(1-y_2)\cdots (1-y_k).\end{aligned} $$

Then, we have

$$\displaystyle \begin{aligned} x_1^{\alpha_1-1}&\cdots x_k^{\alpha_k-1}(1-x_1-\cdots -x_k)^{\alpha_{k+1}-1} \text{d}x_1\wedge\ldots \wedge\text{d}x_k\\ &=y_1^{\alpha_1-1}\cdots y_k^{\alpha_k-1}(1-y_1)^{\alpha_2+\cdots+\alpha_{k+1}-1}\\ &\ \ \ \ \times (1-y_2)^{\alpha_3+\cdots+\alpha_{k+1}-1}\cdots (1-y_k)^{\alpha_{k+1}-1} \text{d}y_1\wedge\ldots \wedge\text{d}y_k.\end{aligned} $$

Now, all the y _j’s are separated and each one can be integrated out by making use of a type-1 beta integral. For example, the integrals over y ₁, y ₂, …, y _k give the following:

$$\displaystyle \begin{aligned} \int_0^1y_1^{\alpha_1-1}(1-y_1)^{\alpha_2+\cdots+\alpha_{k+1}-1}\text{ d}y_1&=\frac{\varGamma(\alpha_1)\varGamma(\alpha_2+\cdots+\alpha_{k+1})}{\varGamma(\alpha_1+\cdots+\alpha_{k+1})}\\ \int_0^1y_2^{\alpha_2-1}(1-y_2)^{\alpha_3+\cdots+\alpha_{k+1}-1}\text{ d}y_2&=\frac{\varGamma(\alpha_2)\varGamma(\alpha_3+\cdots+\alpha_{k+1})}{\varGamma(\alpha_2+\cdots+\alpha_{k+1})}\\ &\ \ \vdots\\ \int_0^1y_k^{\alpha_k-1}(1-y_k)^{\alpha_{k+1}-1}\text{d}y_k&=\frac{\varGamma(\alpha_k)\varGamma(\alpha_{k+1})}{\varGamma(\alpha_k+\alpha_{k+1})}\end{aligned} $$

for $\Re (\alpha _j)>0,~j=1,\ldots ,k+1$. Taking the product produces $c_k^{-1}$.

5.8.2. The type-2 Dirichlet density, real scalar case

Let x ₁ > 0, …, x _k > 0 be real scalar random variables having the joint density

$$\displaystyle \begin{aligned} f_2(x_1,\ldots ,x_k)=c_kx_1^{\alpha_1-1}\cdots x_k^{\alpha_k-1}(1+x_1+\cdots+x_k)^{-(\alpha_1+\cdots+\alpha_{k+1})} {} \end{aligned} $$

(5.8.5)

for $x_j>0,~ j=1,\ldots ,k,~ \Re (\alpha _j)>0,~ j=1,\ldots ,k+1$ and f ₂ = 0 elsewhere, where c _k is the normalizing constant. This density is known as a type-2 Dirichlet density. It can be shown that the normalizing constant c _k is the same as the one obtained in (5.8.2) for the type-1 Dirichlet distribution. This can be established by integrating out the variables one at a time, starting with x _k or x ₁. This constant can also be evaluated with the help of the following transformation:

$$\displaystyle \begin{aligned} x_1&=y_1\\ x_2&=y_2(1+y_1)\\ x_j&=y_j(1+y_1)(1+y_2)\cdots (1+y_{j-1}),\ j=2,\ldots ,k, {} \end{aligned} $$

(5.8.6)

whose Jacobian is given by

$$\displaystyle \begin{aligned} \text{d}x_1\wedge\ldots \wedge\text{d}x_k&=(1+y_1)^{k-1}(1+y_2)^{k-2}\cdots (1+y_{k-1})\,\text{d}y_1\wedge\ldots \wedge\text{d}y_k. {} \end{aligned} $$

(5.8.7)

5.8.3. Some properties of Dirichlet densities in the real scalar case

Let us determine the h-th moment of (1 − x ₁ −⋯ − x _k) in a type-1 Dirichlet density:

$$\displaystyle \begin{aligned}E[1-x_1-\cdots -x_k]^h=\int_{\omega}(1-x_1-\cdots -x_k)^hf_1(x_1,\ldots ,x_k)\,\text{d}x_1\wedge\ldots \wedge\text{d}x_k.\end{aligned}$$

In comparison with the total integral, the only change is that the parameter α _k+1 is replaced by α _k+1 + h; thus the result is available from the normalizing constant. That is,

$$\displaystyle \begin{aligned}E[1-x_1-\cdots -x_k]^h=\frac{\varGamma(\alpha_{k+1}+h)}{\varGamma(\alpha_{k+1})}\frac{\varGamma(\alpha_1+\cdots+\alpha_{k+1})} {\varGamma(\alpha_1+\cdots+\alpha_{k+1}+h)}.{}\end{aligned} $$

(5.8.8)

The additional condition needed is $\Re (\alpha _{k+1}+h)>0$. Considering the structure of the moment in (5.8.8), u = 1 − x ₁ −⋯ − x _k is manifestly a real scalar type-1 beta variable with the parameters (α _k+1, α ₁ + ⋯ + α _k). This is stated in the following result:

Theorem 5.8.1

Let x ₁, …, x _k have a real scalar type-1 Dirichlet density with the parameters (α ₁, …, α _k;α _k+1). Then, u = 1 − x ₁ −⋯ − x _k has a real scalar type-1 beta distribution with the parameters (α _k+1, α ₁ + ⋯ + α _k), and 1 − u = x ₁ + ⋯ + x _k has a real scalar type-1 beta distribution with the parameters (α ₁ + ⋯ + α _k, α _k+1).

Some parallel results can also be obtained for type-2 Dirichlet variables. Consider a real scalar type-2 Dirichlet density with the parameters (α ₁, …, α _k;α _k+1). Let v = (1 + x ₁ + ⋯ + x _k)⁻¹. Then, when taking the h-th moment of v, that is E[v ^h], we see that the only change is that α _k+1 becomes α _k+1 + h. Accordingly, v has a real scalar type-1 beta distribution with the parameters (α _k+1, α ₁ + ⋯ + α _k). Thus, $1-v=\frac {x_1+\cdots +x_k}{1+x_1+\cdots +x_k}$ is a type-1 beta random variables with the parameters interchanged. Hence the following result:

Theorem 5.8.2

Let x ₁, …, x _k have a real scalar type-2 Dirichlet density with the parameters (α ₁, …, α _k;α _k+1). Then v = (1 + x ₁ + ⋯ + x _k)⁻¹ has a real scalar type-1 beta distribution with the parameters (α _k+1, α ₁ + ⋯ + α _k) and $1-v=\frac {x_1+\cdots +x_k}{1+x_1+\cdots +x_k}$ has a real scalar type-1 beta distribution with the parameters (α ₁ + ⋯ + α _k, α _k+1).

Observe that the joint product moments $E[x_1^{h_1}\cdots x_k^{h_k}]$ can be determined both in the cases of real scalar type-1 Dirichlet and type-2 Dirichlet densities. This can be achieved by considering the corresponding normalizing constants. Since an arbitrary product moment will uniquely determine the corresponding distribution, one can show that all subsets of variables from the set {x ₁, …, x _k} are again real scalar type-1 Dirichlet and real scalar type-2 Dirichlet distributed, respectively; to identify the marginal joint density of a subset under consideration, it suffices to set the complementary set of h _j’s equal to zero. Type-1 and type-2 Dirichlet densities enjoy many properties, some of which are mentioned in the exercises. As well, there exist several types of generalizations of the type-1 and type-2 Dirichlet models. The first author and his coworkers have developed several such models, one of which was introduced in connection with certain reliability problems.

5.8.4. Some generalizations of the Dirichlet models

Let the real scalar variables x ₁, …, x _k have a joint density of the following type, which is a generalization of the type-1 Dirichlet density:

$$\displaystyle \begin{aligned} g_1(x_1,\ldots ,x_k)&=b_k\,x_1^{\alpha_1-1}(1-x_1)^{\beta_1}x_2^{\alpha_2-1}(1-x_1-x_2)^{\beta_2}\cdots \\ &\ \ \ \ \times x_k^{\alpha_k-1}(1-x_1-\cdots -x_k)^{\beta_k+\alpha_{k+1}-1}{} \end{aligned} $$

(5.8.9)

for (x ₁, …, x _k) ∈ ω, $\Re (\alpha _j)>0,~j=1,\ldots ,k+1,$ as well as other necessary conditions to be stated later, and g ₁ = 0 elsewhere, where b _k denotes the normalizing constant. This normalizing constant can be evaluated by integrating out the variables one at a time or by making the transformation (5.8.3) and taking into account its associated Jacobian as specified in (5.8.4). Under the transformation (5.8.3), y ₁, …, y _k will be independently distributed with y _j having a type-1 beta density with the parameters (α _j, γ _j), γ _j = α _j+1 + ⋯ + α _k+1 + β _j + ⋯ + β _k, j = 1, …, k, which yields the normalizing constant

$$\displaystyle \begin{aligned} b_k=\prod_{j=1}^k\frac{\varGamma(\alpha_j+\gamma_j)}{\varGamma(\alpha_j)\varGamma(\gamma_j)} {} \end{aligned} $$

(5.8.10)

for $\Re (\alpha _j)>0,~ j=1,\ldots ,k+1,~ \Re (\gamma _j)>0,~ j=1,\ldots ,k,$ where

$$\displaystyle \begin{aligned} \gamma_j=\alpha_{j+1}+\cdots+\alpha_{k+1}+\beta_j+\cdots+\beta_k,~ j=1,\ldots ,k. {} \end{aligned} $$

(5.8.11)

Arbitrary moments $E[x_1^{h_1}\cdots x_k^{h_k}]$ are available from the normalizing constant b _k by replacing α _j by α _j + h _j for j = 1, …, k and then taking the ratio. It can be observed from this arbitrary moment that all subsets of the type (x ₁, …, x _j) have a density of the type specified in (5.8.9). For other types of subsets, one has initially to rearrange the variables and the corresponding parameters by bringing them to the first j positions and then utilize the previous result on subsets.

The following model corresponding to (5.8.9) for the type-2 Dirichlet model was introduced by the first author:

$$\displaystyle \begin{aligned} g_2(x_1,\ldots ,x_k)&=a_kx_1^{\alpha_1-1}(1+x_1)^{-\beta_1}x_2^{\alpha_2-1}(1+x_1+x_2)^{-\beta_2}\cdots \\ &\ \ \ \times x_k^{\alpha_k-1}(1+x_1+\cdots+x_k)^{-(\alpha_1+\cdots+\alpha_{k+1})-\beta_k}{} \end{aligned} $$

(5.8.12)

for $x_j>0,~j=1,\ldots ,k,~ \Re (\alpha _j)>0,~ j=1,\ldots ,k+1,$ as well as other necessary conditions to be stated later, and g ₂ = 0 elsewhere. In order to evaluate the normalizing constant a _k, one can use the transformation (5.8.6) and its associated Jacobian given in (5.8.7). Then, y ₁, …, y _k become independently distributed real scalar type-2 beta variables with the parameters (α _j, δ _j), where

$$\displaystyle \begin{aligned} \delta_j=\alpha_1+\cdots+\alpha_{j-1}+\alpha_{k+1}+\beta_j+\cdots+\beta_k {} \end{aligned} $$

(5.8.13)

for $\Re (\alpha _j)>0,~j=1,\ldots ,k+1,~ \Re (\delta _j)>0,~j=1,\ldots ,k$. Other generalizations are available in the literature.

5.8.5. A pseudo Dirichlet model

In the type-1 Dirichlet model, the support is the previously described simplex ω. We will now consider a model, which was recently introduced by the first author, wherein the variables can vary freely in a hypercube. Let us begin with the case k = 2. Consider the model

$$\displaystyle \begin{aligned}g_{12}(x_1,x_2)=c_{12}\,x_2^{\alpha_1}(1-x_1)^{\alpha_1-1}(1-x_2)^{\alpha_2-1}(1-x_1x_2)^{-(\alpha_1+\alpha_2-1)},~ 0\le x_j\le 1,{}\end{aligned} $$

(5.8.14)

for $\Re (\alpha _j)>0,~j=1,2,$ and g ₁₂ = 0 elsewhere. In this case, the variables are free to vary within the unit square. Let us evaluate the normalizing constant c ₁₂. For this purpose, let us expand the last factor by making use of the binomial expansion since 0 < x ₁ x ₂ < 1. Then,

$$\displaystyle \begin{aligned} (1-x_1x_2)^{-(\alpha_1+\alpha_2-1)}=\sum_{k=0}^{\infty}\frac{(\alpha_1+\alpha_2-1)_k}{k!}x_1^kx_2^k \end{aligned}$$

(i)

where for example the Pochhmmer symbol (a)_k stands for

$$\displaystyle \begin{aligned}(a)_k=a(a+1)\cdots (a+k-1),~ a\ne 0,~ (a)_0=1.\end{aligned}$$

Integral over x ₁ gives

$$\displaystyle \begin{aligned} \int_0^1x_1^k(1-x_1)^{\alpha_1-1}\text{d}x_1=\frac{\varGamma(k+1)\varGamma(\alpha_1)}{\varGamma(\alpha_1+k+1)},~\Re(\alpha_1)>0, \end{aligned}$$

(ii)

and the integral over x ₂ yields

$$\displaystyle \begin{aligned} \int_0^1x_2^{\alpha_1+k}(1-x_2)^{\alpha_2-1}\text{d}x_2=\frac{\varGamma(\alpha_1+k+1)\varGamma(\alpha_2)}{\varGamma(\alpha_1+\alpha_2+1)}. \end{aligned}$$

(iii)

Taking the product of the right-hand side expressions in (ii) and (iii) and observing that Γ(α ₁ + α ₂ + k + 1) = Γ(α ₁ + α ₂ + 1)(α ₁ + α ₂ + 1)_k and Γ(k + 1) = (1)_k, we obtain the following total integral:

$$\displaystyle \begin{aligned} \frac{\varGamma(\alpha_1)\varGamma(\alpha_2)}{\varGamma(\alpha_1+\alpha_2+1)}&\sum_{k=0}^{\infty}\frac{(1)_k(\alpha_1+\alpha_2-1)_k}{k!(\alpha_1+\alpha_2+1)_k}\\ &=\frac{\varGamma(\alpha_1)\varGamma(\alpha_2)}{\varGamma(\alpha_1+\alpha_2+1)}{{}_2F_1}(1,\alpha_1+\alpha_2-1;\alpha_1+\alpha_2+1;1)\\ &=\frac{\varGamma(\alpha_1)\varGamma(\alpha_2)}{\varGamma(\alpha_1+\alpha_2+1)}\frac{\varGamma(\alpha_1+\alpha_2+1)\varGamma(1)}{\varGamma(\alpha_1+\alpha_2)\varGamma(2)}\\ &=\frac{\varGamma(\alpha_1)\varGamma(\alpha_2)}{\varGamma(\alpha_1+\alpha_2)},~\Re(\alpha_1)>0,~\Re(\alpha_2)>0,{} \end{aligned} $$

(5.8.15)

where the ₂ F ₁ hypergeometric function with argument 1 is evaluated with the following identity:

$$\displaystyle \begin{aligned} {{}_2F_1}(a,b;c;1)=\frac{\varGamma(c)\varGamma(c-a-b)}{\varGamma(c-a)\varGamma(c-b)} {} \end{aligned} $$

(5.8.16)

whenever the gamma functions are defined. Observe that (5.8.15) is a surprising result as it is the total integral coming from a type-1 real beta density with the parameters (α ₁, α ₂). Now, consider the general model

$$\displaystyle \begin{aligned} g_{1k}(x_1,\ldots ,x_k)&=c_{1k}(1-x_1)^{\alpha_1-1}\cdots (1-x_k)^{\alpha_k-1}x_2^{\alpha_1}\cdots \\ &\ \ \ \times x_k^{\alpha_1+\cdots+\alpha_{k-1}}(1-x_1\ldots x_k)^{-(\alpha_1+\cdots+\alpha_k-1)},~0\le x_j\le 1,~j=1,\ldots ,k.{} \end{aligned} $$

(5.8.17)

Proceeding exactly as in the case of k = 2, one obtains the total integral as

$$\displaystyle \begin{aligned}{}[c_{1k}]^{-1}=\frac{\varGamma(\alpha_1)\ldots \varGamma(\alpha_k)}{\varGamma(\alpha_1+\cdots+\alpha_k)},~\Re(\alpha_j)>0,~j=1,\ldots ,k. {} \end{aligned} $$

(5.8.18)

This is the total integral coming from a (k − 1)-variate real type-1 Dirichlet model. Some properties of this distribution are pointed out in some of the assigned problems.

5.8.6. The type-1 Dirichlet model in real matrix-variate case

Direct generalizations of the real scalar variable Dirichlet models to real as well as complex matrix-variate cases are possible. The type-1 model will be considered first. Let the p × p real positive definite matrices X ₁, …, X _k be such that X _j > O, I − X _j > O, that is X _j as well as I − X _j are positive definite, for j = 1, …, k, and, in addition, I − X ₁ −⋯ − X _k > O. Let Ω = {(X ₁, …, X _k)| O < X _j < I, j = 1, …, k, I − X ₁ −⋯ − X _k > O}. Consider the model

$$\displaystyle \begin{aligned} G_1(X_1,\ldots ,X_k)&=C_k|X_1|{}^{\alpha_1-\frac{p+1}{2}}\cdots |X_k|{}^{\alpha_k-\frac{p+1}{2}}\\ &\ \ \ \times |I-X_1-\cdots -X_k|{}^{\alpha_{k+1}-\frac{p+1}{2}},~ (X_1,\ldots ,X_k)\in\varOmega,{} \end{aligned} $$

(5.8.19)

for $\Re (\alpha _j)>\frac {p-1}{2},~ j=1,\ldots ,k+1,$ and G ₁ = 0 elsewhere. The normalizing constant C _k can be determined by using real matrix-variate type-1 beta integrals to integrate the matrices one at the time. We can also evaluate the total integral by means of the following transformation:

$$\displaystyle \begin{aligned} X_1&=Y_1\\ X_2&=(I-Y_1)^{\frac{1}{2}}Y_2(I-Y_1)^{\frac{1}{2}}\\ X_j&=(I-Y_1)^{\frac{1}{2}}\cdots (I-Y_{j-1})^{\frac{1}{2}}Y_j(I-Y_{j-1})^{\frac{1}{2}}\cdots (I-Y_1)^{\frac{1}{2}},~ j=2,\ldots ,k.{} \end{aligned} $$

(5.8.20)

The associated Jacobian can then be determined by making use of results on matrix transformations that are provided in Sect. 1.6. Then,

$$\displaystyle \begin{aligned} \text{d}X_1\wedge\ldots \wedge\text{d}X_k=|I-Y_1|{}^{(k-1)(\frac{p+1}{2})}\cdots |I-Y_{k-1}|{}^{\frac{p+1}{2}}\text{d}Y_1\wedge\ldots \wedge\text{d}Y_k. {} \end{aligned} $$

(5.8.21)

It can be seen that the Y _j’s are independently distributed as real matrix-variate type-1 beta random variables and the product of the integrals gives the following final result:

$$\displaystyle \begin{aligned}C_k=\frac{\varGamma_p(\alpha_1+\cdots+\alpha_{k+1})}{\varGamma_p(\alpha_1)\cdots \varGamma_p(\alpha_{k+1})},~ \Re(\alpha_j)>\frac{p-1}{2},~j=1,\ldots ,k+1.{}\end{aligned} $$

(5.8.22)

By integrating out the variables one at a time, we can show that the marginal densities of all subsets of {X ₁, …, X _k} also belong to the same real matrix-variate type-1 Dirichlet distribution and single matrices are real matrix-variate type-1 beta distributed. By taking the product moment of the determinants, $E[|X_1|{ }^{h_1}\cdots |X_k|{ }^{h_k}],$ one can anticipate the results; however, arbitrary moments of determinants need not uniquely determine the densities of the corresponding matrices. In the real scalar case, one can uniquely identify the density from arbitrary moments, very often under very mild conditions. The result I − X ₁ −⋯ − X _k has a real matrix-variate type-1 beta distribution can be seen by taking arbitrary moments of the determinant, that is, E[|I − X ₁ −⋯ − X _k|^h], but evaluating the h-moment of a determinant and then identifying it as the h-th moment of the determinant from a real matrix-variate type-1 beta density is not valid in this case. If one makes a transformation of the type Y ₁ = X ₁, …, Y _k−1 = X _k−1, Y _k = I − X ₁ −⋯ − X _k, it is seen that X _k = I − Y ₁ −⋯ − Y _k and that the Jacobian in absolute value is 1. Hence, we end up with a real matrix-variate type-1 Dirichlet density of the same format but whose parameters α _k and α _k+1 are interchanged. Then, integrating out Y ₁, …, Y _k−1, we obtain a real matrix-variate type-1 beta density with the parameters (α _k+1, α ₁ + ⋯ + α _k). Hence the result. When Y _k has a real matrix-variate type-1 beta distribution, we have that I − Y _k = X ₁ + ⋯ + X _k is also a type-1 beta random variable with the parameters interchanged.

The first author and his coworkers have proposed various types of generalizations to the matrix-variate type-1 and type-2 Dirichlet models in the real and complex cases. One of those extensions which is defined in the real domain, is the following:

$$\displaystyle \begin{aligned} G_2(X_1,\ldots ,X_k)&=C_{1k}|X_1|{}^{\alpha_1-\frac{p+1}{2}}|I-X_1|{}^{\beta_1}|X_2|{}^{\alpha_2-\frac{p+1}{2}}\\ &\ \ \ \times |I-X_1-X_2|{}^{\beta_2}\cdots |X_k|{}^{\alpha_k-\frac{p+1}{2}}\\ &\ \ \ \times |I-X_1-\cdots -X_k|{}^{\alpha_{k+1}+\beta_k-\frac{p+1}{2}},{} \end{aligned} $$

(5.8.23)

for (X ₁, …, X _k) ∈ Ω, $\Re (\alpha _j)>\frac {p-1}{2},~ j=1,\ldots ,k+1$, and G ₂ = 0 elsewhere. The normalizing constant C _1k can be evaluated by integrating variables one at a time or by using the transformation (5.8.20). Under this transformation, the real matrices Y _j’s are independently distributed as real matrix-variate type-1 beta variables with the parameters (α _j, γ _j), γ _j = α _j+1 + ⋯ + α _k+1 + β _j + ⋯ + β _k. The conditions will then be $\Re (\alpha _j)>\frac {p-1}{2},~ j=1,\ldots ,k+1,$ and $ \Re (\gamma _j)>\frac {p-1}{2},~j=1,\ldots ,k$. Hence

$$\displaystyle \begin{aligned} C_{1k}=\prod_{j=1}^{k}\frac{\varGamma_p(\alpha_j+\gamma_j)}{\varGamma_p(\alpha_j)\varGamma_p(\gamma_j)}. {} \end{aligned} $$

(5.8.24)

5.8.7. The type-2 Dirichlet model in the real matrix-variate case

The type-2 Dirichlet density in the real matrix-variate case is the following:

$$\displaystyle \begin{aligned} \!\!\! G_3(X_1,\ldots ,X_k)&=C_k|X_1|{}^{\alpha_1-\frac{p+1}{2}}\cdots |X_k|{}^{\alpha_k-\frac{p+1}{2}}\\ &\ \ \times |I+X_1+\cdots+X_k|{}^{-(\alpha_1+\cdots+\alpha_{k+1})}\!,~ X_j>O,\!~j=1,\ldots ,k,{} \end{aligned} $$

(5.8.25)

for $\Re (\alpha _j)>\frac {p-1}{2},~j=1,\ldots ,k+1$ and G ₃ = 0 elsewhere, the normalizing constant C _k being the same as that appearing in the type-1 Dirichlet case. This can be verified, either by integrating matrices one at a time from (5.8.25) or by making the following transformation:

$$\displaystyle \begin{aligned} X_1&=Y_1\\ X_2&=(I+Y_1)^{\frac{1}{2}}Y_2(I+Y_1)^{\frac{1}{2}}\\ X_j&=(I+Y_1)^{\frac{1}{2}}\cdots (I+Y_{j-1})^{\frac{1}{2}}Y_j(I+Y_{j-1})^{\frac{1}{2}}\cdots (I+Y_1)^{\frac{1}{2}},\ j=2,\ldots ,k. {} \end{aligned} $$

(5.8.26)

Under this transformation, the Jacobian is as follows:

$$\displaystyle \begin{aligned} \text{d}X_1\wedge\ldots \wedge\text{d}X_k=|I+Y_1|{}^{(k-1)(\frac{p+1}{2})}\cdots |I+Y_{k-1}|{}^{\frac{p+1}{2}}\text{d}Y_1\wedge\ldots \wedge\text{d}Y_k. {} \end{aligned} $$

(5.8.27)

Thus, the Y _j’s are independently distributed real matrix-variate type-2 beta variables and the product of the integrals produces [C _k]⁻¹. By integrating matrices one at a time, we can see that all subsets of matrices belonging to {X ₁, …, X _k} will have densities of the type specified in (5.8.25). Several properties can also be established for the model (5.8.25); some of them are included in the exercises.

Example 5.8.1

Evaluate the normalizing constant c explicitly if the function f(X ₁, X ₂) is a statistical density where the p × p real matrices X _j > O, I − X _j > O, j = 1, 2, and I − X ₁ − X ₂ > O where

$$\displaystyle \begin{aligned}f(X_1,X_2)=c\,|X_1|{}^{\alpha_1-\frac{p+1}{2}}|I-X_1|{}^{\beta_1}|X_2|{}^{\alpha_2-\frac{p+1}{2}}|I-X_1-X_2|{}^{\beta_2-\frac{p+1}{2}}.\end{aligned}$$

Solution 5.8.1

Note that

$$\displaystyle \begin{aligned}|I-X_1-X_2|{}^{\beta_2-\frac{p+1}{2}}=|I-X_1|{}^{\beta_2-\frac{p+1}{2}}|I-(I-X_1)^{-\frac{1}{2}}X_2(I-X_1)^{-\frac{1}{2}}|{}^{\beta_2-\frac{p+1}{2}}. \end{aligned}$$

Now, letting $Y=(I-X_1)^{-\frac {1}{2}}X_2(I-X_1)^{-\frac {1}{2}}\Rightarrow \text{d}Y=|I-X_1|{ }^{-\frac {p+1}{2}}\text{d}X_2$, and the integral over X ₂ gives the following:

$$\displaystyle \begin{aligned}|I-X_1|{}^{\alpha_2+\beta_2-\frac{p+1}{2}}\int_{O<Y<I} |Y|{}^{\alpha_2-\frac{p+1}{2}}|I-Y|{}^{\beta_2-\frac{p+1}{2}}\text{ d}Y=|I-X_1|{}^{\alpha_2+\beta_2-\frac{p+1}{2}}\frac{\varGamma_p(\alpha_2)\varGamma_p(\beta_2)}{\varGamma_p(\alpha_2+\beta_2)}\end{aligned}$$

for $\Re (\alpha _2)>\frac {p-1}{2},~\Re (\beta _2)>\frac {p-1}{2}$. Then, the integral over X ₁ can be evaluated as follows:

$$\displaystyle \begin{aligned}\int_{O<X_1<I}|X_1|{}^{\alpha_1-\frac{p+1}{2}}|I-X_1|{}^{\beta_1+\beta_2+\alpha_2-\frac{p+1}{2}}\text{ d}X_1=\frac{\varGamma_p(\alpha_1)\varGamma_p(\beta_1+\beta_2+\alpha_2)}{\varGamma_p(\alpha_1+\alpha_2+\beta_1+\beta_2)}\end{aligned}$$

for $\Re (\alpha _1)>\frac {p-1}{2},~ \Re (\beta _1+\beta _2+\alpha _2)>\frac {p-1}{2}$. Collecting the results from the integrals over X ₂ and X ₁ and using the fact that the total integral is 1, we have

$$\displaystyle \begin{aligned}c=\frac{\varGamma_p(\alpha_2+\beta_2)\varGamma_p(\alpha_1+\alpha_2+\beta_1+\beta_2)}{\varGamma_p(\alpha_2)\varGamma_p(\beta_2)\varGamma_p(\alpha_1)\varGamma_p(\alpha_2+\beta_1+\beta_2)}\end{aligned}$$

for $\Re (\alpha _j)>\frac {p-1}{2},~j=1,2,~\Re (\beta _2)>\frac {p-1}{2},$ and $\Re (\beta _1+\beta _2+\alpha _2)>\frac {p-1}{2}$.

The first author and his coworkers have established several generalizations to the type-2 Dirichlet model in (5.8.25). One such model is the following:

$$\displaystyle \begin{aligned} G_4(X_1,\ldots ,X_k)&=C_{2k}|X_1|{}^{\alpha_1-\frac{p+1}{2}}|I+X_1|{}^{-\beta_1}|X_2|{}^{\alpha_2-\frac{p+1}{2}}\\ &\ \ \ \times |I+X_1+X_2|{}^{-\beta_2}\cdots |X_k|{}^{\alpha_k-\frac{p+1}{2}}\\ &\ \ \ \times |I+X_1+\cdots+X_k|{}^{-(\alpha_1+\cdots+\alpha_{k+1}+\beta_k)}{} \end{aligned} $$

(5.8.28)

for $\Re (\alpha _j)>\frac {p-1}{2},~ j=1,\ldots ,k+1,~ X_j>O,~ j=1,\ldots ,k$, as well as other necessary conditions to be stated later, and G ₄ = 0 elsewhere. The normalizing constant C _2k can be evaluated by integrating matrices one at a time or by making the transformation (5.8.26). Under this transformation, the Y _j’s are independently distributed real matrix-variate type-2 beta variables with the parameters (α _j, δ _j), where

$$\displaystyle \begin{aligned} \delta_j= \alpha_1+\cdots+\alpha_{j-1}+\beta_j+\cdots+\beta_k. {} \end{aligned} $$

(5.8.29)

The normalizing constant is then

$$\displaystyle \begin{aligned} G_{2k}=\prod_{j=1}^k\frac{\varGamma_p(\alpha_j+\delta_j)}{\varGamma_p(\alpha_j)\varGamma_p(\delta_j)} {} \end{aligned} $$

(5.8.30)

where the δ _j is given in (5.8.29). The marginal densities of the subsets, if taken in the order X ₁, {X ₁, X ₂}, and so on, will belong to the same family of densities as that specified by (5.8.28). Several properties of the model (5.8.28) are available in the literature.

5.8.8. A pseudo Dirichlet model

We will now discuss the generalization of the model introduced in Sect. 5.8.5. Consider the density

$$\displaystyle \begin{aligned} G_{1k}(X_1,\ldots ,X_k)&=C_{1k}|I-X_1|{}^{\alpha_1-\frac{p+1}{2}}\cdots |I-X_k|{}^{\alpha_k-\frac{p+1}{2}}\\ &\ \ \ \times |X_2|{}^{\alpha_1}|X_3|{}^{\alpha_1+\alpha_2}\cdots |X_{k}|{}^{\alpha_1+\cdots+\alpha_{k-1}}\\ &\ \ \ \times |I-X_k^{\frac{1}{2}}\cdots X_2^{\frac{1}{2}}X_1X_2^{\frac{1}{2}}\cdots X_k^{\frac{1}{2}}|{}^{-(\alpha_1+\cdots+\alpha_k-\frac{p+1}{2})}. {} \end{aligned} $$

(5.8.31)

Then, by following steps parallel to those used in the real scalar variable case, one can show that the normalizing constant is given by

$$\displaystyle \begin{aligned} C_{1k}=\frac{\varGamma_p(\alpha_1+\cdots+\alpha_k)}{\varGamma_p(\alpha_1)\cdots \varGamma_p(\alpha_k)}\frac{\varGamma_p(p+1)} {[\varGamma_p(\frac{p+1}{2})]^2}.{}\end{aligned} $$

(5.8.32)

The binomial expansion of the last factor determinant in (5.8.31) is somewhat complicated as it involves zonal polynomials; this expansion is given in Mathai (1997). Compared to the real scalar case, the only change is the appearance of the constant $\frac {\varGamma _p(p+1)}{[\varGamma _p(\frac {p+1}{2})]^2}$ which is 1 when p = 1. Apart from this constant, the rest is the normalizing constant in a real matrix-variate type-1 Dirichlet model in k − 1 variables instead of k variables.

5.8a. Dirichlet Models in the Complex Domain

All the matrices appearing in the remainder of this chapter are p × p Hermitian positive definite, that is, $\tilde {X}_j=\tilde {X}_j^{*}$ where an asterisk indicates the conjugate transpose. Complex matrix-variate random variables will be denoted with a tilde. For a complex matrix $\tilde {X}$, the determinant will be denoted by $\text{det}(\tilde {X})$ and the absolute value of the determinant, by $|\text{det}(\tilde {X})|$. For example, if $\text{det}(\tilde {X})=a+ib, \ a$ and b being real and $i=\sqrt {(-1)}$, the absolute value is $|\text{det}(\tilde {X})|=+(a^2+b^2)^{\frac {1}{2}}$. The type-1 Dirichlet model in the complex domain, denoted by $\tilde {G}_1$, is the following:

$$\displaystyle \begin{aligned} \tilde{G}_1(X_1,\ldots ,X_k)&=\tilde{C}_k|\text{det}(\tilde{X}_1)|{}^{\alpha_1-p}\cdots |\text{det}(\tilde{X}_k)|{}^{\alpha_k-p}\\ &\ \ \ \times |\text{det}(I-\tilde{X}_1-\cdots -\tilde{X}_k)|{}^{\alpha_{k+1}-p}{} \end{aligned} $$

(5.8a.1)

for $(\tilde {X}_1,\ldots ,\tilde {X}_k)\in \tilde {\varOmega },\ \tilde {\varOmega }=\{(\tilde {X}_1,\ldots ,\tilde {X}_k)|\,O<\tilde {X}_j<I,~ j=1,\ldots ,k,~O<\tilde {X}_1+\cdots +\tilde {X}_k<I\},~ \Re (\alpha _j)>p-1,~j=1,\ldots ,k+1,$ and $\tilde {G}_1=0$ elsewhere. The normalizing constant $\tilde {C}_k$ can be evaluated by integrating out matrices one at a time with the help of complex matrix-variate type-1 beta integrals. One can also employ a transformation of the type given in (5.8.20) where the real matrices are replaced by matrices in the complex domain and Hermitian positive definite square roots are used. The Jacobian is then as follows:

$$\displaystyle \begin{aligned} \text{d}\tilde{X}_1\wedge\ldots \wedge\text{d}\tilde{X}_k=|\text{det}(I-\tilde{Y}_1)|{}^{(k-1)p}\cdots |\text{det}(I-Y_{k-1})|{}^p\text{d}\tilde{Y}_1\wedge\ldots \wedge\text{d}\tilde{Y}_k. {} \end{aligned} $$

(5.8a.2)

Then $\tilde {Y}_j$’s are independently distributed as complex matrix-variate type-1 beta variables. On taking the product of the total integrals, one can verify that

$$\displaystyle \begin{aligned}\tilde{C}_k=\frac{\tilde{\varGamma}_p(\alpha_1+\cdots+\alpha_{k+1})}{\tilde{\varGamma}_p(\alpha_1)\cdots \tilde{\varGamma}_p(\alpha_{k+1})}{} \end{aligned} $$

(5.8a.3)

where for example $\tilde {\varGamma }_p(\alpha )$ is the complex matrix-variate gamma given by

$$\displaystyle \begin{aligned} \tilde{\varGamma}_p(\alpha)=\pi^{\frac{p(p-1)}{2}}\varGamma(\alpha)\varGamma(\alpha-1)\cdots \varGamma(\alpha-p+1),~ \Re(\alpha)>p-1.{} \end{aligned} $$

(5.8a.4)

The first author and his coworkers have also discussed various types of generalizations to Dirichlet models in complex domain.

5.8a.1. A type-2 Dirichlet model in the complex domain

One can have a model parallel to the type-2 Dirichlet model in the real matrix-variate case. Consider the model

$$\displaystyle \begin{aligned} \tilde{G}_2&=\tilde{C}_k|\text{det}(\tilde{X}_1)|{}^{\alpha_1-p}\cdots |\text{det}(\tilde{X}_k)|{}^{\alpha_k-p}\\ &\ \ \ \times |\text{det}(I+\tilde{X}_1+\cdots+\tilde{X}_k)|{}^{-(\alpha_1+\cdots+\alpha_{k+1})} {} \end{aligned} $$

(5.8a.5)

for $\tilde {X}_j>O,~ j=1,\ldots ,k,~ \Re (\alpha _j)>p-1,~j=1,\ldots ,k+1,$ and $\tilde {G}_2=0$ elsewhere. By integrating out matrices one at a time with the help of complex matrix-variate type-2 integrals or by using a transformation parallel to that provided in (5.7.26) and then integrating out the independently distributed complex type-2 beta variables $\tilde {Y}_j$’s, we can show that the normalizing constant $\tilde {C}_k$ is the same as that obtained in the complex type-1 Dirichlet case. The first author and his coworkers have given various types of generalizations to the complex type-2 Dirichlet density as well.

Exercises 5.8

5.8.1

By integrating out variables one at a time derive the normalizing constant in the real scalar type-2 Dirichlet case.

5.8.2

By using the transformation (5.8.3), derive the normalizing constant in the real scalar type-1 Dirichlet case.

5.8.3

By using the transformation in (5.8.6), derive the normalizing constant in the real scalar type-2 Dirichlet case.

5.8.4

Derive the normalizing constants for the extended Dirichlet models in (5.8.9) and (5.8.12).

5.8.5

Evaluate $E[x_1^{h_1}\cdots x_k^{h_k}]$ for the model specified in (5.8.12) and state the conditions for its existence.

5.8.6

Derive the normalizing constant given in (5.8.18).

5.8.7

With respect to the pseudo Dirichlet model in (5.8.17), show that the product u = x ₁⋯x _k is uniformly distributed.

5.8.8

Derive the marginal distribution of (1): x ₁; (2): (x ₁, x ₂); (3): (x ₁, …, x _r), r < k, and the conditional distribution of (x ₁, …, x _r) given (x _r+1, …, x _k) in the pseudo Dirichlet model in (5.8.17).

5.8.9

Derive the normalizing constant in (5.8.22) by completing the steps in (5.8.22) and then by integrating out matrices one by one.

5.8.10

From the outline given after equation (5.8.22), derive the density of I − X ₁ −⋯ − X _k and therefrom the density of X ₁ + ⋯ + X _k when (X ₁, …, X _k) has a type-1 Dirichlet distribution.

5.8.11

Complete the derivation of C _1k in (5.8.24) and verify it by integrating out matrices one at a time from the density given in (5.8.23).

5.8.12

Show that U = (I + X ₁ + ⋯ + X _k)⁻¹ in the type-2 Dirichlet model in (5.8.25) is a real matrix-variate type-1 beta distributed. As well, specify its parameters.

5.8.13

Evaluate the normalizing constant C _k in (5.8.25) by using the transformation provided in (5.8.26) as well as by integrating out matrices one at a time.

5.8.14

Derive the δ _j in (5.8.29) and thus the normalizing constant C _2k in (5.8.28).

5.8.15

For the following model in the complex domain, evaluate C:

$$\displaystyle \begin{aligned} f(\tilde{X})&=C|\text{det}(\tilde{X}_1)|{}^{\alpha_1-p}|\text{det}(I-\tilde{X}_1)|{}^{\beta_1}|\text{det}(\tilde{X}_2)|{}^{\alpha_2-p}|\text{ det}(I-\tilde{X}_1-\tilde{X}_2)|{}^{\beta_2}\cdots \\ &\ \ \ \times |\text{det}(I-\tilde{X}_1-\cdots -\tilde{X}_k)|{}^{\alpha_{k+1}-p+\beta_k}.\end{aligned} $$

5.8.16

Evaluate the normalizing constant in the pseudo Dirichlet model in (5.8.31).

5.8.17

In the pseudo Dirichlet model specified in (5.8.31), show that $U=X_k^{\frac {1}{2}}\cdots X_2^{\frac {1}{2}}$ $X_1X_2^{\frac {1}{2}}\cdots X_k^{\frac {1}{2}}$ is uniformly distributed.

5.8.18

Show that the normalizing constant in the complex type-2 Dirichlet model specified in (5.8a.5) is the same as the one in the type-1 Dirichlet case. Establish the result by integrating out matrices one by one.

5.8.19

Show that the normalizing constant in the type-2 Dirichlet case in (5.8a.5) is the same as that in the type-1 case. Establish this by using a transformation parallel to (5.8.26) in the complex domain.

5.8.20

Construct a generalized model for the type-2 Dirichlet case for k = 3 parallel to the case in (5.8.28) in the complex domain.

References

Mathai, A.M. (1993): A Handbook of Generalized Special Functions for Statistical and Physical Sciences, Oxford University Press, Oxford.
MATH Google Scholar
Mathai, A.M. (1997): Jacobians of Matrix Transformations and Functions of Matrix Argument, World Scientific Publishing, New York.
Book MATH Google Scholar
Mathai, A.M. (1999): Introduction to Geometrical Probabilities: Distributional Aspects and Applications, Gordon and Breach, Amsterdam.
MATH Google Scholar
Mathai, A.M. (2005): A pathway to matrix-variate gamma and normal densities, Linear Algebra and its Applications, 396, 317–328.
Article MATH Google Scholar
Mathai, A.M. (2009): Fractional integrals in the matrix-variate case and connection to statistical distributions, Integral Transforms and Special Functions, 20(12), 871–882.
Article MATH Google Scholar
Mathai, A.M. (2010): Some properties of Mittag-Leffler functions and matrix-variate analogues: A statistical perspective, Fractional Calculus & Applied Analysis, 13(2), 113–132.
MATH Google Scholar
Mathai, A.M. (2012): Generalized Krätzel integral and associated statistical densities, International Journal of Mathematical Analysis, 6(51), 2501–2510.
MATH Google Scholar
Mathai, A.M. (2013): Fractional integral operators in the complex matrix-variate case, Linear Algebra and its Applications, 439, 2901–2913.
Article MATH Google Scholar
Mathai, A.M. (2014): Evaluation of matrix-variate gamma and beta integrals, Applied Mathematics and computations, 247, 312–318.
Article MATH Google Scholar
Mathai, A.M. (2014a): Fractional integral operators involving many matrix variables, Linear Algebra and its Applications, 446, 196–215.
Article MATH Google Scholar
Mathai, A.M. (2014b): Explicit evaluations of gamma and beta integrals in the matrix-variate case, Journal of the Indian Mathematical Society, 81(3), 259–271.
MATH Google Scholar
Mathai, A.M. (2015): Fractional differential operators in the complex matrix-variate case, Linear Algebra and its Applications, 478, 200–217.
Article MATH Google Scholar
Mathai, A.M. and H.J. Haubold, H.J. (1988) Modern Problems in Nuclear and Neutrino Astrophysics, Akademie-Verlag, Berlin.
Google Scholar
Mathai, A.M. and Haubold, H.J. (2008): Special Functions for Applied Scientists, Springer, New York.
Book MATH Google Scholar
Mathai, A.M. and Haubold, H.J. (2011): A pathway from Bayesian statistical analysis to superstatistics, Applied Mathematics and Computations, 218, 799–804.
Article MATH Google Scholar
Mathai, A.M. and Haubold, H.J. (2011a): Matrix-variate statistical distributions and fractional calculus, Fractional Calculus & Applied Analysis, 24(1), 138–155.
Article MATH Google Scholar
Mathai, A.M. and Haubold, H.J. (2017): Introduction to Fractional Calculus, Nova Science Publishers, New York.
MATH Google Scholar
Mathai, A.M. and Haubold, H.J. (2017a): Fractional and Multivariable Calculus: Model Building and Optimization, Springer, New York.
Book MATH Google Scholar
Mathai, A.M. and Haubold, H. J. (2017b): Probability and Statistics, De Gruyter, Germany.
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Mathematics and Statistics, McGill University, Montreal, Canada
Arak Mathai
Statistical and Actuarial Sciences, The University of Western Ontario, London, ON, Canada
Serge Provost
Vienna International Centre, UN Office for Outer Space Affairs, Vienna, Austria
Hans Haubold

Authors

Arak Mathai
View author publications
You can also search for this author in PubMed Google Scholar
Serge Provost
View author publications
You can also search for this author in PubMed Google Scholar
Hans Haubold
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mathai, A., Provost, S., Haubold, H. (2022). Chapter 5: Matrix-Variate Gamma and Beta Distributions. In: Multivariate Statistical Analysis in the Real and Complex Domains. Springer, Cham. https://doi.org/10.1007/978-3-030-95864-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-95864-0_5
Published: 22 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95863-3
Online ISBN: 978-3-030-95864-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Chapter 5: Matrix-Variate Gamma and Beta Distributions

Abstract

Similar content being viewed by others

A Note on the Product of Independent Beta Random Variables

Distributions of powers of the central beta matrix variates and applications

On the Beta-G Poisson Family

5.1. Introduction

5.1a. The Complex Matrix-variate Gamma

5.2. The Real Matrix-variate Gamma Density

Example 5.2.1

Solution 5.2.1

5.2.1. The mgf of the real matrix-variate gamma distribution

Theorem 5.2.1

Theorem 5.2.2

Theorem 5.2.3

Theorem 5.2.4

5.2a. The Matrix-variate Gamma Function and Density, Complex Case

5.2a.1. The mgf of the complex matrix-variate gamma distribution

Lemma 5.2a.1

Proof

Theorem 5.2a.1

Theorem 5.2a.2

Theorem 5.2a.3

Theorem 5.2a.4

5.2.1

5.2.2

5.2.3

5.2.4

5.2.5

5.3. Matrix-variate Type-1 Beta and Type-2 Beta Densities, Real Case

5.3.1. Some properties of real matrix-variate type-1 and type-2 beta densities

5.3a. Matrix-variate Type-1 and Type-2 Beta Densities, Complex Case

5.3.2. Explicit evaluation of type-1 matrix-variate beta integrals, real case

Example 5.3.1

Solution 5.3.1

5.3a.1. Evaluation of matrix-variate type-1 beta integrals, complex case

Example 5.31

Solution 5.31

5.3.3. General partitions, real case

5.3.4. Methods avoiding integration over the Stiefel manifold

5.3.5. Arbitrary moments of the determinants, real gamma and beta matrices

Example 5.3.2

Solution 5.3.2

Lemma 5.3.1

Lemma 5.3.2

5.3a.2. Arbitrary moments of the determinants in the complex case

Example 5.3a.2

Solution 5.3a.2

Exercises 5.3

5.3.1

5.3.2

5.3.3

5.3.4

5.3.5

5.3.6

5.3.7

5.3.8

5.3.9

5.3.10

5.3.11

5.3.12

5.4. The Densities of Some General Structures

5.4.1. The G-function

Example 5.4.1

Solution 5.4.1

5.4.2. Some special cases of the G-function

5.4.3. The H-function

Example 5.4.2

Solution 5.4.2

5.4.4. Some special cases of the H-function

Exercises 5.4

5.4.1

5.4.2

5.4.3

5.4.4

5.4.5

5.5,5.5a. The Wishart Density

5.5.1. Explicit evaluations of the matrix-variate gamma integral, real case

5.5a.1. Evaluation of matrix-variate gamma integrals in the complex case

5.5.2. Triangularization of the Wishart matrix in the real case

5.6.3. Different derivations of ρ _1.(2…p)