2.1. Introduction

It is assumed that the reader has had adequate exposure to basic concepts in Probability, Statistics, Calculus and Linear Algebra. This chapter will serve as a review of the basic ideas about the univariate Gaussian, or normal, distribution as well as related distributions. We will begin with a discussion of the univariate Gaussian density. We will adopt the following notation: real scalar mathematical or random variables will be denoted by lower-case letters such as x, y, z, whereas vector or matrix-variate mathematical or random variables will be denoted by capital letters such as X, Y, Z, … . Statisticians usually employ the double notation X and x where it is claimed that x is a realization of X. Since x can vary, it is a variable in the mathematical sense. Treating mathematical and random variables the same way will simplify the notation and possibly reduce the confusion. Complex variables will be written with a tilde such as \(\tilde {x}, \tilde {y}, \tilde {X}, \tilde {Y}\), etc. Constant scalars and matrices will be written without a tilde unless for stressing that the constant matrix is in the complex domain. In such a case, a tilde will be also be utilized for the constant. Constant matrices will be denoted by A, B, C, … .

The numbering will first indicate the chapter and then the section. For example, Eq. (2.1.9) will be the ninth equation in Sect. 2.1 of this chapter. Local numbering for sub-sections will be indicated as (i), (ii), and so on.

Let x 1 be a real univariate Gaussian, or normal, random variable whose parameters are μ 1 and \(\sigma _1^2\); this will be written as \(x_1\sim N_1(\mu _1,\sigma _1^2)\), the associated density being given by

$$\displaystyle \begin{aligned}f(x_1)=\frac{1}{\sigma_1\sqrt{2\pi}}\text{e}^{-\frac{1}{2\sigma_1^2}(x_1-\mu_1)^2},\ -\infty<x_1<\infty,\ -\infty<\mu_1<\infty,\ \sigma_1>0.\end{aligned}$$

In this instance, the subscript 1 in N 1(⋅) refers to the univariate case. Incidentally, a density is a real-valued scalar function of x such that f(x) ≥ 0 for all x and ∫x f(x)dx = 1. The moment generating function (mgf) of this Gaussian random variable x 1, with t 1 as its parameter, is given by the following expected value, where E[⋅] denotes the expected value of [⋅]:

$$\displaystyle \begin{aligned}E[\text{e}^{t_1x_1}]=\int_{-\infty}^{\infty}\text{e}^{t_1x_1}f(x_1)\text{d}x_1=\text{e}^{t_1\mu_1+\frac{1}{2}t_1^2\sigma_1^2}.{} \end{aligned} $$
(2.1.1)

2.1a. The Complex Scalar Gaussian Variable

Let \(\tilde {x}=x_1+ix_2,\ i=\sqrt {(-1)},\ x_1,\ x_2\) real scalar variables. Let \(E[x_1]=\mu _1,\ E[x_2]=\mu _2,\ \text{Var}(x_1)=\sigma _1^2,\ \text{Var}(x_2)=\sigma _2^2,\ \text{Cov}(x_1,x_2)=\sigma _{12}\). By definition, the variance of the complex random variable \(\tilde {x}\) is defined as

$$\displaystyle \begin{aligned}\text{Var}(\tilde{x})=E[\tilde{x}-E(\tilde{x})][\tilde{x}-E(\tilde{x})]^{*} \end{aligned}$$

where * indicates a conjugate transpose in general; in this case, it simply means the conjugate since \(\tilde {x}\) is a scalar. Since \(\tilde {x}-E(\tilde {x})=x_1+ix_2-\mu _1-i\mu _2=(x_1-\mu _1)+i(x_2-\mu _2)\) and \([\tilde {x}-E(\tilde {x})]^{*}=(x_1-\mu _1)-i(x_2-\mu _2)\),

$$\displaystyle \begin{aligned} \text{Var}(\tilde{x})&=\!E[\tilde{x}-E(\tilde{x})][\tilde{x}-E(\tilde{x})]^{*}\\ &=\!E[(x_1-\mu_1)\!+\!i(x_2-\mu_2)][(x_1-\mu_1)\!-\!i(x_2-\mu_2)]\!=\!E[(x_1\!-\!\mu_1)^2\!+(x_2\!-\!\mu_2)^2]\\ &=\!\sigma_1^2+\sigma_2^2\equiv\sigma^2.\end{aligned} $$
(i)

Observe that Cov(x 1, x 2) does not appear in the scalar case. However, the covariance will be present in the vector/matrix case as will be explained in the coming chapters. The complex Gaussian density is given by

$$\displaystyle \begin{aligned}f(\tilde{x})=\frac{1}{\pi\sigma^2}\text{e}^{-\frac{1}{\sigma^2}(\tilde{x}-\tilde{\mu})^{*}(\tilde{x}-\tilde{\mu})} \end{aligned} $$
(ii)

for \(\tilde {x}=x_1+ix_2,\ \tilde {\mu }=\mu _1+i\mu _2,\ -\infty <x_j<\infty ,\ -\infty <\mu _j<\infty , \ \sigma ^2>0,\ j=1,2\). We will write this as \(\tilde {x}\sim \tilde {N}_1(\tilde {\mu },\sigma ^2)\). It can be shown that the two parameters appearing in the density in (ii) are the mean value of \(\tilde {x}\) and the variance of \(\tilde {x}\). We now establish that the density in (ii) is equivalent to a real bivariate Gaussian density with \(\sigma _1^2=\frac {1}{2}\sigma ^2,\ \sigma _2^2=\frac {1}{2}\sigma ^2\) and zero correlation. In the real bivariate normal density, the exponent is the following, with Σ as given below:

This exponent agrees with that appearing in the complex case. Now, consider the constant part in the real bivariate case:

which also coincides with that of the complex Gaussian. Hence, a complex scalar Gaussian is equivalent to a real bivariate Gaussian case whose parameters are as described above.

Let us consider the mgf of the complex Gaussian scalar case. Let \(\tilde {t}=t_1+it_2,i=\sqrt {(-1)}\), with t 1 and t 2 being real parameters, so that \(\tilde {t}^{*}=\bar {\tilde {t}}= t_1-it_2\) is the conjugate of \(\tilde {t}\). Then \(\tilde {t}^{*}\tilde {x}=t_1x_1+t_2x_2+i(t_1x_2-t_2x_1)\). Note that t 1 x 1 + t 2 x 2 contains the necessary number of parameters (that is, 2) and hence, to be consistent with the definition of the mgf in a real bivariate case, the imaginary part should not be taken into account; thus, we should define the mgf as \(E[\text{e}^{\Re (\tilde {t}^{*}\tilde {x})}]\), where \(\Re (\cdot )\) denotes the real part of (⋅). Accordingly, in the complex case, the mgf is obtained as follows:

$$\displaystyle \begin{aligned} M_{\tilde{x}}(\tilde{t})&=E[\text{e}^{\Re(\tilde{t}^{*}\tilde{x})}]\\ &=\frac{1}{\pi\sigma^2}\int_{\tilde{x}}\text{e}^{\Re(\tilde{t}^{*}\tilde{x})-\frac{1}{\sigma^2}(\tilde{x}-\tilde{\mu})^{*}(\tilde{x}- \tilde{\mu})}\text{d}\tilde{x}\\ &=\frac{\text{e}^{\Re(\tilde{t}^{*}\tilde{\mu})}}{\pi\sigma^2}\int_{\tilde{x}}\text{e}^{\Re(\tilde{t}^{*}(\tilde{x}-\tilde{\mu}))-\frac{1}{\sigma^2}(\tilde{x}-\tilde{\mu})^{*}(\tilde{x}-\tilde{\mu})}\text{d}\tilde{x}.\end{aligned} $$

Let us simplify the exponent:

$$\displaystyle \begin{aligned} \Re(\tilde{t}^{*}(\tilde{x}-\tilde{\mu}))&-\frac{1}{\sigma^2}(\tilde{x}-\tilde{\mu})^{*}(\tilde{x}-\tilde{\mu})\\ &=-\{\frac{1}{\sigma^2}(x_1-\mu_1)^2+\frac{1}{\sigma^2}(x_2-\mu_2)^2-t_1(x_1-\mu_1)-t_2(x_2-\mu_2)\}\\ &=\frac{\sigma^2}{4}(t_1^2+t_2^2)-\{(y_1-\frac{\sigma}{2}t_1)^2+(y_2-\frac{\sigma}{2}t_2)^2\}\end{aligned} $$

where \(y_1=\frac {x_1-\mu _1}{\sigma }, y_2=\frac {x_2-\mu _2}{\sigma }\), \(\text{d}y_j=\frac {1}{\sigma }\text{d}x_j, j=1,2\). But

$$\displaystyle \begin{aligned}\frac{1}{\sqrt{\pi}}\int_{-\infty}^{\infty}\text{e}^{-(y_j-\frac{\sigma}{2}t_j)^2}\text{d}y_j=1,\ \ j=1,2.\end{aligned}$$

Hence,

$$\displaystyle \begin{aligned}M_{\tilde{x}}(\tilde{t})=\text{e}^{\Re(\tilde{t}^{*}\tilde{\mu})+\frac{\sigma^2}{4}\tilde{t}^{*}\tilde{t}}=\text{e}^{t_1\mu_1+t_2\mu_2+\frac{\sigma^2}{4}(t_1^2+t_2^2)},{} \end{aligned} $$
(2.1a.1)

which is the mgf of the equivalent real bivariate Gaussian distribution.

Note 2.1.1

A statistical density is invariably a real-valued scalar function of the variables involved, be they scalar, vector or matrix variables, real or complex.

2.1.1. Linear functions of Gaussian variables in the real domain

If x 1, …, x k are statistically independently distributed real scalar Gaussian variables with parameters \((\mu _j,\sigma _j^2),j=1,\ldots ,k\) and if a 1, …, a k are real scalar constants then the mgf of a linear function u = a 1 x 1 + ⋯ + a k x k is given by

$$\displaystyle \begin{aligned} M_u(t)&=E[\text{e}^{tu}]=E[\text{e}^{ta_1x_1+\cdots+ta_kx_k}]=M_{x_1}(a_1t)\cdots M_{x_k}(a_kt), \ \text{as}\ M_{ax}(t)=M_x(at),\\ &=\text{e}^{(ta_1\mu_1+\cdots+ta_k\mu_k)+\frac{1}{2}t^2(a_1^2\sigma_1^2+\cdots+a_k^2\sigma_k^2)}\\ &=\text{e}^{t(\sum_{j=1}^ka_j\mu_j)+\frac{1}{2}t^2(\sum_{j=1}^ka_j^2\sigma_j^2)},\end{aligned} $$

which is the mgf of a real normal random variable whose parameters are \((\sum _{j=1}^ka_j\mu _j,\) \(\sum _{j=1}^ka_j^2\sigma _j^2)\). Hence, the following result:

Theorem 2.1.1

Let the real scalar random variable x j have a real univariate normal (Gaussian) distribution, that is, \(x_j\sim N_1(\mu _j,\sigma _j^2),\ j=1,\ldots ,k\) and let x 1, …, x k be statistically independently distributed. Then, any linear function u = a 1 x 1 + ⋯ + a k x k , where a 1, …, a k are real constants, has a real normal distribution with mean value \(\sum _{j=1}^ka_j\mu _j\) and variance \(\sum _{j=1}^ka_j^2\sigma _j^2\) , that is, \(u\sim N_1(\sum _{j=1}^ka_j\mu _j,\sum _{j=1}^ka_j^2\sigma _j^2)\).

Vector/matrix notation enables one to express this result in a more convenient form. Let

Then denoting the transposes by primes, u = L′X = X′L, E(u) = L′μ = μ′L, and

$$\displaystyle \begin{aligned} \text{Var}(u)&=E[(u-E(u))(u-E(u))']=L'E[(X-E(X))(X-E(X))']L\\ &=L'\text{Cov}(X)L=L'\varSigma L\end{aligned} $$

where, in this case, Σ is the diagonal matrix \(\text{diag}(\sigma _1^2,\ldots ,\sigma _k^2)\). If x 1, …, x k is a simple random sample from x 1, that is, from the normal population specified by the density of x 1 or, equivalently, if x 1, …, x k are iid (independently and identically distributed) random variables having as a common distribution that of x 1, then E(u) = μ 1 L′J = μ 1 J′L and \(\text{Var}(u)=\sigma _1^2L'L\) where u is the linear function defined in Theorem 2.1.1 and J′ = (1, 1, …, 1) is a vector of unities.

Example 2.1.1

Let x 1 ∼ N 1(−1, 1) and x 2 ∼ N 1(2, 2) be independently distributed real normal variables. Determine the density of the linear function u = 5x 1 − 2x 2 + 7.

Solution 2.1.1

Since u is a linear function of independently distributed real scalar normal variables, it is real scalar normal whose parameters E(u) and Var(u) are

$$\displaystyle \begin{aligned} E(u)&=5E(x_1)-2E(x_2)+7=5(-1)-2(2)+7=-2\\ \text{Var}(u)&=25\text{Var}(x_1)+4\text{Var}(x_2)+0=25(1)+4(2)=33,\end{aligned} $$

the covariance being zero since x 1 and x 2 are independently distributed. Thus, u ∼ N 1(−2, 33).

2.1a.1. Linear functions in the complex domain

We can also look into the distribution of linear functions of independently distributed complex Gaussian variables. Let a be a constant and \(\tilde {x}\) a complex random variable, where a may be real or complex. Then, from the definition of the variance in the complex domain, one has

$$\displaystyle \begin{aligned} \text{Var}(a\tilde{x})&=E[(a\tilde{x}-E(a\tilde{x}))(a\tilde{x}-E(a\tilde{x}))^{*}]=aE[(\tilde{x}-E(\tilde{x}))(\tilde{x}-E(\tilde{x}))^{*}]a^{*}\\ &=a\text{Var}(\tilde{x})a^{*}=aa^{*}\text{Var}(\tilde{x})=|a|{}^2\text{Var}(\tilde{x})=|a|{}^2\sigma^2 \end{aligned} $$

when the variance of \(\tilde {x}\) is σ 2, where |a| denotes the absolute value of a. As well, \(E[a\tilde {x}]=aE[\tilde {x}]=a\tilde {\mu }\). Then, a companion to Theorem 2.1.1 is obtained.

Theorem 2.1a.1

Let \(\tilde {x}_1,\ldots ,\tilde {x}_k\) be independently distributed scalar complex Gaussian variables, \(\tilde {x}_j\sim \tilde {N}_1(\tilde {\mu }_j,\sigma _j^2)\) , j = 1, …, k. Let a 1, …, a k be real or complex constants and \(\tilde {u}=a_1\tilde {x}_1+\cdots +a_k\tilde {x}_k\) be a linear function. Then, \(\tilde {u}\) has a univariate complex Gaussian distribution given by \(\tilde {u}\sim \tilde {N}_1(\sum _{j=1}^ka_j\tilde {\mu }_j,\sum _{j=1}^k|a_j|{ }^2\mathit{\text{Var}}(\tilde {x}_j))\).

Example 2.1a.1

Let \(\tilde {x}_1,\ \tilde {x}_2,\ \tilde {x}_3\) be independently distributed complex Gaussian univariate random variables with expected values \(\tilde {\mu }_1=-1+2i, \ \tilde {\mu }_2=i,\ \tilde {\mu }_3=-1-i\) respectively. Let \(\tilde {x}_j=x_{1j}+ix_{2j},j=1,2,3\). Let [Var(x 1j), Var(x 2j)] = [(1, 1), (1, 2), (2, 3)], respectively. Let a 1 = 1 + i, a 2 = 2 − 3i, a 3 = 2 + i, a 4 = 3 + 2i. Determine the density of the linear function \(\tilde {u}=a_1\tilde {x}_1+a_2\tilde {x}_2+a_3\tilde {x}_3+a_4\).

Solution 2.1a.1

$$\displaystyle \begin{aligned} E(\tilde{u})&=a_1E(\tilde{x}_1)+a_2E(\tilde{x}_2)+a_3E(\tilde{x}_3)+a_4\\ &=(1+i)(-1+2i)+(2-3i)(i)+(2+i)(-1-i)+(3+2i)\\ &=(-3+i)+(3+2i)+(-1-3i)+(3+2i)=2+2i;\\ \text{Var}(\tilde{u})&=|a_1|{}^2\text{Var}(\tilde{x}_1)+|a_2|{}^2\text{Var}(\tilde{x}_2)+|a_3|{}^2\text{Var}(\tilde{x}_3)\end{aligned} $$

and the covariances are equal to zero since the variables are independently distributed. Note that \(\tilde {x}_1=x_{11}+ix_{21}\) and hence, for example,

$$\displaystyle \begin{aligned} \text{Var}(\tilde{x}_1)&=E[(\tilde{x}_1-E(\tilde{x}_1))(\tilde{x}_1-E(\tilde{x}_1))^{*}]\\ &=E\{[(x_{11}-E(x_{11}))+i(x_{21}-E(x_{21}))][(x_{11}-E(x_{11}))-i(x_{21}-E(x_{21}))]\}\\ &=E[(x_{11}-E(x_{11}))^2]-(i)^2E[(x_{21}-E(x_{21}))^2]\\ &=\text{Var}(x_{11})+\text{Var}(x_{21})=1+1=2.\end{aligned} $$

Similarly, \(\text{Var}(\tilde {x}_2)=1+2=3, \text{Var}(\tilde {x}_3)=2+3=5\). Moreover, |a 1|2 = (1)2 + (1)2 = 2, |a 2|2 = (2)2 + (3)2 = 13, |a 3|2 = (2)2 + (1)2 = 5. Accordingly, \(\text{Var}(\tilde {u})=2(2)+(13)(3)+(5)(5)=68\). Thus, \(\tilde {u}\sim \tilde {N}_1(2+2i,68)\). Note that the constant a 4 only affects the mean value. Had a 4 been absent from \(\tilde {u}\), its mean value would have been real and equal to − 1.

2.1.2. The chisquare distribution in the real domain

Suppose that x 1 follows a real standard normal distribution, that is, x 1 ∼ N 1(0, 1), whose mean value is zero and variance, 1. What is then the density of \(x_1^2\), the square of a real standard normal variable? Let the distribution function or cumulative distribution function of x 1 be \(F_{x_1}(t)=Pr\{x_1\le t\}\) and that of \(y_1=x_1^2\) be \(F_{y_1}(t)=Pr\{y_1\le t\}\). Note that since y 1 > 0, t must be positive. Then,

$$\displaystyle \begin{aligned} F_{y_1}(t)&=Pr\{y_1\le t\}=Pr\{x_1^2\le t\}=Pr\{|x_1|\le \sqrt{t}\}=Pr\{-\sqrt{t}\le x_1\le \sqrt{t}\}\\ &=Pr\{x_1\le \sqrt{t}\}-Pr\{x_1\le -\sqrt{t}\}=F_{x_1}(\sqrt{t})-F_{x_1}(-\sqrt{t}).\end{aligned} $$
(i)

Denoting the density of y 1 by g(y 1), this density at y 1 = t is available by differentiating the distribution function \(F_{y_1}(t)\) with respect to t. As for the density of x 1, which is the standard normal density, it can be obtained by differentiating \(F_{x_1}(\sqrt {t})\) with respect to \(\sqrt {t}\). Thus, differentiating (i) throughout with respect to t, we have

$$\displaystyle \begin{aligned} g(t)|{}_{t=y_1}&=\Big[\frac{\text{d}}{\text{d}t}F_{x_1}(\sqrt{t})-\frac{\text{d}}{\text{d}t}F_{x_1}(-\sqrt{t})\Big]\Big|{}_{t=y_1}\\ &=\Big[\frac{\text{d}}{\text{d}\sqrt{t}}F_{x_1}(\sqrt{t})\frac{\text{d}\sqrt{t}}{\text{d}t}-\frac{\text{d}}{\text{d}\sqrt{t}}F_{x_1}(-\sqrt{t})\frac{\text{d}(-\sqrt{t})}{\text{d}t}\Big]\Big|{}_{t=y_1}\\ &=\frac{1}{2}t^{\frac{1}{2}-1}\frac{1}{\sqrt{(2\pi)}}\text{e}^{-\frac{1}{2}t}\Big|{}_{t=y_1}+\frac{1}{2}t^{\frac{1}{2}-1}\frac{1}{\sqrt{(2\pi)}}\text{e}^{-\frac{1}{2}t}\Big|{}_{t=y_1}\\ &=\frac{1}{2^{\frac{1}{2}}\varGamma({1}/{2})}y_1^{\frac{1}{2}-1}\text{e}^{-\frac{1}{2}y_1},0\le y_1<\infty,\ \text{with}\ \varGamma({1}/{2})=\sqrt{\pi}.\end{aligned} $$
(ii)

Accordingly, the density of \(y_1=x_1^2\) or the square of a real standard normal variable, is a two-parameter real gamma with \(\alpha =\frac {1}{2}\) and β = 2 or a real chisquare with one degree of freedom. A two-parameter real gamma density with the parameters (α, β) is given by

$$\displaystyle \begin{aligned}f_1(y_1)=\frac{1}{\beta^{\alpha}\varGamma(\alpha)}y_1^{\alpha-1}\text{e}^{-\frac{y_1}{\beta}},\ 0\le y_1<\infty,\ \alpha>0,\ \beta>0,{} \end{aligned} $$
(2.1.2)

and f 1(y 1) = 0 elsewhere. When \(\alpha =\frac {n}{2}\) and β = 2, we have a real chisquare density with n degrees of freedom. Hence, the following result:

Theorem 2.1.2

The square of a real scalar standard normal random variable is a real chisquare variable with one degree of freedom. A real chisquare with n degrees of freedom has the density given in (2.1.2) with \(\alpha =\frac {n}{2}\) and β = 2.

A real scalar chisquare random variable with m degrees of freedom is denoted as \(\chi _m^2\). From (2.1.2), by computing the mgf we can see that the mgf of a real scalar gamma random variable y is M y(t) = (1 − βt)α for 1 − βt > 0. Hence, a real chisquare with m degrees of freedom has the mgf \(M_{\chi _m^2}(t)=(1-2t)^{-\frac {m}{2}}\) for 1 − 2t > 0. The condition 1 − βt > 0 is required for the convergence of the integral when evaluating the mgf of a real gamma random variable. If \(y_j\sim \chi _{m_j}^2,j=1,\ldots ,k\) and if y 1, …, y k are independently distributed, then the sum \(y=y_1+\cdots +y_k\sim \chi ^2_{m_1+\cdots +m_k}\), a real chisquare with m 1 + ⋯ + m k degrees of freedom, with mgf \(M_y(t)=(1-2t)^{-\frac {1}{2}(m_1+\cdots +m_k)}\) for 1 − 2t > 0.

Example 2.1.2

Let x 1 ∼ N 1(−1, 4), x 2 ∼ N 1(2, 2) be independently distributed. Let \(u=x_1^2+2x_2^2+2x_1-8x_2+5\). Compute the density of u.

Solution 2.1.2

$$\displaystyle \begin{aligned} u&=x_1^2+2x_2^2+2x_1-8x_2+5=(x_1+1)^2+2(x_2-2)^2-4\\ &=4\Big[\frac{(x_1+1)^2}{4}+\frac{(x_2-2)^2}{2}\Big]-4.\end{aligned} $$

Since x 1 ∼ N 1(−1, 4) and x 2 ∼ N 1(2, 2) are independently distributed, so are \(\frac {(x_1+1)^2}{4}\sim \chi ^2_1\) and \(\frac {(x_2-2)^2}{2}\sim \chi ^2_1\), and hence the sum is a real \(\chi ^2_2\) random variable. Then, u = 4y − 4 with \(y=\chi ^2_2\). But the density of y, denoted by f y(y), is

$$\displaystyle \begin{aligned}f_y(y)=\frac{1}{2}\text{e}^{-\frac{y}{2}},\ 0\le y<\infty, \end{aligned}$$

and f y(y) = 0 elsewhere. Then, z = 4y has the density

$$\displaystyle \begin{aligned}f_z(z)=\frac{1}{8}\text{e}^{-\frac{z}{8}},\ 0\le z<\infty, \end{aligned}$$

and f z(z) = 0 elsewhere. However, since u = z − 4, its density is

$$\displaystyle \begin{aligned}f_u(u)=\frac{1}{8}\text{e}^{-\frac{(u+4)}{8}},\ -4\le u<\infty, \end{aligned}$$

and zero elsewhere.

2.1a.2. The chisquare distribution in the complex domain

Let us consider the distribution of \(\tilde {z}_1\tilde {z}_1^{*}\) of a scalar standard complex normal variable \(\tilde {z}_1\). The density of \(\tilde {z}_1\) is given by

$$\displaystyle \begin{aligned}f_{\tilde{z}_1}(\tilde{z}_1)=\frac{1}{\pi}\text{e}^{-\tilde{z}_1^{*}\tilde{z}_1},\ \tilde{z}_1=z_{11}+iz_{12},\ -\infty<z_{1j}<\infty,\ j=1,2. \end{aligned}$$

Let \(\tilde {u}_1=\tilde {z}_1^{*}\tilde {z}_1\). Note that \(\tilde {z}_1^{*}\tilde {z}_1\) is real and hence we may associate a real parameter t to the mgf. Note that \(\tilde {z}_1\tilde {z}_1^{*}\) in the scalar complex case corresponds to z 2 in the real scalar case where z ∼ N 1(0, 1). Then, the mgf of \(\tilde {u}_1\) is given by

$$\displaystyle \begin{aligned}M_{\tilde{u}_1}(t)=E[\text{e}^{\Re(\tilde{t}\tilde{u}_1)}]=\frac{1}{\pi}\int_{\tilde{z}_1}\text{e}^{-\Re[(1-t)\tilde{z}_1^{*}\tilde{z}_1]}\text{d}\tilde{z}_1. \end{aligned}$$

However, \(\tilde {z}_1^{*}\tilde {z}_1=z_{11}^2+z_{12}^2\) as \( \tilde {z}_1=z_{11}+iz_{12},\ i=\sqrt {(-1)}\), where z 11 and z 12 are real. Thus, the above integral gives \((1-t)^{-\frac {1}{2}}(1-t)^{-\frac {1}{2}}=(1-t)^{-1}\) for 1 − t > 0, which is the mgf of a real scalar gamma variable with parameters α = 1 and β = 1. Let \(\tilde {z}_j\sim \tilde {N}_1(\tilde {\mu }_j,\sigma ^2),\ j=1,\ldots ,k\), be scalar complex normal random variables that are independently distributed. Letting

$$\displaystyle \begin{aligned}\tilde{u}=\sum_{j=1}^k\Big(\frac{\tilde{z}_j-\tilde{\mu}_j}{\sigma_j}\Big)^{*}\Big(\frac{\tilde{z}_j-\tilde{\mu}_j}{\sigma_j}\Big) \sim\mbox{ real scalar gamma with parameters }\alpha=k, \beta=1, \end{aligned}$$

whose density is

$$\displaystyle \begin{aligned}f_{\tilde{u}}(u)=\frac{1}{\varGamma(k)}u^{k-1}\text{e}^{-u},\ 0\le u<\infty, \ k=1,2,\ldots, {} \end{aligned} $$
(2.1a.2)

\(\tilde {u}\) is referred to as a scalar chisquare in the complex domain having k degrees of freedom, which is denoted \(\tilde {u}\sim \tilde {\chi }_k^2\). Hence, the following result:

Theorem 2.1a.2

Let \(\tilde {z}_j\sim \tilde {N}_1(\tilde {\mu }_j,\sigma _j^2),\ j=1,\ldots ,k,\) be independently distributed and \(\tilde {u}=\sum _{j=1}^k(\frac {\tilde {z}_j-\tilde {\mu }_j}{\sigma _j})^{*}(\frac {\tilde {z}_j-\tilde {\mu }_j}{\sigma _j})\) . Then \(\tilde {u}\) is called a scalar chisquare having k degrees of freedom in the complex domain whose density as given in (2.1a.2) is that of a real scalar gamma random variable with parameters α = k and β = 1.

Example 2.1a.2

Let \(\tilde {x}_1\sim \tilde {N}_1(i,2),\ \tilde {x}_2\sim \tilde {N}_1(1-i,1)\) be independently distributed complex Gaussian univariate random variables. Let \(\tilde {u}=\tilde {x}_1^{*}\tilde {x}_1+2\tilde {x}_2^{*}\tilde {x}_2-2\tilde {x}_2^{*}-2\tilde {x}_2+i(\tilde {x}_1+2\tilde {x}_2^{*})-i(\tilde {x}_1^{*} +2\tilde {x}_2)+5\). Evaluate the density of \(\tilde {u}\).

Solution 2.1a.2

Let us simplify \(\tilde {u}\), keeping in mind the parameters in the densities of \(\tilde {x}_1\) and \(\tilde {x}_2\). Since terms of the type \(\tilde {x}_1^{*}\tilde {x}_1\) and \(\tilde {x}_2^{*}\tilde {x}_2\) are present in \(\tilde {u}\), we may simplify into factors involving \(\tilde {x}_j^{*}\) and \(\tilde {x}_j\) for j = 1, 2. From the density of \(\tilde {x}_1\) we have

$$\displaystyle \begin{aligned}\frac{(\tilde{x}_1-i)^{*}(\tilde{x}_1-i)}{2}\sim\tilde{\chi}_1^2\end{aligned}$$

where

$$\displaystyle \begin{aligned}(\tilde{x}_1-i)^{*}(\tilde{x}_1-i)=(\tilde{x}_1^{*}+i)(\tilde{x}_1-i)=\tilde{x}_1^{*}\tilde{x}_1+i\tilde{x}_1-i\tilde{x}_1^{*}+1. \end{aligned} $$
(i)

After removing the elements in (i) from \(\tilde {u}\), the remainder is

$$\displaystyle \begin{aligned} 2\tilde{x}_2^{*}\tilde{x}_2&-2\tilde{x}_2^{*}-2\tilde{x}_2+2i\tilde{x}_2^{*}-2i\tilde{x}_2+4\\ &=2[(\tilde{x}_2-1)^{*}(\tilde{x}_2-1)-i\tilde{x}_2+i\tilde{x}_2^{*}+1]\\ &=2[(\tilde{x}_2-1+i)^{*}(\tilde{x}_2-1+i)].\end{aligned} $$

Accordingly,

$$\displaystyle \begin{aligned} \tilde{u}&=2\Big[\frac{(\tilde{x}_1-i)^{*}(\tilde{x}_1-i)}{2}+(\tilde{x}_2-1+i)^{*}(\tilde{x}_2-1+i)\Big]\\ &=2[\tilde{\chi}^2_1+\tilde{\chi}^2_1]=2\tilde{\chi}^2_2\end{aligned} $$

where \(\tilde {\chi }^2_2\) is a scalar chisquare of degree 2 in the complex domain or, equivalently, a real scalar gamma with parameters (α = 2, β = 1). Letting \(y=\tilde {\chi }^2_2\), the density of y, denoted by f y(y), is

$$\displaystyle \begin{aligned}f_y(y)=y\,\text{e}^{-y},\ 0\le y<\infty, \end{aligned}$$

and f y(y) = 0 elsewhere. Then, the density of u = 2y, denoted by f u(u), which is given by

$$\displaystyle \begin{aligned}f_{u}(u)=\frac{u}{2}\,\text{e}^{-\frac{u}{2}},\ 0\le u<\infty, \end{aligned}$$

and f u(u) = 0 elsewhere, is that of a real scalar gamma with the parameters (α = 2, β = 2).

2.1.3. The type-2 beta and F distributions in the real domain

What about the distribution of the ratio of two independently distributed real scalar chisquare random variables? Let \(y_1\sim \chi _m^2\) and \( y_2\sim \chi _n^2\), that is, y 1 and y 2 are real chisquare random variables with m and n degrees of freedom respectively, and assume that y 1 and y 2 are independently distributed. Let us determine the density of u = y 1y 2. Let v = y 2 and consider the transformation (y 1, y 2) onto (u, v). Noting that

$$\displaystyle \begin{aligned}\frac{\partial u}{\partial y_1}=\frac{1}{y_2},\ \frac{\partial v}{\partial y_2}=1,\ \frac{\partial v}{\partial y_1}=0,\end{aligned}$$

one has

where the asterisk indicates the presence of some element in which we are not interested owing to the triangular pattern for the Jacobian matrix. Letting the joint density of y 1 and y 2 be denoted by f 12(y 1, y 2), one has

$$\displaystyle \begin{aligned}f_{12}(y_1,y_2)=\frac{1}{2^{\frac{m+n}{2}}\varGamma(\frac{m}{2})\varGamma(\frac{n}{2})}y_1^{\frac{m}{2}-1}y_2^{\frac{n}{2}-1}\text{e}^{-\frac{y_1+y_2}{2}}\end{aligned} $$

for 0 ≤ y 1 < , 0 ≤ y 2 < , m, n = 1, 2, …, and f 12(y 1, y 2) = 0 elsewhere. Let the joint density of u and v be denoted by g 12(u, v) and the marginal density of u be denoted by g 1(u). Then,

$$\displaystyle \begin{aligned} g_{12}(u,v)&=c~v(uv)^{\frac{m}{2}-1}v^{\frac{n}{2}-1}\text{e}^{-\frac{1}{2}(uv+v)},\ c=\frac{1}{2^{\frac{m+n}{2}}\varGamma(\frac{m}{2})\varGamma(\frac{n}{2})},\\ g_1(u)&=c~u^{\frac{m}{2}-1}\int_{v=0}^{\infty}v^{\frac{m+n}{2}-1}\text{e}^{-v\frac{(1+u)}{2}}\text{d}v\\ &=c~u^{\frac{m}{2}-1}\varGamma\Big(\frac{m+n}{2}\Big)\Big(\frac{1+u}{2}\Big)^{-\frac{m+n}{2}}\\ &=\frac{\varGamma(\frac{m+n}{2})}{\varGamma(\frac{m}{2})\varGamma(\frac{n}{2})}u^{\frac{m}{2}-1}(1+u)^{-\frac{m+n}{2}} {} \end{aligned} $$
(2.1.3)

for m, n = 1, 2, …, 0 ≤ u <  and g 1(u) = 0 elsewhere. Note that g 1(u) is a type-2 real scalar beta density. Hence, we have the following result:

Theorem 2.1.3

Let the real scalar \(y_1\sim \chi _m^2\) and \(y_2\sim \chi _n^2\) be independently distributed, then the ratio \(u=\frac {y_1}{y_2}\) is a type-2 real scalar beta random variable with the parameters \(\frac {m}{2}\) and \(\frac {n}{2}\) where m, n = 1, 2, …, whose density is provided in (2.1.3).

This result also holds for general real scalar gamma random variables x 1 > 0 and x 2 > 0 with parameters (α 1, β) and (α 2, β), respectively, where β is a common scale parameter and it is assumed that x 1 and x 2 are independently distributed. Then, \(u=\frac {x_1}{x_2}\) is a type-2 beta with parameters α 1 and α 2.

If u as defined in Theorem 2.1.3 is replaced by \(\frac {m}{n}F_{m,n}\) or \(F=F_{m,n}=\frac {\chi _m^2/m}{\chi _n^2/n}=\frac {n}{m}u\) is known as the F-random variable with m and n degrees of freedom, where the degrees of freedom indicate those of the numerator and denominator chisquare random variables which are independently distributed. Denoting the density of F by f F(F) we have the following result:

Theorem 2.1.4

Letting \(F=F_{m,n}=\frac {\chi _m^2/m}{\chi _n^2/n}\) where the two real scalar chisquares are independently distributed, the real scalar F-density is given by

$$\displaystyle \begin{aligned}f_F(F)=\frac{\varGamma(\frac{m+n}{2})}{\varGamma(\frac{m}{2})\varGamma(\frac{n}{2})}\Big(\frac{m}{n}\Big)^{\frac{m}{2}} \frac{F^{\frac{m}{2}-1}}{(1+\frac{m}{n}F)^{\frac{m+n}{2}}}{} \end{aligned} $$
(2.1.4)

for 0 ≤ F < ∞, m, n = 1, 2, …, and f F(F) = 0 elsewhere.

Example 2.1.3

Let x 1 and x 2 be independently distributed real scalar gamma random variables with parameters (α 1, β) and (α 2, β), respectively, β being a common parameter, whose densities are as specified in (2.1.2). Let \(u_1=\frac {x_1}{x_1+x_2}, u_2=\frac {x_1}{x_2},u_3=x_1+x_2\). Show that (1): u 3 has a real scalar gamma density as given in (2.1.2) with the parameters (α 1 + α 2, β); (2): u 1 and u 3 as well as u 2 and u 3 are independently distributed; (3): u 2 is a real scalar type-2 beta with parameters (α 1, α 2) whose density is specified in (2.1.3); (4): u 1 has a real scalar type-1 beta density given as

$$\displaystyle \begin{aligned}f_1(u_1)=\frac{\varGamma(\alpha_1+\alpha_2)}{\varGamma(\alpha_1)\varGamma(\alpha_2)}u_1^{\alpha_1-1}(1-u_1)^{\alpha_2-1},\ 0\le u_1\le 1,{} \end{aligned} $$
(2.1.5)

for \(\Re (\alpha _1)>0,\Re (\alpha _2)>0\) and zero elsewhere. [In a statistical density, the parameters are usually real; however, since the integrals exist for complex parameters, the conditions are given for complex parameters as the real parts of α 1 and α 2, which must be positive. When they are real, the conditions will be simply α 1 > 0 and α 2 > 0.]

Solution 2.1.3

Since x 1 and x 2 are independently distributed, their joint density is the product of the marginal densities, which is given by

$$\displaystyle \begin{aligned}f_{12}(x_1,x_2)=c~x_1^{\alpha_1-1}x_2^{\alpha_2-1}\text{e}^{-\frac{1}{\beta}(x_1+x_2)},\ 0\le x_j<\infty,\ j=1,2, \end{aligned} $$
(i)

for \(\Re (\alpha _j)>0,\ \Re (\beta )>0,\ j=1,2\) and zero elsewhere, where

$$\displaystyle \begin{aligned}c=\frac{1}{\beta^{\alpha_1+\alpha_2}\varGamma(\alpha_1)\varGamma(\alpha_2)}. \end{aligned}$$

Since the sum x 1 + x 2 is present in the exponent and both x 1 and x 2 are positive, a convenient transformation is \(x_1=r~\cos ^2\theta , \ x_2=r~\sin ^2\theta , \ 0\le r<\infty ,\ 0\le \theta \le \frac {\pi }{2}\). Then, the Jacobian is available from the detailed derivation of Jacobian given in the beginning of Sect. 2.1.3 or from Example 1.6.1. That is,

$$\displaystyle \begin{aligned}\text{d}x_1\wedge\text{d}x_2=2r~\sin\theta\cos\theta~\text{d}r\wedge\text{d}\theta. \end{aligned} $$
(ii)

Then from (i) and (ii), the joint density of r and θ, denoted by f r,θ(r, θ), is the following:

$$\displaystyle \begin{aligned}f_{r,\theta}(r,\theta)=c~(\cos^2\theta)^{\alpha_1-1}(\sin^2\theta)^{\alpha_2-1}2\cos\theta\sin\theta~r^{\alpha_1+\alpha_2-1}\text{e}^{-\frac{1}{\beta}r} \end{aligned} $$
(iii)

and zero elsewhere. As f r,θ(r, θ) is a product of positive integrable functions involving solely r and θ, r and θ are independently distributed. Since \(u_3=x_1+x_2=r\cos ^2\theta +r\sin ^2\theta =r\) is solely a function of r and \(u_1=\frac {x_1}{x_1+x_2}=\cos ^2\theta \) and \(u_2=\frac {\cos ^2\theta }{\sin ^2\theta }\) are solely functions of θ, it follows that u 1 and u 3 as well as u 2 and u 3 are independently distributed. From (iii), upon multiplying and dividing by Γ(α 1 + α 2), we obtain the density of u 3 as

$$\displaystyle \begin{aligned}f_1(u_3)=\frac{1}{\beta^{\alpha_1+\alpha_2}\varGamma(\alpha_1+\alpha_2)}u_3^{\alpha_1+\alpha_2-1}\text{e}^{-\frac{u_3}{\beta}},\ 0\le u_3<\infty, \end{aligned} $$
(iv)

and zero elsewhere, which is a real scalar gamma density with parameters (α 1 + α 2, β). From (iii), the density of θ, denoted by f 2(θ), is

$$\displaystyle \begin{aligned}f_2(\theta)=c~(\cos^2\theta)^{\alpha_1-1}(\sin^2\theta)^{\alpha_2-1},\ 0\le \theta\le \frac{\pi}{2}\end{aligned} $$
(v)

and zero elsewhere, for \(\Re (\alpha _j)>0,\ j=1,2,\). From this result, we can obtain the density of \(u_1=\cos ^2\theta \). Then, \(\text{d}u_1=-2\cos \theta \sin \theta ~\text{d}\theta \). Moreover, when θ → 0, u 1 → 1 and when \(\theta \to \frac {\pi }{2}, u_1\to 0\). Hence, the minus sign in the Jacobian is needed to obtain the limits in the natural order, 0 ≤ u 1 ≤ 1. Substituting in (v), the density of u 1 denoted by f 3(u 1), is as given in (2.1.5), u 1 being a real scalar type-1 beta random variable with parameters (α 1, α 2). Now, observe that

$$\displaystyle \begin{aligned}u_2=\frac{\cos^2\theta}{\sin^2\theta}=\frac{\cos^2\theta}{1-\cos^2\theta}=\frac{u_1}{1-u_1}. \end{aligned} $$
(vi)

Given the density of u 1 as specified in (2.1.5), we can obtain the density of u 2 as follows. As \(u_2=\frac {u_1}{1-u_1}\), we have \(u_1=\frac {u_2}{1+u_2}\Rightarrow \text{d}u_1=\frac {1}{(1+u_2)^2}\text{d}u_2\); then substituting these values in the density of u 1, we have the following density for u 2:

$$\displaystyle \begin{aligned}f_3(u_2)=\frac{\varGamma(\alpha_1+\alpha_2)}{\varGamma(\alpha_1)\varGamma(\alpha_2)}u_2^{\alpha_1-1}(1+u_2)^{-(\alpha_1+\alpha_2)},\ 0\le u_2<\infty,{} \end{aligned} $$
(2.1.6)

and zero elsewhere, for \(\Re (\alpha _j)>0,\ j=1,2\), which is a real scalar type-2 beta density with parameters (α 1, α 2). The results associated with the densities (2.1.5) and (2.1.6) are now stated as a theorem.

Theorem 2.1.5

Let x 1 and x 2 be independently distributed real scalar gamma random variables with the parameters (α 1, β), (α 2, β), respectively, β being a common scale parameter. [If \(x_1\sim \chi ^2_m\ and \ x_2\sim \chi ^2_n\) , then \(\alpha _1=\frac {m}{2},\ \alpha _2=\frac {n}{2}\) and β = 2.] Then \(u_1=\frac {x_1}{x_1+x_2}\) is a real scalar type-1 beta whose density is as specified in (2.1.5) with the parameters (α 1, α 2), and \(u_2=\frac {x_1}{x_2}\) is a real scalar type-2 beta whose density is as given in (2.1.6) with the parameters (α 1, α 2).

2.1a.3. The type-2 beta and F distributions in the complex domain

It follows that in the complex domain, if \(\tilde {\chi }^2_m\) and \(\tilde {\chi }^2_n\) are independently distributed, then the sum is a chisquare with m + n degrees of freedom, that is, \(\tilde {\chi }^2_m+\tilde {\chi }^2_n=\tilde {\chi }^2_{m+n}.\) We now look into type-2 beta variables and F-variables and their connection to chisquare variables in the complex domain. Since, in the complex domain, the chisquares are actually real variables, the density of the ratio of two independently distributed chisquares with m and n degrees of freedom in the complex domain, remains the same as the density given in (2.1.3) with \(\frac {m}{2}\) and \(\frac {n}{2}\) replaced by m and n, respectively. Thus, letting \(\tilde {u}=\tilde {\chi }_m^2/\tilde {\chi }_n^2\) where the two chisquares in the complex domain are independently distributed, the density of \(\tilde {u}\), denoted by \(\tilde {g}_1(u)\), is

$$\displaystyle \begin{aligned}\tilde{g}_1(u)=\frac{\varGamma(m+n)}{\varGamma(m)\varGamma(n)}u^{m-1}(1+u)^{-(m+n)} {} \end{aligned} $$
(2.1a.3)

for 0 ≤ u < , m, n = 1, 2, …, and \(\tilde {g}_1(u)=0\) elsewhere.

Theorem 2.1a.3

Let \(\tilde {y}_1\sim \tilde {\chi }^2_m\) and \( \tilde {y}_2\sim \tilde {\chi }^2_n\) be independently distributed where \(\tilde {y}_1\) and \(\tilde {y}_2\) are in the complex domain; then, \(\tilde {u}=\frac {\tilde {y}_1}{\tilde {y}_2}\) is a real type-2 beta whose density is given in (2.1a.3).

If the F random variable in the complex domain is defined as \(\tilde {F}_{m,n}=\frac {\tilde {\chi }_m^2/m}{\tilde {\chi }_n^2/n}\) where the two chisquares in the complex domain are independently distributed, then the density of \(\tilde {F}\) is that of the real F-density with m and n replaced by 2m and 2n in (2.1.4), respectively.

Theorem 2.1a.4

Let \(\tilde {F}=\tilde {F}_{m,n}=\frac {\tilde {\chi }_m^2/m}{\tilde {\chi }_n^2/n}\) where the two chisquares in the complex domain are independently distributed; then, \(\tilde {F}\) is referred to as an F random variable in the complex domain and it has a real F-density with the parameters m and n, which is given by

$$\displaystyle \begin{aligned}\tilde{g}_2(F)=\frac{\varGamma(m+n)}{\varGamma(m)\varGamma(n)}\Big(\frac{m}{n}\Big)^m F^{m-1}(1+\frac{m}{n}F)^{-(m+n)} {} \end{aligned} $$
(2.1a.4)

for 0 ≤ F < ∞, m, n = 1, 2, …, and \(\tilde {g}_2(F)=0\) elsewhere.

A type-1 beta representation in the complex domain can similarly be obtained from Theorem 2.1a.3. This will be stated as a theorem.

Theorem 2.1a.5

Let \(\tilde {x}_1\sim \tilde {\chi }^2_m\) and \(\tilde {x}_2\sim \tilde {\chi }^2_n\) be independently distributed scalar chisquare variables in the complex domain with m and n degrees of freedom, respectively. Let \(\tilde {u}_1=\frac {\tilde {x}_1}{\tilde {x}_1+\tilde {x}_2}\) , which is a real variable that we will call u 1 . Then, \(\tilde {u}_1\) is a scalar type-1 beta random variable in the complex domain with the parameters m, n, whose real scalar density is

$$\displaystyle \begin{aligned}\tilde{f}_1(\tilde{u}_1)=\frac{\varGamma(m+n)}{\varGamma(m)\varGamma(n)}u_1^{m-1}(1-u_1)^{n-1},\ 0\le u_1\le 1, {} \end{aligned} $$
(2.1a.5)

and zero elsewhere, for m, n = 1, 2, … .

2.1.4. Power transformation of type-1 and type-2 beta random variables

Let us make a power transformation of the type u 1 = ay δ, a > 0, δ > 0. Then, du 1 = aδy δ−1dy. For convenience, let the parameters in (2.1.5) be α 1 = α and α 2 = β. Then, the density given in (2.1.5) becomes

$$\displaystyle \begin{aligned}f_{11}(y)=\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}\delta a^{\alpha} y^{\alpha\delta-1}(1-ay^{\delta})^{\beta-1},\ 0\le y\le \frac{1}{a^{\frac{1}{\delta}}},{}\end{aligned} $$
(2.1.7)

and zero elsewhere, for \(a>0,\ \delta >0,\ \Re (\alpha )>0,\ \Re (\beta )>0\). We can extend the support to \(-a^{-\frac {1}{\delta }}\le y\le a^{-\frac {1}{\delta }}\) by replacing y by |y| and multiplying the normalizing constant by \(\frac {1}{2}\). Such power transformed models are useful in practical applications. Observe that a power transformation has the following effect: for y < 1, the density is reduced if δ > 1 or raised if δ < 1, whereas for y > 1, the density increases if δ > 1 or diminishes if δ < 1. For instance, the particular case α = 1 is highly useful in reliability theory and stress-strength analysis. Thus, letting α = 1 in the original real scalar type-1 beta density (2.1.7) and denoting the resulting density by f 12(y), one has

$$\displaystyle \begin{aligned}f_{12}(y)=a\delta \beta y^{\delta-1}(1-ay^{\delta})^{\beta-1},\ 0\le y\le a^{-\frac{1}{\delta}},{} \end{aligned} $$
(2.1.8)

for \(a>0,\ \delta >0,\ \Re (\beta )>0\), and zero elsewhere. In the model in (2.1.8), the reliability, that is, Pr{y ≥ t}, for some t, can be easily determined. As well, the hazard function \(\frac {f_{12}(y=t)}{Pr\{y\ge t\}}\) is readily available. Actually, the reliability or survival function is

$$\displaystyle \begin{aligned}Pr\{y\ge t\}=(1-at^{\delta})^{\beta},\ a>0,\ \delta>0,\ t>0,\ \beta>0,\end{aligned} $$
(i)

and the hazard function is

$$\displaystyle \begin{aligned}\frac{f_{12}(y=t)}{Pr\{y\ge t\}}=\frac{a\delta\beta t^{\delta-1}}{1-at^{\delta}}.\end{aligned} $$
(ii)

Observe that the free parameters a, δ and β allow for much versatility in model building situations. If β = 1 in the real scalar type-1 beta model in (2.1.7), then the density reduces to αy α−1, 0 ≤ y ≤ 1, α > 0, which is a simple power function. The most popular power function model in the statistical literature is the Weibull model, which is a power transformed exponential density. Consider the real scalar exponential density

$$\displaystyle \begin{aligned}g(x)=\theta\text{e}^{-\theta x},\ \theta>0,\ x\ge 0, \end{aligned} $$
(iii)

and zero elsewhere, and let x = y δ, δ > 0. Then the model in (iii) becomes the real scalar Weibull density, denoted by g 1(y):

$$\displaystyle \begin{aligned} g_1(y)=\theta\delta y^{\delta-1}\text{e}^{-\theta y^{\delta}},\ \theta>0,\ \delta>0,\ y\ge 0, \end{aligned} $$
(iv)

and zero elsewhere.

Now, let us consider power transformations in a real scalar type-2 beta density given in (2.1.6). For convenience let α 1 = α and α 2 = β. Letting y 2 = ay δ, a > 0, δ > 0, the model specified by (2.1.6) then becomes

$$\displaystyle \begin{aligned}f_{21}(y)=a^{\alpha}\delta\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}y^{\alpha\delta-1}(1+ay^{\delta})^{-(\alpha+\beta)} \end{aligned} $$
(v)

for \(a>0,\ \delta >0,\ \Re (\alpha )>0,\ \Re (\beta )>0\), and zero elsewhere. As in the type-1 beta case, the most interesting special case occurs when α = 1. Denoting the resulting density by f 22(y), we have

$$\displaystyle \begin{aligned}f_{22}(y)=a\delta\beta y^{\delta-1}(1+ay^{\delta})^{-(\beta+1)},\ 0\le y<\infty,{} \end{aligned} $$
(2.1.9)

for \(a>0,\ \delta >0,\ \Re (\beta )>0,\ \alpha =1\), and zero elsewhere. In this case as well, the reliability and hazard functions can easily be determined:

$$\displaystyle \begin{aligned} \mbox{Reliability function }&=Pr\{y\ge t\}=(1+at^{\delta})^{-\beta}, \end{aligned} $$
(vi)
$$\displaystyle \begin{aligned} \mbox{Hazard function }&=\frac{f_{22}(y=t)}{Pr\{y\ge t\}}=\frac{a\delta\beta t^{\delta-1}}{1+at^{\delta}}.\end{aligned} $$
(vii)

Again, for application purposes, the forms in (vi) and (vii) are seen to be very versatile due to the presence of the free parameters a, δ and β.

2.1.5. Exponentiation of real scalar type-1 and type-2 beta variables

Let us consider the real scalar type-1 beta model in (2.1.5) where, for convenience, we let α 1 = α and α 2 = β. Letting u 1 = aeby, we denote the resulting density by f 13(y) where

$$\displaystyle \begin{aligned}f_{13}(y)=a^{\alpha} b\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}\text{e}^{-b\,\alpha\, y} (1-a\text{e}^{-b\,y})^{\beta-1},\ y\ge \ln a^{\frac{1}{b}},{} \end{aligned} $$
(2.1.10)

for \(a>0,\ b>0,\ \Re (\alpha )>0,\ \Re (\beta )>0\), and zero elsewhere. Again, for practical application the special case α = 1 is the most useful one. Let the density corresponding to this special case be denoted by f 14(y). Then,

$$\displaystyle \begin{aligned}f_{14}(y)=ab\beta \text{e}^{-b\,y}(1-a\text{e}^{-b\,y})^{\beta-1},\ y\ge \ln a^{\frac{1}{b}}, \end{aligned} $$
(i)

for a > 0, b > 0, β > 0, and zero elsewhere. In this case,

$$\displaystyle \begin{aligned} \mbox{Reliability function }&=Pr\{y\ge t\}=(1-a\text{e}^{-bt})^{\beta}, \end{aligned} $$
(ii)
$$\displaystyle \begin{aligned} \mbox{Hazard function }&=\frac{f_{14}(y=t)}{Pr\{y\ge t\}}=\frac{ab\beta\text{e}^{-bt}}{[1-a\text{e}^{-bt})}.\end{aligned} $$
(iii)

Now, consider exponentiating a real scalar type-2 beta random variable whose density is given in (2.1.6). For convenience, we will let the parameters be (α 1 = α and α 2 = β). Letting u 2 = eby in (2.1.6), we obtain the following density:

$$\displaystyle \begin{aligned}f_{21}(y)=a^{\alpha}b\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}\text{e}^{-b\alpha y}(1+a\text{e}^{-by})^{-(\alpha+\beta)},\ -\infty<y<\infty, {}\end{aligned} $$
(2.1.11)

for \(a>0,\ b>0,\ \Re (\alpha )>0,\ \Re (\beta )>0\), and zero elsewhere. The model in (2.1.11) is in fact the generalized logistic model introduced by Mathai and Provost (2006). For the special case α = 1, β = 1, a = 1, b = 1 in (2.1.11), we have the following density:

$$\displaystyle \begin{aligned}f_{22}(y)=\frac{\text{e}^{-y}}{(1+\text{e}^{-y})^2}=\frac{\text{e}^{y}}{(1+\text{e}^y)^2},\ -\infty<y<\infty.\end{aligned} $$
(iv)

This is the famous logistic model which is utilized in industrial applications.

2.1.6. The Student-t distribution in the real domain

A real Student-t variable with ν degrees of freedom, denoted by t ν, is defined as \(t_{\nu }=\frac {z}{\sqrt {\chi _{\nu }^2/\nu }}\) where z ∼ N 1(0, 1) and \(\chi _{\nu }^2\) is a real scalar chisquare with ν degrees of freedom, z and \(\chi _{\nu }^2\) being independently distributed. It follows from the definition of a real F m,n random variable, that \(t_{\nu }^2=\frac {z^2}{\chi _{\nu }^2/\nu }=F_{1,\nu }\), an F random variable with 1 and ν degrees of freedom. Thus, the density of \(t_{\nu }^2\) is available from that of an F 1,ν. On substituting the values m = 1, n = ν in the F-density appearing in (2.1.4), we obtain the density of t 2 = w, denoted by f w(w), as

$$\displaystyle \begin{aligned}f_w(w)=\frac{\varGamma(\frac{\nu+1}{2})}{\sqrt{\pi}\varGamma(\frac{\nu}{2})}\Big(\frac{1}{\nu}\Big)^{\frac{1}{2}} \frac{w^{\frac{1}{2}-1}}{(1+\frac{w}{\nu})^{\frac{\nu+1}{2}}},\ 0\le w<\infty,{}\end{aligned} $$
(2.1.12)

for w = t 2, ν = 1, 2, … and f w(w) = 0 elsewhere. Since w = t 2, then the part of the density for t > 0 is available from (2.1.12) by observing that \(\frac {1}{2}w^{\frac {1}{2}-1}\text{d}w=\text{d}t\) for t > 0. Hence for t > 0 that part of the Student-t density is available from (2.1.12) as

$$\displaystyle \begin{aligned}f_{1t}(t)=2\frac{\varGamma(\frac{\nu+1}{2})}{\sqrt{\pi\nu}\varGamma(\frac{\nu}{2})}(1+\frac{t^2}{\nu})^{-(\frac{\nu+1}{2})},\ 0\le t<\infty,{}\end{aligned} $$
(2.1.13)

and zero elsewhere. Since (2.1.13) is symmetric, we extend it over (−, ) and so, obtain the real Student-t density, denoted by f t(t). This is stated in the next theorem.

Theorem 2.1.6

Consider a real scalar standard normal variable z, which is divided by the square root of a real chisquare variable with ν degrees of freedom divided by its number of degrees of freedom ν, that is, \(t=\frac {z}{\sqrt {\chi _{nu}^2/\nu }}\) , where z and \(\chi _{\nu }^2\) are independently distributed; then t is known as the real scalar Student-t variable and its density is given by

$$\displaystyle \begin{aligned}f_t(t)=\frac{\varGamma(\frac{\nu+1}{2})}{\sqrt{\pi\nu}\varGamma(\frac{\nu}{2})}\Big(1+\frac{t^2}{\nu}\Big)^{-(\frac{\nu+1}{2})},\ -\infty<t<\infty,{} \end{aligned} $$
(2.1.14)

for ν = 1, 2, ….

2.1a.4. The Student-t distribution in the complex domain

Let \(\tilde {z}\sim \tilde {N}_1(0,1)\) and \(\tilde {y}\sim \tilde {\chi }_{\nu }^2\) in the complex domain or equivalently \(\tilde {y}\) is distributed as a real gamma with the parameters (α = ν, β = 1), and let these random variables be independently distributed. Then, we will define Student-t with ν degrees of freedom in the complex domain as follows:

$$\displaystyle \begin{aligned}\tilde{t}=\tilde{t}_{\nu}=\frac{|\tilde{z}|}{\sqrt{\tilde{\chi}^2_{\nu}/\nu}},\ |\tilde{z}| =(z_1^2+z_2^2)^{\frac{1}{2}},\ \tilde{z}=z_1+iz_2 \end{aligned}$$

with z 1, z 2 real and \(i=\sqrt {(-1)}\). What is then the density of \(\tilde {t}_{\nu }\)? The joint density of \(\tilde {z}\) and \(\tilde {y}\), denoted by \(\tilde {f}(\tilde {y},\tilde {z})\), is

$$\displaystyle \begin{aligned}\tilde{f}(\tilde{z},\tilde{y})\text{d}\tilde{y}\wedge\text{d}\tilde{z}=\frac{1}{\pi \varGamma(\nu)}y^{\nu-1}\text{e}^{-y-|\tilde{z}|{}^2}\text{d}\tilde{y}\wedge\text{d}\tilde{z}. \end{aligned}$$

Let \(\tilde {z}=z_1+iz_2,\ i=\sqrt {(-1)}\), where \(z_1=r\cos \theta \) and \( z_2=r\sin \theta ,\ 0\le r<\infty ,\ 0\le \theta \le 2\pi \). Then, dz 1 ∧dz 2 = r dr ∧dθ, and the joint density of r and \(\tilde {y}\), denoted by f 1(r, y), is the following after integrating out θ, observing that y has a real gamma density:

$$\displaystyle \begin{aligned}f_1(r,y)\text{d}r\wedge\text{d}y=\frac{2}{\varGamma(\nu)}y^{\nu-1}\text{e}^{-y-r^2}r\text{d}r\wedge\text{d}y. \end{aligned}$$

Let \(u=t^2=\frac {\nu r^2}{y}\) and y = w. Then, \(\text{d}u\wedge \text{d}w=\frac {2\nu r}{w}\text{d}r\wedge \text{d}y\) and so, \(r\text{d}r\wedge \text{d}y=\frac {w}{2\nu }\text{d}u\wedge \text{d}w\). Letting the joint density of u and w be denoted by f 2(u, w), we have

$$\displaystyle \begin{aligned}f_2(u,w)=\frac{1}{\nu\varGamma(\nu)}w^{\nu}\text{e}^{-(w+\frac{uw}{\nu})} \end{aligned}$$

and the marginal density of u, denoted by g(u), is as follows:

$$\displaystyle \begin{aligned}g(u)=\int_{w=0}^{\infty}f_2(u,v)\text{d}w=\int_0^{\infty}\frac{w^{\nu}}{\varGamma(\nu+1)}\text{e}^{-w(1+\frac{u}{\nu})}\text{d}w=\Big(1+\frac{u}{\nu}\Big)^{-(\nu+1)} \end{aligned}$$

for 0 ≤ u < , ν = 1, 2, …, u = t 2 and zero elsewhere. Thus the part of the density of t, for t > 0 denoted by f 1t(t) is as follows, observing that du = 2tdt for t > 0:

$$\displaystyle \begin{aligned}f_{1t}(t)=2t\Big(1+\frac{t^2}{\nu}\Big)^{-(\nu+1)}, \ 0\le t<\infty,\ \nu=1,2,... {} \end{aligned} $$
(2.1a.6)

Extending this density over the real line, we obtain the following density of \(\tilde {t}\) in the complex case:

$$\displaystyle \begin{aligned}\tilde{f}_{\nu}(t)=|t|\Big(1+\frac{t^2}{\nu}\Big)^{-(\nu+1)},\ -\infty<t<\infty,\ \nu=1,.... {} \end{aligned} $$
(2.1a.7)

Thus, the following result:

Theorem 2.1a.6

Let \(\tilde {z}\sim \tilde {N}_1(0,1), \tilde {y}\sim \tilde {\chi }_{\nu }^2\) , a scalar chisquare in the complex domain and let \(\tilde {z}\) and \(\tilde {y}\) in the complex domain be independently distributed. Consider the real variable \(t=t_{\nu }=\frac {|\tilde {z}|}{\sqrt {\tilde {y}/\nu }}\) . Then this t will be called a Student-t with ν degrees of freedom in the complex domain and its density is given by (2.1a.7).

2.1.7. The Cauchy distribution in the real domain

We have already seen a ratio distribution in Sect. 2.1.3, namely the real type-2 beta distribution and, as particular cases, the real F-distribution and the real t 2 distribution. We now consider a ratio of two independently distributed real standard normal variables. Let z 1 ∼ N 1(0, 1) and z 2 ∼ N 1(0, 1) be independently distributed. The joint density of z 1 and z 2, denoted by f(z 1, z 2), is given by

$$\displaystyle \begin{aligned}f(z_1,z_2)=\frac{1}{2\pi}\text{e}^{-\frac{1}{2}(z_1^2+z_2^2)},\ -\infty<z_j<\infty,j=1,2. \end{aligned}$$

Consider the quadrant z 1 > 0, z 2 > 0 and the transformation \(u=\frac {z_1}{z_2}, \ v=z_2\). Then dz 1 ∧dz 2 = vdu ∧dv, see Sect. 2.1.3. Note that u > 0 covers the quadrants z 1 > 0, z 2 > 0 and z 1 < 0, z 2 < 0. The part of the density of u in the quadrant u > 0, v > 0, denoted as g(u, v), is given by

$$\displaystyle \begin{aligned}g(u,v)=\frac{v}{2\pi}\text{e}^{-\frac{1}{2}v^2(1+u^2)}\end{aligned}$$

and that part of the marginal density of u, denoted by g 1(u), is

$$\displaystyle \begin{aligned}g_1(u)=\frac{1}{2\pi}\int_0^{\infty}v\text{e}^{-v^2\frac{(1+u^2)}{2}}\text{d}v=\frac{1}{2\pi (1+u^2)}. \end{aligned}$$

The other two quadrants z 1 > 0, z 2 < 0 and z 1 < 0, z 2 > 0, which correspond to u < 0, will yield the same form as above. Accordingly, the density of the ratio \(u=\frac {z_1}{z_2}\), known as the real Cauchy density, is as specified in the next theorem.

Theorem 2.1.7

Consider the independently distributed real standard normal variables z 1 ∼ N 1(0, 1) and z 2 ∼ N 1(0, 1). Then the ratio \(u=\frac {z_1}{z_2}\) has the real Cauchy distribution having the following density:

$$\displaystyle \begin{aligned}g_u(u)=\frac{1}{\pi (1+u^2)},\ -\infty<u<\infty.{}\end{aligned} $$
(2.1.15)

By integrating out in each interval (−, 0) and (0, ), with the help of a type-2 beta integral, it can be established that (2.1.15) is indeed a density. Since g u(u) is symmetric, \(Pr\{u\le 0\}=Pr\{u\ge 0\}=\frac {1}{2}\), and one could posit that the mean value of u may be zero. However, observe that

$$\displaystyle \begin{aligned}\int_0^{\infty}\frac{u}{1+u^2}\text{d}u=\frac{1}{2}\ln(1+u^2)\big|{}_0^{\infty}\to\infty. \end{aligned}$$

Thus, E(u), the mean value of a real Cauchy random variable, does not exist, which implies that the higher moments do not exist either.

Exercises 2.1

2.1.1

Consider someone throwing dart at a board to hit a point on the board. Taking this target point as the origin, consider a rectangular coordinate system. If (x, y) is a point of hit, then compute the densities of x and y under the following assumptions: (1): There is no bias in the horizontal and vertical directions or x and y are independently distributed; (2): The joint density is a function of the distance from the origin \(\sqrt {x^2+y^2}\). That is, if f 1(x) and f 2(y) are the densities of x and y then it is given that \(f_1(x)f_2(y)=g(\sqrt {x^2+y^2})\) where f 1, f 2, g are unknown functions. Show that f 1 and f 2 are identical and real normal densities.

2.1.2

Generalize Exercise 2.1.1 to 3-dimensional Euclidean space or

$$\displaystyle \begin{aligned}g(\sqrt{x^2+y^2+z^2})=f_1(x)f_2(y)f_3(z).\end{aligned}$$

2.1.3

Generalize Exercise 2.1.2 to k-space, k ≥ 3.

2.1.4

Let f(x) be an arbitrary density. Then Shannon’s measure of entropy or uncertainty is \(S=-k\int _xf(x)\ln f(x)\text{d}x\) where k is a constant. Optimize S, subject to the conditions (a): \(\int _{-\infty }^{\infty }f(x)\text{d}x=1\); (b): Condition in (a) plus \(\int _{-\infty }^{\infty }xf(x)\text{d}x= \) given quantity; (c): The conditions in (b) plus \(\int _{-\infty }^{\infty }x^2f(x)\text{d}x=\) a given quantity. Show that under (a), f is a uniform density; under (b), f is an exponential density and under (c), f is a Gaussian density. Hint: Use Calculus of Variation.

2.1.5

Let the error of measurement 𝜖 satisfy the following conditions: (1) 𝜖 = 𝜖 1 + 𝜖 2 + ⋯ or it is a sum of infinitely many infinitesimal contributions 𝜖 j’s where the 𝜖 j’s are independently distributed. (2): Suppose that 𝜖 j can only take two values δ with probability \(\frac {1}{2}\) and − δ with probability \(\frac {1}{2}\) for all j. (3): Var(𝜖) = σ 2 < . Then show that this error density is real Gaussian. Hint: Use mgf. [This is Gauss’ derivation of the normal law and hence it is called the error curve or Gaussian density also.]

2.1.6

The pathway model of Mathai (2005) has the following form in the case of real positive scalar variable x:

$$\displaystyle \begin{aligned}f_1(x)=c_1x^{\gamma}[1-a(1-q)x^{\delta}]^{\frac{1}{1-q}},q<1,0\le x\le [a(1-q)]^{-\frac{1}{\delta}},\end{aligned}$$

for δ > 0, a > 0, γ > −1 and f 1(x) = 0 elsewhere. Show that this generalized type-1 beta form changes to generalized type-2 beta form for q > 1,

$$\displaystyle \begin{aligned}f_2(x)=c_2x^{\gamma}[1+a(q-1)x^{\delta}]^{-\frac{1}{q-1}},q>1,x\ge 0,\delta>0,a>0 \end{aligned}$$

and f 2(x) = 0 elsewhere, and for q → 1, the model goes into a generalized gamma form given by

$$\displaystyle \begin{aligned}f_3(x)=c_3x^{\gamma}\text{e}^{-ax^{\delta}},a>0,\delta>0,x\ge 0 \end{aligned}$$

and zero elsewhere. Evaluate the normalizing constants c 1, c 2, c 3. All models are available either from f 1(x) or from f 2(x) where q is the pathway parameter.

2.1.7

Make a transformation x = et in the generalized gamma model of f 3(x) of Exercise 2.1.6. Show that an extreme-value density for t is available.

2.1.8

Consider the type-2 beta model

$$\displaystyle \begin{aligned}f(x)=\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}x^{\alpha-1}(1+x)^{-(\alpha+\beta)},x\ge 0, \Re(\alpha)>0,\Re(\beta)>0 \end{aligned}$$

and zero elsewhere. Make the transformation x = ey and then show that y has a generalized logistic distribution and as a particular case there one gets the logistic density.

2.1.9

Show that for 0 ≤ x < , β > 0, f(x) = c[1 + eα+βx]−1 is a density, which is known as Fermi-Dirac density. Evaluate the normalizing constant c.

2.1.10

Let f(x) = c[eα+βx − 1]−1 for 0 ≤ x < , β > 0. Show that f(x) is a density, known as Bose-Einstein density. Evaluate the normalizing constant c.

2.1.11

Evaluate the incomplete gamma integral \(\gamma (\alpha ;b)=\int _0^bx^{\alpha -1}\text{e}^{-x}\text{d}x\) and show that it can be written in terms of the confluent hypergeometric series

$$\displaystyle \begin{aligned}{{}_1F_1}(\beta;\delta;y)=\sum_{k=0}^{\infty}\frac{(\beta)_k}{(\delta)_k}\frac{y^k}{k!},\end{aligned} $$

(α)k = α(α + 1)⋯(α + k − 1), α≠0, (α)0 = 1 is the Pochhammer symbol. Evaluate the normalizing constant c if f(x) = cx α−1ex, 0 ≤ x ≤ a, α > 0 and zero elsewhere, is a density.

2.1.12

Evaluate the incomplete beta integral \(b(\alpha ;\beta ;b)=\int _0^bx^{\alpha -1}(1-x)^{\beta -1},\ \alpha >0,\ \beta >0,\ 0\le b\le 1\). Show that it is available in terms of a Gauss’ hypergeometric series of the form \({{ }_2F_1}(a,b;c;z)=\sum _{k=0}^{\infty }\frac {(a)_k(b)_k}{(c)_k}\frac {z^k}{k!},\ |z|<1.\)

2.1.13

For the pathway model in Exercise 2.1.6 compute the reliability function Pr{x ≥ t} when γ = 0 for all the cases q < 1, q > 1, q → 1.

2.1.14

Weibull density: In the generalized gamma density \(f(x)=cx^{\gamma -1}\text{e}^{-ax^{\delta }},x\ge 0,\gamma >0,a>0,\delta >0\) and zero elsewhere, if δ = γ then f(x) is called a Weibull density. For a Weibull density, evaluate the hazard function h(t) = f(t)∕Pr{x ≥ t}.

2.1.15

Consider a type-1 beta density \(f(x)=\frac {\varGamma (\alpha +\beta )}{\varGamma (\alpha )\varGamma (\beta )}x^{\alpha -1}(1-x)^{\beta -1},0\le x\le 1,\alpha >0,\beta >0\) and zero elsewhere. Let α = 1. Consider a power transformation x = y δ, δ > 0. Let this model be g(y). Compute the reliability function Pr{y ≥ t} and the hazard function h(t) = g(t)∕Pr{y ≥ t}.

2.1.16

Verify that if z is a real standard normal variable, \(E(\text{e}^{t\,z^2})=(1-2t)^{-1/2}\), t < 1∕2, which is the mgf of a chi-square random variable having one degree of freedom. Owing to the uniqueness of the mgf, this result establishes that \(z^2\sim \chi _1^2\).

2.2. Quadratic Forms, Chisquaredness and Independence in the Real Domain

Let x 1, …, x p be iid (independently and identically distributed) real scalar random variables distributed as N 1(0, 1) and X be a p × 1 vector whose components are x 1, …, x p, that is, X′ = (x 1, …, x p). Consider the real quadratic form u 1 = X′AX for some p × p real constant symmetric matrix A = A′. Then, we have the following result:

Theorem 2.2.1

The quadratic form u 1 = X′AX, A = A′, where the components of X are iid N 1(0, 1), is distributed as a real chisquare with r, r  p, degrees of freedom if and only if A is idempotent, that is, A = A 2 , and A of rank r.

Proof

When A = A′ is real, there exists an orthonormal matrix P, PP′ = I, P′P = I, such that P′AP = diag(λ 1, …, λ p), where the λ j’s are the eigenvalues of A. Consider the transformation X = PY  or Y = P′X. Then

$$\displaystyle \begin{aligned}X'AX=Y'P'APY=\lambda_1y_1^2+\lambda_2y_2^2+\cdots+\lambda_py_p^2 \end{aligned} $$
(i)

where y 1, …, y p are the components of Y  and λ 1, …, λ p are the eigenvalues of A. We have already shown in Theorem 2.1.1 that all linear functions of independent real normal variables are also real normal and hence, all the y j’s are normally distributed. The expectation of Y is E[Y ] = E[P′X] = P′E(X) = P′O = O and the covariance matrix associated with Y  is

$$\displaystyle \begin{aligned}\text{Cov}(Y)=E[Y-E(Y)][Y-E(Y)]'=E[YY']=P'\text{Cov}(X)P=P'IP=P'P=I \end{aligned}$$

which means that the y j’s are real standard normal variables that are mutually independently distributed. Hence, \(y_j^2\sim \chi _1^2\) or each \(y_j^2\) is a real chisquare with one degree of freedom each and the y j’s are all mutually independently distributed. If A = A 2 and the rank of A is r, then r of the eigenvalues of A are unities and the remaining ones are equal to zero as the eigenvalues of an idempotent matrix can only be equal to zero or one, the number of ones being equal to the rank of the idempotent matrix. Then the representation in (i) becomes sum of r independently distributed real chisquares of one degree of freedom each and hence the sum is a real chisquare of r degrees of freedom. Hence, the sufficiency of the result is proved. For the necessity, we assume that \(X'AX\sim \chi _r^2\) and we must prove that A = A 2 and A is of rank r. Note that it is assumed throughout that A = A′. If X′AX is a real chisquare having r degrees of freedom, then the mgf of u 1 = X′AX is given by \(M_{u_1}(t)=(1-2t)^{-\frac {r}{2}}\). From the representation given in (i), the mgf’s are as follows: \(M_{y_j^2}(t)=(1-2t)^{-\frac {1}{2}}\Rightarrow M_{\lambda _jy_j^2}(t)=(1-2\lambda _jt)^{-\frac {1}{2}},j=1,\ldots ,p\), the y j’s being independently distributed. Thus, the mgf of the right-hand side of (i) is \(M_{u_1}(t)=\prod _{j=1}^p(1-2\lambda _jt)^{-\frac {1}{2}}\). Hence, we have

$$\displaystyle \begin{aligned}(1-2t)^{-\frac{r}{2}}=\prod_{j=1}^p(1-2\lambda_jt)^{-\frac{1}{2}},\ 1-2t>0,\ 1-2\lambda_jt>0,\ j=1,\ldots,p. \end{aligned} $$
(ii)

Taking the natural logarithm of each side of (ii), expanding the terms and then comparing the coefficients of \(\frac {(2t)^n}{n}\) on both sides for n = 1, 2, …, we obtain equations of the type

$$\displaystyle \begin{aligned}r=\sum_{j=1}^p\lambda_j=\sum_{j=1}^p\lambda_j^2= \sum_{j=1}^p\lambda_j^3=\cdots\ \end{aligned} $$
(iii)

The only solution resulting from (iii) is that r of the λ j’s are unities and the remaining ones are zeros. This result, combined with the property that A = A′ guarantees that A is idempotent of rank r.

Observe that the eigenvalues of a matrix being ones and zeros need not imply that the matrix is idempotent; take for instance triangular matrices whose diagonal elements are unities and zeros. However, this property combined with the symmetry assumption will guarantee that the matrix is idempotent.

Corollary 2.2.1

If the simple random sample or the iid variables came from a real N 1(0, σ 2) distribution, then the modification needed in Theorem 2.2.1 is that \(\frac {1}{\sigma ^2}X'AX\sim \chi _r^2,\ A=A'\) , if and only if A = A 2 and A is of rank r.

The above result, Theorem 2.2.1, coupled with another result on the independence of quadratic forms, are quite useful in the areas of Design of Experiment, Analysis of Variance and Regression Analysis, as well as in model building and hypotheses testing situations. This result on the independence of quadratic forms is stated next.

Theorem 2.2.2

Let x 1, …, x p be iid variables from a real N 1(0, 1) population. Consider two real quadratic forms u 1 = X′AX, A = A′ and u 2 = X′BX, B = B′, where the components of the p × 1 vector X are the x 1, …, x p . Then, u 1 and u 2 are independently distributed if and only if AB = O.

Proof

Let us assume that AB = O. Then AB = O = O′ = (AB) = B′A′ = BA. When AB = BA, there exists a single orthonormal matrix P, PP′ = I, P′P = I, such that both the quadratic forms are reduced to their canonical forms by the same P. Let

$$\displaystyle \begin{aligned}u_1=X'AX=\lambda_1y_1^2+\cdots+\lambda_py_p^2\end{aligned} $$
(i)

and

$$\displaystyle \begin{aligned}u_2=X'BX=\nu_1y_1^2+\cdots+\nu_py_p^2\end{aligned} $$
(ii)

where λ 1, …, λ p are the eigenvalues of A and ν 1, …, ν p are the eigenvalues of B. Since A = A′, the eigenvalues λ j’s are all real. Moreover,

(iii)

which means that λ j ν j = 0 for all j = 1, …, p. Thus, whenever a λ j is not zero, the corresponding ν j is zero and vice versa. Accordingly, the λ j’s and ν j’s are separated in (i) and (ii), that is, the independent components are mathematically separated and hence u 1 and u 2 are statistically independently distributed. The converse which can be stated as follows: if u 1 and u 2 are independently distributed, A = A′, B = B′ and the x j’s are real iid N 1(0, 1), then AB = O, is more difficult to establish. The proof which requires additional properties of matrices, will not be herein presented. Note that there are several incorrect or incomplete “proofs” in the literature. A correct derivation may be found in Mathai and Provost (1992).

When x 1, …, x p are iid N 1(0, σ 2), the above result on the independence of quadratic forms still holds since the independence is not altered by multiplying the quadratic forms by \(\frac {1}{\sigma ^2}\).

Example 2.2.1

Construct two 3 × 3 matrices A and B such that A = A′, B = B′ [both are symmetric], A = A 2 [A is idempotent], AB = O [A and B are orthogonal to each other], and A has rank 2. Then (1): verify Theorem 2.2.1; (2): verify Theorem 2.2.2.

Solution 2.2.1

Consider the following matrices:

Note that both A and B are symmetric, that is, A = A′, B = B′. Further, the rank of A is 2 since the first and second row vectors are linearly independent and the third row is a multiple of the first one. Note that A 2 = A and AB = O. Now, consider the quadratic forms u = X′AX and v = X′BX. Then \(u=\frac {1}{2}x_1^2+x_2^2+\frac {1}{2}x_3^2-x_1x_3=x_2^2+[\frac {1}{\sqrt {2}}(x_1-x_3)]^2\). Our initial assumption is that x j ∼ N 1(0, 1), j = 1, 2, 3 and the x j’s are independently distributed. Let \(y_1=\frac {1}{\sqrt {2}}(x_1-x_3)\). Then, \(E[y_1]=0, \text{Var}(y_1)=+\frac {1}{2}[\text{Var}(x_1)+\text{Var}(x_3)]=\frac {1}{2}[1+1]=1\). Since y 1 is a linear function of normal variables, y 1 is normal with the parameters E[y 1] = 0 and Var(y 1) = 1, that is, y 1 ∼ N 1(0, 1), and hence \(y_1^2\sim \chi ^2_1\); as well, \(x_2^2\sim \chi ^2_1\). Thus, \(u\sim \chi ^2_2\) since x 2 and y 1 are independently distributed given that the variables are separated, noting that y 1 does not involve x 2. This verifies Theorem 2.2.1. Now, having already determined that AB = O, it remains to show that u and v are independently distributed where \(v=X'BX=x_1^2+x_3^2+2x_1x_3=(x_1+x_3)^2\). Let \(y_2=\frac {1}{\sqrt {2}}(x_1+x_3)\Rightarrow y_2\sim N_1(0,1)\) as y 2 is a linear function of normal variables and hence normal with parameters E[y 2] = 0 and Var(y 2) = 1. On noting that v does not contain x 2, we need only consider the parts of u and v containing x 1 and x 3. Thus, our question reduces to: are y 1 and y 2 independently distributed? Since both y 1 and y 2 are linear functions of normal variables, both y 1 and y 2 are normal. Since the covariance between y 1 and y 2, that is, \(\text{Cov}(y_1,y_2)=\frac {1}{2}\text{Cov}(x_1-x_3,x_1+x_3)=\frac {1}{2}[\text{Var}(x_1)-\text{Var}(x_3)]=\frac {1}{2}[1-1]=0,\) the two normal variables are uncorrelated and hence, independently distributed. That is, y 1 and y 2 are independently distributed, thereby implying that u and v are also independently distributed, which verifies Theorem 2.2.2.

2.2a. Hermitian Forms, Chisquaredness and Independence in the Complex Domain

Let \(\tilde {x}_1,\tilde {x}_2,\ldots ,\tilde {x}_k\) be independently and identically distributed standard univariate Gaussian variables in the complex domain and let \(\tilde {X}\) be a k × 1 vector whose components are \(\tilde {x}_1,\ldots ,\tilde {x}_k\). Consider the Hermitian form \(\tilde {X}^{*}A\tilde {X},\ A=A^{*}\) (Hermitian) where A is a k × k constant Hermitian matrix. Then, there exists a unitary matrix Q, QQ  = I, Q Q = I, such that Q AQ = diag(λ 1, …, λ k). Note that the λ j’s are real since A is Hermitian. Consider the transformation \(\tilde {X}=Q\tilde {Y}\). Then,

$$\displaystyle \begin{aligned}\tilde{X}^{*}A\tilde{X}=\lambda_1|\tilde{y}_1|{}^2+\cdots+\lambda_k|\tilde{y}_k|{}^2\end{aligned} $$

where the \(\tilde {y}_j\)’s are iid standard normal in the complex domain, \(\tilde {y}_j\sim \tilde {N}_1(0,1),j=1,\ldots ,k\). Then, \(\tilde {y}_j^{*}\tilde {y}_j=|\tilde {y}_j|{ }^2\sim \tilde {\chi }^2_1\), a chisquare having one degree of freedom in the complex domain or, equivalently, a real gamma random variable with the parameters (α = 1, β = 1), the \(|\tilde {y}_j|{ }^2\)’s being independently distributed for j = 1, …, k. Thus, we can state the following result whose proof parallels that in the real case.

Theorem 2.2a.1

Let \(\tilde {x}_1,\ldots ,\tilde {x}_k\) be iid \(\tilde {N}_1(0,1)\) variables in the complex domain. Consider the Hermitian form \(u=\tilde {X}^{*}A\tilde {X}, \ A=A^{*}\) where \(\tilde {X}\) is a k × 1 vector whose components are \(\tilde {x}_1,\ldots ,\tilde {x}_k\) . Then, u is distributed as a chisquare in the complex domain with r degrees of freedom or a real gamma with the parameters (α = r, β = 1), if and only if A is of rank r and A = A 2.

Theorem 2.2a.2

Let the \(\tilde {x}_j\) ’s and \(\tilde {X}\) be as in Theorem 2.2a.1 . Consider two Hermitian forms \(u_1=\tilde {X}^{*}A\tilde {X}, A=A^{*}\) and \(u_2=\tilde {X}^{*}B\tilde {X}, B=B^{*}\) . Then, u 1 and u 2 are independently distributed if and only if AB = O (null matrix).

Example 2.2a.1

Construct two 3 × 3 Hermitian matrices A and B, that is A = A , B = B , such that A = A 2 [idempotent] and is of rank 2 with AB = O. Then (1): verify Theorems 2.2a.1, (2): verify Theorem 2.2a.2.

Solution 2.2a.1

Consider the following matrices

It can be readily verified that A = A , B = B , A = A 2, AB = O. Further, on multiplying the first row of A by \(-\frac {2(1-i)}{\sqrt {8}}\), we obtain the third row, and since the third row is a multiple of the first one and the first and second rows are linearly independent, the rank of A is 2. Our initial assumption is that \(\tilde {x}_j\sim \tilde {N}_1(0,1),\ j=1,2,3\), that is, they are univariate complex Gaussian, and they are independently distributed. Then, \(\tilde {x}_j^{*}\tilde {x}_j\sim \tilde {\chi }^2_1\), a chisquare with one degree of freedom in the complex domain or a real gamma random variable with the parameters (α = 1, β = 1) for each j = 1, 2, 3. Let us consider the Hermitian forms \(u=\tilde {X}^{*}A\tilde {X}\) and \(v=\tilde {X}^{*}B\tilde {X},\ \tilde {X}'=(\tilde {x}_1,\tilde {x}_2,\tilde {x}_3)\). Then

$$\displaystyle \begin{aligned} u&=\frac{1}{2}\tilde{x}_1^{*}\tilde{x}_1-\frac{(1+i)}{\sqrt{8}}\tilde{x}_1^{*}\tilde{x}_3-\frac{(1-i)}{\sqrt{8}}\tilde{x}_3^{*}\tilde{x}_1 +\frac{1}{2}\tilde{x}_3^{*}\tilde{x}_3+\tilde{x}_2^{*}\tilde{x}_2\\ &=\tilde{x}_2^{*}\tilde{x}_2+\frac{1}{2}[\tilde{x}_1^{*}\tilde{x}_1-4\frac{(1+i)}{\sqrt{8}}\tilde{x}_1^{*}\tilde{x}_3+\tilde{x}_3^{*}\tilde{x}_3]\\ &=\tilde{\chi}^2_1+\Big[\frac{1}{\sqrt{2}}(\tilde{y}_1-\tilde{x}_3)]^{*}[\frac{1}{\sqrt{2}}(\tilde{y}_1-\tilde{x}_3)\Big]\end{aligned} $$
(i)

where

$$\displaystyle \begin{aligned} \tilde{y}_1&=2\frac{(1+i)}{\sqrt{8}}\tilde{x}_1\Rightarrow E[\tilde{y}_1]=0, \ \text{Var}(\tilde{y}_1)=\Big|2\frac{(1+i)}{\sqrt{8}}\Big|{}^2\text{Var}(\tilde{x}_1)\\ \text{Var}(\tilde{y}_1)&=E\Big\{\Big[2\frac{(1+i)}{\sqrt{8}}\Big]^{*}\Big[2 \frac{(1+i)}{\sqrt{8}}\Big]\tilde{x}_1^{*}\tilde{x}_1\Big\}=E\{\tilde{x}_1^{*} \tilde{x}_1\}=\text{Var}(\tilde{x}_1)=1.\end{aligned} $$
(ii)

Since \(\tilde {y}_1\) is a linear function of \(\tilde {x}_1\), it is a univariate normal in the complex domain with parameters 0 and 1 or \(\tilde {y}_1\sim \tilde {N}_1(0,1)\). The part not containing the \(\tilde {\chi }^2_1\) in (i) can be written as follows:

$$\displaystyle \begin{aligned}\Big[\frac{1}{\sqrt{2}}(\tilde{y}_1-\tilde{x}_3)\Big]^{*}\Big[\frac{1}{\sqrt{2}}(\tilde{y}_1-\tilde{x}_3)\Big]\sim\tilde{\chi}^2_1 \end{aligned} $$
(iii)

since \(\tilde {y}_1-\tilde {x}_3\sim \tilde {N}_1(0,2)\) as \(\tilde {y}_1-\tilde {x}_3\) is a linear function of the normal variables \(\tilde {y}_1\) and \(\tilde {x}_3\). Therefore \(u=\tilde {\chi }^2_1+\tilde {\chi }^2_1=\tilde {\chi }^2_2\), that is, a chisquare having two degrees of freedom in the complex domain or a real gamma with the parameters (α = 2, β = 1). Observe that the two chisquares are independently distributed because one of them contains only \(\tilde {x}_2\) and the other, \(\tilde {x}_1\) and \(\tilde {x}_3\). This establishes (1). In order to verify (2), we first note that the Hermitian form v can be expressed as follows:

$$\displaystyle \begin{aligned}v=\frac{1}{2}\tilde{x}_1^{*}\tilde{x}_1+\frac{(1+i)}{\sqrt{8}}\tilde{x}_1^{*}\tilde{x}_3 +\frac{(1-i)}{\sqrt{8}}\tilde{x}_3^{*}\tilde{x}_1+\frac{1}{2}\tilde{x}_3^{*}\tilde{x}_3 \end{aligned}$$

which can be written in the following form by making use of steps similar to those leading to (iii):

$$\displaystyle \begin{aligned}v=\Big[2\frac{(1+i)}{\sqrt{8}}\frac{\tilde{x}_1}{\sqrt{2}}+\frac{\tilde{x}_3}{\sqrt{2}}\Big]^{*} \Big[2\frac{(1+i)}{\sqrt{8}}\frac{\tilde{x}_1}{\sqrt{2}} +\frac{\tilde{x}_3}{\sqrt{2}}\Big] =\tilde{\chi}^2_1 \end{aligned} $$
(iv)

or v is a chisquare with one degree of freedom in the complex domain. Observe that \(\tilde {x}_2\) is absent in (iv), so that we need only compare the terms containing \(\tilde {x}_1\) and \(\tilde {x}_3\) in (iii) and (iv). These terms are \(\tilde {y}_2=2\frac {(1+i)}{\sqrt {8}}\tilde {x}_1+\tilde {x}_3\) and \(\tilde {y}_3=2\frac {(1+i)}{\sqrt {8}}\tilde {x}_1-\tilde {x}_3\). Noting that the covariance between \(\tilde {y}_2\) and \(\tilde {y}_3\) is zero:

$$\displaystyle \begin{aligned}\text{Cov}(\tilde{y}_2,\tilde{y}_3)=\Big|2\frac{(1+i)}{\sqrt{8}}\Big|{}^2\text{Var}(\tilde{x}_1)-\text{Var}(\tilde{x}_3)=1-1=0, \end{aligned}$$

and that \(\tilde {y}_2\) and \(\tilde {y}_3\) are linear functions of normal variables and hence normal, the fact that they are uncorrelated implies that they are independently distributed. Thus, u and v are indeed independently distributed, which establishes (2).

2.2.1. Extensions of the results in the real domain

Let \(x_j\sim N_1(\mu _j,\sigma _j^2),\ j=1,\ldots ,k,\) be independently distributed. Then, \(\frac {x_j}{\sigma _j}\sim N_1(\frac {\mu _j}{\sigma _j},1),\ \sigma _j>0,\ j=1,\ldots ,k\). Let

Then, let

If μ = O, it has already been shown that \(Y'Y\sim \chi _k^2\). If μO, then \(Y'Y\sim \chi _k^2(\lambda ), \ \lambda =\frac {1}{2}\mu '\varSigma ^{-1}\mu \). It is assumed that the noncentral chisquare distribution has already been discussed in a basic course in Statistics. It is defined for instance in Mathai and Haubold (2017a, 2017b) and will be briefly discussed in Sect. 2.3.1. If μ = O, then for any k × k symmetric matrix \(A=A',\ Y'AY\sim \chi _r^2\) if and only if A = A 2 and A is of rank r. This result has already been established. Now, if μ = O, then \(X'AX=Y'\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}Y\sim \chi _k^2\) if and only if \(\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}=\varSigma ^{\frac {1}{2}}A\varSigma A\varSigma ^{\frac {1}{2}}\Rightarrow A=A\varSigma A\) and \(\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}\) is of rank r or A is of rank r since Σ > O. Hence, we have the following result:

Theorem 2.2.3

Let the real scalars \(x_j\sim N_1(\mu _j,\sigma _j^2),\ j=1,\ldots ,k,\) be independently distributed. Let

Then for any k × k symmetric matrix A = A′,

$$\displaystyle \begin{aligned}X'AX=Y'\varSigma^{\frac{1}{2}}A\varSigma^{\frac{1}{2}}Y\sim\begin{cases}\chi_r^2\mathit{\mbox{ if }}\mu=O\\ \chi_k^2(\lambda)\mathit{\mbox{ if }}\mu\ne O,\ \lambda=\frac{1}{2}\mu'\varSigma^{-\frac{1}{2}}A\varSigma^{-\frac{1}{2}}\mu\end{cases}\end{aligned} $$

if and only if A = AΣA and A is of rank r.

Independence is not altered if the variables are relocated. Consider two quadratic forms X′AX and X′BX, A = A′, B = B′. Then, \(X'AX=Y'\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}Y\) and \(X'BX=Y'\varSigma ^{\frac {1}{2}}B\varSigma ^{\frac {1}{2}}Y,\) and we have the following result:

Theorem 2.2.4

Let x j, X, Y, Σ be as defined in Theorem 2.2.3 . Then, the quadratic forms \(X'AX=Y'\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}Y\) and \(X'BX=Y'\varSigma ^{\frac {1}{2}}B\varSigma ^{\frac {1}{2}}Y,\ A=A',\ B=B'\) , are independently distributed if and only if AΣB = O.

Let X, A and Σ be as defined in Theorem 2.2.3, Z be a standard normal vector whose components z i, i = 1, …, k are iid N 1(0, 1), and P be an orthonormal matrix such that \(P'\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}P=\text{diag}(\lambda _1,\ldots ,\lambda _k)\); then, a general quadratic form X′AX can be expressed as follows:

$$\displaystyle \begin{aligned}X'AX=(Z'\varSigma^{\frac{1}{2}}+\mu')A(\varSigma^{\frac{1}{2}}Z+\mu)= (Z+\varSigma^{-\frac{1}{2}}\mu)'PP'\varSigma^{\frac{1}{2}}A\varSigma^{\frac{1}{2}}PP'(Z+\varSigma^{-\frac{1}{2}}\mu) \end{aligned}$$

where λ 1, …, λ k are the eigenvalues of \(\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}\). Hence, the following decomposition of the quadratic form:

$$\displaystyle \begin{aligned}X'AX=\lambda_1(u_1+b_1)^2+\cdots+\lambda_k(u_k+b_k)^2, {}\end{aligned} $$
(2.2.1)

where , , and hence the u i’s are iid N 1(0, 1).

Thus, XAX can be expressed as a linear combination of independently distributed non-central chisquare random variables, each having one degree of freedom, whose non-centrality parameters are respectively \({b_j^2}/{2},\ j=1,\ldots ,k\). Of course, the k chisquares will be central when μ = O.

2.2a.1. Extensions of the results in the complex domain

Let the complex scalar variables \(\tilde {x}_j\sim \tilde {N}_1(\tilde {\mu }_j,\sigma _j^2),\ j=1,\ldots ,k\), be independently distributed and \(\varSigma =\text{diag}(\sigma _1^2,\ldots ,\sigma _k^2)\). As well, let

where \(\varSigma ^{\frac {1}{2}}\) is the Hermitian positive definite square root of Σ. In this case, \(\tilde {y}_j\sim \tilde {N}_1(\frac {\tilde {\mu }_j}{\sigma _j},1),\ j=1,\ldots ,k\) and the \(\tilde {y}_j\)’s are assumed to be independently distributed. Hence, for any Hermitian form \(\tilde {X}^{*}A\tilde {X}, \ A=A^{*}\), we have \(\tilde {X}^{*}A\tilde {X}=\tilde {Y}^{*}\varSigma ^{\frac {1}{2}}A\varSigma ^{\frac {1}{2}}\tilde {Y}\). Hence if \(\tilde {\mu }=O\) (null vector), then from the previous result on chisquaredness, we have:

Theorem 2.2a.3

Let \(\tilde {X},\ \varSigma ,\tilde {Y},\ \tilde {\mu }\) be as defined above. Let \(u=\tilde {X}^{*}A\tilde {X},\ A=A^{*}\) be a Hermitian form. Then \(u\sim \tilde {\chi }^2_r\) in the complex domain if and only if A is of rank r, \(\tilde {\mu }=O\) and A = AΣA. [A chisquare with r degrees of freedom in the complex domain is a real gamma with parameters (α = r, β = 1).]

If \(\tilde {\mu }\ne O\), then we have a noncentral chisquare in the complex domain. A result on the independence of Hermitian forms can be obtained as well.

Theorem 2.2a.4

Let \(\tilde {X},\ \tilde {Y},\ \varSigma \) be as defined above. Consider the Hermitian forms \(u_1=\tilde {X}^{*}A\tilde {X}, \ A=A^{*}\) and \( u_2=\tilde {X}^{*}B\tilde {X},\ B=B^{*}\) . Then u 1 and u 2 are independently distributed if and only if AΣB = O.

The proofs of Theorems 2.2a.3 and 2.2a.4 parallel those presented in the real case and are hence omitted. Exercises 2.2

2.2.1

Give a proof to the second part of Theorem 2.2.2, namely, given that X′AX, A = A′ and X′BX, B = B′ are independently distributed where the components of the p × 1 vector X are mutually independently distributed as real standard normal variables, then show that AB = O.

2.2.2

Let the real scalar x j ∼ N 1(0, σ 2), σ 2 > 0, j = 1, 2, …, k and be independently distributed. Let X′ = (x 1, …, x k) or X is the k × 1 vector where the elements are x 1, …, x k. Then the joint density of the real scalar variables x 1, …, x k, denoted by f(X), is

$$\displaystyle \begin{aligned}f(X)=\frac{1}{(\sqrt{2\pi})^k}\text{e}^{-\frac{1}{2\sigma^2}X'X},-\infty<x_j<\infty,j=1,\ldots,k. \end{aligned}$$

Consider the quadratic form u = X′AX, A = A′ and X is as defined above. (1): Compute the mgf of u; (2): Compute the density of u if A is of rank r and all eigenvalues of A are equal to λ > 0; (3): If the eigenvalues are λ > 0 for m of the eigenvalues and the remaining n of them are λ < 0, m + n = r, compute the density of u.

2.2.3

In Exercise 2.2.2 compute the density of u if (1): r 1 of the eigenvalues are λ 1 each and r 2 of the eigenvalues are λ 2 each, r 1 + r 2 = r. Consider all situations λ 1 > 0, λ 2 > 0 etc.

2.2.4

In Exercise 2.2.2 compute the density of u for the general case with no restrictions on the eigenvalues.

2.2.5

Let x j ∼ N 1(0, σ 2), j = 1, 2 and be independently distributed. Let X′ = (x 1, x 2). Let u = X′AX where A = A′. Compute the density of u if the eigenvalues of A are (1): 2 and 1, (2): 2 and − 1; (3): Construct a real 2 × 2 matrix A = A′ where the eigenvalues are 2 and 1.

2.2.6

Show that the results on chisquaredness and independence in the real or complex domain need not hold if AA , BB .

2.2.7

Construct a 2 × 2 Hermitian matrix A = A such that A = A 2 and verify Theorem 2.2a.3. Construct 2 × 2 Hermitian matrices A and B such that AB = O, and then verify Theorem 2.2a.4.

2.2.8

Let \(\tilde {x}_1,\ldots ,\tilde {x}_m\) be a simple random sample of size m from a complex normal population \(\tilde {N}_1(\tilde {\mu }_1,\sigma _1^2)\). Let \(\tilde {y}_1,\ldots ,\tilde {y}_n\) be iid \(\tilde {N}(\tilde {\mu }_2,\sigma _2^2)\). Let the two complex normal populations be independent. Let

$$\displaystyle \begin{aligned} s_1^2&=\sum_{j=1}^m(\tilde{x}_j-\bar{\tilde{x}})^{*}(\tilde{x}_j-\bar{\tilde{x}})/\sigma_1^2, s_2^2=\sum_{j=1}^n( \tilde{y}_j-\bar{\tilde{y}})^{*}(\tilde{y}_j-\bar{\tilde{y}})/\sigma_2^2,\\ s_{11}^2&=\frac{1}{\sigma_1^2}\sum_{j=1}^m(\tilde{x}_j-\tilde{\mu}_1)^{*}(\tilde{x_j}-\tilde{\mu}_1), s_{21}^2=\frac{1}{\sigma_2^2}\sum_{j=1}^n(\tilde{y}_j-\tilde{\mu}_2)^{*}(\tilde{y}_j-\tilde{\mu}_2)\end{aligned} $$

Then, show that

$$\displaystyle \begin{aligned}\frac{s_{11}^2/m}{s_{21}^2/n}\sim\tilde{F}_{m,n}, \frac{s_1^2/(m-1)}{s_2^2/(n-1)}\sim\tilde{F}_{m-1,n-1}\end{aligned}$$

for \(\sigma _1^2=\sigma _2^2\).

2.2.9

In Exercise 2.2.8 show that \(\frac {s_{11}^2}{s_{21}^2}\) is a type-2 beta with the parameters m and n, and \(\frac {s_1^2}{s_2^2}\) is a type-2 beta with the parameters m − 1 and n − 1 for \(\sigma _1^2=\sigma _2^2\).

2.2.10

In Exercise 2.2.8 if \(\sigma _1^2=\sigma _2^2=\sigma ^2\) then show that

$$\displaystyle \begin{aligned}\frac{1}{\sigma^2}\Big[\sum_{j=1}^m(\tilde{x}_j-\bar{\tilde{x}})^{*}(\tilde{x}_j-\bar{\tilde{x}}) +\sum_{j=1}^n(\tilde{y}_j-\bar{\tilde{y}})^{*}(\tilde{y}_j-\bar{\tilde{y}})\Big]\sim\tilde{\chi}^2_{m+n-2}.\end{aligned}$$

2.2.11

In Exercise 2.2.10 if \(\bar {\tilde {x}}\) and \(\bar {\tilde {y}}\) are replaced by \(\tilde {\mu }_1\) and \(\tilde {\mu }_2\) respectively then show that the degrees of freedom of the chisquare is m + n.

2.2.12

Derive the representation of the general quadratic form X’AX given in (2.2.1).

2.3. Simple Random Samples from Real Populations and Sampling Distributions

For practical applications, an important result is that on the independence of the sample mean and sample variance when the sample comes from a normal (Gaussian) population. Let x 1, …, x n be a simple random sample of size n from a real \(N_1(\mu _1,\sigma _1^2)\) or, equivalently, x 1, …, x n are iid \(N_1(\mu _1,\sigma _1^2)\). Recall that we have established that any linear function L′X = X′L, L′ = (a 1, …, a n), X′ = (x 1, …, x n) remains normally distributed (Theorem 2.1.1). Now, consider two linear forms \(y_1=L_1^{\prime }X ,\ y_2=L_2^{\prime }X,\) with \( L_1^{\prime }=(a_1,\ldots ,a_n),\ L_2^{\prime }=(b_1,\ldots ,b_n)\) where a 1, …, a n, b 1, …, b n are real scalar constants. Let us examine the conditions that are required for assessing the independence of the linear forms y 1 and y 2. Since x 1, …, x n are iid, we can determine the joint mgf of x 1, …, x n. We take a n × 1 parameter vector T, T′ = (t 1, …, t n) where the t j’s are scalar parameters. Then, by definition, the joint mgf is given by

$$\displaystyle \begin{aligned}E[\text{e}^{T'X}]=\prod_{j=1}^nM_{x_j}(t_j)=\prod_{j=1}^n\text{e}^{t_j\mu_1+\frac{1}{2}t_j^2\sigma_1^2}=\text{e}^{\mu_1T'J+\frac{\sigma_1^2}{2}T'T}{}\end{aligned} $$
(2.3.1)

since the x j’s are iid, J′ = (1, …, 1). Since every linear function of x 1, …, x n is a univariate normal, we have \(y_1\sim N_1(\mu _1L_1^{\prime }J,\frac {\sigma _1^2t_1^2}{2}L_1^{\prime }L_1)\) and hence the mgf of y 1, taking t 1 as the parameter for the mgf, is \(M_{y_1}(t_1)=\text{e}^{t_1\mu _1L_1^{\prime }J+\frac {\sigma _1^2t_1^2}{2}L_1^{\prime }L_1}.\) Now, let us consider the joint mgf of y 1 and y 2 taking t 1 and t 2 as the respective parameters. Let the joint mgf be denoted by \(M_{y_1,y_2}(t_1,t_2)\). Then,

$$\displaystyle \begin{aligned} M_{y_1,y_2}(t_1,t_2)&=E[\text{e}^{t_1y_1+t_2y_2}]=E[\text{e}^{(t_1L_1^{\prime}+t_2L_2^{\prime})X}]\\ &=\text{e}^{\mu_1(t_1L_1^{\prime}+t_2L_2^{\prime})J+\frac{\sigma_1^2}{2}(t_1L_1+t_2L_2)'(t_1L_1+t_2L_2)}\\ &=\text{e}^{\mu_1(L_1^{\prime}+L_2^{\prime})J+\frac{\sigma_1^2}{2}(t_1^2L_1^{\prime}L_1+t_2^2L_2^{\prime}L_2+2t_1t_2L_1^{\prime}L_2)}\\ &=M_{y_1}(t_1)M_{y_2}(t_2)\text{e}^{{\sigma_1^2}t_1t_2L_1^{\prime}L_2}.\end{aligned} $$

Hence, the last factor on the right-hand side has to vanish for y 1 and y 2 to be independently distributed, and this can happen if and only if \(L_1^{\prime }L_2=L_2^{\prime }L_1=0\) since t 1 and t 2 are arbitrary. Thus, we have the following result:

Theorem 2.3.1

Let x 1, …, x n be iid \(N_1(\mu _1,\sigma _1^2)\) . Let \(y_1=L_1^{\prime }X\) and \( y_2=L_2^{\prime }X\) where X′ = (x 1, …, x n), \(L_1^{\prime }=(a_1,\ldots ,a_n)\) and \( L_2^{\prime }=(b_1,\ldots ,b_n),\) the a j ’s and b j ’s being scalar constants. Then, y 1 and y 2 are independently distributed if and only if \(L_1^{\prime }L_2=L_2^{\prime }L_1=0\).

Example 2.3.1

Let x 1, x 2, x 3, x 4 be a simple random sample of size 4 from a real normal population N 1(μ = 1, σ 2 = 2). Consider the following statistics: (1): u 1, v 1, w 1, (2): u 2, v 2, w 2. Check for independence of various statistics in (1): and (2): where

$$\displaystyle \begin{aligned} u_1&=\bar{x}=\frac{1}{4}(x_1+x_2+x_3+x_4),\ v_1=2x_1-3x_2+x_3+x_4,\ w_1=x_1-x_2+x_3-x_4;\\ u_2&=\bar{x}=\frac{1}{4}(x_1+x_2+x_3+x_4),\ v_2=x_1-x_2+x_3-x_4, \ w_2=x_1-x_2-x_3+x_4.\end{aligned} $$

Solution 2.3.1

Let X′ = (x 1, x 2, x 3, x 4) and let the coefficient vectors in (1) be denoted by L 1, L 2, L 3 and those in (2) be denoted by M 1, M 2, M 3. Thus they are as follows :

This means that u 1 and w 1 are independently distributed and that the other pairs are not independently distributed. The coefficient vectors in (2) are

This means that all the pairs are independently distributed, that is, u 2, v 2 and w 2 are mutually independently distributed.

We can extend Theorem 2.3.1 to sets of linear functions. Let Y 1 = AX and Y 2 = BX where A of dimension m 1 × n, m 1 ≤ n and B of dimension m 2 × n, m 2 ≤ n are constant matrices and X′ = (x 1, …, x n) where the x j’s are iid \(N_1(\mu _1,\sigma _1^2)\). Let the parameter vectors T 1 and T 2 be of dimensions m 1 × 1 and m 2 × 1, respectively. Then, the mgf of Y 1 is \(M_{Y_1}(T_1)=E[\text{e}^{T_1^{\prime }Y_1}]=E[\text{e}^{T_1^{\prime }A_1X}]\), which can be evaluated by integration over the joint density of x 1, …, x n, individually, or over the vector X′ = (x 1, …, x n) with E[X′] = [μ 1, μ 1, …, μ 1] = μ 1[1, 1, …, 1] = μ 1 J′, J′ = [1, …, 1] ⇒ E[Y 1] = μ 1 A 1 J. The mgf of Y 1 is then

$$\displaystyle \begin{aligned}M_{Y_1}(T_1)=E[\text{e}^{\mu_1T_1^{\prime}A_1J+T_1^{\prime}A_1[X-E(X)]}]=E[\text{e}^{\mu_1T_1^{\prime}A_1J+T_1^{\prime}A_1Z}], \ Z=X-E(X), \end{aligned} $$
(i)

and the exponent in the expected value, not containing μ 1, simplifies to

$$\displaystyle \begin{aligned}-\frac{1}{2\sigma_1^2}\{Z'Z-2\sigma_1^2T_1^{\prime}A_1Z\}=-\frac{1}{2\sigma_1^2}\{(Z'-\sigma_1T_1^{\prime}A_1)(Z-\sigma_1 A_1^{\prime}T_1)-\sigma_1^2T_1^{\prime}A_1A_1^{\prime}T_1\}. \end{aligned}$$

Integration over Z or individually over the elements of Z, that is, z 1, …, z n, yields 1 since the total probability is 1, which leaves the factor not containing Z. Thus,

$$\displaystyle \begin{aligned} M_{Y_1}(T_1)=\text{e}^{\mu_1T_1^{\prime}A_1J+\frac{1}{2}T_1^{\prime}A_1A_1^{\prime}T_1}, \end{aligned} $$
(ii)

and similarly,

$$\displaystyle \begin{aligned}M_{Y_2}(T_2)=\text{e}^{\mu_1T_2^{\prime}A_2J+\frac{1}{2}T_2^{\prime}A_2A_2^{\prime}T_2}. \end{aligned} $$
(iii)

The joint mgf of Y 1 and Y 2 is then

$$\displaystyle \begin{aligned} M_{Y_1,Y_2}(T_1,T_2)&=\text{e}^{\mu_1(T_1^{\prime}A_1J+T_2^{\prime}A_2J)+\frac{1}{2}(T_1^{\prime}A_1+T_2^{\prime}A_2)(T_1^{\prime}A_1+T_2^{\prime}A_2)'}\\ &=M_{Y_1}(T_1)M_{Y_2}(T_2)\text{e}^{T_1^{\prime}A_1A_2^{\prime}T_2}.\end{aligned} $$
(iv)

Accordingly, Y 1 and Y 2 will be independent if and only if \(A_1A_2^{\prime }=O\Rightarrow A_2A_1^{\prime }=O\) since T 1 and T 2 are arbitrary parameter vectors, the two null matrices having different orders. Then, we have

Theorem 2.3.2

Let Y 1 = A 1 X and Y 2 = A 2 X, with X′ = (x 1, …, x n), the x j ’s being iid \(N_1(\mu _1,\sigma _1^2),\ j=1,\ldots ,n\) , be two sets of linear forms where A 1 is m 1 × n and A 2 is m 2 × n, m 1 ≤ n, m 2 ≤ n, are constant matrices. Then, Y 1 and Y 2 are independently distributed if and only if \(A_1A_2^{\prime }=O\) or \(A_2A_1^{\prime }=O\).

Example 2.3.2

Consider a simple random sample of size 4 from a real scalar normal population \(N_1(\mu _1=0,\sigma _1^2=4)\). Let X′ = (x 1, x 2, x 3, x 4). Verify whether the sets of linear functions U = A 1 X, V = A 2 X, W = A 3 X are pairwise independent, where

Solution 2.3.2

Taking the products, we have \(A_1A_2^{\prime }\ne O, A_1A_3^{\prime }=O, A_2A_3^{\prime }\ne O\). Hence, the pair U and W are independently distributed and other pairs are not.

We can apply Theorems 2.3.1 and 2.3.2 to prove several results involving sample statistics. For instance, let x 1, …, x n be iid \(N_1(\mu _1,\sigma _1^2)\) or a simple random sample of size n from a real \(N_1(\mu _1,\sigma _1^2)\) and \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n).\) Consider the vectors

Note that when the x j’s are iid \(N_1(\mu _1,\sigma _1^2),\) \(x_j-\mu _1\sim N_1(0,\sigma _1^2),\) and that since \(X-\bar {X}=(X-\mu )-(\bar {X}-\mu ),\) we may take x j’s as coming from \(N_1(0,\sigma _1^2)\) for all operations involving \((X,\bar {X})\). Moreover, \(\bar {x}=\frac {1}{n}J'X, J'=(1,1,\ldots ,1)\) where J is a n × 1 vector of unities. Then,

(i)

and on letting \(A=\frac {1}{n}JJ'\), we have

$$\displaystyle \begin{aligned}A=A^2,\ I-A=(I-A)^2,\ A(I-A)=O.\end{aligned} $$
(ii)

Also note that

$$\displaystyle \begin{aligned}(X-\bar{X})'(X-\bar{X})=\sum_{j=1}^n(x_j-\bar{x})^2\ \text{and}\ s^2=\frac{1}{n}\sum_{j=1}^n(x_j-\bar{x})^2 \end{aligned} $$
(iii)

where s 2 is the sample variance and \(\frac {1}{n}J'X=\bar {x}\) is the sample mean. Now, observe that in light of Theorem 2.3.2, Y 1 = (I − A)X and Y 2 = AX are independently distributed, which implies that \(X-\bar {X}\) and \(\bar {X}\) are independently distributed. But \(\bar {X}\) contains only \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\) and hence \(X-\bar {X}\) and \(\bar {x}\) are independently distributed. We now will make use of the following result: If w 1 and w 2 are independently distributed real scalar random variables, then the pairs \((w_1,w_2^2),\ (w_1^2,w_2),\ (w_1^2,w_2^2)\) are independently distributed when w 1 and w 2 are real scalar random variables; the converses need not be true. For example, \(w_1^2\) and \(w_2^2\) being independently distributed need not imply the independence of w 1 and w 2. If w 1 and w 2 are real vectors or matrices and if w 1 and w 2 are independently distributed then the following pairs are also independently distributed wherever the quantities are defined: \((w_1,w_2w_2^{\prime }),\ (w_1,w_2^{\prime }w_2),\ (w_1w_1^{\prime },w_2), \ (w_1^{\prime }w_1,w_2),\ (w_1^{\prime }w_1,w_2^{\prime }w_2)\). It then follows from (iii) that \(\bar {x}\) and \((X-\bar {X})'(X-\bar {X})=\sum _{j=1}^n(x_j-\bar {x})^2\) are independently distributed. Hence, the following result:

Theorem 2.3.3

Let x 1, …, x n be iid \(N_1(\mu _1,\sigma _1^2)\) or a simple random sample of size n from a univariate real normal population \(N_1(\mu _1,\sigma _1^2)\) . Let \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\) be the sample mean and \(s^2=\frac {1}{n}\sum _{j=1}^n(x_j-\bar {x})^2\) be the sample variance. Then \(\bar {x}\) and s 2 are independently distributed.

This result has several corollaries. When x 1, …, x n are iid \(N_1(\mu _1,\sigma _1^2)\), then the sample sum of products, which is also referred to as the corrected sample sum of products (corrected in the sense that \(\bar {x}\) is subtracted), is given by

where both A and I − A are idempotent. In this case, \(\text{tr}(A)=\frac {1}{n}(1+\cdots +1)=1\) and tr(I − A) = n − 1 and hence, the ranks of A and I − A are 1 and n − 1, respectively. When a matrix is idempotent, its eigenvalues are either zero or one, the number of ones corresponding to its rank. As has already been pointed out, when X and \(\bar {X}\) are involved, it can be equivalently assumed that the sample is coming from a \(N_1(0,\sigma _1^2)\) population. Hence

$$\displaystyle \begin{aligned}\frac{ns^2}{\sigma_1^2}=\frac{1}{\sigma_1^2}\sum_{j=1}^n(x_j-\bar{x})^2\sim\chi_{n-1}^2{} \end{aligned} $$
(2.3.2)

is a real chisquare with n − 1 (the rank of the idempotent matrix of the quadratic form) degrees of freedom as per Theorem 2.2.1. Observe that when the sample comes from a real \(N_1(\mu _1,\sigma _1^2)\) distribution, we have \(\bar {x}\sim N_1(\mu _1,\frac {\sigma _1^2}{n})\) so that \(z=\frac {\sqrt {n}(\bar {x}-\mu _1)}{\sigma _1}\sim N_1(0,1)\) or z is a real standard normal, and that \(\frac {(n-1)s_1^2}{\sigma _1^2}\sim \chi _{n-1}^2\) where \(s_1^2=\frac {\sum _{j=1}^n(x_j-\bar {x})^2}{n-1}\). Recall that \(\bar {x}\) and s 2 are independently distributed. Hence, the ratio

$$\displaystyle \begin{aligned}\frac{z}{s_1/\sigma_1}\sim t_{n-1}\end{aligned}$$

has a real Student-t distribution with n − 1 degrees of freedom, where \(z=\frac {\sqrt {n}(\bar {x}-\mu _1)}{\sigma _1}\) and \(\frac {z}{s_1/\sigma _1}=\frac {\sqrt {n}(\bar {x}-\mu _1)}{s_1}\). Hence, we have the following result:

Theorem 2.3.4

Let x 1, …, x n be iid \(N_1(\mu _1,\sigma _1^2)\) . Let \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\) and \(s_1^2=\frac {\sum _{j=1}^n(x_j-\bar {x})^2}{n-1}\) . Then,

$$\displaystyle \begin{aligned}\frac{\sqrt{n}(\bar{x}-\mu_1)}{s_1}\sim t_{n-1}{} \end{aligned} $$
(2.3.3)

where t n−1 is a real Student-t with n − 1 degrees of freedom.

It should also be noted that when x 1, …, x n are iid \(N_1(\mu _1,\sigma _1^2)\), then

$$\displaystyle \begin{aligned}\frac{\sum_{j=1}^n(x_j-\mu_1)^2}{\sigma_1^2}\sim \chi_{n}^2, \ \frac{\sum_{j=1}^n(x_j-\bar{x})^2}{\sigma_1^2}\sim\chi_{n-1}^2,\ \frac{\sqrt{n}(\bar{x}-\mu_1)^2}{\sigma_1^2}\sim\chi_1^2,\end{aligned}$$

wherefrom the following decomposition is obtained:

$$\displaystyle \begin{aligned}\frac{1}{\sigma_1^2}\sum_{j=1}^n(x_j-\mu_1)^2=\frac{1}{\sigma_1^2}\Big[\sum_{j=1}^n(x_j-\bar{x})^2+ n(\bar{x}-\mu_1)^2\Big]\Rightarrow \chi_n^2=\chi_{n-1}^2+\chi_1^2,{}\end{aligned} $$
(2.3.4)

the two chisquare random variables on the right-hand side of the last equation being independently distributed.

2.3a. Simple Random Samples from a Complex Gaussian Population

The definition of a simple random sample from any population remains the same as in the real case. A set of complex scalar random variables \(\tilde {x}_1,\ldots ,\tilde {x}_n,\) which are iid as \(\tilde {N}_1(\tilde {\mu }_1,\sigma _1^2)\) is called a simple random sample from this complex Gaussian population. Let \(\tilde {X}\) be the n × 1 vector whose components are these sample variables, \(\bar {\tilde {x}}=\frac {1}{n}(\tilde {x}_1+\cdots +\tilde {x}_n)\) denote the sample average, and \(\bar {\tilde {X}}'=(\bar {\tilde {x}},\ldots ,\bar {\tilde {x}})\) be the 1 × n vector of sample means; then \(\tilde {X},\ \bar {\tilde {X}}\) and the sample sum of products matrix \(\tilde {s}\) are respectively,

These quantities can be simplified as follows with the help of the vector of unities J′ = (1, 1, …, 1): \(\bar {\tilde {x}}=\frac {1}{n}J'\tilde {X}\), \(\tilde {X}-\bar {\tilde {X}}=[I-\frac {1}{n}JJ']\tilde {X}\), \(\bar {\tilde {X}}=\frac {1}{n}JJ'\tilde {X}\), \(\tilde {s}=\tilde {X}^{*}[I-\frac {1}{n}JJ']\tilde {X}\). Consider the Hermitian form

$$\displaystyle \begin{aligned}\tilde{X}^{*}[I-\frac{1}{n}JJ']\tilde{X}=\sum_{j=1}^n(\tilde{x}_j-\bar{\tilde{x}})^{*}(\tilde{x}_j-\bar{\tilde{x}}) =n\tilde{s}^2\end{aligned}$$

where \(\tilde {s}^2\) is the sample variance in the complex scalar case, given a simple random sample of size n.

Consider the linear forms \(\tilde {u}_1=L_1^{*}\tilde {X}=\bar {a}_1\tilde {x}_1+\cdots +\bar {a}_n\tilde {x}_n\) and \(\tilde {u}_2=L_2^{*}\tilde {X}=\bar {b}_1\tilde {x}_1+\cdots +\bar {b}_n\tilde {x}_n\) where the a j’s and b j’s are scalar constants that may be real or complex, \(\bar {a}_j\) and \( \bar {b}_j\) denoting the complex conjugates of a j and b j, respectively.

$$\displaystyle \begin{aligned}E[\tilde{X}]'=(\tilde{\mu}_1)[1,1,\ldots,1]=(\tilde{\mu}_1) J',\ J'=[1,\ldots,1], \end{aligned}$$

since the \(\tilde {x}_j\)’s are iid \(\tilde { N}_1(\tilde {\mu }_1,\sigma _1^2),\ j=1,\ldots ,n\). The mgf of \(\tilde {u}_1\) and \(\tilde {u}_2\), denoted by \(M_{\tilde {u}_j}(\tilde {t}_j),\ j=1,2\) and the joint mgf of \(\tilde {u}_1\) and \(\tilde {u}_2\), denoted by \(M_{\tilde {u}_1,\tilde {u}_2}(\tilde {t}_1,\tilde {t}_2)\) are the following, where \(\Re (\cdot )\) denotes the real part of (⋅):

$$\displaystyle \begin{aligned} M_{\tilde{u}_1}(\tilde{t}_1)&=E[\text{e}^{\Re(\tilde{t}_1^{*}\tilde{u}_1)}]=E[\text{e}^{\Re(\tilde{t}_1^{*}L_1^{*}\tilde{X})}]\\ &=\text{e}^{\Re(\tilde{\mu}_1\tilde{t}_1^{*}L_1^{*}J)}E[\text{e}^{\Re(\tilde{t}_1^{*}L_1^{*}(\tilde{X}-E(\tilde{X})))} =\text{e}^{\Re(\tilde{\mu}_1\tilde{t}_1^{*}L_1^{*}J)}\text{e}^{\frac{\sigma_1^2}{4}\tilde{t}_1^{*}L_1^{*}L_1\tilde{t}_1} \end{aligned} $$
(i)
$$\displaystyle \begin{aligned} M_{\tilde{u}_2}(\tilde{t}_2)&=\text{e}^{\Re(\tilde{\mu}_1\tilde{t}_2^{*}L_2^{*}J)+\frac{\sigma_1^2}{4}\tilde{t}_2^{*}L_2^{*}L_2\tilde{t}_2} \end{aligned} $$
(ii)
$$\displaystyle \begin{aligned} M_{\tilde{u}_1,\tilde{u}_2}(\tilde{t}_1,\tilde{t}_2)&=M_{\tilde{u}_1}(\tilde{t}_1)M_{\tilde{u}_2}(\tilde{t}_2) \text{e}^{\frac{\sigma_1^2}{2}\tilde{t}_1^{*}L_1^{*}L_2\tilde{t}_2}.\end{aligned} $$
(iii)

Consequently, \(\tilde {u}_1\) and \(\tilde {u}_2\) are independently distributed if and only if the exponential part is 1 or equivalently \(\tilde {t}_1^{*}L_1^{*}L_2\tilde {t}_2=0\). Since \(\tilde {t}_1\) and \(\tilde {t}_2\) are arbitrary, this means \(L_1^{*}L_2=0\Rightarrow L_2^{*}L_1=0\). Then we have the following result:

Theorem 2.3a.1

Let \(\tilde {x}_1,\ldots ,\tilde {x}_n\) be a simple random sample of size n from a univariate complex Gaussian population \(\tilde {N}_1(\tilde {\mu }_1,\sigma _1^2)\) . Consider the linear forms \(\tilde {u}_1=L_1^{*}\tilde {X}\) and \(\tilde {u}_2=L_2^{*}\tilde {X}\) where L 1, L 2 and \(\tilde {X}\) are the previously defined n × 1 vectors, and a star denotes the conjugate transpose. Then, \(\tilde {u}_1\) and \(\tilde {u}_2\) are independently distributed if and only if \(L_1^{*}L_2=0\).

Example 2.3a.1

Let \(\tilde {x}_j,\ j=1,2,3,4\) be iid univariate complex normal \(\tilde {N}_1(\tilde {\mu }_1,\sigma _1^2)\). Consider the linear forms

$$\displaystyle \begin{aligned} \tilde{u}_1&=L_1^{*}\tilde{X}=(1+i)\tilde{x}_1+2i\tilde{x}_2-(1-i)\tilde{x}_3+2\tilde{x}_4\\ \tilde{u}_2&=L_2^{*}\tilde{X}=(1+i)\tilde{x}_1+(2+3i)\tilde{x}_2-(1-i)\tilde{x}_3-i\tilde{x}_4\\ \tilde{u}_3&=L_3^{*}\tilde{X}=-(1+i)\tilde{x}_1+i\tilde{x}_2+(1-i)\tilde{x}_3+\tilde{x}_4.\end{aligned} $$

Verify whether the three linear forms are pairwise independent.

Solution 2.3a.1

With the usual notations, the coefficient vectors are as follows:

$$\displaystyle \begin{aligned} L_1^{*}&=[1+i,2i,-1+i,2]\Rightarrow L_1^{\prime}=[1-i,-2i,-1-i,2]\\ L_2^{*}&=[1+i,2+3i,1-i,-i]\Rightarrow L_2^{\prime}=[1-i,2-3i,1+i,i]\\ L_3^{*}&=[-(1+i),i,1-i,1]\Rightarrow L_3^{\prime}=[-(1-i),-i,1+i,1].\end{aligned} $$

Taking the products we have \(L_1^{*}L_2=6+6i\ne 0, L_1^{*}L_3=0, L_2^{*}L_3=3-3i\ne 0\). Hence, only \(\tilde {u}_1\) and \(\tilde {u}_3\) are independently distributed.

We can extend the result stated in Theorem 2.3a.1 to sets of linear forms. Let \(\tilde {U}_1=A_1\tilde {X}\) and \( \tilde {U}_2=A_2\tilde {X}\) where A 1 and A 2 are constant matrices that may or may not be in the complex domain, A 1 is m 1 × n and A 2 is m 2 × n, with m 1 ≤ n, m 2 ≤ n. As was previously the case, \(\tilde {X}'=(\tilde {x}_1,\ldots ,\tilde {x}_n),\ \tilde {x}_j,\ j=1,\ldots ,n,\) are iid \(\tilde {N}_1(\tilde {\mu }_1,\sigma _1^2)\). Let \(\tilde {T}_1\) and \(\tilde {T}_2\) be parameter vectors of orders m 1 × 1 and m 2 × 1, respectively. Then, on following the steps leading to (iii), the mgf of \(\tilde {U}_1\) and \(\tilde {U}_2\) and their joint mgf are obtained as follows:

$$\displaystyle \begin{aligned} M_{\tilde{U}_1}(\tilde{T}_1)&=E[\text{e}^{\Re(\tilde{T}_1^{*}A_1\tilde{X})}] =\text{e}^{\Re(\tilde{\mu}_1\tilde{T}_1^{*}A_1J)+\frac{\sigma_1^2}{4} \tilde{T}_1^{*}A_1A_1^{*}\tilde{T}_1} \end{aligned} $$
(iv)
$$\displaystyle \begin{aligned} M_{\tilde{U}_2}(\tilde{T}_2)&=\text{e}^{\Re(\tilde{\mu}_1 \tilde{T}_2^{*}A_2J)+\frac{\sigma_1^2}{4}\tilde{T}_2^{*} A_2A_2^{*}\tilde{T}_2} \end{aligned} $$
(v)
$$\displaystyle \begin{aligned} M_{\tilde{U}_1,\tilde{U}_2}(\tilde{T}_1,\tilde{T}_2)&= M_{\tilde{U}_1}(\tilde{T}_1)M_{\tilde{U}_2}(\tilde{T}_2)\text{e}^{\frac{\sigma_1^2}{2} \tilde{T}_1^{*}A_1A_2^{*}\tilde{T}_2}.\end{aligned} $$
(vi)

Since \(\tilde {T}_1\) and \(\tilde {T}_2\) are arbitrary, the exponential part in (vi) is 1 if and only if \(A_1A_2^{*}=O\) or \(A_2A_1^{*}=O\), the two null matrices having different orders. Then, we have:

Theorem 2.3a.2

Let the \(\tilde {x}_j\) ’s and \(\tilde {X}\) be as defined in Theorem 2.3a.1 . Let A 1 be a m 1 × n constant matrix and A 2 be a m 2 × n constant matrix, m 1 ≤ n, m 2 ≤ n, and the constant matrices may or may not be in the complex domain. Consider the general linear forms \(\tilde {U}_1=A_1\tilde {X}\) and \( \tilde {U}_2=A_2\tilde {X}\) . Then \(\tilde {U}_1\) and \(\tilde {U}_2\) are independently distributed if and only if \(A_1A_2^{*}=O\) or, equivalently, \(A_2A_1^{*}=O\).

Example 2.3a.2

Let \(\tilde {x}_j,\ j=1,2,3,4,\) be iid univariate complex Gaussian \(\tilde {N}_1(\tilde {\mu }_1,\sigma _1^2)\). Consider the following sets of linear forms \(\tilde {U}_1=A_1\tilde {X},\ \tilde {U}_2=A_2\tilde {X},\ \tilde {U}_3=A_3\tilde {X}\) with X′ = (x 1, x 2, x 3, x 4), where

Verify whether the pairs in \(\tilde {U}_1,\tilde {U}_2,\tilde {U}_3\) are independently distributed.

Solution 2.3a.2

Since the products \(A_1A_2^{*}=O, A_1A_3^{*}\ne O,A_2A_3^{*}\ne O\), only \(\tilde {U}_1\) and \(\tilde {U}_2\) are independently distributed.

As a corollary of Theorem 2.3a.2, one has that the sample mean \(\bar {\tilde {x}}\) and the sample sum of products \({\tilde s}\) are also independently distributed in the complex Gaussian case, a result parallel to the corresponding one in the real case. This can be seen by taking \(A_1=\frac {1}{n}JJ'\) and \(A_2=I-\frac {1}{n}JJ'\). Then, since \(A_1=A_1^2,\ A_2=A_2^2,\) and A 1 A 2 = O, we have \(\frac {1}{\sigma _1^2}\tilde {X}^{*}A_1\tilde {X}\sim \tilde {\chi }_1^2\) for \(\tilde {\mu }_1=0\) and \( \frac {1}{\sigma _1^2}\tilde {X}^{*}A_2\tilde {X}\sim \tilde {\chi }_{n-1}^2\), and both of these chisquares in the complex domain are independently distributed. Then,

$$\displaystyle \begin{aligned}\frac{1}{\sigma_1^2}n\tilde{s}=\frac{1}{\sigma_1^2}\sum_{j=1}^n(\tilde{x}_j-\bar{\tilde{x}})^{*}(\tilde{x}_j -\bar{\tilde{x}})\sim\tilde{\chi}_{n-1}^2.{} \end{aligned} $$
(2.3a.1)

The Student-t with n − 1 degrees of freedom can be defined in terms of the standardized sample mean and sample variance in the complex case.

2.3.1. Noncentral chisquare having n degrees of freedom in the real domain

Let \(x_j\sim N_1(\mu _j,\sigma _j^2),\ j=1,\ldots ,n\) and the x j’s be independently distributed. Then, \(\frac {x_j-\mu _j}{\sigma _j}\sim N_1(0,1)\) and \(\sum _{j=1}^n\frac {(x_j-\mu _j)^2}{\sigma _j^2}\sim \chi _{n}^2\) where \(\chi _n^2\) is a real chisquare with n degrees of freedom. Then, when at least one of the μ j’s is nonzero, \(\sum _{j=1}^n\frac {x_j^2}{\sigma _j^2}\) is referred to as a real non-central chisquare with n degrees of freedom and non-centrality parameter λ, which is denoted \(\chi _n^2(\lambda )\), where

Let \(u=\sum _{j=1}^n\frac {x_j^2}{\sigma _j^2}\). In order to derive the distribution of u, let us determine its mgf. Since u is a function of the x j’s where \(x_j\sim N_1(\mu _j,\sigma _j^2),\ j=1,\ldots ,n,\) we can integrate out over the joint density of the x j’s. Then, with t as the mgf parameter,

$$\displaystyle \begin{aligned} M_u(t)&=E[\text{e}^{tu}]\\ &=\int_{-\infty}^{\infty}...\int_{-\infty}^{\infty}\frac{1}{(2\pi)^{\frac{n}{2}}|\varSigma|{}^{\frac{1}{2}}}\text{e}^{t\sum_{j=1}^n\frac{x_j^2}{\sigma_j^2}-\frac{1}{2}\sum_{j=1}^n\frac{(x_j-\mu_j)^2}{\sigma_j^2}}\text{d}x_1\wedge...\wedge\text{d}x_n.\end{aligned} $$

The exponent, excluding \(-\frac {1}{2}\) can be simplified as follows:

$$\displaystyle \begin{aligned}-2t\sum_{j=1}^n\frac{x_j^2}{\sigma_j^2}+\sum_{j=1}^n\frac{(x_j-\mu_j)^2}{\sigma_j^2}=(1-2t)\sum_{j=1}^n\frac{x_j^2}{\sigma_j^2}-2\sum_{j=1}^n\frac{\mu_jx_j}{\sigma_j^2}+\sum_{j=1}^n\frac{\mu_j^2}{\sigma_j^2}. \end{aligned}$$

Let \(y_j=\sqrt {(1-2t)}x_j\). Then, \((1-2t)^{-\frac {n}{2}}\text{d}y_1\wedge \ldots \wedge \text{d}y_n=\text{d}x_1\wedge \ldots \wedge \text{d}x_n\), and

$$\displaystyle \begin{aligned} (1-2t)\sum_{j=1}^n\Big(\frac{x_j^2}{\sigma_j^2}\Big)-2\sum_{j=1}^n\Big(\frac{\mu_jx_j}{\sigma_j^2}\Big)&=\sum_{j=1}^n\frac{y_j^2}{\sigma_j^2}-2 \sum_{j=1}^n\frac{\mu_jy_j}{\sigma_j^2\sqrt{(1-2t)}}\\ &\quad +\sum_{j=1}^n\Big(\frac{\mu_j}{\sigma_j\sqrt{(1-2t)}}\Big)^2-\sum_{j=1}^n\frac{\mu_j^2}{\sigma_j^2(1-2t)}\\ &=\sum_{j=1}^n\Bigg(\frac{\Big(y_j-\frac{\mu_j}{\sqrt{(1-2t)}}\Big)}{\sigma_j}\Bigg)^2 -\sum_{j=1}^n\frac{\mu_j^2}{\sigma_j^2(1-2t)}.\end{aligned} $$

But

$$\displaystyle \begin{aligned}\int_{-\infty}^{\infty}\cdots\int_{-\infty}^{\infty}\frac{1}{(2\pi)^{\frac{n}{2}}|\varSigma|{}^{\frac{1}{2}}}\text{e}^{-\sum_{j=1}^n\frac{1}{2\sigma_j^2}\Big(y_j-\frac{\mu_j}{\sqrt{(1-2t)}}\Big)^2}\text{d }y_1\wedge\ldots\wedge\text{d}y_n=1. \end{aligned}$$

Hence, for \(\lambda =\frac {1}{2}\sum _{j=1}^n\frac {\mu _j^2}{\sigma _j^2}=\frac {1}{2}\mu '\varSigma ^{-1}\mu \),

$$\displaystyle \begin{aligned} M_u(t)&=\frac{1}{(1-2t)^{\frac{n}{2}}}[\text{e}^{-\lambda+\frac{\lambda}{(1-2t)}}]{}\\ &=\sum_{k=0}^{\infty}\frac{\lambda^k\text{e}^{-\lambda}}{k!}\frac{1}{(1-2t)^{\frac{n}{2}+k}}. \end{aligned} $$
(2.3.5)

However, \((1-2t)^{-(\frac {n}{2}+k)}\) is the mgf of a real scalar gamma with parameters \((\alpha =\frac {n}{2}+k,\beta =2)\) or a real chisquare with n + 2k degrees of freedom or \(\chi _{n+2k}^2\). Hence, the density of a non-central chisquare with n degrees of freedom and non-centrality parameter λ, denoted by g u,λ(u), is obtained by term by term inversion as follow:

$$\displaystyle \begin{aligned} g_{u,\lambda}(u)&=\sum_{k=0}^{\infty}\frac{\lambda^k\text{e}^{-\lambda}}{k!}\frac{u^{\frac{n+2k}{2}-1}\text{e}^{-\frac{u}{2}}}{2^{\frac{n+2k}{2}}\varGamma(\frac{n}{2}+k)}{} \end{aligned} $$
(2.3.6)
$$\displaystyle \begin{aligned} &=\frac{u^{\frac{n}{2}-1}\text{e}^{-\frac{u}{2}}}{2^{\frac{n}{2}}\varGamma(\frac{n}{2})}\sum_{k=0}^{\infty}\frac{\lambda^k\text{e}^{-\lambda}}{k!}\frac{u^k}{(\frac{n}{2})_k}{}\end{aligned} $$
(2.3.7)

where \((\frac {n}{2})_k\) is the Pochhammer symbol given by

$$\displaystyle \begin{aligned}(a)_k=a(a+1)\cdots(a+k-1),\ a\ne 0, \ (a)_0\ \text{being}\ \text{equal}\ \text{to}\ \text{1,}\ {} \end{aligned} $$
(2.3.8)

and, in general, Γ(α + k) = Γ(α)(α)k for k = 1, 2, …, whenever the gamma functions are defined. Hence, provided (1 − 2t) > 0, (2.3.6) can be looked upon as a weighted sum of chisquare densities whose weights are Poisson distributed, that is, (2.3.6) is a Poisson mixture of chisquare densities. As well, we can view (2.3.7) as a chisquare density having n degrees of freedom appended with a Bessel series. In general, a Bessel series is of the form

$$\displaystyle \begin{aligned}{{}_0F_1}(~;b;x)=\sum_{k=0}^{\infty}\frac{1}{(b)_k}\frac{x^k}{k!},\ b\ne 0,-1,-2,\ldots,{}\end{aligned} $$
(2.3.9)

which is convergent for all x.

2.3.1.1. Mean value and variance, real central and non-central chisquare

The mgf of a real \(\chi _{\nu }^2\) is \((1-2t)^{-\frac {\nu }{2}}, 1-2t>0\). Thus,

$$\displaystyle \begin{aligned} E[\chi_{\nu}^2]&=\frac{\text{d}}{\text{d}t}(1-2t)^{-\frac{\nu}{2}}|{}_{t=0}=\Big(\!\!-\frac{\nu}{2}\Big)(-2)(1-2t)^{-\frac{\nu}{2}-1}|{}_{t=0}=\nu\\ E[\chi_{\nu}^2]^2&=\frac{\text{d}^2}{\text{d}t^2}(1-2t)^{-\frac{\nu}{2}}|{}_{t=0}=\Big(\!\!-\frac{\nu}{2}\Big)(-2)\Big(\!\!-\frac{\nu}{2}-1\Big)(-2)=\nu(\nu+2).\end{aligned} $$

That is,

$$\displaystyle \begin{aligned}E[\chi_{\nu}^2]=\nu\mbox{ and }\text{Var}(\chi_{\nu}^2)=\nu(\nu+2)-\nu^2=2\nu.{}\end{aligned} $$
(2.3.10)

What are then the mean and the variance of a real non-central \(\chi _{\nu }^2(\lambda )\)? They can be derived either from the mgf or from the density. Making use of the density, we have

$$\displaystyle \begin{aligned}E[\chi_{\nu}^2(\lambda)]=\sum_{k=0}^{\infty}\frac{\lambda^k\text{e}^{-\lambda}}{k!}\int_0^{\infty}u\frac{u^{\frac{\nu}{2}+k-1}\text{e}^{-\frac{u}{2}}}{2^{\frac{\nu}{2}+k}\varGamma(\frac{\nu}{2}+k)}\text{d}u, \end{aligned}$$

the integral part being equal to

$$\displaystyle \begin{aligned}\frac{\varGamma(\frac{\nu}{2}+k+1)}{\varGamma(\frac{\nu}{2}+k)}\frac{2^{\frac{\nu}{2}+k+1}}{2^{\frac{\nu}{2}+k}} =2\Big(\frac{\nu}{2}+k\Big)=\nu+2k. \end{aligned}$$

Now, the remaining summation over k can be looked upon as the expected value of ν + 2k in a Poisson distribution. In this case, we can write the expected values as the expected value of a conditional expectation: \(E[u]=E[E(u|k)], u=\chi ^2_{\nu }(\lambda )\). Thus,

$$\displaystyle \begin{aligned}E[\chi_{\nu}^2(\lambda)]=\nu+2\sum_{k=0}^{\infty}k\frac{\lambda^k\text{e}^{-\lambda}}{k!}=\nu+2E[k]=\nu+2\lambda.\end{aligned}$$

Moreover,

$$\displaystyle \begin{aligned}E[\chi_{\nu}^2(\lambda)]^2= \sum_{k=0}^{\infty}\frac{\lambda^k\text{e}^{-\lambda}}{k!}\int_0^{\infty}u^2\frac{u^{\frac{\nu}{2}+k-1}}{2^{\frac{\nu}{2}+k}\varGamma(\frac{\nu}{2}+k)}\text{d}u, \end{aligned}$$

the integral part being

$$\displaystyle \begin{aligned} \frac{\varGamma(\frac{\nu}{2}+k+2)}{\varGamma(\frac{\nu}{2}+k)}\frac{2^{\frac{\nu}{2}+k+2}}{2^{\frac{\nu}{2}+k}}&= 2^2(\frac{\nu}{2}+k+1)(\frac{\nu}{2}+k)\\ &=(\nu+2k+2)(\nu+2k)=\nu^2+2\nu k+2\nu (k+1)+4k(k+1).\end{aligned} $$

Since E[k] = λ, E[k 2] = λ 2 + λ for a Poisson distribution,

$$\displaystyle \begin{aligned}E[\chi_{\nu}^2(\lambda)]^2=\nu^2+2\nu+4\nu\lambda+4(\lambda^2+2\lambda). \end{aligned}$$

Thus,

$$\displaystyle \begin{aligned} \text{Var}(\chi_{\nu}^2(\lambda))&=E[\chi_{\nu}^2(\lambda)]^2-[E(\chi_{\nu}^2(\lambda))]^2\\ &=\nu^2+2\nu+4\nu\lambda+4(\lambda^2+2\lambda)-(\nu+2\lambda)^2\\ &=2\nu+8\lambda.\end{aligned} $$

To summarize,

$$\displaystyle \begin{aligned}E[\chi_{\nu}^2(\lambda)]=\nu+2\lambda\ \text{and} \ \text{Var}(\chi_{\nu}^2(\lambda))=2\nu+8\lambda.{} \end{aligned} $$
(2.3.11)

Example 2.3.3

Let x 1 ∼ N 1(−1, 2), x 2 ∼ N 1(1, 3) and x 3 ∼ N 1(−2, 2) be independently distributed and \(u=\frac {x_1^2}{2}+\frac {x_2^2}{3}+\frac {x_3^2}{2}\). Provide explicit expressions for the density of u, E[u] and Var(u).

Solution 2.3.3

This u has a noncentral chisquare distribution with non-centrality parameter λ where

$$\displaystyle \begin{aligned}\lambda=\frac{1}{2}\Big[\frac{\mu_1^2}{\sigma_1^2}+\frac{\mu_2^2}{\sigma_2^2}+\frac{\mu_3^2}{\sigma_3^2}\Big] =\frac{1}{2}\Big[\frac{(-1)^2}{2}+\frac{(1)^2}{3}+\frac{(-2)^2}{2}\Big]=\frac{17}{12}, \end{aligned}$$

and the number of degrees of freedom is n = 3 = ν. Thus, \(u\sim \chi ^2_3(\lambda )\) or a real noncentral chisquare with ν = 3 degrees of freedom and non-centrality parameter \(\frac {17}{12}\). Then \(E[u]=E[\chi ^2_3(\lambda )]=\nu +2\lambda =3+2(\frac {17}{12})=\frac {35}{6}\). \(\text{Var}(u)=\text{Var}(\chi ^2_3(\lambda ))=2\nu +8\lambda =(2)(3)+8(\frac {17}{12})=\frac {52}{3}\). Let the density of u be denoted by g(u). Then

$$\displaystyle \begin{aligned} g(u)&=\frac{u^{\frac{n}{2}-1}\text{e}^{-\frac{u}{2}}}{2^{\frac{n}{2}}\varGamma(\frac{n}{2})}\sum_{k=0}^{\infty}\frac{\lambda^k\text{e}^{-\lambda}}{k!}\frac{u^k}{(\frac{n}{2})_k}\\ &=\frac{u^{\frac{1}{2}}\text{e}^{-\frac{u}{2}}}{\sqrt{2\pi}}\sum_{k=0}^{\infty}\frac{(17/12)^k\text{e}^{-17/12}}{k!}\frac{u^k}{(\frac{3}{2})_k},\ 0\le u<\infty,\end{aligned} $$

and zero elsewhere.

2.3a.1. Noncentral chisquare having n degrees of freedom in the complex domain

Let us now consider independently Gaussian distributed variables in the complex domain. Let the complex scalar variables \(\tilde {x}_j\sim \tilde {N}(\tilde {\mu }_j,\sigma _j^2),\ j=1,\ldots ,n,\) be independently distributed. Then, we have already established that \(\sum _{j=1}^n(\frac {\tilde {x}_j-\tilde {\mu }_j}{\sigma _j})^{*}(\frac {\tilde {x}_j-\tilde {\mu }_j}{\sigma _j}) \sim \tilde {\chi }_n^2\), which is a chisquare variable having n degrees of freedom in the complex domain. If we let \(\tilde {u}=\sum _{j=1}^n\frac {\tilde {x}_j^{*}\tilde {x}_j}{\sigma _j^2}\), then this u will be said to have a noncentral chisquare distribution with n degrees of freedom and non-centrality parameter λ in the complex domain where \(\tilde {x}_j^{*}\) is only the conjugate since it is a scalar quantity. Since, in this case, \(\tilde {u}\) is real, we may associate the mgf of \(\tilde {u}\) with a real parameter t. Now, proceeding as in the real case, we obtain the mgf of \(\tilde {u}\), denoted by \(M_{\tilde {u}}(t)\), as follows:

$$\displaystyle \begin{aligned} M_{\tilde{u}}(t)&=E[\text{e}^{t\tilde{u}}]=(1-t)^{-n}\text{e}^{-\lambda+\frac{\lambda}{1-t}},\ 1-t>0,\ \lambda=\sum_{j=1}^n\frac{\tilde{\mu}_j^{*}\tilde{\mu}_j}{\sigma_j^2}= \tilde{\mu}^{*}\varSigma^{-1}\tilde{\mu}\\ &=\sum_{k=0}^{\infty}\frac{\lambda^k}{k!}\text{e}^{-\lambda}(1-t)^{-(n+k)}. \end{aligned} $$
(2.3a.2)

Note that the inverse corresponding to (1 − t)−(n+k) is a chisquare density in the complex domain with n + k degrees of freedom, and that part of the density, denoted by f 1(u), is

$$\displaystyle \begin{aligned}f_1(u)=\frac{1}{\varGamma(n+k)}u^{n+k-1}\text{e}^{-u}=\frac{1}{\varGamma(n)(n)_k}u^{n-1}u^k\text{e}^{-u}. \end{aligned}$$

Thus, the noncentral chisquare density with n degrees of freedom in the complex domain, that is, \(u=\tilde {\chi }^2_n(\lambda )\), denoted by f u,λ(u), is

$$\displaystyle \begin{aligned}f_{u,\lambda}(u)=\frac{u^{n-1}}{\varGamma(n)}\text{e}^{-u}\sum_{k=0}^{\infty}\frac{\lambda^k}{k!}\text{e}^{-\lambda}\frac{u^k}{(n)_k}, {} \end{aligned} $$
(2.3a.3)

which, referring to Eqs. (2.3.5)–(2.3.9) in connection with a non-central chisquare in the real domain, can also be represented in various ways.

Example 2.3a.3

Let \(\tilde {x}_1\sim \tilde {N}_1(1+i,2),\ \tilde {x}_2\sim \tilde {N}_1(2+i,4)\) and \(\tilde {x}_3\sim \tilde {N}_1(1-i,2)\) be independently distributed univariate complex Gaussian random variables and

$$\displaystyle \begin{aligned}\tilde{u}=\frac{\tilde{x}_1^{*}\tilde{x}_1}{\sigma_1^2}+\frac{\tilde{x}_2^{*}\tilde{x}_2}{\sigma_2^2}+\frac{\tilde{x}_3^{*}\tilde{x}_3}{\sigma_3^2} =\frac{\tilde{x}_1^{*}\tilde{x}_1}{2} +\frac{\tilde{x}_2^{*}\tilde{x}_2}{4}+\frac{\tilde{x}_3^{*}\tilde{x}_3}{2}.\end{aligned}$$

Compute \(E[\tilde {u}]\) and \(\text{Var}(\tilde {u})\) and provide an explicit representation of the density of \(\tilde {u}\).

Solution 2.3a.3

In this case, \(\tilde {u}\) has a noncentral chisquare distribution with degrees of freedom ν = n = 3 and non-centrality parameter λ given by

$$\displaystyle \begin{aligned}\lambda=\frac{\tilde{\mu}_1^{*}\tilde{\mu}_1}{\sigma_1^2}+\frac{\tilde{\mu}_2^{*}\tilde{\mu}_2}{\sigma_2^2} +\frac{\tilde{\mu}_3^{*}\tilde{\mu}_3}{\sigma_3^2}=\frac{[(1)^2+(1)^2]}{2}+\frac{[(2)^2+(1)^2]}{4}+\frac{[(1)^2+(-1)^2]}{2}=1+\frac{5}{4}+1 =\frac{13}{4}. \end{aligned}$$

The density, denoted by \(g_1(\tilde {u})\), is given in (i). In this case \(\tilde {u}\) will be a real gamma with the parameters (α = n, β = 1) to which a Poisson series is appended:

$$\displaystyle \begin{aligned}g_1(u)=\sum_{k=0}^{\infty}\frac{\lambda^k}{k!}\text{e}^{-\lambda}\frac{u^{n+k-1}\text{e}^{-u}}{\varGamma(n+k)},\ 0\le u<\infty, \end{aligned} $$
(i)

and zero elsewhere. Then, the expected value of u and the variance of u are available from (i) by direct integration.

$$\displaystyle \begin{aligned}E[u]=\int_0^{\infty}ug_1(u)\text{d}u=\sum_{k=0}^{\infty}\frac{\lambda^k}{k!}\text{e}^{-\lambda}\frac{\varGamma(n+k+1)}{\varGamma(n+k)}. \end{aligned}$$

But \(\frac {\varGamma (n+k+1)}{\varGamma (n+k)}=n+k\) and the summation over k can be taken as the expected value of n + k in a Poisson distribution. Thus, \(E[\tilde {\chi }_n^2(\lambda )]=n+E[k]=n+\lambda \). Now, in the expected value of

$$\displaystyle \begin{aligned}{}[\tilde{\chi}^2_n(\lambda)][\tilde{\chi}^2_n(\lambda)]^{*}=[\tilde{\chi}^2_n (\lambda)]^2=[u]^2, \end{aligned}$$

which is real in this case, the integral part over u gives

$$\displaystyle \begin{aligned}\frac{\varGamma(n+k+2)}{\varGamma(n+k)}=(n+k+1)(n+k)=n^2+2nk+k^2+n+k \end{aligned}$$

with expected value n 2 + 2 + n + λ + (λ 2 + λ). Hence,

$$\displaystyle \begin{aligned}\text{Var}(\tilde{\chi}_n^2(\lambda))=E[u-E(u)][u-E(u)]^{*}=\text{Var}(u)=n^2+2n\lambda+n+\lambda+(\lambda^2+\lambda) -(n+\lambda)^2, \end{aligned}$$

which simplifies to n + 2λ. Accordingly,

$$\displaystyle \begin{aligned}E[\tilde{\chi}_n^2(\lambda)]=n+\lambda \mbox{ and }\text{Var}(\tilde{\chi}_n^2(\lambda))=n+2\lambda, \end{aligned} $$
(ii)

so that

$$\displaystyle \begin{aligned}E[u]=n+\lambda = 3+\frac{13}{4}=\frac{25}{4} \mbox{ and } \text{Var}(u)=n+2\lambda=3+\frac{13}{2}=\frac{19}{2}. \end{aligned} $$
(iii)

The explicit form of the density is then

$$\displaystyle \begin{aligned}g_1(u)=\frac{u^2\text{e}^{-u}}{2}\sum_{k=0}^{\infty}\frac{(13/4)^k\text{e}^{-13/4}}{k!}\frac{u^k}{(3)_k},\ 0\le u<\infty, \end{aligned} $$
(iv)

and zero elsewhere.

Exercises 2.3

2.3.1

Let x 1, …, x n be iid variables with common density a real gamma density with the parameters α and β or with the mgf (1 − βt)α, 1 − βt > 0, α > 0, β > 0. Let \(u_1=x_1+\cdots +x_n,u_2=\frac {1}{n}(x_1+\cdots +x_n), u_3=u_2-\alpha \beta , u_4=\frac {\sqrt {n}u_3}{\beta \sqrt {\alpha }}\). Evaluate the mgfs and thereby the densities of u 1, u 2, u 3, u 4. Show that they are all gamma densities for all finite n, may be relocated. Show that when n →, u 4 → N 1(0, 1) or u 4 goes to a real standard normal when n goes to infinity.

2.3.2

Let x 1, …, x n be a simple random sample of size n from a real population with mean value μ and variance σ 2 < , σ > 0. Then the central limit theorem says that \(\frac {\sqrt {n}(\bar {x}-\mu )}{\sigma }\to N_1(0,1)\) as n →, where \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\). Translate this statement for (1): binomial probability function and f 1(x) = 0 elsewhere; (2): negative binomial probability law and zero elsewhere; (3): geometric probability law f 3(x) = p(1 − p)x−1, x = 1, 2, …, 0 < p < 1 and f 3(x) = 0 elsewhere; (4): Poisson probability law \(f_4(x)=\frac {\lambda ^x}{x!}\text{e}^{-\lambda }, x=0,1,\ldots , \lambda >0\) and f 4(x) = 0 elsewhere.

2.3.3

Repeat Exercise 2.3.2 if the population is (1): \(g_1(x)=c_1x^{\gamma -1}\text{e}^{-ax^{\delta }},x\ge 0, \delta >0,a>0,\gamma >0\) and g 1(x) = 0 elsewhere; (2): The real pathway model \(g_2(x)= c_q x^{\gamma }[1-a(1-q)x^{\delta }]^{\frac {1}{1-q}},a>0,\delta >0, 1-a(1-q)x^{\delta }>0\) and for the cases q < 1, q > 1, q → 1, and g 2(x) = 0 elsewhere.

2.3.4

Let \(x\sim N_1(\mu _1,\sigma _1^2),y\sim N_1(\mu _2,\sigma _2^2)\) be real Gaussian and be independently distributed. Let \(x_1,\ldots ,x_{n_1},y_1,\ldots ,y_{n_2}\) be simple random samples from x and y respectively. Let \(u_1=\sum _{j=1}^{n_1}(x_j-\mu _1)^2, u_2=\sum _{j=1}^{n_2}(y_j-\mu _2)^2, u_3=2x_1-3x_2+y_1-y_2+2y_3\),

$$\displaystyle \begin{aligned}u_4=\frac{\sum_{j=1}^{n_1}(x_j-\mu_1)^2/\sigma_1^2}{\sum_{j=1}^{n_2}(y_j-\mu_2)^2/\sigma_2^2}, u_5=\frac{\sum_{j=1}^{n_1}(x_j-\bar{x})^2/\sigma_1^2}{\sum_{j=1}^{n_2}(y_j-\bar{y})^2/\sigma_2^2}. \end{aligned}$$

Compute the densities of u 1, u 2, u 3, u 4, u 5.

2.3.5

In Exercise 2.3.4 if \(\sigma _1^2=\sigma _2^2=\sigma ^2\) compute the densities of u 3, u 4, u 5 there, and \(u_6=\sum _{j=1}^{n_1}(x_j-\bar {x})^2+\sum _{j=1}^{n_2}(y_j-\bar {y})^2\) if (1): n 1 = n 2, (2): n 1n 2.

2.3.6

For the noncentral chisquare in the complex case, discussed in (2.3a.5) evaluate the mean value and the variance.

2.3.7

For the complex case, starting with the mgf, derive the noncentral chisquare density and show that it agrees with that given in (2.3a.3).

2.3.8

Give the detailed proofs of the independence of linear forms and sets of linear forms in the complex Gaussian case.

2.4. Distributions of Products and Ratios and Connection to Fractional Calculus

Distributions of products and ratios of real scalar random variables are connected to numerous topics including Krätzel integrals and transforms, reaction-rate probability integrals in nuclear reaction-rate theory, the inverse Gaussian distribution, integrals occurring in fractional calculus, Kobayashi integrals and Bayesian structures. Let x 1 > 0 and x 2 > 0 be real scalar positive random variables that are independently distributed with density functions f 1(x 1) and f 2(x 2), respectively. We respectively denote the product and ratio of these variables by u 2 = x 1 x 2 and \( u_1=\frac {x_2}{x_1}\). What are then the densities of u 1 and u 2? We first consider the density of the product. Let u 2 = x 1 x 2 and v = x 2. Then \(x_1=\frac {u_2}{v} \) and x 2 = v, \(\text{d}x_1\wedge \text{d}x_2=\frac {1}{v}\text{d}u\wedge \text{d}v\). Let the joint density of u 2 and v be denoted by g(u 2, v) and the marginal density of u 2 by g 2(u 2). Then

$$\displaystyle \begin{aligned}g(u_2,v)=\frac{1}{v}f_1\Big(\frac{u_2}{v}\Big)f_2(v)\mbox{ and }g_2(u_2)=\int_v\frac{1}{v}f_2\Big(\frac{u_2}{v}\Big)f_1(v)\text{d}v.{}\end{aligned} $$
(2.4.1)

For example, let f 1 and f 2 be generalized gamma densities, in which case

$$\displaystyle \begin{aligned}f_j(x_j)=\frac{a_j^{\frac{\gamma_j}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})}x_j^{\gamma_j-1}\text{e}^{-a_jx_j^{\delta_j}},\ a_j>0,\ \delta_j>0,~\gamma_j>0,~x_j\ge 0\ j=1,2,\end{aligned} $$

and f j(x j) = 0 elsewhere. Then,

$$\displaystyle \begin{aligned} g_2(u_2)&=c\int_0^{\infty}\Big(\frac{1}{v}\Big)\Big(\frac{u_2}{v}\Big)^{\gamma_1-1}v^{\gamma_2-1}\\ &\ \ \ \ \times \text{e}^{-a_2v^{\delta_2}-a_1(\frac{u_2}{v})^{\delta_1}}\text{d}v,\ c=\prod_{j=1}^2\frac{a_j^{\frac{\gamma_j}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})}\\ &=c~u_2^{\gamma_1-1}\int_0^{\infty}v^{\gamma_2-\gamma_1-1}\text{e}^{-a_2v^{\delta_2}-a_1({u_2^{\delta_1}}/{v^{\delta_1}})}\text{d}v.{} \end{aligned} $$
(2.4.2)

The integral in (2.4.2) is connected to several topics. For δ 1 = 1, δ 2 = 1, this integral is the basic Krätzel integral and Krätzel transform, see Mathai and Haubold (2020). When \(\delta _2=1,\delta _1=\frac {1}{2},\) the integral in (2.4.2) is the basic reaction-rate probability integral in nuclear reaction-rate theory, see Mathai and Haubold (1988). For δ 1 = 1, δ 2 = 1, the integrand in (2.4.2), once normalized, is the inverse Gaussian density for appropriate values of γ 2 − γ 1 − 1. Observe that (2.4.2) is also connected to the Bayesian structure of unconditional densities if the conditional and marginal densities belong to generalized gamma family of densities. When δ 2 = 1, the integral is a mgf of the remaining part with a 2 as the mgf parameter (It is therefore the Laplace transform of the remaining part of the function).

Now, let us consider different f 1 and f 2. Let f 1(x 1) be a real type-1 beta density with the parameters (γ + 1, α), \(\Re (\alpha )>0,\ \Re (\gamma )>-1\) (in statistical problems, the parameters are real but in this case the results hold as well for complex parameters; accordingly, the conditions are stated for complex parameters), that is, the density of x 1 is

$$\displaystyle \begin{aligned}f_1(x_1)=\frac{\varGamma(\gamma+1+\alpha)}{\varGamma(\gamma+1)\varGamma(\alpha)}x_1^{\gamma}(1-x_1)^{\alpha-1},\ 0\le x_1\le 1,\ \alpha>0,\ \gamma>-1, \end{aligned}$$

and f 1(x 1) = 0 elsewhere. Let f 2(x 2) = f(x 2) where f is an arbitrary density. Then, the density of u 2 is given by

$$\displaystyle \begin{aligned} g_2(u_2)&=c\,\frac{1}{\varGamma(\alpha)}\int_v\frac{1}{v}\Big(\frac{u_2}{v}\Big)^{\gamma}\Big(1-\frac{u_2}{v}\Big)^{\alpha-1}f(v)\text{d}v,\ c=\frac{\varGamma(\gamma+1+\alpha)}{\varGamma(\gamma+1)}\\ &=c\,\frac{u_2^{\gamma}}{\varGamma(\alpha)}\int_{v\ge u_2>0}v^{-\gamma-\alpha}(v-u_2)^{\alpha-1}f(v)\text{d}v{} \end{aligned} $$
(2.4.3)
$$\displaystyle \begin{aligned} &=c\,K_{2,u_2,\gamma}^{-\alpha}f{}\end{aligned} $$
(2.4.4)

where \(K_{2,u_2,\gamma }^{-\alpha }f\) is the Erdélyi-Kober fractional integral of order α of the second kind, with parameter γ in the real scalar variable case. Hence, if f is an arbitrary density, then \(K_{2,u_2,\gamma }^{-\alpha }f\) is \(\frac {\varGamma (\gamma +1)}{\varGamma (\gamma +1+\alpha )}g_2(u_2)\) or a constant multiple of the density of a product of independently distributed real scalar positive random variables where one of them has a real type-1 beta density and the other has an arbitrary density. When f 1 and f 2 are densities, then g 2(u 2) has the structure

$$\displaystyle \begin{aligned}g_2(u_2)=\int_v\frac{1}{v}f_1\big(\frac{u_2}{v}\big)f_2(v)\text{d}v. \end{aligned} $$
(i)

Whether or not f 1 and f 2 are densities, the structure in (i) is called the Mellin convolution of a product in the sense that if we take the Mellin transform of g 2, with Mellin parameter s, then

$$\displaystyle \begin{aligned}M_{g_2}(s)=M_{f_1}(s)M_{f_2}(s){} \end{aligned} $$
(2.4.5)

where \(M_{g_2}(s)=\int _0^{\infty }u_2^{s-1}g_2(u_2)\text{d}u_2\),

$$\displaystyle \begin{aligned}M_{f_1}(s)=\int_0^{\infty}x_1^{s-1}f_1(x_1)\text{d}x_1 \ \text{and }\ M_{f_2}(s)=\int_0^{\infty}x_2^{s-1}f_2(x_2)\text{d}x_2,\end{aligned}$$

whenever the Mellin transforms exist. Here (2.4.5) is the Mellin convolution of a product property. In statistical terms, when f 1 and f 2 are densities and when x 1 and x 2 are independently distributed, we have

$$\displaystyle \begin{aligned}E[u_2^{s-1}]=E[x_1^{s-1}]E[x_2^{s-1}]{} \end{aligned} $$
(2.4.6)

whenever the expected values exist. Taking different forms of f 1 and f 2, where f 1 has a factor \(\frac {(1-x_1)^{\alpha -1}}{\varGamma (\alpha )}\) for \(0\le x_1\le 1, \Re (\alpha )>0,\) it can be shown that the structure appearing in (i) produces all the various fractional integrals of the second kind of order α available in the literature for the real scalar variable case, such as the Riemann-Liouville fractional integral, Weyl fractional integral, etc. Connections of distributions of products and ratios to fractional integrals were established in a series of papers which appeared in Linear Algebra and its Applications, see Mathai (2013, 2014, 2015).

Now, let us consider the density of a ratio. Again, let x 1 > 0, x 2 > 0 be independently distributed real scalar random variables with density functions f 1(x 1) and f 2(x 2), respectively. Let \(u_1=\frac {x_2}{x_1}\) and let v = x 2. Then \(\text{d}x_1\wedge \text{d}x_2=-\frac {v}{u_1^2}\text{d}u_1\wedge \text{d}v\). If we take x 1 = v, the Jacobian will be only v and not \(-\frac {v}{u_1^2}\) and the final structure will be different. However, the first transformation is required in order to establish connections to fractional integral of the first kind. If f 1 and f 2 are generalized gamma densities as described earlier and if x 1 = v, then the marginal density of u 1, denoted by g 1(u 1), will be as follows:

$$\displaystyle \begin{aligned} g_1(u_1)&=c\int_vvv^{\gamma_1-1}(u_1v)^{\gamma_2-1}\text{e}^{-a_1v^{\delta_1}-a_2(u_1v)^{\delta_2}}\text{d}v,\ c=\prod_{j=1}^2\frac{a_j^{\frac{\gamma_j}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})},\\ &=c\,u_1^{\gamma_2-1}\int_{v=0}^{\infty}v^{\gamma_1+\gamma_2-1}\text{e}^{-a_1v^{\delta_1}-a_2(u_1v)^{\delta_2}}\text{d}v{} \end{aligned} $$
(2.4.7)
$$\displaystyle \begin{aligned} &=\frac{c}{\delta}\,\varGamma\Big(\frac{\gamma_1+\gamma_2}{\delta}\Big)u_1^{\gamma_2-1}(a_1+a_2u_1^{\delta})^{-\frac{\gamma_1+\gamma_2}{\delta}}, \mbox{ for }\delta_1=\delta_2=\delta.{} \end{aligned} $$
(2.4.8)

On the other hand, if x 2 = v, then the Jacobian is \(-\frac {v}{u_1^2}\) and the marginal density, again denoted by g 1(u 1), will be as follows when f 1 and f 2 are gamma densities:

$$\displaystyle \begin{aligned} g_1(u_1)&=c\int_v\Big(\frac{v}{u_1^2}\Big)\Big(\frac{v}{u_1}\Big)^{\gamma_1-1}v^{\gamma_2-1}\text{e}^{-a_1(\frac{v}{u_1})^{\delta_1}-a_2v^{\delta_2}}\text{d}v\\ &=c\,u_1^{-\gamma_1-1}\int_{v=0}^{\infty}v^{\gamma_1+\gamma_2-1}\text{e}^{-a_2v^{\delta_2}-a_1(\frac{v}{u_1})^{\delta_1}}\text{d}v.\end{aligned} $$

This is one of the representations of the density of a product discussed earlier, which is also connected to Krátzel integral, reaction-rate probability integral, and so on. Now, let us consider a type-1 beta density for x 1 with the parameters (γ, α) having the following density:

$$\displaystyle \begin{aligned}f_1(x_1)=\frac{\varGamma(\gamma+\alpha)}{\varGamma(\gamma)\varGamma(\alpha)}x_1^{\gamma-1}(1-x_1)^{\alpha-1},\ 0\le x_1\le 1,\end{aligned}$$

for γ > 0, α > 0 and f 1(x 1) = 0 elsewhere. Let f 2(x 2) = f(x 2) where f is an arbitrary density. Letting \(u_1=\frac {x_2}{x_1}\) and x 2 = v, the density of u 1, again denoted by g 1(u 1), is

$$\displaystyle \begin{aligned} g_1(u_1)&=\frac{\varGamma(\gamma+\alpha)}{\varGamma(\gamma)\varGamma(\alpha)}\int_v\frac{v}{u_1^2} \Big(\frac{v}{u_1}\Big)^{\gamma-1}\Big(1-\frac{v}{u_1}\Big)^{\alpha-1}f(v)\text{d}v\\ &=\frac{\varGamma(\gamma+\alpha)}{\varGamma(\gamma)}\frac{u_1^{-\gamma-\alpha}}{\varGamma(\alpha)}\int_{v\le u_1}v^{\gamma}(u_1-v)^{\alpha-1}f(v)\text{d}v{} \end{aligned} $$
(2.4.9)
$$\displaystyle \begin{aligned} &=\frac{\varGamma(\gamma+\alpha)}{\varGamma(\gamma)}K_{1,u_1,\gamma}^{-\alpha}f, \ \Re(\alpha)>0,{}\end{aligned} $$
(2.4.10)

where \(K_{1,u_1,\gamma }^{-\alpha }f\) is Erdélyi-Kober fractional integral of the first kind of order α and parameter γ. If f 1 and f 2 are densities, this Erdélyi-Kober fractional integral of the first kind is a constant multiple of the density of a ratio g 1(u 1). In statistical terms,

$$\displaystyle \begin{aligned} u_1&=\frac{x_2}{x_1}\Rightarrow E[u_1^{s-1}]=E[x_2^{s-1}]E[x_1^{-s+1}] \ \text{with }\ E[x_1^{-s+1}]=E[x_1^{(2-s)-1}]\Rightarrow\\ M_{g_1}(s)&=M_{f_1}(2-s)M_{f_2}(s),{}\end{aligned} $$
(2.4.11)

which is the Mellin convolution of a ratio. Whether or not f 1 and f 2 are densities, (2.4.11) is taken as the Mellin convolution of a ratio and it cannot be given statistical interpretations when f 1 and f 2 are not densities. For example, let \(f_1(x_1)=x_1^{-\alpha }\frac {(1-x_1)^{\alpha -1}}{\varGamma (\alpha )}\) and \(f_2(x_2)=x_2^{\alpha }f(x_2)\) where f(x 2) is an arbitrary function. Then the Mellin convolution of a ratio, as in (2.4.11), again denoted by g 1(u 1), is given by

$$\displaystyle \begin{aligned}g_1(u_1)=\int_{v\le u_1}\frac{(u_1-v)^{\alpha-1}}{\varGamma(\alpha)}f(v)\text{d}v, \ \Re(\alpha)>0.{} \end{aligned} $$
(2.4.12)

This is Riemann-Liouville fractional integral of the first kind of order α if v is bounded below; when v is not bounded below, then it is Weyl fractional integral of the first kind of order α. An introduction to fractional calculus is presented in Mathai and Haubold (2018). The densities of u 1 and u 2 are connected to various problems in different areas for different functions f 1 and f 2.

In the p × p matrix case in the complex domain, we will assume that the matrix is Hermitian positive definite. Note that when p = 1, Hermitian positive definite means a real positive variable. Hence in the scalar case, we will not discuss ratios and products in the complex domain since densities must be real-valued functions.

Exercises 2.4

2.4.1

Derive the density of (1): a real non-central F, where the numerator chisquare is non-central and the denominator chisquare is central, (2): a real doubly non-central F where both the chisquares are non-central with non-centrality parameters λ 1 and λ 2 respectively.

2.4.2

Let x 1 and x 2 be real gamma random variables with parameters (α 1, β) and (α 2, β) with the same β respectively and be independently distributed. Let \(u_1=\frac {x_1}{x_1+x_2}, u_2=\frac {x_1}{x_2}, u_3=x_1+x_2\). Compute the densities of u 1, u 2, u 3. Hint: Use the transformation \(x_1=r\cos ^2\theta , x_2=r\sin ^2\theta \).

2.4.3

Let x 1 and x 2 be as defined as in Exercise 2.4.2. Let u = x 1 x 2. Derive the density of u.

2.4.4

Let x j have a real type-1 beta density with the parameters (α j, β j), j = 1, 2 and be independently distributed. Let \(u_1=x_1x_2, u_2=\frac {x_1}{x_2}\). Derive the densities of u 1 and u 2. State the conditions under which these densities reduce to simpler known densities.

2.4.5

Evaluate (1): Weyl fractional integral of the second kind of order α if the arbitrary function is f(v) = ev; (ii) Riemann-Liouville fractional integral of the first kind of order α if the lower limit is 0 and the arbitrary function is f(v) = v δ.

2.4.6

In Exercise 2.4.2 show that (1): u 1 and u 3 are independently distributed; (2): u 2 and u 3 are independently distributed.

2.4.7

In Exercise 2.4.2 show that for arbitrary h, \(E\left [\frac {x_1}{x_1+x_2}\right ]^h=\frac {E(x_1^h)}{E(x_1+x_2)^h}\) and state the conditions for the existence of the moments. [Observe that, in general, \(E(\frac {y_1}{y_2})^h\ne \frac {E(y_1^h)}{E(y_2^h)}\) even if y 1 and y 2 are independently distributed.]

2.4.8

Derive the corresponding densities in Exercise 2.4.1 for the complex domain by taking the chisquares in the complex domain.

2.4.9

Extend the results in Exercise 2.4.2 to the complex domain by taking chisquare variables in the complex domain instead of gamma variables.

2.5. General Structures

2.5.1. Product of real scalar gamma variables

Let x 1, …, x k be independently distributed real scalar gamma random variables with x j having the density \(f_j(x_j)=c_jx_j^{\alpha _j-1}\text{e}^{-\frac {x_j}{\beta _j}},\ 0\le x_j<\infty ,\ \alpha _j>0,\ \beta _j>0\) and f j(x j) = 0 elsewhere. Consider the product u = x 1 x 2x k. Such structures appear in many situations such as geometrical probability problems when we consider gamma distributed random points, see Mathai (1999). How can we determine the density of such a general structure? The transformation of variables technique is not a feasible procedure in this case. Since the x j’s are positive, we may determine the Mellin transforms of the x j’s with parameter s. Then, when f j(x j) is a density, the Mellin transform \(M_{f_j}(s)\), once expressed in terms of an expected value, is \(M_{f_j}(s)=E[x_j^{s-1}]\) whenever the expected value exists:

$$\displaystyle \begin{aligned} M_{f_j}(s)&=E[x_j^{s-1}]=\frac{1}{\beta_j^{\alpha_j}\varGamma(\beta_j)}\int_0^{\infty}x_j^{s-1}x_j^{\alpha_j-1}\text{e}^{-\frac{x_j}{\beta_j}}\text{d}x_j\\ &=\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j)}\beta_j^{s-1},\ \Re(\alpha_j+s-1)>0.\end{aligned} $$

Hence,

$$\displaystyle \begin{aligned}E[u^{s-1}]=\prod_{j=1}^kE[x_j^{s-1}]= \prod_{j=1}^k\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j)}\beta_j^{s-1},\ \Re(\alpha_j+s-1)>0,\ j=1,\ldots,k,\end{aligned}$$

and the density of u is available from the inverse Mellin transform. If g(u) is the density of u, then

$$\displaystyle \begin{aligned}g(u)=\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\Big\{\prod_{j=1}^k\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j)}\beta_j^{s-1}\Big\}u^{-s}\text{d}x,i=\sqrt{(-1)}.{} \end{aligned} $$
(2.5.1)

This is a contour integral where c is any real number such that \(c>-\Re (\alpha _j-1), j=1,\ldots ,k\). The integral in (2.5.1) is available in terms of a known special function, namely Meijer’s G-function. The G-function can be defined as follows:

$$\displaystyle \begin{aligned} G(z)&=G_{p,q}^{m,n}(z)=G_{p,q}^{m,n}\left[z\big|{}_{b_1,\ldots,b_q}^{a_1,\ldots,a_p}\right]\\ &=\frac{1}{2\pi i}\int_L\frac{\{\prod_{j=1}^m\varGamma(b_j+s)\}\{\prod_{j=1}^n\varGamma(1-a_j-s)\}}{\{\prod_{j=m+1}^q\varGamma(1-b_j-s)\} \{\prod_{j=n+1}^p\varGamma(a_j+s)\}}z^{-s}\text{d}s,\ i=\sqrt{(-1)}.{}\end{aligned} $$
(2.5.2)

The existence conditions, different possible contours L, as well as properties and applications are discussed in Mathai (1993), Mathai and Saxena (1973, 1978), and Mathai et al. (2010). With the help of (2.5.2), we may now express (2.5.1) as follows in terms of a G-function:

$$\displaystyle \begin{aligned}g(u)=\Big\{\prod_{j=1}^k\frac{1}{\beta_j\varGamma(\alpha_j)}\Big\}\,G_{0,k}^{k,0}\left[\frac{u}{\beta_1\cdots\beta_k} \big|{}_{\alpha_j-1,j=1,\ldots,k}\right]{} \end{aligned} $$
(2.5.3)

for 0 ≤ u < . Series and computable forms of a general G-function are provided in Mathai (1993). They are built-in functions in the symbolic computational packages Mathematica and MAPLE.

2.5.2. Product of real scalar type-1 beta variables

Let y 1, …, y k be independently distributed real scalar type-1 beta random variables with the parameters (α j, β j), α j > 0, β j > 0, j = 1, …, k. Consider the product u 1 = y 1y k. Such a structure occurs in several contexts. It appears for instance in geometrical probability problems in connection with type-1 beta distributed random points. As well, when testing certain hypotheses on the parameters of one or more multivariate normal populations, the resulting likelihood ratio criteria, also known as λ-criteria, or one- to-one functions thereof, have the structure of a product of independently distributed real type-1 beta variables under the null hypothesis. The density of u 1 can be obtained by proceeding as in the previous section. Since the moment of u 1 of order s − 1 is

$$\displaystyle \begin{aligned} E[u_1^{s-1}]&=\prod_{j=1}^kE[y_j^{s-1}]\\ &=\prod_{j=1}^k\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j)}\frac{\varGamma(\alpha_j+\beta_j)} {\varGamma(\alpha_j+\beta_j+s-1)}{}\end{aligned} $$
(2.5.4)

for \(\Re (\alpha _j+s-1)>0,\ j=1,\ldots ,k\), Then, the density of u 1, denoted by g 1(u 1), is given by

$$\displaystyle \begin{aligned} g_1(u_1)&=\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}[E(u_1^{s-1})]u_1^{-s}\text{d}s\\ &=\Big\{\prod_{j=1}^k\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j)}\Big\}\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\Big\{\prod_{j=1}^k\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j+\beta_j+s-1)}\Big\}u_1^{-s}\text{d}s\\ &=\Big\{\prod_{j=1}^k\frac{\varGamma(\alpha_j+\beta_j)}{\varGamma(\alpha_j)}\Big\}\,G_{k,k}^{k,0}\left[u_1 \big\vert_{\alpha_j-1,\ j=1,\ldots,k}^{\alpha_j+\beta_j-1,\ j=1,\ldots,k}\right]{} \end{aligned} $$
(2.5.5)

for \(0\le u_1\le 1,\ \Re (\alpha _j+s-1)>0,\ j=1,\ldots ,k\).

2.5.3. Product of real scalar type-2 beta variables

Let u 2 = z 1 z 2z k where the z j’s are independently distributed real scalar type-2 beta random variables with the parameters (α j, β j), α j > 0, β j > 0, j = 1, …, k. Such products are encountered in several situations, including certain problems in geometrical probability that are discussed in Mathai (1999). Then,

$$\displaystyle \begin{aligned} E[z_j^{s-1}]=\frac{\varGamma(\alpha_j+s-1)}{\varGamma(\alpha_j)}\frac{\varGamma(\beta_j-s+1)}{\varGamma(\beta_j)},\ -\Re(\alpha_j-1)<\Re(s)<\Re(\beta_j+1), \end{aligned}$$

and

$$\displaystyle \begin{aligned} E[u_2^{s-1}]=\prod_{j=1}^kE[z_j^{s-1}]=\big\{ \prod_{j=1}^k[\varGamma(\alpha_j)\varGamma(\beta_j)]^{-1}\big\} \big\{\prod_{j=1}^k\varGamma(\alpha_j+s-1)\varGamma(\beta_j-s+1)\big\}.{}\end{aligned} $$
(2.5.6)

Hence, the density of u 2, denoted by g 2(u 2), is given by

$$\displaystyle \begin{aligned} g_2(u_2)&=\big\{\prod_{j=1}^k[\varGamma(\alpha_j) \varGamma(\beta_j)]^{-1}\big\}\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\big\{\prod_{j=1}^k\varGamma(\alpha_j+s-1)\varGamma(\beta_j-s+1)\big\}u_2^{-s}\text{d}s\\ &=\big\{\prod_{j=1}^k[\varGamma(\alpha_j) \varGamma(\beta_j)]^{-1}\big\}\,G_{k,k}^{k,k}\left[ u_2\big\vert_{\alpha_j-1,\ j=1,\ldots,k}^{-\beta_j,\ j=1,\ldots,k}\right],\ u_2\ge 0.{}\qquad \qquad \end{aligned} $$
(2.5.7)

2.5.4. General products and ratios

Let us consider a structure of the following form:

$$\displaystyle \begin{aligned}u_3=\frac{t_1\cdots t_r}{t_{r+1}\cdots t_k,} \end{aligned}$$

where the t j’s are independently distributed real positive variables, such as real type-1 beta, real type-2 beta, and real gamma variables, where the expected values \(E[t_j^{s-1}]\) for j = 1, …, k, will produce various types of gamma products, some containing + s and others, − s, both in the numerator and in the denominator. Accordingly, we obtain a general structure such as that appearing in (2.5.2), and the density of u 3, denoted by g 3(u 3), will then be proportional to a general G-function.

2.5.5. The H-function

Let u = v 1 v 2v k where the v j’s are independently distributed generalized real gamma variables with densities

$$\displaystyle \begin{aligned}h_j(v_j)=\frac{a_j^{\frac{\gamma_j}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})}v_j^{\gamma_j-1}\text{e}^{-a_jv_j^{\delta_j}},\ v_j\ge 0,\end{aligned}$$

for a j > 0, δ j > 0, γ j > 0, and h j(v j) = 0 elsewhere for j = 1, …, k. Then,

$$\displaystyle \begin{aligned}E[v_j^{s-1}]=\frac{\varGamma(\frac{\gamma_j+s-1}{\delta_j})}{\varGamma(\frac{\gamma_j}{\delta_j})}\frac{1}{a_j^{\frac{s-1}{\delta_j}}}, \end{aligned}$$

for \(\Re (\gamma _j+s-1)>0, \ v_j\ge 0,\ \delta _j>0,\ a_j>0,\ \gamma _j>0, \ j=1,\ldots ,k\), and

$$\displaystyle \begin{aligned}E[u^{s-1}]=\Big\{\prod_{j=1}^k\frac{a_j^{\frac{1}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})}\Big\} \Big\{\prod_{j=1}^k\varGamma(\frac{\gamma_j-1}{\delta_j}+\frac{s}{\delta_j})\,a_j^{-\frac{s}{\delta_j}}\Big\}.{} \end{aligned} $$
(2.5.8)

Thus, the density of u, denoted by g 3(u), is given by

$$\displaystyle \begin{aligned} g_3(u)&=\Big\{\prod_{j=1}^k\frac{a_j^{\frac{1}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})}\Big\}\frac{1}{2\pi i}\int_L\Big\{\prod_{j=1}^k\varGamma(\frac{\gamma_j-1}{\delta_j}+\frac{s}{\delta_j})\,a_j^{-\frac{s}{\delta_j}}\Big\}u^{-s}\text{d}s\\ &=\Big\{\prod_{j=1}^k\frac{a_j^{\frac{1}{\delta_j}}}{\varGamma(\frac{\gamma_j}{\delta_j})}\Big\}H_{0,k}^{k,0}\left[ \Big\{\prod_{j=1}^ka_j^{\frac{1}{\delta_j}}\Big\}u\Big|{}_{(\frac{\gamma_j-1}{\delta_j},\frac{1}{\delta_j}),\ j=1,\ldots,k}\right] {}\end{aligned} $$
(2.5.9)

for \(u\ge 0,\ \Re (\gamma _j+s-1)>0,\ j=1,\ldots ,k,\) where L is a suitable contour and the general H-function is defined as follows:

$$\displaystyle \begin{aligned} H(z)&=H_{p,q}^{m,n}(z)=H_{p,q}^{m,n}\big[z\big|{}_{(b_1,\beta_1),\ldots,(b_q,\beta_q)}^{(a_1,\alpha_1),\ldots,(a_p,\alpha_p)}\big]\\ &=\frac{1}{2\pi i}\int_L\frac{\{\prod_{j=1}^m\varGamma(b_j+\beta_js)\}\{\prod_{j=1}^n\varGamma(1-a_j-\alpha_js)\}} {\{\prod_{j=m+1}^q\varGamma(1-b_j-\beta_js)\}\{\prod_{j=n+1}^p\varGamma(a_j+\alpha_js)\}}z^{-s}\text{d}s{} \end{aligned} $$
(2.5.10)

where α j > 0, j = 1, …, p; β j > 0, j = 1, …, q are real and positive, b j’s and a j’s are complex numbers, the contour L separates the poles of Γ(b j + β j s), j = 1, …, m, lying on one side of it and the poles of Γ(1 − a j − α j s), j = 1, …, n, which must lie on the other side. The existence conditions and the various types of possible contours are discussed in Mathai and Saxena (1978) and Mathai et al. (2010). Observe that we can consider arbitrary powers of the variables present in u, u 1, u 2 and u 3 as introduced in Sects. 2.5.12.5.5; however, in this case, the densities of these various structures will be expressible in terms of H-functions rather than G-functions. In the G-function format as defined in (2.5.2), the complex variable s has ± 1 as its coefficients, whereas the coefficients of s in the H-function, that is, ± α j, α j > 0 and ± β j, β j > 0, are not restricted to unities.

We will give a simple illustrative example that requires the evaluation of an inverse Mellin transform. Let f(x) = ex, x > 0. Then, the Mellin transform is

$$\displaystyle \begin{aligned}M_{f}(s)=\int_0^{\infty}x^{s-1}\text{e}^{-x}\text{d}x=\varGamma(s),\ \Re(s)>0, \end{aligned}$$

and it follows from the inversion formula that

$$\displaystyle \begin{aligned}f(x)=\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\varGamma(s)x^{-s}\text{d}s,\ \Re(s)>0,\ i=\sqrt{(-1)}.{} \end{aligned} $$
(2.5.11)

If f(x) is unknown and we are told that the Mellin transform of a certain function is Γ(s), then are we going to retrieve f(x) as ex from the inversion formula? Let us explore this problem. The poles of Γ(s) occur at s = 0, −1, −2, …. Thus, if we take c in the contour of integration as c > 0, this contour will enclose all the poles of Γ(s). We may now apply Cauchy’s residue theorem. By definition, the residue at s = −ν, denoted by R ν, is

$$\displaystyle \begin{aligned}R_{\nu}=\lim_{s\to -\nu}(s+\nu)\varGamma(s)x^{-s}.\end{aligned}$$

We cannot substitute s = −ν to obtain the limit in this case. However, noting that

$$\displaystyle \begin{aligned}(s+\nu)\varGamma(s)\, x^{-s}=\frac{(s+\nu)(s+\nu-1)\cdots s\,\varGamma(s)x^{-s}}{(s+\nu-1)\cdots s} =\frac{\varGamma(s+\nu+1)x^{-s}}{(s+\nu-1)\cdots s}, \end{aligned}$$

which follows from the recursive relationship, αΓ(α) = Γ(α + 1), the limit can be taken:

$$\displaystyle \begin{aligned} \lim_{s\to -\nu}(s+\nu)\varGamma(s)\,x^{-s}&=\lim_{s\to -\nu}\frac{\varGamma(s+\nu+1)x^{-s}}{(s+\nu-1)\cdots s}\\ &=\frac{\varGamma(1)x^{\nu}}{(-1)(-2)\cdots (-\nu)}=\frac{(-1)^{\nu}x^{\nu}}{\nu!}.{} \end{aligned} $$
(2.5.12)

Hence, the sum of the residues is

$$\displaystyle \begin{aligned}\sum_{\nu=0}^{\infty}R_{\nu}=\sum_{\nu=0}^{\infty}\frac{(-1)^{\nu}x^{\nu}}{\nu!}=\text{e}^{-x}, \end{aligned}$$

and the function is recovered.

Note 2.5.1

Distributions of products and ratios of random variables in the complex domain could as well be worked out. However, since they may not necessarily have practical applications, they will not be discussed herein. Certain product and ratio distributions for variables in the complex domain which reduce to real variables, such as a chisquare in the complex domain, have already been previously discussed.

Exercises 2.5

2.5.1

Evaluate the density of u = x 1 x 2 where the x j’s are independently distributed real type-1 beta random variables with the parameters (α j, β j), α j > 0, β j > 0, j = 1, 2 by using Mellin and inverse Mellin transform technique. Evaluate the density for the case α 1 − α 2≠ ± ν, ν = 0, 1, ... so that the poles are simple.

2.5.2

Repeat Exercise 2.5.1 if x j’s are (1): real type-2 beta random variables with parameters (α j, β j), α j > 0, β j > 0 and (2): real gamma random variables with the parameters (α j, β j), α j > 0, β j > 0, j = 1, 2.

2.5.3

Let \(u=\frac {u_1}{u_2}\) where u 1 and u 2 are real positive random variables. Then the h-th moment, for arbitrary h, is \(E[\frac {u_1}{u_2}]^h\ne \frac {E[u_1^h]}{E[u_2^h]}\) in general. Give two examples where \(E[\frac {u_1}{u_2}]^h=\frac {E[u_1^h]}{E[u_2^h]}\).

2.5.4

\(E[\frac {1}{u}]^h=E[u^{-h}]\ne \frac {1}{E[u^h]}\) in general. Give two examples where \(E[\frac {1}{u}]=\frac {1}{E[u]}\).

2.5.5

Let \(u=\frac {x_1x_2}{x_3x_4}\) where the x j’s are independently distributed. Let x 1, x 3 be type-1 beta random variables, x 2 be a type-2 beta random variable, and x 4 be a gamma random variable with parameters (α j, β j), α j > 0, β j > 0, j = 1, 2, 3, 4. Determine the density of u.

2.6. A Collection of Random Variables

Let x 1, …, x n be iid (independently and identically distributed) real scalar random variables with a common density denoted by f(x), that is, assume that the sample comes from the population that is specified by f(x). Let the common mean value be μ and the common variance be σ 2 < , that is, E(x j) = μ and Var(x j) = σ 2, j = 1, …, n, where E denotes the expected value. Denoting the sample average by \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\), what can be said about \(\bar {x}\) when n →? This is the type of questions that will be investigated in this section.

2.6.1. Chebyshev’s inequality

For some k > 0, let us examine the probability content of |x − μ| where μ = E(x) and the variance of x is σ 2 < . Consider the probability that the random variable x lies outside the interval μ −  < x < μ + , that is k times the standard deviation σ away from the mean value μ. From the definition of the variance σ 2 for a real scalar random variable x,

$$\displaystyle \begin{aligned} \sigma^2&=\int_{-\infty}^{\infty}(x-\mu)^2f(x)\text{d}x\\ &=\int_{-\infty}^{\mu-k\sigma}(x-\mu)^2f(x)\text{d}x+\int_{\mu-k\sigma}^{\mu+k\sigma} (x-\mu)^2f(x)\text{d}x+\int_{\mu+k\sigma}^{\infty}(x-\mu)^2f(x)\text{d}x\\ &\ge \int_{-\infty}^{\mu-k\sigma}(x-\mu)^2f(x)\text{d}x+\int_{\mu+k\sigma}^{\infty}(x-\mu)^2f(x)\text{d}x\end{aligned} $$

since the probability content over the interval μ −  < x < μ +  is omitted. Over this interval, the probability is either positive or zero, and hence the inequality. However, the intervals (−, μ − ] and [μ + , ), that is, − < x ≤ μ −  and μ +  ≤ x <  or − < x − μ ≤− and  ≤ x − μ < , can thus be described as the intervals for which |x − μ|≥ . In these intervals, the smallest value that |x − μ| can take on is , k > 0 or equivalently, the smallest value that |x − μ|2 can assume is ()2 = k 2 σ 2. Accordingly, the above inequality can be further sharpened as follows:

$$\displaystyle \begin{aligned} \sigma^2&\ge \int_{|x-\mu|\ge k\sigma}(x-\mu)^2f(x)\text{d}x\ge \int_{|x-\mu|\ge k\sigma}(k\sigma )^2f(x)\text{d}x\Rightarrow\\ \frac{\sigma^2}{k^2\sigma^2}&\ge \int_{|x-\mu|\ge k\sigma}f(x)\text{d}x\Rightarrow\\ \frac{1}{k^2}&\ge \int_{|x-\mu|\ge k\sigma}f(x)\text{d}x=Pr\{|x-\mu|\ge k\sigma,\}\mbox{ that is, }\\ \frac{1}{k^2}&\ge Pr\{|x-\mu|\ge k\sigma\},\end{aligned} $$

which can be written as

$$\displaystyle \begin{aligned}Pr\{|x-\mu|\ge k\sigma\}\le \frac{1}{k^2}\mbox{ or }Pr\{|x-\mu|<k\sigma\}\ge 1-\frac{1}{k^2}.{}\end{aligned} $$
(2.6.1)

If  = k 1, \(k=\frac {k_1}{\sigma ^2}\), and the above inequalities can be written as follows:

$$\displaystyle \begin{aligned}Pr\{|x-\mu|\ge k\}\le \frac{\sigma^2}{k^2}\mbox{ or }Pr\{|x-\mu|<k\}\ge 1-\frac{\sigma^2}{k^2}.{} \end{aligned} $$
(2.6.2)

The inequalities (2.6.1) and (2.6.2) are known as Chebyshev’s inequalities (also referred to as Chebycheff’s inequalities). For example, when k = 2, Chebyshev’s inequality states that \(Pr\{|x-\mu |<2\sigma \}\ge 1-\frac {1}{4}=0.75\), which is not a very sharp probability limit. If x ∼ N 1(μ, σ 2), then we know that

$$\displaystyle \begin{aligned}Pr\{|x-\mu|<1.96\sigma\}\approx 0.95\mbox{ and }Pr\{|x-\mu|<3\sigma\}\approx 0.99. \end{aligned}$$

Note that the bound 0.75 resulting from Chebyshev’s inequality seriously underestimate the actual probability for a Gaussian variable x. However, what is astonishing about this inequality, is that the given probability bound holds for any distribution, whether it be continuous, discrete or mixed. Sharper bounds can of course be obtained for the probability content of the interval [μ − , μ + ] when the exact distribution of x is known.

These inequalities can be expressed in terms of generalized moments. Let \(\mu _r^{\frac {1}{r}}=\{E|x-\mu |{ }^r\}^{\frac {1}{r}}, \ r=1,2,\ldots \ ,\) which happens to be a measure of scatter in x from the mean value μ. Given that

$$\displaystyle \begin{aligned}\mu_r=\int_{-\infty}^{\infty}|x-\mu|{}^rf(x)\text{d}x,\end{aligned}$$

consider the probability content of the intervals specified by \(|x-\mu |\ge k\mu _r^{\frac {1}{r}}\) for k > 0. Paralleling the derivations of (2.6.1) and (2.6.2), we have

$$\displaystyle \begin{aligned} \mu_r&\ge\int_{|x-\mu|\ge k\,\mu_r^{\frac{1}{r}}}|x-\mu|{}^rf(x)\text{d}x\ge \int_{|x-\mu|\ge k\,\mu_r^{\frac{1}{r}}}|(k\mu_r^{\frac{1}{r}})|{}^rf(x)\text{d}x\Rightarrow\qquad \qquad \\ Pr&\{|x-\mu|\ge k\mu_r^{\frac{1}{r}}\}\le \frac{1}{k^r}\ \mbox{ or }\ Pr\{|x-\mu|<k\mu_r^{\frac{1}{r}}\}\ge 1-\frac{1}{k^r},{}\end{aligned} $$
(2.6.3)

which can also be written as

$$\displaystyle \begin{aligned}Pr\{|x-\mu|\ge k\}\le \frac{\mu_r}{k^r}\ \mbox{ or }\ Pr\{|x-\mu|<k\}\ge 1-\frac{\mu_r}{k^r}, \ r=1,2,\ldots\ .{} \end{aligned} $$
(2.6.4)

Note that when r = 2, μ r = σ 2, and Chebyshev’s inequalities as specified in (2.6.1) and (2.6.2) are obtained from (2.6.3) and (2.6.4), respectively. If x is a real scalar positive random variables with f(x) = 0 for x ≤ 0, we can then obtain similar inequalities in terms of the first moment μ. For k > 0,

$$\displaystyle \begin{aligned} \mu=E(x)&=\int_0^{\infty}xf(x)\text{d}x\ \mbox{ since }f(x)=0\mbox{ for }x\le 0\\ &=\int_0^kxf(x)\text{d}x+\int_k^{\infty}xf(x)\text{d}x\ge \int_k^{\infty}xf(x)\text{d}x\ge \int_k^{\infty}k f(x)\text{d}x\ \Rightarrow\\ \frac{\mu}{k}&\ge \int_k^{\infty}f(x)\text{d}x=Pr\{x\ge k\}.\end{aligned} $$

Accordingly, we have the following inequality for any real positive random variable x:

$$\displaystyle \begin{aligned}Pr\{x\ge k\}\le \frac{\mu}{k}\mbox{ for }x>0,k>0.{} \end{aligned} $$
(2.6.5)

Suppose that our variable is \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n),\) where x 1, …, x n are iid variables with common mean value μ and the common variance σ 2 < . Then, since \(\text{Var}(\bar {x})=\frac {\sigma ^2}{n}\) and \(E(\bar {x})=\mu \), Chebyshev’s inequality states that

$$\displaystyle \begin{aligned}Pr\{|\bar{x}-\mu|<k\}\ge 1-\frac{\sigma^2}{n}\to 1\mbox{ as }n\to\infty{} \end{aligned} $$
(2.6.6)

or \(Pr\{|\bar {x}-\mu |\ge k\}\to 0\) as n →. However, since a probability cannot be greater than 1, \(Pr\{|\bar {x}-\mu |<k\}\to 1\) as n →. In other words, \(\bar {x}\) tends to μ with probability 1 as n →. This is referred to as the Weak Law of Large Numbers.

The Weak Law of Large Numbers

Let x 1, …, x n be iid with common mean value μ and common variance σ 2 < . Then, as n →,

$$\displaystyle \begin{aligned}Pr\{\bar{x}\to \mu\}\to 1.{} \end{aligned} $$
(2.6.7)

Another limiting property is known as the Central Limit Theorem. Let x 1, …, x n be iid real scalar random variables with common mean value μ and common variance σ 2 < . Letting \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\) denote the sample mean, the standardized sample mean is

$$\displaystyle \begin{aligned}u=\frac{\bar{x}-E(\bar{x})}{\sqrt{\text{Var}(\bar{x})}}=\frac{\sqrt{n}}{\sigma}(\bar{x}-\mu)=\frac{1}{\sigma\sqrt{n}}[(x_1-\mu)+\cdots+(x_n-\mu)]. \end{aligned} $$
(i)

Consider the characteristic function of x − μ, that is,

$$\displaystyle \begin{aligned} \phi_{x-\mu}(t)&=E[\text{e}^{it(x-\mu)}]=1+\frac{it}{1!}E(x-\mu)+\frac{(it)^2}{2!}E(x-\mu)^2+\cdots \\ &=1+0-\frac{t^2}{2!}E(x-\mu)^2+\cdots\ =1+\frac{t}{1!}\phi^{(1)}(0)+\frac{t^2}{2!}\phi^{(2)}(0)+\cdots \end{aligned} $$
(ii)

where ϕ (r)(0) is the r-th derivative of ϕ(t) with respect to t, evaluated at t = 0. Let us consider the characteristic function of our standardized sample mean u.

Making use of the last representation of u in (i), we have \(\phi _{{\frac {\sum _{j=1}^n(x_j-\mu )}{\sigma \sqrt {n}}}}(t)=[\phi _{x_j-\mu }({\frac {t}{\sigma \sqrt {n}}})]^n\) so that \(\phi _u(t)=[\phi _{x_j-\mu }(\frac {t}{\sigma \sqrt {n}})]^n\) or \(\ln \phi _u(t)=n\ln \phi _{x_j-\mu }(\frac {t}{\sigma \sqrt {n}})\). It then follows from (ii) that

$$\displaystyle \begin{aligned}{}[\phi_{x_j-\mu}(\frac{t}{\sigma\sqrt{n}})]&=1+0-\frac{t^2}{2!}\frac{\sigma^2}{n\sigma^2}-i\frac{t^3}{3!}\frac{E(x_j-\mu)^3}{(\sigma\sqrt{n})^3}+...\\ &=1-\frac{t^2}{2n}+O\Big(\frac{1}{n^{\frac{3}{2}}}\Big).\end{aligned} $$
(iii)

Now noting that \(\ln (1-y)=-[y+\frac {y^2}{2}+\frac {y^3}{3}+\cdots \ ]\) whenever |y| < 1, we have

$$\displaystyle \begin{aligned}\ln\phi_{x_j-\mu}(\frac{t}{\sigma\sqrt{n}})=-\frac{t^2}{2n}+O\Big(\frac{1}{n^{\frac{3}{2}}}\Big)\Rightarrow n \ln\phi_{x_j-\mu}(\frac{t}{\sigma\sqrt{n}})=-\frac{t^2}{2}+O\Big(\frac{1}{n^{\frac{1}{2}}}\Big)\to -\frac{t^2}{2}\mbox{ as }n\to\infty. \end{aligned}$$

Consequently, as n →,

$$\displaystyle \begin{aligned}\phi_u(t)=\text{e}^{-\frac{t^2}{2}}\Rightarrow u\to N_1(0,1)\mbox{ as }n\to \infty.\end{aligned}$$

This is known as the central limit theorem.

Theorem The Central Limit Theorem.

Let x 1, …, x n be iid real scalar random variables having common mean value μ and common variance σ 2 < ∞. Let the sample mean be \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\) and u denote the standardized sample mean. Then

$$\displaystyle \begin{aligned}u=\frac{\bar{x}-E(\bar{x})}{\sqrt{\mathit{\text{Var}}(\bar{x})}}=\frac{\sqrt{n}}{\sigma}(\bar{x}-\mu)\to N_1(0,1)\mathit{\mbox{ as }}n\to\infty.{}\end{aligned} $$
(2.6.8)

Generalizations, extensions and more rigorous statements of this theorem are available in the literature. We have focussed on the substance of the result, assuming that a simple random sample is available and that the variance of the population is finite.

Exercises 2.6

2.6.1

For a binomial random variable with the probability function (1 − p)nx, 0 < p < 1, x = 0, 1, …, n, n = 1, 2, … and zero elsewhere, show that the standardized binomial variable itself, namely \(\frac {x-np}{\sqrt {np(1-p)}}\) goes to the standard normal when n →.

2.6.2

State the central limit theorem for the following real scalar populations by evaluating the mean value and variance there, assuming that a simple random sample is available: (1) Poisson random variable with parameter λ; (2) Geometric random variable with parameter p; (3) Negative binomial random variable with parameters (p, k); (4) Discrete hypergeometric probability law with parameters (a, b, n); (5) Uniform density over [a, b]; (6) Exponential density with parameter θ; (7) Gamma density with the parameters (α, β); (8) Type-1 beta random variable with the parameters (α, β); (9) Type-2 beta random variable with the parameters (α, β).

2.6.3

State the central limit theorem for the following probability/density functions: (1):

$$\displaystyle \begin{aligned}f(x)=\begin{cases}0.5, x=2\\ 0.5, x=5\end{cases}\end{aligned}$$

and f(x) = 0 elsewhere; (2): f(x) = 2e−2x, 0 ≤ x <  and zero elsewhere; (3): f(x) = 1, 0 ≤ x ≤ 1 and zero elsewhere. Assume that a simple random sample is available from each population.

2.6.4

Consider a real scalar gamma random variable x with the parameters (α, β) and show that E(x) = αβ and variance of x is αβ 2. Assume a simple random sample x 1, …, x n from this population. Derive the densities of (1): x 1 + ⋯ + x n; (2): \(\bar {x}\); (3): \(\bar {x}-\alpha \beta \); (4): Standardized sample mean \(\bar {x}\). Show that the densities in all these cases are still gamma densities, may be relocated, for all finite values of n however large n may be.

2.6.5

Consider the density \(f(x)=\frac {c}{x^{\alpha }},1\le x<\infty \) and zero elsewhere, where c is the normalizing constant. Evaluate c stating the relevant conditions. State the central limit theorem for this population, stating the relevant conditions.

2.7. Parameter Estimation: Point Estimation

There exist several methods for estimating the parameters of a given density/probability function, based on a simple random sample of size n (iid variables from the population designated by the density/probability function). The most popular methods of point estimation are the method of maximum likelihood and the method of moments.

2.7.1. The method of moments and the method of maximum likelihood

The likelihood function L(θ) is the joint density/probability function of the sample values, at an observed sample point, x 1, …, x n. As a function of θ, L(θ), or a one-to-one function thereof, is maximized in order to determine the most likely value of θ in terms of a function of the given sample. This estimation process is referred to as the method of maximum likelihood.

Let \(m_r=\frac {\sum _{j=1}^nx_j^r}{n}\) denote the r-th integer moment of the sample, where x 1, …, x n is the observed sample point, the corresponding population r-th moment being E[x r], where E denotes the expected value. According to the method of moments, the estimates of the parameters are obtained by solving m r = E[x r], r = 1, 2, … .

For example, consider a N 1(μ, σ 2) population with density

$$\displaystyle \begin{aligned}f(x)=\frac{1}{\sqrt{2\pi}\sigma}\text{e}^{-\frac{1}{2\sigma^2}(x-\mu)^2},\ -\infty<x<\infty,\ -\infty<\mu<\infty,\ \sigma>0,{} \end{aligned} $$
(2.7.1)

where μ and σ 2 are the parameters here. Let x 1, …, x n be a simple random sample from this population. Then, the joint density of x 1, …, x n, denoted by L = L(x 1, …, x n;μ, σ 2), is

$$\displaystyle \begin{aligned} L&=\frac{1}{[\sqrt{2\pi}\sigma]^n}\text{e}^{-\frac{1}{2\sigma^2}\sum_{j=1}^n(x_j-\mu)^2}=\frac{1}{[\sqrt{2\pi}\sigma]^n}\text{e}^{-\frac{1}{2\sigma^2}[\sum_{j=1}^n(x_j-\bar{x})^2+n(\bar{x}-\mu)^2]}\ \Rightarrow\\ \ln L&=-n\ln(\sqrt{2\pi}\sigma)-\frac{1}{2\sigma^2}\Big[\sum_{j=1}^n(x_j-\bar{x})^2+ n(\bar{x}-\mu)^2\Big],\ \bar{x}=\frac{1}{n}(x_1+\cdots+x_n).{} \end{aligned} $$
(2.7.2)

Maximizing L or \(\ln L\), since L and \(\ln L\) are one-to-one functions, with respect to μ and θ = σ 2, and solving for μ and σ 2 produces the maximum likelihood estimators (MLE’s). An observed value of the estimator is the corresponding estimate. It follows from a basic result in Calculus that the extrema of L can be determined by solving the equations

$$\displaystyle \begin{aligned}\frac{\partial}{\partial \mu}\ln L=0\end{aligned} $$
(i)

and

$$\displaystyle \begin{aligned}\frac{\partial}{\partial\theta}\ln L=0,\ \theta=\sigma^2. \end{aligned} $$
(ii)

Equation (i) produces the solution \(\mu =\bar {x}\) so that \(\bar {x}\) is the MLE of μ. Note that \(\bar {x}\) is a random variable and that \(\bar {x}\) evaluated at a sample point or at a set of observations on x 1, …, x n produces the corresponding estimate. We will denote both the estimator and estimate of μ by \(\hat {\mu }\). As well, we will utilize the same abbreviation, namely, MLE for the maximum likelihood estimator and the corresponding estimate. Solving (ii) and substituting \(\hat {\mu }\) to μ, we have \(\hat {\theta }={\hat {\sigma }}^2=\frac {1}{n}\sum _{j=1}^n(x_j-\bar {x})^2=s^2=\) the sample variance as an estimate of θ = σ 2. Does the point \((\bar {x},s^2)\) correspond to a local maximum or a local minimum or a saddle point? Since the matrix of second order partial derivatives at the point \((\bar {x},s^2)\) is negative definite, the critical point \((\bar {x},s^2)\) corresponds to a maximum. Thus, in this case, \(\hat {\mu }=\bar {x}\) and \({\hat {\sigma }}^2=s^2\) are the maximum likelihood estimators/estimates of the parameters μ and σ 2, respectively. If we were to differentiate with respect to σ instead of θ = σ 2 in (ii), we would obtain the same estimators, since for any differentiable function g(t), \(\frac {\text{d}}{\text{d}t}g(t)=0\Rightarrow \frac {\text{d}}{\text{d}\phi (t)}g(t)=0\) if \(\frac {\text{d}}{\text{d}t}\phi (t)\ne 0\). In this instance, ϕ(σ) = σ 2 and \(\frac {\text{d}}{\text{d}\sigma }\sigma ^2\ne 0\).

For obtaining the moment estimates, we equate the sample integer moments to the corresponding population moments, that is, we let m r = E[x r], r = 1, 2, two equations being required to estimate μ and σ 2. Note that \(m_1=\bar {x}\) and \(m_2=\frac {1}{n}\sum _{j=1}^nx_j^2\). Then, consider the equations

$$\displaystyle \begin{aligned}\bar{x}=E[x]=\mu \ \mbox{ and }\ \ \frac{1}{n}\sum_{j=1}^nx_j^2=E[x^2]\Rightarrow s^2=\sigma^2.\end{aligned}$$

Thus, the moment estimators/estimates of μ and σ 2, which are \(\hat {\mu }=\bar {x}\) and \( {\hat {\sigma }}^2=s^2\), happen to be identical to the MLE’s in this case.

Let us consider the type-1 beta population with parameters (α, β) whose density is

$$\displaystyle \begin{aligned}f_1(x)=\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}x^{\alpha-1}(1-x)^{\beta-1},\ 0\le x\le 1,\ \alpha>0,\ \beta>0,{}\end{aligned} $$
(2.7.3)

and zero otherwise. In this case, the likelihood function contains gamma functions and the derivatives of gamma functions involve psi and zeta functions. Accordingly, the maximum likelihood approach is not very convenient here. However, we can determine moment estimates without much difficulty from (2.7.3). The first two population integer moments are obtained directly from a representation of the h-th moment:

$$\displaystyle \begin{aligned} E[x^h]&=\frac{\varGamma(\alpha+h)}{\varGamma(\alpha)}\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha+\beta+h)},\ \Re(\alpha+h)>0\Rightarrow\\ E[x]&=\frac{\varGamma(\alpha+1)}{\varGamma(\alpha)}\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha+\beta+1)}=\frac{\alpha}{\alpha+\beta}{} \end{aligned} $$
(2.7.4)
$$\displaystyle \begin{aligned} E[x^2]&=\frac{\alpha(\alpha+1)}{(\alpha+\beta)(\alpha+\beta+1)}=E[x]\ \frac{\alpha+1}{\alpha+\beta+1}.{} \end{aligned} $$
(2.7.5)

Equating the sample moments to the corresponding population moments, that is, letting m 1 = E[x] and m 2 = E[x 2], it follows from (2.7.4) that

$$\displaystyle \begin{aligned}\bar{x}=\frac{\alpha}{\alpha+\beta}\Rightarrow \frac{\beta}{\alpha}=\frac{1-\bar{x}}{\bar{x}}; \end{aligned} $$
(iii)

Then, from (2.7.5), we have

$$\displaystyle \begin{aligned}\frac{\frac{1}{n}\sum_{j=1}^nx_j^2}{\bar{x}}=\frac{\alpha+1}{\alpha+\beta+1}=\frac{1}{1+\frac{\beta}{\alpha+1}}\Rightarrow \frac{\bar{x}-\sum_{j=1}^nx_j^2/n}{\sum_{j=1}^nx_j^2/n}=\frac{\beta}{\alpha+1}.\end{aligned} $$
(iv)

The parameter β can be eliminated from (iii) and (iv), which yields an estimate of α; \(\hat {\beta }\) is then obtained from (iii). Thus, the moment estimates are available from the equations, m r = E[x r], r = 1, 2, even though these equations are nonlinear in the parameters α and β. The method of maximum likelihood or the method of moments can similarly yield parameters estimates for populations that are otherwise distributed.

2.7.2. Bayes’ estimates

This procedure is more relevant when the parameters in a given statistical density/probability function have their own distributions. For example, let the real scalar variable x be discrete having a binomial probability law for the fixed (given) parameter p, that is, let and f(x|p) = 0 elsewhere be the conditional probability function. Let p have a prior type-1 beta density with known parameters α and β, that is, let the prior density of p bec

$$\displaystyle \begin{aligned}g(p)=\frac{\varGamma(\alpha+\beta)}{\varGamma(\alpha)\varGamma(\beta)}p^{\alpha-1}(1-p)^{\beta-1},\ 0<p<1,\ \alpha>0,\ \beta>0\end{aligned} $$

and g(p) = 0 elsewhere. Then, the joint probability function f(x, p) = f(x|p)g(p) and the unconditional probability function of x, denoted by f 1(x), is as follows:

Thus, the posterior density of p, given x, denoted by g 1(p|x), is

$$\displaystyle \begin{aligned}g_1(p|x)=\frac{f(x,p)}{f_1(x)}=\frac{\varGamma(\alpha+\beta+n)}{\varGamma(\alpha+x)\varGamma(\beta+n-x)}p^{\alpha+x-1}(1-p)^{\beta+n-x-1}. \end{aligned}$$

Accordingly, the expected value of p in this conditional distribution of p given x, which is called the posterior density of p, is known as the Bayes estimate of p:

$$\displaystyle \begin{aligned} E[p|x]&=\frac{\varGamma(\alpha+\beta+n)}{\varGamma(\alpha+x)\varGamma(\beta+n-x)}\int_0^1p\, p^{\alpha+x-1}(1-p)^{\beta+n-x-1}\text{d}p\\ &=\frac{\varGamma(\alpha+\beta+n)}{\varGamma(\alpha+x)\varGamma(\beta+n-x)}\frac{\varGamma(\alpha+x+1)\varGamma(\beta+n-x)}{\varGamma(\alpha+\beta+n+1)}\\ &=\frac{\varGamma(\alpha+x+1)}{\varGamma(\alpha+x)}\frac{\varGamma(\alpha+\beta+n)}{\varGamma(\alpha+\beta+n+1)}=\frac{\alpha+x}{\alpha+\beta+n}.\end{aligned} $$

The prior estimate/estimator of p as obtained from the binomial distribution is \(\frac {x}{n}\) and the posterior estimate or the Bayes estimate of p is

$$\displaystyle \begin{aligned}E[p|x]=\frac{\alpha+x}{\alpha+\beta+n}, \end{aligned}$$

so that \(\frac {x}{n}\) is revised to \(\frac {\alpha +x}{\alpha +\beta +n}\). In general, if the conditional density/probability function of x given θ is f(x|θ) and the prior density/probability function of θ is g(θ), then the posterior density of θ is g 1(θ|x) and E[θ|x] or the expected value of θ in the conditional distribution of θ given x is the Bayes estimate of θ.

2.7.3. Interval estimation

Before concluding this section, the concept of confidence intervals or interval estimation of a parameter will be briefly touched upon. For example, let x 1, …, x n be iid N 1(μ, σ 2) and let \(\bar {x}=\frac {1}{n}(x_1+\cdots +x_n)\). Then, \(\bar {x}\sim N_1(\mu ,\frac {\sigma ^2}{n}),\ (\bar {x}-\mu )\sim N_1(0,\frac {\sigma ^2}{n}) \) and \( z=\frac {\sqrt {n}}{\sigma }(\bar {x}-\mu )\sim N_1(0,1)\). Since the standard normal density N 1(0, 1) is free of any parameter, one can select two percentiles, say, a and b, from a standard normal table and make a probability statement such as Pr{a < z < b} = 1 − α for every given α; for instance, Pr{−1.96 ≤ z ≤ 1.96}≈ 0.95 for α = 0.05. Let \(Pr\{-z_{\frac {\alpha }{2}}\le z\le z_{\frac {\alpha }{2}}\}=1-\alpha \) where \(z_{\frac {\alpha }{2}}\) is such that \(Pr(z>z_{\frac {\alpha }{2}})={\frac {\alpha }{2}}\). The following inequalities are mathematically equivalent and hence the probabilities associated with the corresponding intervals are equal:

$$\displaystyle \begin{aligned} -z_{\frac{\alpha}{2}}\le z\le z_{\frac{\alpha}{2}}&\Leftrightarrow -z_{\frac{\alpha}{2}}\le \frac{\sqrt{n}}{\sigma}(\bar{x}-\mu)\le z_{\frac{\alpha}{2}}\\ &\Leftrightarrow\mu-z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\le \bar{x}\le \mu+z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\\ &\Leftrightarrow \bar{x}-z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\le \mu\le \bar{x}+z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}.\end{aligned} $$
(i)

Accordingly,

$$\displaystyle \begin{aligned}Pr\{\mu-z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\le \bar{x}\le \mu+z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\}=1-\alpha\ \end{aligned} $$
(ii)

$$\displaystyle \begin{aligned}Pr\{\bar{x}-z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\le \mu\le \bar{x}+z_{\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}\}=1-\alpha.\end{aligned} $$
(iii)

Note that (iii) is not a usual probability statement as opposed to (ii), which is a probability statement on a random variable. In (iii), the interval \([\bar {x}-z_{\frac {\alpha }{2}}\frac {\sigma }{\sqrt {n}}, \bar {x}+z_{\frac {\alpha }{2}}\frac {\sigma }{\sqrt {n}}]\) is random and μ is a constant. This can be given the interpretation that the random interval covers the parameter μ with probability 1 − α, which means that we are 100(1 − α)% confident that the random interval will cover the unknown parameter μ or that the random interval is an interval estimator and an observed value of the interval is the interval estimate of μ. We could construct such an interval only because the distribution of z is parameter-free, which enabled us to make a probability statement on μ.

In general, if u = u(x 1, …, x n, θ) is a function of the sample values and the parameter θ (which may be a vector of parameters) and if the distribution of u is free of all parameter, then such a quantity is referred to as a pivotal quantity. Since the distribution of the pivotal quantity is parameter-free, we can find two numbers a and b such that Pr{a ≤ u ≤ b} = 1 − α for every given α. If it is possible to convert the statement a ≤ u ≤ b into a mathematically equivalent statement of the type u 1 ≤ θ ≤ u 2, so that Pr{u 1 ≤ θ ≤ u 2} = 1 − α for every given α, then [u 1, u 2] is called a 100(1 − α)% confidence interval or interval estimate for θ, u 1 and u 2 being referred to as the lower confidence limit and the upper confidence limit, and 1 − α being called the confidence coefficient. Additional results on interval estimation and the construction of confidence intervals are, for instance, presented in Mathai and Haubold (2017b).

Exercises 2.7

2.7.1

Obtain the method of moments estimators for the parameters (α, β) in a real type-2 beta population. Assume that a simple random sample of size n is available.

2.7.2

Obtain the estimate/estimator of the parameters by the method of moments and the method of maximum likelihood in the real (1): exponential population with parameter θ, (2): Poisson population with parameter λ. Assume that a simple random sample of size n is available.

2.7.3

Let x 1, …, x n be a simple random sample of size n from a point Bernoulli population f 2(x) = p x(1 − p)1−x, x = 0, 1, 0 < p < 1 and zero elsewhere. Obtain the MLE as well as moment estimator for p. [Note: These will be the same estimators for p in all the populations based on Bernoulli trials, such as binomial population, geometric population, negative binomial population].

2.7.4

If possible, obtain moment estimators for the parameters of a real generalized gamma population, \(f_3(x)=c~x^{\alpha -1}\text{e}^{-bx^{\delta }},\alpha >0,b>0,\delta >0, x\ge 0\) and zero elsewhere, where c is the normalizing constant.

2.7.5

If possible, obtain the MLE of the parameters a, b in the following real uniform population \(f_4(x)=\frac {1}{b-a}, b>a, a\le x<b\) and zero elsewhere. What are the MLE if a < x < b? What are the moment estimators in these two situations?

2.7.6

Construct the Bayes’ estimate/estimator of the parameter λ in a Poisson probability law if the prior density for λ is a gamma density with known parameters (α, β).

2.7.7

By selecting the appropriate pivotal quantities, construct a 95% confidence interval for (1): Poisson parameter λ; (2): Exponential parameter θ; (3): Normal parameter σ 2; (4): θ in a uniform density \(f(x)=\frac {1}{\theta }, 0\le x\le \theta \) and zero elsewhere.