Abstract
This paper derives the risk functions of a class of shrinkage estimators for the mean parameter matrix of a matrix variate elliptically contoured distribution. It is showed that the positive rule shrinkage estimator outperformed the shrinkage and unrestricted (maximum likelihood) estimators. To illustrate the findings of the paper, the relative risk functions for different degrees of freedoms are given for a multivariate t distribution. Shrinkage estimators for the matrix variate regression model under matrix normal, matrix t or Pearson VII error distributions would be special cases of this paper.
Similar content being viewed by others
References
Anderson TW, Fang KT (1990) Inference in multivariate elliptically contoured distribution based on maximum likelihood. In: Fang KT, Anderson TW (eds) Statistical inference in elliptically contoured and related distribution. Allerton Press, New York, pp 201–216
Arashi M (2009) Preliminary test estimation of the mean vector under balanced loss function. J Stat Res 43(2):55–65
Arashi M, Nadarajah S (2012) On singular elliptical models. Manuscript
Arashi M, Tabatabaey SMM (2010) A note on classical Stein-type estimators in elliptically contoured models. J Stat Plan Inference 140:1206–1213
Arashi M, Saleh AKMdE, Tabatabaey SMM (2010) Estimation of parameters of parallelism model with elliptically distributed errors. Metrika 71:79–100
Arashi M, Saleh AKMdE, Tabatabaey SMM (2012a) Regression model with elliptically contoured errors. Stat J Theor Appl Stat. doi:10.1080/02331888.2012.694442
Arashi M, Roux JJJ, Bekker A (2012b) Advance mathematical statistics for elliptical models, Technical Report, 2012/02 ISBN: 978-1-86854-983-2. University of Pretoria, South Africa
Bancroft TA (1944) On biases in estimation due to use of preliminary tests of significance. Ann Math Stat 15:190–204
Bancroft TA (1964) Analysis and inference for incompletely specified models involving the use of preliminary test(s) of significance. Biometrics 20:427–442
Benda N (1996) Pre-test estimation and design in the linear model. J Stat Plan Inference 52:225–240
Chu KC (1973) Estimation and decision for linear systems with elliptically random process. IEEE Trans Autom Control 18:499–505
Dawid AP (1977) Spherical matrix distributions and a multivariate model. J R Stat Soc B 39(2):254–261
Díaz-García José A, Leiva-Sánchez V, Galea M (2002) Singular elliptical distribution: density and applications. Commun Stat Theory Methods 31(5):665–681
Fang KT, Li R (1999) Bayesian statistical inference on elliptical matrix distributions. J Multivar Anal 70(1):66–85
Giles AJ (1991) Pretesting for linear restrictions in a regression model with spherically symmetric distributions. J Econom 50:377–398
Gnanadesikan R (1977) Methods for statistical data analysis of multivariate observations. Wiley, New York
Gupta AK, Varga T (1995) Normal mixture representations of matrix variate elliptically contoured distributions. Sankhyā 57:68–78
Gupta AK, Saleh AKMdE, Sen PK (1989) Improved estimation in a contingency table: independence structure. J Am Stat Assoc 84:525–532
Han C-P, Bancroft TA (1968) On Pooling means when variance is unknown. J Am Stat Assoc 63:1333–1342
James W, Stein C (1961) Estimation with quadratic loss. In: Proceeding of the fourth Berkeley symposium on mathematical statistics and probability. University of California Press, Berkeley, CA
Judge GG, Bock ME (1978) The statistical implication of pre-test and stein-rule estimators in econometrics. North Holland, Amsterdam
Kibria BMG (1996) On shrinkage ridge regression estimators for restricted linear models with multivariate t disturbances. Students 1(3):177–188
Kibria BMG, Saleh AKMdE (2006) Optimum critical value for pretest estimators. Commun Stat Simul Comput 35(2):309–319
Kibria BMG, Saleh AKMdE (2011) Improving the estimators of the parameters of a probit regression model: a ridge regression approach. J Stat Plan Inference 142:1421–1435
Kollo T, von Rosen D (2005) Advanced multivariate statistics with matrices. Springer, Beriln
Nkurunziza S (2011) Shrinkage strategy in stratified random sample subject to measurement error. Stat Probab Lett 81(2):317–325
Nkurunziza S (2012) The risk of pretest and shrinkage estimators. Stat J Theor Appl Stat 46(3):305–312
Nkurunziza S, Ahmed SE (2011) Estimation strategies for the regression coefficient parameter matrix in multivariate multiple regression. Statistica Neerlandica 65(4):387–406
Ohtani K (1993) A comparison of the Stein-rule and positive part Stein-rule estimators in a misspecified linear regression models. Econom Theory 9:668–679
Provost SB, Cheong Y-H (2002) The distribution of Hermitian quadratic forms in elliptically contoured random vectors. J Stat Plan Inference 102:303–316
Saleh AKMdE (2006) Preliminary test and Stein-type estimation with applications. Wiley, New York
Saleh AKMdE, Sen PK (1985) On shrinkage M-estimators of location parameters. Commun Stat Theory Methods 24:2313–2329
Saleh AKMdE, Picek J, Kalian J (2010) Nonparamteric estimation of regression parameters in measurement erorrs model. Metron LXVII:177–200
Saleh AKMdE, Picek J, Kalian J (2012) R-estimation of the parameters of a multiple regression model with measurement erorrs. Metrika 75:311–328
Sen PK, Saleh AKMdE (1985) On some shrinkage estimators of multivariate location. Ann Stat 13:272–281
Singh RS (1989) Estimation of error variance in linear regression models with errors having multivariate Student- t distribution with unknown degrees of freedom. Econ Lett 27:47–53
Singh RS (1991) James–Stein rule estimators in linear regression models with multivariate t distributed error. Aust J Stat 33:145–158
Tabatabaey SMM, Saleh AKMdE, Kibria BMG (2004) Estimation strategies for the parameters of the linear regression models under spherically symmetric distributions. J Stat Res 38:13–31
Wang S-D, Kuo T-S, Hsu C-F (1986) Trace bounds on the solution of the algebraic matrix Riccati and Lyapunov equation. IEEE Trans Autom Control 31(7):654–656
Zellner A (1976) Bayesian and non-Bayesian analysis of the regression model with multivariate Student \(t\) error terms. J Am Stat Assoc 71:400–405
Acknowledgments
The authors would like to thank Dr. Sévérien Nkurunziza at the University of Windsor for reading this manuscript and providing us with constructive suggestions which improved the quality and presentation of the paper greatly. We are also thankful to the Editor and two anonymous referees for their valuable and constructive comments which certainly improved the presentation and quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Here we propose some necessary tools for mathematical computations with matrices. For the purpose of this appendix we precisely take matrix variate elliptically contoured (MEC) distributions into consideration.
Let \(\varvec{X}\) be an \(n\times p\) random matrix, which can be expressed in terms of its elements, column and rows as
Here \(\varvec{x}_{(1)},\ldots ,\varvec{x}_{(n)}\) can be regarded as a sample of size \(n\) from a \(p\)-dimensional population.
As pointed by Fang and Li (1999), there are many ways (not completely different) to define elliptical matrix distributions. Four classes of elliptical matrix distributions are defined and discussed by Dawid (1977) and Anderson and Fang (1990). For the purpose of this appendix we specifically consider the following situations.
Definition 6.1
The \(n\times p\) random matrix \(\varvec{X}\) has a MEC distribution if its density has the form
where \(\varvec{\mu }\in \mathbb {R}^p,\,\varvec{\Omega }\) and \(\varvec{\Sigma }\) are \(p\times p\) and \(n\times n\) positive definite matrices. This distribution is denoted by \(\varvec{X}\sim \mathcal {E}_{n,p}(\varvec{\mu },\varvec{\Sigma }\otimes \varvec{\Omega },f)\). For notational convenience we may also use \(\varvec{X}\sim \mathcal {E}_{n,p}(\varvec{\mu },\varvec{\Sigma },\varvec{\Omega },f)\) where needed.
Definition 6.1 imposes the condition \(f(\varvec{A}\varvec{B})=f(\varvec{B}\varvec{A})\) on the density generator \(f\) for any \(p\times p\) positive definite symmetric matrices \(\varvec{A}\) and \(\varvec{B}\). Throughout, without loss of generality, we take \(\mathop {\mathrm{tr}}\nolimits \) operation from the argument of \(f(.)\).
Some examples of MEC distributions are given below.
-
(i)
Matrix Variate Normal (MN) Distribution \(\varvec{X}\in \mathbb {R}^{n\times p}\) has MN distribution, with mean \(\varvec{M}\), row and column covariance matrices \(\varvec{\Omega }\) and \(\varvec{\Sigma }\), respectively denoted by \(\varvec{X}\sim MN_{n,p}(\varvec{M},\varvec{\Sigma },\varvec{\Omega })\), if its pdf is given by
$$\begin{aligned} f(\varvec{X})=\frac{|\varvec{\Omega }|^{-\frac{n}{2}}|\varvec{\Sigma }|^{-\frac{p}{2}}}{(2\pi )^{\frac{np}{2}}}\; \exp \left\{ -\frac{1}{2}\mathop {\mathrm{tr}}\nolimits \left[ \varvec{\Omega }^{-1}(\varvec{X}-\varvec{M})'\varvec{\Sigma }^{-1}(\varvec{X}-\varvec{M})\right] \right\} . \end{aligned}$$ -
(ii)
Matrix Variate Student-t (MT) Distribution \(\varvec{X}\in \mathbb {R}^{n\times p}\) has MT distribution, with mean \(\varvec{M}\), row and column scale matrices \(\varvec{\Omega }\) and \(\varvec{\Sigma }\), respectively and \(\nu \) d.f. denoted by \(\varvec{X}\sim MT_{n,p}(\varvec{M},\varvec{\Sigma },\varvec{\Omega },\nu )\), if its pdf is given by
$$\begin{aligned} f(\varvec{X})=\frac{|\varvec{\Omega }|^{-\frac{n}{2}}|\varvec{\Sigma }|^{-\frac{p}{2}}}{g_{n,p}}\; \left| \varvec{I}_n+\varvec{\Omega }^{-1}(\varvec{X}-\varvec{M})'\varvec{\Sigma }^{-1}(\varvec{X}-\varvec{M})\right| ^{-\frac{n+p+\nu -1}{2}}, \end{aligned}$$where
$$\begin{aligned} g_{n,p}=\frac{(\nu \pi )^{\frac{np}{2}}\Gamma _p\left( \frac{\nu +p-1}{2}\right) }{\Gamma _p\left( \frac{\nu +n+p-1}{2}\right) }. \end{aligned}$$ -
(iii)
Matrix Variate Pearson Type-VII (MPVII) Distribution \(\varvec{X}\in \mathbb {R}^{n\times p}\) has MPVII distribution, with mean \(\varvec{M}\), row and column scale matrices \(\varvec{\Omega }\) and \(\varvec{\Sigma }\), respectively and parameters \(m\) and \(q\) denoted by \(\varvec{X}\sim MPVII_{n,p}(\varvec{M},\varvec{\Sigma },\varvec{\Omega },m,q)\), if its pdf is given by
$$\begin{aligned} f(\varvec{X})=\frac{|\varvec{\Omega }|^{-\frac{n}{2}}|\varvec{\Sigma }|^{-\frac{p}{2}}}{h_{n,p,m,q}}\; \left| \varvec{I}_n+\frac{1}{q}\varvec{\Omega }^{-1}(\varvec{X}-\varvec{M})'\varvec{\Sigma }^{-1}(\varvec{X}-\varvec{M})\right| ^{-m}, \end{aligned}$$where
$$\begin{aligned} h_{n,p,m,q}=\frac{(q\pi )^{\frac{np}{2}}\Gamma _p\left( m-\frac{n}{2}\right) }{\Gamma _p\left( m\right) }. \end{aligned}$$ -
(iv)
Matrix Variate Power Exponential (MPE) Distribution \(\varvec{X}\in \mathbb {R}^{n\times p}\) has MPE distribution, with mean \(\varvec{M}\), row and column scale matrices \(\varvec{\Omega }\) and \(\varvec{\Sigma }\), respectively and parameters \(r\) and \(s\) denoted by \(\varvec{X}\sim MPE_{n,p}(\varvec{M},\varvec{\Sigma },\varvec{\Omega },r,s)\), if its pdf is given by
$$\begin{aligned} f(\varvec{X})=\frac{|\varvec{\Omega }|^{-\frac{n}{2}}|\varvec{\Sigma }|^{-\frac{p}{2}}}{j_{n,p,r,s}}\; \exp \left\{ -\frac{r}{2}\left( \mathop {\mathrm{tr}}\nolimits \left[ \varvec{\Omega }^{-1}(\varvec{X}-\varvec{M})'\varvec{\Sigma }^{-1}(\varvec{X}-\varvec{M})\right] \right) ^s\right\} , \end{aligned}$$where
$$\begin{aligned} j_{n,p,r,s}=\frac{(2\pi )^{\frac{np}{2}}\Gamma \left( \frac{np}{2s}\right) }{s\Gamma \left( \frac{np}{2}\right) r^{\frac{np}{2s}}}. \end{aligned}$$ -
(v)
Matrix Variate Laplace (ML) Distribution \(\varvec{X}\in \mathbb {R}^{n\times p}\) has ML distribution, with mean \(\varvec{M}\), row and column scale matrices \(\varvec{\Omega }\) and \(\varvec{\Sigma }\), respectively denoted by \(\varvec{X}\sim ML_{n,p}(\varvec{M},\varvec{\Sigma },\varvec{\Omega })\), if its pdf is given by
$$\begin{aligned} f(\varvec{X})=\frac{|\varvec{\Omega }|^{-\frac{n}{2}}|\varvec{\Sigma }|^{-\frac{p}{2}}}{l_{n,p}}\; \exp \left\{ -\frac{\sqrt{2}}{2}\left( \mathop {\mathrm{tr}}\nolimits \left[ \varvec{\Omega }^{-1}(\varvec{X}-\varvec{M})'\varvec{\Sigma }^{-1}(\varvec{X}-\varvec{M})\right] \right) ^{\frac{1}{2}}\right\} , \end{aligned}$$where
$$\begin{aligned} i_{n,p}=\frac{2\pi ^{\frac{np}{2}}\Gamma \left( np\right) }{\Gamma \left( \frac{np}{2}\right) }. \end{aligned}$$ -
(vi)
Matrix Variate Kotz-Type (MK) Distribution \(\varvec{X}\in \mathbb {R}^{n\times p}\) has MK distribution, with mean \(\varvec{M}\), row and column scale matrices \(\varvec{\Omega }\) and \(\varvec{\Sigma }\), respectively and parameter \(s\) denoted by \(\varvec{X}\sim MK_{n,p}(\varvec{M},\varvec{\Sigma },\varvec{\Omega },s)\), if its pdf is given by
$$\begin{aligned} f(\varvec{X})=\frac{|\varvec{\Omega }|^{-\frac{n}{2}}|\varvec{\Sigma }|^{-\frac{p}{2}}}{t_{n,p,r}}\; \exp \left\{ -\frac{r}{2}\left( \mathop {\mathrm{tr}}\nolimits \left[ \varvec{\Omega }^{-1}(\varvec{X}-\varvec{M})'\varvec{\Sigma }^{-1}(\varvec{X}-\varvec{M})\right] \right) \right\} , \end{aligned}$$where
$$\begin{aligned} t_{n,p,r}=\frac{(2\pi )^{\frac{np}{2}}\Gamma \left( \frac{np}{2}\right) }{\Gamma \left( \frac{np}{2}\right) r^{\frac{np}{2}}}. \end{aligned}$$
Theorem 6.1
Let \(\varvec{X}\sim \mathcal {E}_{n,p}(\varvec{\mu },\varvec{\Sigma }\otimes \varvec{\Omega }, f)\) where the p.d.f. \(g(\varvec{X})\) of \(\varvec{X}\) is defined by
If \(h(t),\,t\in [0,\infty )\) has the inverse Laplace transform (denoted by \(\mathcal {L}^{-1}[h(t)]\)), then we have
where \(f_{\mathcal {N}\left( \varvec{\mu },z^{-1}\varvec{\Sigma }\otimes \varvec{\Omega }\right) }(\varvec{X})\) stands for the p.d.f. of the \(n\times p\) matrix \(\varvec{X}\) distributed as matrix variate normal with the mean matrix \(\varvec{\mu }\) and the covariance matrix \(z^{-1}\varvec{\Sigma }\otimes \varvec{\Omega }\), and \(w(z)\) is the weight function given by
For the proof we refer the readers to Theorem 4.2.1 of Gupta and Varga (1995).
Remark 6.1
An important issue raises when considering Theorem 6.1, is the mixture representation given by (1.3) and the form of the weighting function \(w(.)\). Note that \(w(.)\) is not always nonnegative [(see examples in Chu (1973), Provost and Cheong (2002) and Arashi et al. (2012b)]. This makes a difference with respect to that of the class of multivariate scale mixtures of matrix normal distributions.
Remark 6.2
It is worthwhile to consider that for the special case for which \(\varvec{\Sigma }\) and \(\varvec{\Omega }\) are not invertible, i.e. singular matrix elliptical contoured distribution, one may get similar result, as stated in the above, by the method proposed in Díaz-García et al. (2002). Further discussion on how to get similar result to Theorem 6.1 using rank decomposition is provided in Arashi and Nadarajah (2012).
Remark 6.3
In Theorem 6.1 it is stated that \(f\) is the density of \(N_{n,p}(\varvec{\mu },z^{-1}\varvec{\Sigma }\otimes \varvec{\Omega })\). It is realized from the proof that \(f\) can even has one of the following densities
-
(1)
\(\mathcal {N}_{n,p}(\varvec{\mu },z^{-1}\varvec{\Sigma },\varvec{\Omega })\) or
-
(2)
\(\mathcal {N}_{n,p}(\varvec{\mu },\varvec{\Sigma },z^{-1}\varvec{\Omega })\) or
-
(3)
\(\mathcal {N}_{n,p}(\varvec{\mu },z^{-\frac{1}{2}}\varvec{\Sigma },z^{-\frac{1}{2}}\varvec{\Omega })\). This fact enables us to adopt each representation whenever is needed for practical use.
Rights and permissions
About this article
Cite this article
Arashi, M., Kibria, B.M.G. & Tajadod, A. On shrinkage estimators in matrix variate elliptical models. Metrika 78, 29–44 (2015). https://doi.org/10.1007/s00184-014-0488-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-014-0488-6