Abstract
From an observable \((X,U)\) in \(\mathbb R^p \times \mathbb R^k\), we consider estimation of an unknown location parameter \(\theta \in \mathbb R^p\) under two distributional settings: the density of \((X,U)\) is spherically symmetric with an unknown scale parameter \(\sigma \) and is ellipically symmetric with an unknown covariance matrix \(\Sigma \). Evaluation of estimators of \(\theta \) is made under the classical invariant losses \(\Vert d - \theta \Vert ^2 / \sigma ^2\) and \((d - \theta )^t \Sigma ^{-1} (d - \theta )\) as well as two respective data based losses \(\Vert d - \theta \Vert ^2 / \Vert U\Vert ^2\) and \((d - \theta )^t S^{-1} (d - \theta )\) where \(\Vert U\Vert ^2\) estimates \(\sigma ^2\) while \(S\) estimates \(\Sigma \). We provide new Stein and Stein–Haff identities that allow analysis of risk for these two new losses, including a new identity that gives rise to unbiased estimates of risk (up to a multiple of \(1 / \sigma ^2\)) in the spherical case for a larger class of estimators than in Fourdrinier et al. (J Multivar Anal 85:24–39, 2003). Minimax estimators of Baranchik form illustrate the theory. It is found that the range of shrinkage of these estimators is slightly larger for the data based losses compared to the usual invariant losses. It is also found that \(X\) is minimax with finite risk with respect to the data-based losses for many distributions for which its risk is infinite when calculated under the classical invariant losses. In these cases, including the multivariate \(t\) and, in particular, the multivariate Cauchy, we find improved shrinkage estimators as well.
Similar content being viewed by others
References
Baranchik AJ (1970) A family of minimax estimators of the mean of a multivariate normal distribution. Ann Math Stat 41(2):642–645
Berger JO (1975) Minimax estimation of location vectors for a wide class of densities. Ann Stat 3(6):1318–1328
Cellier D, Fourdrinier D, Robert C (1989) Robust shrinkage estimators of the location parameter for elliptically symmetric distributions. J Multivar Anal 29:39–52
Cellier D, Fourdrinier D (1995) Shrinkage estimators under spherically symmetry for the general linear model. J Multivar Anal 52:338–351
Fourdrinier D, Wells MT (1995) Loss estimation for spherically symmetric distributions. J Multivar Anal 53:311–331
Fourdrinier D, Strawderman WE (1996) A paradox concerning shrinkage estimators: Should a known scale parameter be replaced by an estimated value in the shrinkage factor? J Multivar Anal 59(2):109–140
Fourdrinier D, Strawderman WE, Wells MT (1998) Estimation robuste pour des lois à symétrie elliptique à matrice de covariance inconnue. Comptes Rendus de l’Académie des Sciences 326(1):1135–1140
Fourdrinier D, Strawderman WE, Wells MT (2003) Robust shrinkage estimation for elliptically symmetric distributions with unknown covariance matrix. J Multivar Anal 85:24–39
Fourdrinier D, Strawderman WE, Wells MT (2006) Estimation of a location parameter with restrictions or “vague information” for spherically symmetric distributions. Ann Inst Stat Math 58:73–92
Fourdrinier D, Strawderman WE (2010) Robust generalized Bayes minimax estimators of location vectors for spherically symmetric distribution with unknown scale. Borrowing strength: theory powering applications–A Festschrift for Lawrence D. Brown, vol 6. IMS Collections, pp 249–262
Fourdrinier D, Wells MT (2012) Matrix estimation under spherically symmetric distribution. Université de Rouen and Cornell University, Technical report
Fourdrinier D, Wells MT (2012) On improved loss estimation for shrinkage estimators. Stat Sci 27:61–81
Fourdrinier D, Mezoued F, Strawderman WE (2013) Bayes minimax estimation under power priors of location parameters for a wide class of spherically symmetric distributions. Electron J Stat 7:717–741
Fourdrinier D, Strawderman WE (2014) On the non existence of unbiased estimators of risk for spherically symmetric distributions. Stat Probab Lett 91:6–13
Fourdrinier D, Strawderman WE, Wells MT (2014) On completeness of the general linear model with spherically symmetric errors. Stat Methodol 20:91–104
Haff LTR (1979) An identity for the Wishart distribution with applications. J Multivar Anal 9:531–544
Kubokawa T, Robert C, Saleh AK (1991) Robust estimation of common regression coefficients under spherical symmetry. Ann Inst Stat Math 43(4):677–688
Kubokawa T, Srivastava MS (1999) Robust improvement in estimation of a covariance matrix in an elliptically contoured distribution. Ann Stat 27(2):600–609
Kubokawa T (2009) Integral inequality for minimaxity in the Stein problem. J Jpn Stat Soc 39:1–21
Lehmann, Casella (1998) Theory of point estimation. Springer, New York
Maruyama Y (2003) A robust generalized Bayes estimator improving on the James–Stein estimator for spherically symmetric distributions. Stat Decis 21:69–78
Muirhead RJ (1982) Aspects of multivariate statistical theory. Wiley, New York
Stein C (1981) Estimation of the mean of multivariate normal distribution. Ann Stat 9:1135–1151
Acknowledgments
We would like to thank two referees for suggestions which helped us to improve on a first version of the paper. In particular, we thank one of the referees for suggesting the link to the paper of Kubokawa (2009) presented in Example 3.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was partially supported by a Grant from the Simons Foundation (#209035 to William Strawderman).
Appendix
Appendix
1.1 A link between the expectations involving f and F
Proof of Lemma 2.2 Denote by \(\eta \! \left( X,\left\| U\right\| ^2\right) \) the integrand of the second expectation, that is,
Then conditionally on \(X=x\), we have
where
Applying Fubini’s theorem, we obtain
where \(\bar{B}(\sqrt{s})=\left\{ u\in {\mathbb R}^k/\left\| u\right\| > \sqrt{s}\right\} \) is the complement of the ball of radius \(\sqrt{s}\) centered at \(0\) in \({\mathbb R}^k\). As, in the inner most integral, the variable \(u\) intervenes through its norm \(\left\| u\right\| \), we have, letting \(\varsigma _k = 2 \, \pi ^{k/2} / \Gamma ( k/2)\),
Hence
using the the radial density of \(U | X=x\) as above. Consequently, unconditioning, we have
according to the definition of \(E_{\theta ,\sigma ^2}^{*}\). \(\square \)
1.2 Independence on f of the distribution of Z given S
The following lemma relies on the decomposition of an \(n \times p\) matrix \(Z = Z_{n \times p}\) of rank \(p\), which can be found in Theorem A 9.8 of Muirhead (1982):
uniquely where \(T = T_{p \times p}\) is upper triangular with positive diagonals and \(H_1 = H_{1,n \times p}\) is \(n \times p\) with \(H_1^t H_1 = I_p\).
Lemma 5.1
Let
with rank \(p\) and let \(S = S_{p \times p} = Z^t Z\). Then \(Z | S\) has the distribution of \(H_1 T\) where \(H_1 = H_{1,n \times p}\) has the unique invariant Haar distribution on the Stiefel manifold of \(n \times p\) matrices with orthogonal columns (independently of \(T = T_{p \times p}\)), and that distribution is independent of \(f\).
Proof
Consider (5.1). We have
Further the unique symmetric square root of \(S\) satisfies
by the same theorem so that
and uniqueness implies that
Hence, in the normal case, \(S = T^t T\) is complete and sufficient but, since \(T\) is uniquely defined, it is also complete and sufficient.
Note that, if \(Z\) has density (5.2), then \(Z^* = O Z \sim Z\) (since \(|J| = 1\) and \(Z^{*t} = Z^* = Z^t O^t O Z = Z^t Z\)) and \(T\) is also sufficient for general \(f\). By Theorem 2.1.13 of Muirhead (1982), the Jacobian of \(Z \rightarrow (T,H_1)\) is
where \(H_1^t \, dH_1^t\) is the Harr measure on the Stiefel manifold. It follows that the joint density of \((H_1,T)\) is given by
so that \(T\) is independent of \(H_1\) and \(H_1\) has the invariant Haar probability density on the Stiefel manifold
where \(C_{p,n}\) is given in Theorem 2.1.15 of Muirhead (1982) as equal to
and where \(\Gamma _n(p/2)\) is the multivariate gamma function defined in Definition 2.1.10 as
Hence \(Z = H_1 T\) where \(H_1\) and \(T\) are independent and \(H_1\) has the invariant Haar probability density and
and the distribution is independent of \(f\).\(\square \)
1.3 Technical lemmas
Lemma 5.2
Let \(r\) be a differentiable function from \(\mathbb R_+\) into \(\mathbb R\) such that the function \(t \mapsto t \, r^\prime (t)\) is nondecreasing. Then
provided that \(n - p + 1 > 0\).
Proof
Note that, for any suitable function \(h\), we have
by Theorem 3.1 and the calculations of \(\mathrm{D}_{1/2}^*\). Set
and check that a solution for \(h(t)\) is given by
Then, to prove Inequality (5.3), it will suffice to show that \(\mathrm{div}_Xh(X^t S^{-1} X) \, X \ge 0\). Now, as we have
it suffices to show that
We can write
Hence, if \(t \mapsto t \, r^\prime (t)\) is nondecreasing (and nonnegative), then \(u \, r^\prime (u) \ge t \, r^\prime (t)\) for all \(u \ge t > 0\) and the last line of the above expression is bounded below by
provided \(n - p + 1 > 0\) and \(t > 0\) (recall that \(p \ge 3\)).\(\square \)
Rights and permissions
About this article
Cite this article
Fourdrinier, D., Strawderman, W. Robust minimax Stein estimation under invariant data-based loss for spherically and elliptically symmetric distributions. Metrika 78, 461–484 (2015). https://doi.org/10.1007/s00184-014-0512-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-014-0512-x