Skip to main content
Log in

Robust minimax Stein estimation under invariant data-based loss for spherically and elliptically symmetric distributions

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

From an observable \((X,U)\) in \(\mathbb R^p \times \mathbb R^k\), we consider estimation of an unknown location parameter \(\theta \in \mathbb R^p\) under two distributional settings: the density of \((X,U)\) is spherically symmetric with an unknown scale parameter \(\sigma \) and is ellipically symmetric with an unknown covariance matrix \(\Sigma \). Evaluation of estimators of \(\theta \) is made under the classical invariant losses \(\Vert d - \theta \Vert ^2 / \sigma ^2\) and \((d - \theta )^t \Sigma ^{-1} (d - \theta )\) as well as two respective data based losses \(\Vert d - \theta \Vert ^2 / \Vert U\Vert ^2\) and \((d - \theta )^t S^{-1} (d - \theta )\) where \(\Vert U\Vert ^2\) estimates \(\sigma ^2\) while \(S\) estimates \(\Sigma \). We provide new Stein and Stein–Haff identities that allow analysis of risk for these two new losses, including a new identity that gives rise to unbiased estimates of risk (up to a multiple of \(1 / \sigma ^2\)) in the spherical case for a larger class of estimators than in Fourdrinier et al. (J Multivar Anal 85:24–39, 2003). Minimax estimators of Baranchik form illustrate the theory. It is found that the range of shrinkage of these estimators is slightly larger for the data based losses compared to the usual invariant losses. It is also found that \(X\) is minimax with finite risk with respect to the data-based losses for many distributions for which its risk is infinite when calculated under the classical invariant losses. In these cases, including the multivariate \(t\) and, in particular, the multivariate Cauchy, we find improved shrinkage estimators as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baranchik AJ (1970) A family of minimax estimators of the mean of a multivariate normal distribution. Ann Math Stat 41(2):642–645

    Article  MathSciNet  Google Scholar 

  • Berger JO (1975) Minimax estimation of location vectors for a wide class of densities. Ann Stat 3(6):1318–1328

  • Cellier D, Fourdrinier D, Robert C (1989) Robust shrinkage estimators of the location parameter for elliptically symmetric distributions. J Multivar Anal 29:39–52

    Article  MATH  MathSciNet  Google Scholar 

  • Cellier D, Fourdrinier D (1995) Shrinkage estimators under spherically symmetry for the general linear model. J Multivar Anal 52:338–351

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Wells MT (1995) Loss estimation for spherically symmetric distributions. J Multivar Anal 53:311–331

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE (1996) A paradox concerning shrinkage estimators: Should a known scale parameter be replaced by an estimated value in the shrinkage factor? J Multivar Anal 59(2):109–140

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE, Wells MT (1998) Estimation robuste pour des lois à symétrie elliptique à matrice de covariance inconnue. Comptes Rendus de l’Académie des Sciences 326(1):1135–1140

    MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE, Wells MT (2003) Robust shrinkage estimation for elliptically symmetric distributions with unknown covariance matrix. J Multivar Anal 85:24–39

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE, Wells MT (2006) Estimation of a location parameter with restrictions or “vague information” for spherically symmetric distributions. Ann Inst Stat Math 58:73–92

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE (2010) Robust generalized Bayes minimax estimators of location vectors for spherically symmetric distribution with unknown scale. Borrowing strength: theory powering applications–A Festschrift for Lawrence D. Brown, vol 6. IMS Collections, pp 249–262

  • Fourdrinier D, Wells MT (2012) Matrix estimation under spherically symmetric distribution. Université de Rouen and Cornell University, Technical report

  • Fourdrinier D, Wells MT (2012) On improved loss estimation for shrinkage estimators. Stat Sci 27:61–81

    Article  MathSciNet  Google Scholar 

  • Fourdrinier D, Mezoued F, Strawderman WE (2013) Bayes minimax estimation under power priors of location parameters for a wide class of spherically symmetric distributions. Electron J Stat 7:717–741

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE (2014) On the non existence of unbiased estimators of risk for spherically symmetric distributions. Stat Probab Lett 91:6–13

    Article  MATH  MathSciNet  Google Scholar 

  • Fourdrinier D, Strawderman WE, Wells MT (2014) On completeness of the general linear model with spherically symmetric errors. Stat Methodol 20:91–104

    Article  MathSciNet  Google Scholar 

  • Haff LTR (1979) An identity for the Wishart distribution with applications. J Multivar Anal 9:531–544

    Article  MATH  MathSciNet  Google Scholar 

  • Kubokawa T, Robert C, Saleh AK (1991) Robust estimation of common regression coefficients under spherical symmetry. Ann Inst Stat Math 43(4):677–688

    Article  MATH  MathSciNet  Google Scholar 

  • Kubokawa T, Srivastava MS (1999) Robust improvement in estimation of a covariance matrix in an elliptically contoured distribution. Ann Stat 27(2):600–609

    Article  MATH  MathSciNet  Google Scholar 

  • Kubokawa T (2009) Integral inequality for minimaxity in the Stein problem. J Jpn Stat Soc 39:1–21

    Article  MathSciNet  Google Scholar 

  • Lehmann, Casella (1998) Theory of point estimation. Springer, New York

    MATH  Google Scholar 

  • Maruyama Y (2003) A robust generalized Bayes estimator improving on the James–Stein estimator for spherically symmetric distributions. Stat Decis 21:69–78

    Article  MATH  MathSciNet  Google Scholar 

  • Muirhead RJ (1982) Aspects of multivariate statistical theory. Wiley, New York

    MATH  Google Scholar 

  • Stein C (1981) Estimation of the mean of multivariate normal distribution. Ann Stat 9:1135–1151

    Article  MATH  Google Scholar 

Download references

Acknowledgments

We would like to thank two referees for suggestions which helped us to improve on a first version of the paper. In particular, we thank one of the referees for suggesting the link to the paper of Kubokawa (2009) presented in Example 3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William Strawderman.

Additional information

This work was partially supported by a Grant from the Simons Foundation (#209035 to William Strawderman).

Appendix

Appendix

1.1 A link between the expectations involving f and F

Proof of Lemma 2.2 Denote by \(\eta \! \left( X,\left\| U\right\| ^2\right) \) the integrand of the second expectation, that is,

$$\begin{aligned} \eta \! \left( X,\left\| U\right\| ^2\right) = \frac{1}{2} \, \frac{1}{\Vert U\Vert ^{k-2}} \int _0^{\Vert U\Vert ^2} \gamma (X,s) \, s^{k/2-1} \, ds \, . \end{aligned}$$

Then conditionally on \(X=x\), we have

$$\begin{aligned} E_{\theta ,\sigma ^2} \! \left[ \eta \! \left( X,\Vert U\Vert ^2\right) | X=x\right]&= \frac{1}{K(\theta , \sigma ^2 ,x)} \int _{\mathbb R^k} \frac{1}{2} \, \frac{1}{\Vert u\Vert ^{k-2}} \int _0^{\Vert u\Vert ^2} \gamma (x,s) \, s^{k/2-1} \, ds \, \\&\times \frac{1}{\sigma ^{p+k}} \, f \! \left( \frac{\Vert x-\theta \Vert ^2 + \Vert u\Vert ^2}{\sigma ^2}\right) du \end{aligned}$$

where

$$\begin{aligned} K(\theta , \sigma ^2 ,x)=\int _{{\mathbb R}^k} \frac{1}{\sigma ^{p+k}} \, f \! \left( \frac{\Vert x-\theta \Vert ^2 + \Vert u\Vert ^2}{\sigma ^2}\right) du \, . \end{aligned}$$

Applying Fubini’s theorem, we obtain

$$\begin{aligned} E_{\theta ,\sigma ^2} \! \left[ \eta \! \left( X,\Vert U\Vert ^2\right) | X=x\right]&= \frac{1}{K(\theta , \sigma ^2 ,x)} \int _0^\infty \int _{\bar{B}(\sqrt{s})} \frac{1}{2} \, \frac{1}{\Vert u\Vert ^{k-2}} \frac{1}{\sigma ^{p+k}} \,\nonumber \\&\times f \! \left( \frac{\Vert x-\theta \Vert ^2 + \Vert u\Vert ^2}{\sigma ^2}\right) du \gamma (x,s) \, s^{k/2-1} \, ds \end{aligned}$$

where \(\bar{B}(\sqrt{s})=\left\{ u\in {\mathbb R}^k/\left\| u\right\| > \sqrt{s}\right\} \) is the complement of the ball of radius \(\sqrt{s}\) centered at \(0\) in \({\mathbb R}^k\). As, in the inner most integral, the variable \(u\) intervenes through its norm \(\left\| u\right\| \), we have, letting \(\varsigma _k = 2 \, \pi ^{k/2} / \Gamma ( k/2)\),

$$\begin{aligned}&\int _{\bar{B}(\sqrt{s})} \frac{1}{\Vert u\Vert ^{k-2}} \, f \! \left( \frac{\Vert x-\theta \Vert ^2 + \Vert u\Vert ^2}{\sigma ^2}\right) du\\&\quad = \varsigma _k \int _{\sqrt{s}}^\infty \frac{1}{r^{k-2}} \, f \! \left( \frac{\Vert x-\theta \Vert ^2 + r^2}{\sigma ^2}\right) r^{k-1} \, dr \\&\quad = \frac{\varsigma _k}{2} \int _s^\infty f \! \left( \frac{\Vert x-\theta \Vert ^2 + t}{\sigma ^2}\right) dt \\&\quad = \varsigma _k \, \sigma ^2 \, F \! \left( \frac{\Vert x-\theta \Vert ^2 + s}{\sigma ^2}\right) . \end{aligned}$$

Hence

$$\begin{aligned}&E_{\theta ,\sigma ^2} \! \left[ \eta \! \left( X,\Vert U\Vert ^2\right) | X=x\right] =\frac{\sigma ^2}{2} \, \frac{\varsigma _k}{K(\theta , \sigma ^2 ,x)}\int _0^\infty \frac{1}{\sigma ^{p+k}} \, F \! \left( \frac{\Vert x-\theta \Vert ^2 + s}{\sigma ^2}\right) \\&\qquad \times \gamma (x,s) \, s^{k/2-1} \, ds \\&\quad =\sigma ^2 \int _0^\infty \frac{F \! \left( \frac{\Vert x-\theta \Vert ^2 + s}{\sigma ^2}\right) }{f \! \left( \!\frac{\Vert x-\theta \Vert ^2 + s}{\sigma ^2}\!\right) } \, \gamma (x,s) \, \frac{1}{2} \, \frac{\varsigma _k}{K(\theta , \sigma ^2 ,x)} \, s^{k/2-1} \, \frac{1}{\sigma ^{p+k}} \, f \! \left( \frac{\Vert x-\theta \Vert ^2 + s}{\sigma ^2}\right) ds \\&\quad =\sigma ^2 \, E_{\theta ,\sigma ^2} \! \left[ \frac{F \! \left( \frac{\Vert x-\theta \Vert ^2 + \Vert U\Vert ^2}{\sigma ^2}\right) }{f \! \left( \frac{\Vert x-\theta \Vert ^2 + \Vert U\Vert ^2}{\sigma ^2}\right) } \, \gamma (x,\Vert U\Vert ^2) \, | X=x \right] \end{aligned}$$

using the the radial density of \(U | X=x\) as above. Consequently, unconditioning, we have

$$\begin{aligned} E_{\theta ,\sigma ^2} \! \left[ \eta \! \left( X,\Vert U\Vert ^2\right) \right]&= \sigma ^2 \,E_{\theta ,\sigma ^2} \! \left[ \frac{F \! \left( \frac{\Vert X-\theta \Vert ^2 + \Vert U\Vert ^2}{\sigma ^2}\right) }{f \! \left( \frac{\Vert X-\theta \Vert ^2 + \Vert U\Vert ^2}{\sigma ^2}\right) } \,\gamma (X,\Vert U\Vert ^2)\right] \\&= \sigma ^2 \, c \, E_{\theta ,\sigma ^2}^{*} \!\left[ \gamma \! \left( X,\left\| U\right\| ^2\right) \right] , \end{aligned}$$

according to the definition of \(E_{\theta ,\sigma ^2}^{*}\). \(\square \)

1.2 Independence on f of the distribution of Z given S

The following lemma relies on the decomposition of an \(n \times p\) matrix \(Z = Z_{n \times p}\) of rank \(p\), which can be found in Theorem A 9.8 of Muirhead (1982):

$$\begin{aligned} Z = Z_{n \times p} = H_1 T = H_{1,n \times p} T_{p \times p} \end{aligned}$$
(5.1)

uniquely where \(T = T_{p \times p}\) is upper triangular with positive diagonals and \(H_1 = H_{1,n \times p}\) is \(n \times p\) with \(H_1^t H_1 = I_p\).

Lemma 5.1

Let

$$\begin{aligned} Z = Z_{n \times p} \sim |\Sigma |^{-n/2} \, f \! \left( \mathrm{tr} \! \left[ \Sigma ^{-1} Z^t Z\right] \right) \end{aligned}$$
(5.2)

with rank \(p\) and let \(S = S_{p \times p} = Z^t Z\). Then \(Z | S\) has the distribution of \(H_1 T\) where \(H_1 = H_{1,n \times p}\) has the unique invariant Haar distribution on the Stiefel manifold of \(n \times p\) matrices with orthogonal columns (independently of \(T = T_{p \times p}\)), and that distribution is independent of \(f\).

Proof

Consider (5.1). We have

$$\begin{aligned} S = Z^t Z = T^t H_1^t H_1 T = T^t T \, . \end{aligned}$$

Further the unique symmetric square root of \(S\) satisfies

$$\begin{aligned} S^{1/2}_{p \times p} = S^{1/2} = H^* T^* = H^*_{p \times p} T^*_{p \times p} \end{aligned}$$

by the same theorem so that

$$\begin{aligned} S = S^{1/2} S^{1/2} = T^{*t} T^* = T^t T \end{aligned}$$

and uniqueness implies that

$$\begin{aligned} T = T^* \, . \end{aligned}$$

Hence, in the normal case, \(S = T^t T\) is complete and sufficient but, since \(T\) is uniquely defined, it is also complete and sufficient.

Note that, if \(Z\) has density (5.2), then \(Z^* = O Z \sim Z\) (since \(|J| = 1\) and \(Z^{*t} = Z^* = Z^t O^t O Z = Z^t Z\)) and \(T\) is also sufficient for general \(f\). By Theorem 2.1.13 of Muirhead (1982), the Jacobian of \(Z \rightarrow (T,H_1)\) is

$$\begin{aligned} dZ = \prod _{i=1}^p t_{ii}^{p-i} \, dT H_1^t \, dH_1^t \, , \end{aligned}$$

where \(H_1^t \, dH_1^t\) is the Harr measure on the Stiefel manifold. It follows that the joint density of \((H_1,T)\) is given by

$$\begin{aligned} g(H_1,T) = |\Sigma |^{-n/2} \, f \! \left( \mathrm{tr} \! \left[ T^t T \, \Sigma ^{-1} \right] \right) \prod _{i=1}^p t_{ii}^{p-i} \, dT H_1^t \, dH_1^t \end{aligned}$$

so that \(T\) is independent of \(H_1\) and \(H_1\) has the invariant Haar probability density on the Stiefel manifold

$$\begin{aligned} C_{p,n} \, H_1^t \, dH_1^t \end{aligned}$$

where \(C_{p,n}\) is given in Theorem 2.1.15 of Muirhead (1982) as equal to

$$\begin{aligned} \left( 2^p \, \frac{\pi ^{np/2}}{\Gamma _n(p/2)}\right) ^{-1} \end{aligned}$$

and where \(\Gamma _n(p/2)\) is the multivariate gamma function defined in Definition 2.1.10 as

$$\begin{aligned} \Gamma _m(a) = \int _{A>0} \mathrm{etr}(-A) \, |A|^{a-(m+1)/2} \, dA \, . \end{aligned}$$

Hence \(Z = H_1 T\) where \(H_1\) and \(T\) are independent and \(H_1\) has the invariant Haar probability density and

$$\begin{aligned} Z | S = Z | T \sim H_1 T \, , \end{aligned}$$

and the distribution is independent of \(f\).\(\square \)

1.3 Technical lemmas

Lemma 5.2

Let \(r\) be a differentiable function from \(\mathbb R_+\) into \(\mathbb R\) such that the function \(t \mapsto t \, r^\prime (t)\) is nondecreasing. Then

$$\begin{aligned} E_{\theta ,\Sigma }[r^\prime (X^t S^{-1} X) \, X^t S^{-1} (X - \theta )] \ge 0 \end{aligned}$$
(5.3)

provided that \(n - p + 1 > 0\).

Proof

Note that, for any suitable function \(h\), we have

$$\begin{aligned} E_{\theta ,\Sigma }\left[ \mathrm{div}_Xh(X^t S^{-1} X) \, X \right]&= E_{\theta ,\Sigma }\left[ \left\{ - 2 \, h^\prime (X^t S^{-1} X) X^t S^{-1} X\right. \right. \nonumber \\&\left. \left. + (n - p + 1) \, h(X^t S^{-1} X) \right\} X^t S^{-1} (X - \theta )\right] \end{aligned}$$

by Theorem 3.1 and the calculations of \(\mathrm{D}_{1/2}^*\). Set

$$\begin{aligned} 2 \, r^\prime (t) = - 2 \, h^\prime (t) \, t + (n - p - 1) \, h(t) \end{aligned}$$

and check that a solution for \(h(t)\) is given by

$$\begin{aligned}{}[ h(t) = t^{(n - p - 1)/2} \int _t^\infty u^{-(n - p + 1)/2} \, r^\prime (u) \, du \, . \end{aligned}$$

Then, to prove Inequality (5.3), it will suffice to show that \(\mathrm{div}_Xh(X^t S^{-1} X) \, X \ge 0\). Now, as we have

$$\begin{aligned} \mathrm{div}_Xh(X^t S^{-1} X) \, X = p \, h(X^t S^{-1} X) + 2 \, h^\prime (X^t S^{-1} X) \, X^t S^{-1} X \, , \end{aligned}$$

it suffices to show that

$$\begin{aligned} p \, h(t) + 2 \, h^\prime (t) \, t > 0 \, . \end{aligned}$$

We can write

$$\begin{aligned} p \, h(t) + 2 \, h^\prime (t) \, t&= p \, t^{(n - p - 1)/2} \! \int _t^\infty \! u^{-(n - p + 1)/2} \, r^\prime (u) \, du\\&+\,(n - p - 1) \, t^{(n - p - 1)/2} \! \int _t^\infty \! u^{-(n - p + 1)/2} \, r^\prime (u) \, du - 2 \, r^\prime (t) \\&= (n - 1) \, t^{(n - p - 1)/2} \! \int _t^\infty \! u^{-(n - p + 1)/2} \, r^\prime (u) \, du - 2 \, r^\prime (t). \end{aligned}$$

Hence, if \(t \mapsto t \, r^\prime (t)\) is nondecreasing (and nonnegative), then \(u \, r^\prime (u) \ge t \, r^\prime (t)\) for all \(u \ge t > 0\) and the last line of the above expression is bounded below by

$$\begin{aligned}&(n - 1) \, t^{(n - p - 1)/2} (t \, r^\prime (t)) \! \int _t^\infty \! u^{-(n - p + 3)/2} \, du - 2 \, r^\prime (t) \\&\quad = (n - 1) \, t^{(n - p + 1)/2} \, r^\prime (t) \frac{t^{-(n - p + 1)/2}}{(n - p + 1)/2} - 2 \, r^\prime (t) \\&\quad = \left( \frac{2 \, (n-1)}{n - p + 1} - 2 \right) r^\prime (t) \\&\quad > 0 \, , \end{aligned}$$

provided \(n - p + 1 > 0\) and \(t > 0\) (recall that \(p \ge 3\)).\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fourdrinier, D., Strawderman, W. Robust minimax Stein estimation under invariant data-based loss for spherically and elliptically symmetric distributions. Metrika 78, 461–484 (2015). https://doi.org/10.1007/s00184-014-0512-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-014-0512-x

Keywords

Navigation