On combining unbiased and possibly biased correlated estimators

Zinonos, Stavros; Strawderman, William E.

doi:10.1007/s42081-023-00194-2

On combining unbiased and possibly biased correlated estimators

Original Paper
Stein Estimation and Statistical Shrinkage Methods
Published: 07 March 2023

(2023)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

90 Accesses
1 Altmetric
Explore all metrics

Abstract

We study estimators that combine an unbiased estimator with a possibly biased correlated estimator of a mean vector. The combined estimators are shrinkage-type estimators that shrink the unbiased estimator towards the biased estimator. Conditions under which the combined estimator dominates the original unbiased estimator are given. Models studied include normal models with a known covariance structure, scale mixtures of normals, and more generally elliptically symmetric models with a known covariance structure. Elliptically symmetric models with a covariance structure known up to a multiple are also considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Methods for High-Dimensional Regression and Covariance Matrix Estimation

Weak Versus Strong Dominance of Shrinkage Estimators

Article 18 November 2021

Giuseppe De Luca & Jan R. Magnus

Robust estimation in partially nonlinear models

Article 20 June 2023

Andrés Muñoz & Daniela Rodriguez

References

Arslan, O. (2001). Family of multivariate generalized t distributions. Journal of Multivariate Analysis, 89, 329–251.
Article MathSciNet MATH Google Scholar
Berger, J. (1975). Minimax estimation of location vectors for a wide class of densities. The Annals of Statistics, 3, 1318–1328.
Article MathSciNet MATH Google Scholar
Berger, J. (1976). Admissible minimax estimation of a multivariate normal mean with arbitrary quadratic loss. The Annals of Statistics, 4, 223–226.
Article MathSciNet MATH Google Scholar
Casella, G., & Hwang, J. T. (1982). Limit expressions for the risk of James–Stein estimators. Canadian Journal of Statistics-revue Canadienne De Statistique, 10, 305–309.
Article MathSciNet MATH Google Scholar
Cessie, S., Nagelkerke, N., Rosendal, F., Stralen, K., Pomp, E., & Houwelingen, H. (2008). Combining matched and unmatched control groups in case–control studies. American Journal of Epidemiology, 168, 1204–1210.
Article Google Scholar
Fourdrinier, D., & Strawderman, W. (2008). Generalized Bayes minimax estimators of location vectors for spherically symmetric distributions. Journal of Multivariate Analysis, 99, 735–750.
Article MathSciNet MATH Google Scholar
Fourdrinier, D., Strawderman, W., & Wells, M. (2018). Shrinkage estimation (1st ed.). Springer Nature.
Book MATH Google Scholar
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., & Hothorn, T. (2020). mvtnorm: Multivariate normal and t distributions. Retrieved from https://CRAN.R-project.org/package=mvtnorm
Green, E., & Strawdermann, W. (1991). A James–Stein type estimator for combining unbiased and possibly biased estimators. Journal of the American Statistical Association, 86, 1001–1006.
Article MathSciNet MATH Google Scholar
Judge, G., & Mittelhammer, R. (2004). A semiparametric basis for combing estimation problems under quadratic loss. Journal of the American Statistical Association, 99, 479–487.
Article MathSciNet MATH Google Scholar
Lehmann, E., & Romano, J. (2005). Testing statistical hypothesis (3rd ed.). Springer.
MATH Google Scholar
Mardia, K., Kent, J., & Bibby, J. (1979). Multivariate analysis. Academic Press Inc.
MATH Google Scholar
Stapleton, J. (2009). Linear statistical models (2nd ed.). Wiley.
MATH Google Scholar
Strawderman, W. (1974). Minimax estimation of location parameters for certain spherically symmetric distributions. Journal of Multivariate Analysis, 4, 255–264.
Article MathSciNet MATH Google Scholar
Strawderman, W. (2003). On minimax estimation of a normal mean vector for general quadratic loss. Lecture Notes-Monograph series, 42, 223–226.
MathSciNet Google Scholar
Venables, W., & Ripley, B. (2002). Modern applied statistics with S (4th ed.). Springer.
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Cardiovascular Institute of New Jersey, RWJMS, New Brunswick, NJ, USA
Stavros Zinonos
Rutgers University, New Brunswick, NJ, USA
William E. Strawderman

Authors

Stavros Zinonos
View author publications
You can also search for this author in PubMed Google Scholar
William E. Strawderman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stavros Zinonos.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by grants from the Simons Foundation ($\#$ 209035 and $\#$ 418098 to William Strawderman).

Appendices

Appendix A

1.1 A.1. Proof of Lemma 2 in Sect. 2

Proof

Let $Z\sim N_p(\mu , I_p)$, then $Y=Z'Z$ has a $\chi ^2_p(\mu '\mu )$ distribution with density:

$$\begin{aligned} f(y|p,\mu '\mu )=\sum _{k=0}^{\infty } P_k\left( \frac{\mu '\mu }{2}\right) f_{p+2k}(y)\, [13] \end{aligned}$$

where $P_k$ is the Poisson density, $\frac{e^{-\frac{\mu '\mu }{2}}(\frac{\mu '\mu }{2})^k }{k!}$, and $f_{p+2k}$ is the density of a $\chi ^2_{p+2k}$, $\frac{y^{\frac{p+2k}{2}-1}e^{-\frac{y}{2}}}{\Gamma (\frac{p+2k}{2})2^{\frac{p+2k}{2}}}$, so that Y is a Poisson mixture of central $\chi ^2$ densities. Supposing that $p\ge 3 $, by Tonelli’s theorem

$$\begin{aligned} E\left[ \frac{1}{Y}\right] =\sum _{k=0}^{\infty }P_k\left( \frac{\mu '\mu }{2}\right) E\left[ \frac{1}{\chi ^2_{p+2k}}\right] = \end{aligned}$$

$$\begin{aligned} \sum _{k=0}^{\infty }P_k\left( \frac{\mu '\mu }{2}\right) \frac{1}{p+2k-2}=E_{\frac{\mu '\mu }{2}}\left[ \frac{1}{p+2k-2}\right] , \end{aligned}$$

(51)

where the expectation in (51) is taken with respect to a Poisson distribution with parameter $\frac{\mu '\mu }{2}$. In order for the expectation of the random variable $\frac{1}{Y}$ to exist, $p\ge 3 $ is necessary since for $k=0$, $E\left[ \frac{1}{\chi ^2_p}\right] < \infty $ implies $p\ge 3$. Furthermore for $p\ge 3$ and $l>0$,

$$\begin{aligned} E\left[ \frac{1}{\chi ^2_{p+l}}\right]< & {} E\left[ \frac{1}{\chi ^2_{p}}\right] =\frac{1}{p-2}\\ {\text {so that}} \\ E\left[ \frac{1}{Y}\right]< & {} \sum _{k=0}^{\infty }P_k\left( \frac{\mu '\mu }{2}\right) \left[ \frac{1}{p-2}\right] = \frac{1}{p-2} \end{aligned}$$

which is finite. By the convexity of the function $g(k)=\frac{1}{2k+p-2}$, Jensen’s inequality implies

$$\begin{aligned} \frac{1}{p-2+\mu '\mu }=\frac{1}{p-2+2Ek} \le E\left[ \frac{1}{p+2k-2}\right] \le E\left[ \frac{1}{p-2}\right] =\frac{1}{p-2}. \end{aligned}$$

(52)

Casella and Hwang (1982) gives the more refined upper bound

$$\begin{aligned} E_{\mu ' \mu }\left[ \frac{1}{Z'Z}\right] \le \frac{1}{p-2}\left( \frac{p+2}{p+2+\mu '\mu }\right) \end{aligned}$$

(53)

for $p \ge 3$. When $p>3$, further refinement on the upper bound for the expectation in expression (53) is given in Green and Strawdermann (1991)

$$\begin{aligned} E_{\mu '\mu }\left[ \frac{1}{Z'Z}\right] \le \frac{1}{p-4+\mu '\mu }. \end{aligned}$$

(54)

Combining the bounds from expression (54) and (53), the following expression was established in Green and Strawdermann (1991) when $p\ge 4$

$$\begin{aligned} \frac{1}{p-2+\mu '\mu }\le E_{\mu '\mu }\left[ \frac{1}{Z'Z}\right] \le \text {min}\left\{ \frac{p+2}{(p-2)(p+2+\mu ' \mu )},\frac{1}{p-4+\mu '\mu }\right\} . \end{aligned}$$

(55)

Suppose now that $X\sim N_p(\mu ,\Sigma )$ and let $Y=\Sigma ^{-\frac{1}{2}}X\sim N_p(\Sigma ^{-\frac{1}{2}}\mu ,I_p) $. For any orthogonal matrix U with rank p, $Z=U'Y\sim N_p(U'\Sigma ^{-\frac{1}{2}}\mu ,I_p).$ Using the spectral value decomposition of $\Sigma $ to get

$$\begin{aligned} P'\Sigma P=\Lambda ={\text {Diag}}(\lambda _1,\lambda _2,\ldots ,\lambda _p), \end{aligned}$$

where P is the orthogonal matrix of eigenvectors of $\Sigma $ and $\Lambda $ is the diagonal matrix of eigenvalue of $\Sigma $, and denoting $\lambda _{(1)}$ and $\lambda _{(p)}$ as the largest and smallest eigenvalues of $\Sigma $, the following inequalities exist:

$$\begin{aligned} \lambda _{(1)}E[Z'Z]\le E[X'X]= E[X'\Sigma ^{-\frac{1}{2}}PP'\Sigma PP'\Sigma ^{-\frac{1}{2}}X] = E[Z'\Lambda Z]\le \lambda _{(p)}E[Z'Z] \end{aligned}$$

with $Z=P'\Sigma ^{-\frac{1}{2}}X$. Noting that the function $\phi (x)=\frac{1}{x}$ is convex for $x>0$,

$$\begin{aligned}{} & {} \text {max}\left\{ \frac{1}{\lambda _{p}}E\left[ \frac{1}{Z'Z}\right] ,\frac{1}{tr(\Lambda )+\mu '\mu }\right\} \\ {}{} & {} \quad \le E\left[ \frac{1}{Z'\Lambda Z}\right] = E\left[ \frac{1}{X'X}\right] \le \frac{1}{\lambda _{(1)}}E\left[ \frac{1}{Z'Z}\right] \le \infty \end{aligned}$$

provided $p \ge 3$, where $Z \sim N_p(P'\Sigma ^{-\frac{1}{2}}\mu ,I_p)$.

From expression (55) then, when $p \ge 4 $,

$$\begin{aligned}{} & {} \text {max} \left\{ \frac{1}{\text {tr}(\Lambda ) + \mu '\mu },\frac{1}{\lambda _{(p)}}\left[ \frac{1}{p-2+\mu '\Sigma ^{-1}\mu }\right] \right\} \le E\left[ \frac{1}{X'X}\right] , \end{aligned}$$

(56)

$$\begin{aligned}{} & {} \quad \le \text {min}\left\{ \frac{1}{\lambda _{(1)}}\left[ \frac{1}{p-4+\mu '\Sigma ^{-1}\mu }\right] ,\frac{1}{\lambda _{(1)}}\left[ \frac{p+2}{(p-2)(p+2+\mu '\Sigma ^{-1}\mu }\right] \right\} , \end{aligned}$$

(57)

and for $p\ge 3$,:

$$\begin{aligned}{} & {} \text {max} \left\{ \frac{1}{tr(\Lambda ) + \mu '\mu },\frac{1}{\lambda _{(p)}}\left[ \frac{1}{p-2+\mu '\Sigma ^{-1}\mu }\right] \right\} \le E\left[ \frac{1}{X'X}\right] \\{} & {} \quad \le \frac{1}{\lambda _{(1)}}\left[ \frac{p+2}{(p-2)(p+2+\mu '\Sigma ^{-1}\mu )}\right] . \end{aligned}$$

$\square $

1.2 A.2. Proof of Lemma 5 in Sect. 2

Proof

Let $q_{ij}$ denote the $(i,j)^{th}$ entry in Q.

$$\begin{aligned} \frac{\partial }{\partial x_i} \frac{r(\Vert x-y \Vert ^2_{Q})(x_i-y_i)}{\Vert x-y \Vert ^2_{Q}}= \end{aligned}$$

(by an application of Lemma 4 on $\Vert x-y\Vert ^2_{Q}$)

$$\begin{aligned}{} & {} \frac{\Vert x-y \Vert ^2_{Q}[2r'(\Vert x-y \Vert ^2_{Q})\sum _{j=1}^pq_{ij}(x_j-y_j)(x_i-y_i)+r(\Vert x-y \Vert ^2_{Q})]}{(\Vert x-y \Vert ^2_{Q})2} \\{} & {} \quad - \frac{2r(\Vert x-y \Vert ^2_{Q})\sum _{j=1}^pq_{ij}(x_j-y_j)(x_i-y_i)}{(\Vert x-y \Vert ^2_{Q})2}, \end{aligned}$$

so that

$$\begin{aligned}{} & {} \frac{r^2( \Vert x-y \Vert ^2_Q)}{\Vert x-y \Vert ^2_Q}-2\text {div}_x\left( \frac{r(\Vert x-y\Vert ^2_Q)}{\Vert x-y\Vert ^2_Q} (x-y)\right) \nonumber \\{} & {} \quad = \frac{r^2(\Vert x-y\Vert ^2_{Q})}{\Vert x-y \Vert ^2_{Q})}-2\left[ \frac{(p-2)r(\Vert x-y \Vert ^2_{Q})}{\Vert x-y \Vert ^2_{Q}}+2r'(\Vert x-y \Vert ^2_{Q}) \right] \nonumber \\{} & {} \qquad {\text{(by } \text{ assumption } \text{ ii) }}\nonumber \\{} & {} \quad \le \frac{r^2(\Vert x-y \Vert ^2_{Q})-2(p-2)r(\Vert x-y \Vert ^2_{Q})}{\Vert x-y \Vert ^2_{Q}} \nonumber \\{} & {} \quad = \frac{-r(\Vert x-y \Vert ^2_{Q})[2(p-2)-r(\Vert x-y \Vert ^2_{Q})]}{\Vert x-y \Vert ^2_{Q}} \le 0. \end{aligned}$$

(58)

$\square $

1.3 A.3. Proof of Corollary 2 in Sect. 2

Proof

Let $Z=B(Y-AX)$. Using expression (14), the risk of the estimator $\delta $, $R(\delta ,\theta ,\eta )$ can be expressed as:

$$\begin{aligned}{} & {} R(X,\theta ) +E\left[ E\left[ \frac{a^2}{\Vert X-\theta _0\Vert _{Q^*}}-2\text {div}_x\left( \frac{a}{\Vert X-\theta _0 \Vert ^2_{Q^*}}(X-\theta _0)\right) \mid Z=\theta _0\right] \right] \nonumber \\{} & {} \quad = \text {tr}(Q\Sigma _{11})+E\left[ E\left[ \frac{a^2}{\Vert X-\theta _0 \Vert ^2_{Q^*}}-2\frac{(p-2)a}{\Vert X-\theta _0 \Vert ^2_{Q^*}}\mid Z=\theta _0\right] \right] \nonumber \\{} & {} \quad = \text {tr}(Q\Sigma _{11}) -a(2(p-2)-a)E\left[ \frac{1}{\Vert X-Z \Vert ^2_{Q^*}}\right] . \end{aligned}$$

(59)

Since $Q>0$ and symmetric, $Q^*= \Sigma _{11}^{-1} Q^{-1}\Sigma _{11}^{-1}$ has a symmetric square root denoted by $Q^{*}{}^{\frac{1}{2}}$. Let

$$\begin{aligned} \Sigma ^*= \Sigma _{11}+B(\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12})B'. \end{aligned}$$

Making the change of variables,

$$\begin{aligned} W=X-Z=X-B(Y-AX) \sim N_p(\mu _{\theta \eta }, \Sigma ^*), \end{aligned}$$

and

$$\begin{aligned} V=Q^*{}^{\frac{1}{2}}W \sim N_p(Q^*{}^{\frac{1}{2}}\mu _{\theta \eta }, Q^*{}^{\frac{1}{2}}\Sigma ^*Q^*{}^{\frac{1}{2}}), \end{aligned}$$

(60)

in (59) implies

$$\begin{aligned} R(\delta ,\theta ,\eta )= tr(Q\Sigma _{11})-a(2(p-2)-a)E\left[ \frac{1}{V'V}\right] . \end{aligned}$$

(61)

Since V has a multivariate normal distribution, whose parameters are given in (60), an application of Lemma 2 to the $E\left[ \frac{1}{V'V}\right] $ in (61) implies the result since any eigenvalue of $Q^*{}^{\frac{1}{2}}[\Sigma _{11}+B(\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12})B']Q^*{}^{\frac{1}{2}}$ is also an eigenvalue of $[\Sigma _{11}+B(\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12})B']Q^*$. $\square $

1.4 A.4. Proof of Lemma 6 in Sect. 3.1

Proof

Let $Y=\frac{1}{\sigma }\Sigma ^{-\frac{1}{2}}X$, and $Z = P'Y$ where $P'\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}P =\Lambda $ where $\Lambda $ is the diagonal matrix of eigenvalues and P is the associated column matrix of eigenvectors of $\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}$. Then, $Y \sim N_p(\frac{1}{\sigma }\Sigma ^{-\frac{1}{2}}\mu , I)$ and $Z\sim N_p(\frac{1}{\sigma }P'\Sigma ^{-\frac{1}{2}}\mu ,I)$ so that:

$$\begin{aligned}{} & {} E\left[ \frac{X'QX}{\sigma ^2}\right] = E\left[ Y'\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}Y\right] =E[(P'Y)'P'\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}P(P'Y)]\\{} & {} \quad = E[Z'\Lambda Z] = E\left[ \sum _{i=1}^p \lambda _i Z_i^2\right] . \end{aligned}$$

Now, $\lbrace Z_i \rbrace _{i=1}^p$ is an an independent collection of random variables, where

$$\begin{aligned} Z_i \sim N\left( \frac{\mu ^*_i}{\sigma }, 1\right) \end{aligned}$$

and where $\mu ^*=P'\Sigma ^{-\frac{1}{2}}\mu ,$ and thus $\lbrace Z_i^2 \rbrace _{i=1}^p$ is a collection of independent random variables with

$$\begin{aligned} Z_i^2 \sim \chi ^2_1\left( \frac{\mu _i^{*2}}{\sigma ^2}=\nu _i\right) . \end{aligned}$$

Since a non-central chi-squared random variable has a monotone likelihood ratio (Lehmann & Romano, 2005), $\chi ^2_1(\nu _i)$ is stochastically increasing in the parameter $\nu _i$, and thus is stochastically decreasing in $\sigma ^2$. Let

$$\begin{aligned} U(z_1^2,\ldots ,z_p^2) =\sum _{i=1}^p \lambda _iz_i^2. \end{aligned}$$

Since U is increasing in each of its coordinates, $\lbrace Z_i^2 \rbrace _{i=1}^p$ is independent collection of random variables, and each $Z_i^2$ is stochastically decreasing in $\sigma ^2$, $U(Z_1^2,Z_2^2,\ldots ,Z_p^2)=\sum _{i=1}^p\lambda _iZ_i^2$ is stochastically decreasing in $\sigma ^2$, establishing the result. $\square $

1.5 A.5. Proof of Lemma 7 in Sect. 3.2

Proof

$$\begin{aligned}{} & {} E[(X-\theta )'\Sigma _{11}^{-1}g(X,Y)]\nonumber \\{} & {} \quad =\int \left[ \sum _{i=1}^p\int \left[ \sum _{j=1}^p \sigma ^*_{ij}(x_j-\theta _j)g_i(x,y)\right] \right. \nonumber \\ {}{} & {} \quad \left. f(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}) {\text {d}}x\right] {\text {d}}y\nonumber \\{} & {} \quad =\int \left[ \sum _{i=1}^p \int g_i(x,y) \left\{ -\frac{\partial }{\partial x_i}(\frac{1}{2}\int _{t}^{\infty }f(u){\text {d}}u)\right\} {\text {d}}x\right] {\text {d}}y\nonumber \\{} & {} \quad =\int \left[ \sum _{i=1}^p \int \frac{\partial }{\partial x_i}g_i(x,y)F(t){\text {d}}x\right] {\text {d}}y, \end{aligned}$$

(62)

where $t=\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}$, and $\sigma _{ij}^{*}$ is the i, $j^{th}$ element of $\Sigma _{11}^{-1}$. Upon dividing and multiplying (62) by f(t),

$$\begin{aligned}{} & {} \int \int \text {div}_x(g(x,y))\frac{F(t)}{f(t)}f(t){\text {d}}x{\text {d}}y\nonumber \\{} & {} \quad = E\left[ \text {div}_x(g(X,Y))\frac{F\left( \Vert X-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert Y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}\right) }{f\left( \Vert X-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert Y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}\right) }\right] \end{aligned}$$

(63)

establishing the result. $\square $

1.6 A.6. Proof of Lemma 8 in Sect. 4

Proof

Let $t=\frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)$. Denoting

$$\begin{aligned} F(t) = \frac{1}{2}\int _t^{\infty }f(u){\text {d}}u, \end{aligned}$$

the partial derivative of F(t) with respect to the $i^{th}$ coordinate of u is

$$\begin{aligned} \frac{\partial }{\partial u_i}F(t)= \frac{\partial }{\partial u_i} \frac{1}{2}\int _t^{\infty }f(u){\text {d}}u = -f(t)\frac{u_i}{\sigma ^2}. \end{aligned}$$

(64)

Therefore,

$$\begin{aligned}{} & {} E\left[ \frac{h(X,Y,S)}{\sigma ^2}\right] \nonumber \\ {}{} & {} \quad =\int _{\Re ^{2p+k}} \frac{h\left( x,y,\Vert u \Vert ^2\right) }{\sigma ^2}f\left( \frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\right) {\text {d}}u{\text {d}}x{\text {d}}y\nonumber \\{} & {} \quad = \int _{\Re ^{2p+k}} \frac{u'}{\sigma ^2} \frac{uh(x,y,\Vert u \Vert ^2)}{\Vert u \Vert ^2}f\left( \frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\right) \nonumber \\ {}{} & {} \quad {\text {d}}u{\text {d}}x{\text {d}}y. \end{aligned}$$

(65)

Let $g_{x,y}(u) = \frac{uh(x,y,\Vert u \Vert ^2)}{\Vert u \Vert ^2}$ with $i^{th}$ coordinate

$$\begin{aligned} g_{x,y}(u)_i = \frac{u_ih(x,y,\Vert u \Vert ^2)}{\Vert u \Vert ^2} \end{aligned}$$

in (65). By the weak differentiability of g, and the expression for the partial derivative of F(t) in (64), expression (65) satisfies

$$\begin{aligned}{} & {} \sum _{i=1}^k \int _{\Re ^{2p+k}} \frac{u_i}{\sigma ^2}g_{x,y}(u)_i f\left( \frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\right) {\text {d}}u{\text {d}}x{\text {d}}y, \nonumber \\\end{aligned}$$

(66)

$$\begin{aligned}{} & {} \quad = \sum _{i=1}^k\int _{\Re ^{2p+k}}g_{x,y}(u)_i \left( -\frac{\partial }{\partial u_i} F(t)\right) {\text {d}}u{\text {d}}x{\text {d}}y, \end{aligned}$$

(67)

$$\begin{aligned}{} & {} \quad = \sum _{i=1}^k \int _{\Re ^{2p+k}} \left( \frac{\partial }{\partial u_i}g_{x,y}(u)_i\right) F(t){\text {d}}u{\text {d}}x{\text {d}}y \end{aligned}$$

(68)

$$\begin{aligned}{} & {} \quad = \int _{\Re ^{2p+k}} \text {div}_u(g_{x,y}(u))\frac{F(t)}{f(t)}f(t){\text {d}}u{\text {d}}x{\text {d}}y \end{aligned}$$

$$\begin{aligned}{} & {} \quad = E\left[ \text {div}_u(g_{x,y}(u))\frac{F(t)}{f(t)}\right] , \end{aligned}$$

(69)

where the equality from (67) to (68) is justified by the weak differentiability of $g_{x,y}(u)$ for all (x, y).

Since

$$\begin{aligned} \text {div}_u(g_{x,y}(u))=(k-2)\frac{h(x,y,s)}{S}+2\frac{\partial }{\partial S}h(x,y,s) \end{aligned}$$

expression (69) is equivalent to

$$\begin{aligned} E\left[ \left( (k-2)\frac{h(X,Y,S)}{S}+2\frac{\partial }{\partial S}h(X,Y,S)\right) \frac{F(t)}{f(t)}\right] \end{aligned}$$

establishing the result. $\square $

Appendix B

1.1 B.1. Development of UMVUE Estimator

When $\eta =0$ the density of the pair of random variables $\begin{pmatrix} X\\ Y \end{pmatrix}$ is

$$\begin{aligned} f(x,y)= \vert 2\pi \Sigma \vert ^{-\frac{1}{2}}\exp \left( -\frac{1}{2}\left( \begin{pmatrix} x-\theta \\ y-\theta \end{pmatrix}'\Sigma ^{-1} \begin{pmatrix} x-\theta \\ y-\theta \end{pmatrix} \right) \right) . \end{aligned}$$

(70)

Let

$$\begin{aligned} X_{2.1}= Y-\Sigma _{21}\Sigma _{11}^{-1}X \end{aligned}$$

so that X is independent of $X_{2.1}$, and

$$\begin{aligned} Y^* =(I-\Sigma _{21}\Sigma _{11}^{-1})^{-1}X_{2.1} \end{aligned}$$

(71)

so that,

$$\begin{aligned} E[Y^*]=(I-\Sigma _{21}\Sigma _{11}^{-1})^{-1}(I-\Sigma _{21}\Sigma _{11}^{-1})\theta =\theta \end{aligned}$$

(72)

$Y^*$ is an unbiased estimator of $\theta $, with variance

$$\begin{aligned} \text {var}(Y^*)=\Sigma _{Y^*}=(I-\Sigma _{21}\Sigma _{11}^{-1})^{-1}[\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12}](I-\Sigma _{21}\Sigma _{11}^{-1})^{-t}. \end{aligned}$$

(73)

The joint density of $\begin{pmatrix} X\\ Y^*\end{pmatrix} $ is

$$\begin{aligned} f(x,y^*)=k\exp \left( -\frac{1}{2}\left( \begin{pmatrix}x-\theta \\ y^*-\theta \end{pmatrix}' \begin{pmatrix} \Sigma _{11}^{-1} &{} 0 \\ 0 &{} \Sigma _{Y^*}^{-1} \end{pmatrix} \begin{pmatrix} x-\theta \\ y^*-\theta \end{pmatrix}\right) \right) , \end{aligned}$$

(74)

where k is the normalizing constant in (74). By expanding the quadratic form in the exponential of density (74), we can find a complete sufficient statistic for the parameter $\theta $ that will be unbiased and thus will be the UMVUE estimator. Expanding the quadratic form in (74) yields

$$\begin{aligned} x'\Sigma _{11}^{-1}x+\theta '\Sigma _{11}^{-1} \theta -2\theta '\Sigma _{11}^{-1}x +y^{*}{}'\Sigma _{Y^*}^{-1}y^*+ \theta '\Sigma _{Y^*}^{-1} \theta -2\theta '\Sigma _{Y^*}^{-1}y^*, \end{aligned}$$

(75)

which implies the density in (74) is an exponential family of the form

$$\begin{aligned} h(x,y^*)g(\theta )e^{\theta 'T(X,Y^*)} \end{aligned}$$

(76)

with complete sufficient statistic

$$\begin{aligned} T(X,Y^*)= \Sigma _{11}^{-1}X+\Sigma _{Y^*}^{-1}Y^*. \end{aligned}$$

(77)

Thus, the UMVUE estimator of $\theta $, $\delta _c$, is

$$\begin{aligned} \delta _c(X,Y)=(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}(\Sigma _{11}^{-1}X+\Sigma _{Y^*}^{-1}Y^*), \end{aligned}$$

(78)

with variance

$$\begin{aligned} \Sigma _{\delta _c}=\text {Cov}(\delta _c,\delta _c)=(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}. \end{aligned}$$

(79)

When the loss is of the form $L_Q(d,\theta )=(d-\theta )'Q(d-\theta )$, the risk $\delta _c$ when $Y^*$ is unbiased for $\theta $ is

$$\begin{aligned}{} & {} E[(\delta _c-\theta )'Q(\delta _c-\theta )]=tr(Q\Sigma _{\delta _c})=tr(Q(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}) \\{} & {} \quad = \text {tr}(Q[\Sigma _{11}^{-1}+(I-\Sigma _{21}\Sigma _{11}^{-1})'[\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12}]^{-1}(I-\Sigma _{21}\Sigma _{11}^{-1})]^{-1} ). \end{aligned}$$

When the researcher is mistaken and $Y^*$ is a biased for $\theta $ ($\eta \ne 0$), with bias

$$\begin{aligned}{} & {} E[\delta _c-\theta ]= E[(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}(\Sigma _{11}^{-1}\theta +\Sigma _{Y^*}^{-1}(\theta +\eta ))-\theta ] \\{} & {} \quad = (\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}\Sigma _{Y^*}^{-1}\eta , \end{aligned}$$

the risk of using $\delta _c$ as an estimator of $\theta $ will be

$$\begin{aligned} E[\Vert \delta _c(X,Y)-\theta \Vert ^2_{Q}]= \text {tr}(Q\Sigma _{\delta _c})+\eta '\Sigma _{Y^*}^{-1}(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1} )^{-1}Q(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1} )^{-1}\Sigma _{Y^*}^{-1}\eta , \end{aligned}$$

which is unbounded in $\eta $ unlike the estimators develop in Sect. 2 which have bounded risk.

1.2 B.2. Comparison of Risks Between $\delta _c$ (50) and $\delta _1$(45)

To compare the risk $\delta _1(X,Y)$ and $\delta _c(X,Y)$, we first compare the risk of using $\delta _c$ when $\eta =0$, and show that $\delta _c$ will have uniformly smaller risk than $\delta _1$ when $\eta =0$. We then give sufficient conditions for when the risk of $\delta _1$ will dominate the risk of $\delta _c$ by comparing the upper bound for the risk of $\delta _1(X,Y)$ developed in Corollary 2 to the exact risk of using $\delta _c$. When $\eta '\eta =0$,

$$\begin{aligned} p\sigma ^2-\frac{(p-2)\sigma ^2B}{A+B} \le R(\delta _1(X,Y), \theta , \eta ) \le p\sigma ^2-\frac{(p-2)\sigma ^2B}{A+B}, \end{aligned}$$

where

$$\begin{aligned} A:= \sigma ^2\tau ^2(1-\varrho ^2) \end{aligned}$$

and

$$\begin{aligned} B:= (\sigma ^2-\varrho \sigma \tau )^2, \end{aligned}$$

by Corollary 2. In comparison, the exact risk of $\delta _c$

$$\begin{aligned} R(\delta _c(X,Y),\theta ,\eta ) = \frac{p\sigma ^2A}{A+B}=\frac{p\sigma ^2(\sigma ^2\tau ^2(1-\varrho ^2))}{\sigma ^2\tau ^2(1-\varrho ^2)+(\sigma ^2-\varrho \sigma \tau )^2} \end{aligned}$$

so that

$$\begin{aligned}{} & {} R(\delta _1(X,Y),\theta , 0)-R(\delta _c(X,Y),\theta , 0)=p\sigma ^2-\frac{(p-2)\sigma ^2B}{A+B}- \frac{p\sigma ^2A}{A+B} \\{} & {} \quad = p\sigma ^2\frac{B}{A+B}-(p-2)\sigma ^2\frac{B}{A+B}= \frac{2\sigma ^2B}{A+B} \ge 0 \end{aligned}$$

implying that $R(\delta _c(X,Y), \theta , 0 ) < R(\delta _1(X,Y), \theta , 0)$.

When the upper bound for the risk of $\delta _1$ given by Corollary 2 is less than the risk of $\delta _c$

$$\begin{aligned}{} & {} p\sigma ^2-\frac{(p-2)\sigma ^2B}{(p-2)[B+A]+\sigma ^2\eta '\eta }< \frac{p\sigma ^2A}{A+B}+ \frac{\eta '\eta B^2}{[A+B]^2}\nonumber \\{} & {} \quad \Leftrightarrow \frac{p\sigma ^2B}{A+B} -\frac{(p-2)\sigma ^2B}{(p-2)[B+A]+\sigma ^2\eta '\eta } < \frac{\eta '\eta B^2}{[A+B]^2} \nonumber \\{} & {} \quad \Leftrightarrow p\sigma ^2B(A+B)[(p-2)(B+A)+\sigma ^2\eta '\eta ]- (p-2)\sigma ^2B[A+B]^2\nonumber \\ \end{aligned}$$

(80)

$$\begin{aligned}{} & {} \quad < \eta '\eta B^2[(p-2)(B+A)+\sigma ^2\eta '\eta ]. \end{aligned}$$

(81)

Let $x =\eta '\eta $ where $x \in [0,\infty )$ and

$$\begin{aligned} Q(x)= ax^2+bx+c. \end{aligned}$$

For

$$\begin{aligned} a= & {} \sigma ^2B^2,\\ c= & {} -(p-2)\sigma ^2B(A+B)^2[p-1], \end{aligned}$$

and

$$\begin{aligned} b= B(B+A)[(p-2)B-p\sigma ^4], \end{aligned}$$

expressions (80)–(81) will be satisfied once $Q(x)>0$. Since $a>0$ and $c<0$, Q(x) will have 2 real roots denoted by $r_1$ and $r_2$ respectively, where $r_1 <0$ and $r_2>0$. Since $a>0$, $x >r_2$ implies $Q(x)>0$ and so a sufficient condition for $R(\delta _1(X,Y), \theta ,\eta ) < R(\delta _c(X,Y), \theta ,\eta ) $ is

$$\begin{aligned}{} & {} \eta ' \eta > \frac{-(\sigma ^2-\varrho \sigma \tau )^2[(\sigma ^2-\varrho \sigma \tau )^2+\sigma ^2\tau ^2(1-\varrho ^2)]((p-2)(\sigma ^2-\varrho \sigma \tau )^2-p\sigma ^4)}{2\sigma ^2(\sigma ^2-\varrho \sigma \tau )^4}\\{} & {} \quad + \frac{\sqrt{(\sigma ^2-\varrho \sigma \tau )^4( (\sigma ^2-\varrho \sigma \tau )^2+\sigma ^2\tau ^2(1-\varrho ^2))^2 [((p-2)(\sigma ^2-\varrho \sigma \tau )^2-p\sigma ^4 )^2+4\sigma ^4(p-2)(p-1)(\sigma ^2-\varrho \sigma \tau )^2]}}{2\sigma ^2(\sigma ^2-\varrho \sigma \tau )^4}. \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zinonos, S., Strawderman, W.E. On combining unbiased and possibly biased correlated estimators. Jpn J Stat Data Sci (2023). https://doi.org/10.1007/s42081-023-00194-2

Download citation

Received: 23 October 2022
Revised: 26 January 2023
Accepted: 06 February 2023
Published: 07 March 2023
DOI: https://doi.org/10.1007/s42081-023-00194-2

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On combining unbiased and possibly biased correlated estimators

Abstract

Access this article

Similar content being viewed by others

Robust Methods for High-Dimensional Regression and Covariance Matrix Estimation

Weak Versus Strong Dominance of Shrinkage Estimators

Robust estimation in partially nonlinear models

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A

1.1 A.1. Proof of Lemma 2 in Sect. 2

Proof

1.2 A.2. Proof of Lemma 5 in Sect. 2

Proof

1.3 A.3. Proof of Corollary 2 in Sect. 2

Proof

1.4 A.4. Proof of Lemma 6 in Sect. 3.1

Proof

1.5 A.5. Proof of Lemma 7 in Sect. 3.2

Proof

1.6 A.6. Proof of Lemma 8 in Sect. 4

Proof

Appendix B

1.1 B.1. Development of UMVUE Estimator

1.2 B.2. Comparison of Risks Between \(\delta _c\) (50) and \(\delta _1\)(45)

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

On combining unbiased and possibly biased correlated estimators

Abstract

Access this article

Similar content being viewed by others

Robust Methods for High-Dimensional Regression and Covariance Matrix Estimation

Weak Versus Strong Dominance of Shrinkage Estimators

Robust estimation in partially nonlinear models

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A

1.1 A.1. Proof of Lemma 2 in Sect. 2

Proof

1.2 A.2. Proof of Lemma 5 in Sect. 2

Proof

1.3 A.3. Proof of Corollary 2 in Sect. 2

Proof

1.4 A.4. Proof of Lemma 6 in Sect. 3.1

Proof

1.5 A.5. Proof of Lemma 7 in Sect. 3.2

Proof

1.6 A.6. Proof of Lemma 8 in Sect. 4

Proof

Appendix B

1.1 B.1. Development of UMVUE Estimator

1.2 B.2. Comparison of Risks Between \(\delta _c\) (50) and \(\delta _1\)(45)

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation