Skip to main content
Log in

On combining unbiased and possibly biased correlated estimators

  • Original Paper
  • Stein Estimation and Statistical Shrinkage Methods
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

We study estimators that combine an unbiased estimator with a possibly biased correlated estimator of a mean vector. The combined estimators are shrinkage-type estimators that shrink the unbiased estimator towards the biased estimator. Conditions under which the combined estimator dominates the original unbiased estimator are given. Models studied include normal models with a known covariance structure, scale mixtures of normals, and more generally elliptically symmetric models with a known covariance structure. Elliptically symmetric models with a covariance structure known up to a multiple are also considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Arslan, O. (2001). Family of multivariate generalized t distributions. Journal of Multivariate Analysis, 89, 329–251.

    Article  MathSciNet  MATH  Google Scholar 

  • Berger, J. (1975). Minimax estimation of location vectors for a wide class of densities. The Annals of Statistics, 3, 1318–1328.

    Article  MathSciNet  MATH  Google Scholar 

  • Berger, J. (1976). Admissible minimax estimation of a multivariate normal mean with arbitrary quadratic loss. The Annals of Statistics, 4, 223–226.

    Article  MathSciNet  MATH  Google Scholar 

  • Casella, G., & Hwang, J. T. (1982). Limit expressions for the risk of James–Stein estimators. Canadian Journal of Statistics-revue Canadienne De Statistique, 10, 305–309.

    Article  MathSciNet  MATH  Google Scholar 

  • Cessie, S., Nagelkerke, N., Rosendal, F., Stralen, K., Pomp, E., & Houwelingen, H. (2008). Combining matched and unmatched control groups in case–control studies. American Journal of Epidemiology, 168, 1204–1210.

    Article  Google Scholar 

  • Fourdrinier, D., & Strawderman, W. (2008). Generalized Bayes minimax estimators of location vectors for spherically symmetric distributions. Journal of Multivariate Analysis, 99, 735–750.

    Article  MathSciNet  MATH  Google Scholar 

  • Fourdrinier, D., Strawderman, W., & Wells, M. (2018). Shrinkage estimation (1st ed.). Springer Nature.

    Book  MATH  Google Scholar 

  • Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., & Hothorn, T. (2020). mvtnorm: Multivariate normal and t distributions. Retrieved from https://CRAN.R-project.org/package=mvtnorm

  • Green, E., & Strawdermann, W. (1991). A James–Stein type estimator for combining unbiased and possibly biased estimators. Journal of the American Statistical Association, 86, 1001–1006.

    Article  MathSciNet  MATH  Google Scholar 

  • Judge, G., & Mittelhammer, R. (2004). A semiparametric basis for combing estimation problems under quadratic loss. Journal of the American Statistical Association, 99, 479–487.

    Article  MathSciNet  MATH  Google Scholar 

  • Lehmann, E., & Romano, J. (2005). Testing statistical hypothesis (3rd ed.). Springer.

    MATH  Google Scholar 

  • Mardia, K., Kent, J., & Bibby, J. (1979). Multivariate analysis. Academic Press Inc.

    MATH  Google Scholar 

  • Stapleton, J. (2009). Linear statistical models (2nd ed.). Wiley.

    MATH  Google Scholar 

  • Strawderman, W. (1974). Minimax estimation of location parameters for certain spherically symmetric distributions. Journal of Multivariate Analysis, 4, 255–264.

    Article  MathSciNet  MATH  Google Scholar 

  • Strawderman, W. (2003). On minimax estimation of a normal mean vector for general quadratic loss. Lecture Notes-Monograph series, 42, 223–226.

    MathSciNet  Google Scholar 

  • Venables, W., & Ripley, B. (2002). Modern applied statistics with S (4th ed.). Springer.

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stavros Zinonos.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by grants from the Simons Foundation (\(\#\) 209035 and \(\#\) 418098 to William Strawderman).

Appendices

Appendix A

1.1 A.1. Proof of Lemma 2 in Sect. 2

Proof

Let \(Z\sim N_p(\mu , I_p)\), then \(Y=Z'Z\) has a \(\chi ^2_p(\mu '\mu )\) distribution with density:

$$\begin{aligned} f(y|p,\mu '\mu )=\sum _{k=0}^{\infty } P_k\left( \frac{\mu '\mu }{2}\right) f_{p+2k}(y)\, [13] \end{aligned}$$

where \(P_k\) is the Poisson density, \(\frac{e^{-\frac{\mu '\mu }{2}}(\frac{\mu '\mu }{2})^k }{k!}\), and \(f_{p+2k}\) is the density of a \(\chi ^2_{p+2k}\), \(\frac{y^{\frac{p+2k}{2}-1}e^{-\frac{y}{2}}}{\Gamma (\frac{p+2k}{2})2^{\frac{p+2k}{2}}}\), so that Y is a Poisson mixture of central \(\chi ^2\) densities. Supposing that \(p\ge 3 \), by Tonelli’s theorem

$$\begin{aligned} E\left[ \frac{1}{Y}\right] =\sum _{k=0}^{\infty }P_k\left( \frac{\mu '\mu }{2}\right) E\left[ \frac{1}{\chi ^2_{p+2k}}\right] = \end{aligned}$$
$$\begin{aligned} \sum _{k=0}^{\infty }P_k\left( \frac{\mu '\mu }{2}\right) \frac{1}{p+2k-2}=E_{\frac{\mu '\mu }{2}}\left[ \frac{1}{p+2k-2}\right] , \end{aligned}$$
(51)

where the expectation in (51) is taken with respect to a Poisson distribution with parameter \(\frac{\mu '\mu }{2}\). In order for the expectation of the random variable \(\frac{1}{Y}\) to exist, \(p\ge 3 \) is necessary since for \(k=0\), \(E\left[ \frac{1}{\chi ^2_p}\right] < \infty \) implies \(p\ge 3\). Furthermore for \(p\ge 3\) and \(l>0\),

$$\begin{aligned} E\left[ \frac{1}{\chi ^2_{p+l}}\right]< & {} E\left[ \frac{1}{\chi ^2_{p}}\right] =\frac{1}{p-2}\\ {\text {so that}} \\ E\left[ \frac{1}{Y}\right]< & {} \sum _{k=0}^{\infty }P_k\left( \frac{\mu '\mu }{2}\right) \left[ \frac{1}{p-2}\right] = \frac{1}{p-2} \end{aligned}$$

which is finite. By the convexity of the function \(g(k)=\frac{1}{2k+p-2}\), Jensen’s inequality implies

$$\begin{aligned} \frac{1}{p-2+\mu '\mu }=\frac{1}{p-2+2Ek} \le E\left[ \frac{1}{p+2k-2}\right] \le E\left[ \frac{1}{p-2}\right] =\frac{1}{p-2}. \end{aligned}$$
(52)

Casella and Hwang (1982) gives the more refined upper bound

$$\begin{aligned} E_{\mu ' \mu }\left[ \frac{1}{Z'Z}\right] \le \frac{1}{p-2}\left( \frac{p+2}{p+2+\mu '\mu }\right) \end{aligned}$$
(53)

for \(p \ge 3\). When \(p>3\), further refinement on the upper bound for the expectation in expression (53) is given in Green and Strawdermann (1991)

$$\begin{aligned} E_{\mu '\mu }\left[ \frac{1}{Z'Z}\right] \le \frac{1}{p-4+\mu '\mu }. \end{aligned}$$
(54)

Combining the bounds from expression (54) and (53), the following expression was established in Green and Strawdermann (1991) when \(p\ge 4\)

$$\begin{aligned} \frac{1}{p-2+\mu '\mu }\le E_{\mu '\mu }\left[ \frac{1}{Z'Z}\right] \le \text {min}\left\{ \frac{p+2}{(p-2)(p+2+\mu ' \mu )},\frac{1}{p-4+\mu '\mu }\right\} . \end{aligned}$$
(55)

Suppose now that \(X\sim N_p(\mu ,\Sigma )\) and let \(Y=\Sigma ^{-\frac{1}{2}}X\sim N_p(\Sigma ^{-\frac{1}{2}}\mu ,I_p) \). For any orthogonal matrix U with rank p, \(Z=U'Y\sim N_p(U'\Sigma ^{-\frac{1}{2}}\mu ,I_p).\) Using the spectral value decomposition of \(\Sigma \) to get

$$\begin{aligned} P'\Sigma P=\Lambda ={\text {Diag}}(\lambda _1,\lambda _2,\ldots ,\lambda _p), \end{aligned}$$

where P is the orthogonal matrix of eigenvectors of \(\Sigma \) and \(\Lambda \) is the diagonal matrix of eigenvalue of \(\Sigma \), and denoting \(\lambda _{(1)}\) and \(\lambda _{(p)}\) as the largest and smallest eigenvalues of \(\Sigma \), the following inequalities exist:

$$\begin{aligned} \lambda _{(1)}E[Z'Z]\le E[X'X]= E[X'\Sigma ^{-\frac{1}{2}}PP'\Sigma PP'\Sigma ^{-\frac{1}{2}}X] = E[Z'\Lambda Z]\le \lambda _{(p)}E[Z'Z] \end{aligned}$$

with \(Z=P'\Sigma ^{-\frac{1}{2}}X\). Noting that the function \(\phi (x)=\frac{1}{x}\) is convex for \(x>0\),

$$\begin{aligned}{} & {} \text {max}\left\{ \frac{1}{\lambda _{p}}E\left[ \frac{1}{Z'Z}\right] ,\frac{1}{tr(\Lambda )+\mu '\mu }\right\} \\ {}{} & {} \quad \le E\left[ \frac{1}{Z'\Lambda Z}\right] = E\left[ \frac{1}{X'X}\right] \le \frac{1}{\lambda _{(1)}}E\left[ \frac{1}{Z'Z}\right] \le \infty \end{aligned}$$

provided \(p \ge 3\), where \(Z \sim N_p(P'\Sigma ^{-\frac{1}{2}}\mu ,I_p)\).

From expression (55) then, when \(p \ge 4 \),

$$\begin{aligned}{} & {} \text {max} \left\{ \frac{1}{\text {tr}(\Lambda ) + \mu '\mu },\frac{1}{\lambda _{(p)}}\left[ \frac{1}{p-2+\mu '\Sigma ^{-1}\mu }\right] \right\} \le E\left[ \frac{1}{X'X}\right] , \end{aligned}$$
(56)
$$\begin{aligned}{} & {} \quad \le \text {min}\left\{ \frac{1}{\lambda _{(1)}}\left[ \frac{1}{p-4+\mu '\Sigma ^{-1}\mu }\right] ,\frac{1}{\lambda _{(1)}}\left[ \frac{p+2}{(p-2)(p+2+\mu '\Sigma ^{-1}\mu }\right] \right\} , \end{aligned}$$
(57)

and for \(p\ge 3\),:

$$\begin{aligned}{} & {} \text {max} \left\{ \frac{1}{tr(\Lambda ) + \mu '\mu },\frac{1}{\lambda _{(p)}}\left[ \frac{1}{p-2+\mu '\Sigma ^{-1}\mu }\right] \right\} \le E\left[ \frac{1}{X'X}\right] \\{} & {} \quad \le \frac{1}{\lambda _{(1)}}\left[ \frac{p+2}{(p-2)(p+2+\mu '\Sigma ^{-1}\mu )}\right] . \end{aligned}$$

\(\square \)

1.2 A.2. Proof of Lemma 5 in Sect. 2

Proof

Let \(q_{ij}\) denote the \((i,j)^{th}\) entry in Q.

$$\begin{aligned} \frac{\partial }{\partial x_i} \frac{r(\Vert x-y \Vert ^2_{Q})(x_i-y_i)}{\Vert x-y \Vert ^2_{Q}}= \end{aligned}$$

(by an application of Lemma 4 on \(\Vert x-y\Vert ^2_{Q}\))

$$\begin{aligned}{} & {} \frac{\Vert x-y \Vert ^2_{Q}[2r'(\Vert x-y \Vert ^2_{Q})\sum _{j=1}^pq_{ij}(x_j-y_j)(x_i-y_i)+r(\Vert x-y \Vert ^2_{Q})]}{(\Vert x-y \Vert ^2_{Q})2} \\{} & {} \quad - \frac{2r(\Vert x-y \Vert ^2_{Q})\sum _{j=1}^pq_{ij}(x_j-y_j)(x_i-y_i)}{(\Vert x-y \Vert ^2_{Q})2}, \end{aligned}$$

so that

$$\begin{aligned}{} & {} \frac{r^2( \Vert x-y \Vert ^2_Q)}{\Vert x-y \Vert ^2_Q}-2\text {div}_x\left( \frac{r(\Vert x-y\Vert ^2_Q)}{\Vert x-y\Vert ^2_Q} (x-y)\right) \nonumber \\{} & {} \quad = \frac{r^2(\Vert x-y\Vert ^2_{Q})}{\Vert x-y \Vert ^2_{Q})}-2\left[ \frac{(p-2)r(\Vert x-y \Vert ^2_{Q})}{\Vert x-y \Vert ^2_{Q}}+2r'(\Vert x-y \Vert ^2_{Q}) \right] \nonumber \\{} & {} \qquad {\text{(by } \text{ assumption } \text{ ii) }}\nonumber \\{} & {} \quad \le \frac{r^2(\Vert x-y \Vert ^2_{Q})-2(p-2)r(\Vert x-y \Vert ^2_{Q})}{\Vert x-y \Vert ^2_{Q}} \nonumber \\{} & {} \quad = \frac{-r(\Vert x-y \Vert ^2_{Q})[2(p-2)-r(\Vert x-y \Vert ^2_{Q})]}{\Vert x-y \Vert ^2_{Q}} \le 0. \end{aligned}$$
(58)

\(\square \)

1.3 A.3. Proof of Corollary 2 in Sect. 2

Proof

Let \(Z=B(Y-AX)\). Using expression (14), the risk of the estimator \(\delta \), \(R(\delta ,\theta ,\eta )\) can be expressed as:

$$\begin{aligned}{} & {} R(X,\theta ) +E\left[ E\left[ \frac{a^2}{\Vert X-\theta _0\Vert _{Q^*}}-2\text {div}_x\left( \frac{a}{\Vert X-\theta _0 \Vert ^2_{Q^*}}(X-\theta _0)\right) \mid Z=\theta _0\right] \right] \nonumber \\{} & {} \quad = \text {tr}(Q\Sigma _{11})+E\left[ E\left[ \frac{a^2}{\Vert X-\theta _0 \Vert ^2_{Q^*}}-2\frac{(p-2)a}{\Vert X-\theta _0 \Vert ^2_{Q^*}}\mid Z=\theta _0\right] \right] \nonumber \\{} & {} \quad = \text {tr}(Q\Sigma _{11}) -a(2(p-2)-a)E\left[ \frac{1}{\Vert X-Z \Vert ^2_{Q^*}}\right] . \end{aligned}$$
(59)

Since \(Q>0\) and symmetric, \(Q^*= \Sigma _{11}^{-1} Q^{-1}\Sigma _{11}^{-1}\) has a symmetric square root denoted by \(Q^{*}{}^{\frac{1}{2}}\). Let

$$\begin{aligned} \Sigma ^*= \Sigma _{11}+B(\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12})B'. \end{aligned}$$

Making the change of variables,

$$\begin{aligned} W=X-Z=X-B(Y-AX) \sim N_p(\mu _{\theta \eta }, \Sigma ^*), \end{aligned}$$

and

$$\begin{aligned} V=Q^*{}^{\frac{1}{2}}W \sim N_p(Q^*{}^{\frac{1}{2}}\mu _{\theta \eta }, Q^*{}^{\frac{1}{2}}\Sigma ^*Q^*{}^{\frac{1}{2}}), \end{aligned}$$
(60)

in (59) implies

$$\begin{aligned} R(\delta ,\theta ,\eta )= tr(Q\Sigma _{11})-a(2(p-2)-a)E\left[ \frac{1}{V'V}\right] . \end{aligned}$$
(61)

Since V has a multivariate normal distribution, whose parameters are given in (60), an application of Lemma 2 to the \(E\left[ \frac{1}{V'V}\right] \) in (61) implies the result since any eigenvalue of \(Q^*{}^{\frac{1}{2}}[\Sigma _{11}+B(\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12})B']Q^*{}^{\frac{1}{2}}\) is also an eigenvalue of \([\Sigma _{11}+B(\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12})B']Q^*\). \(\square \)

1.4 A.4. Proof of Lemma 6 in Sect. 3.1

Proof

Let \(Y=\frac{1}{\sigma }\Sigma ^{-\frac{1}{2}}X\), and \(Z = P'Y\) where \(P'\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}P =\Lambda \) where \(\Lambda \) is the diagonal matrix of eigenvalues and P is the associated column matrix of eigenvectors of \(\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}\). Then, \(Y \sim N_p(\frac{1}{\sigma }\Sigma ^{-\frac{1}{2}}\mu , I)\) and \(Z\sim N_p(\frac{1}{\sigma }P'\Sigma ^{-\frac{1}{2}}\mu ,I)\) so that:

$$\begin{aligned}{} & {} E\left[ \frac{X'QX}{\sigma ^2}\right] = E\left[ Y'\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}Y\right] =E[(P'Y)'P'\Sigma ^{\frac{1}{2}}Q\Sigma ^{\frac{1}{2}}P(P'Y)]\\{} & {} \quad = E[Z'\Lambda Z] = E\left[ \sum _{i=1}^p \lambda _i Z_i^2\right] . \end{aligned}$$

Now, \(\lbrace Z_i \rbrace _{i=1}^p\) is an an independent collection of random variables, where

$$\begin{aligned} Z_i \sim N\left( \frac{\mu ^*_i}{\sigma }, 1\right) \end{aligned}$$

and where \(\mu ^*=P'\Sigma ^{-\frac{1}{2}}\mu ,\) and thus \(\lbrace Z_i^2 \rbrace _{i=1}^p\) is a collection of independent random variables with

$$\begin{aligned} Z_i^2 \sim \chi ^2_1\left( \frac{\mu _i^{*2}}{\sigma ^2}=\nu _i\right) . \end{aligned}$$

Since a non-central chi-squared random variable has a monotone likelihood ratio (Lehmann & Romano, 2005), \(\chi ^2_1(\nu _i)\) is stochastically increasing in the parameter \(\nu _i\), and thus is stochastically decreasing in \(\sigma ^2\). Let

$$\begin{aligned} U(z_1^2,\ldots ,z_p^2) =\sum _{i=1}^p \lambda _iz_i^2. \end{aligned}$$

Since U is increasing in each of its coordinates, \(\lbrace Z_i^2 \rbrace _{i=1}^p\) is independent collection of random variables, and each \(Z_i^2\) is stochastically decreasing in \(\sigma ^2\), \(U(Z_1^2,Z_2^2,\ldots ,Z_p^2)=\sum _{i=1}^p\lambda _iZ_i^2\) is stochastically decreasing in \(\sigma ^2\), establishing the result. \(\square \)

1.5 A.5. Proof of Lemma 7 in Sect. 3.2

Proof

$$\begin{aligned}{} & {} E[(X-\theta )'\Sigma _{11}^{-1}g(X,Y)]\nonumber \\{} & {} \quad =\int \left[ \sum _{i=1}^p\int \left[ \sum _{j=1}^p \sigma ^*_{ij}(x_j-\theta _j)g_i(x,y)\right] \right. \nonumber \\ {}{} & {} \quad \left. f(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}) {\text {d}}x\right] {\text {d}}y\nonumber \\{} & {} \quad =\int \left[ \sum _{i=1}^p \int g_i(x,y) \left\{ -\frac{\partial }{\partial x_i}(\frac{1}{2}\int _{t}^{\infty }f(u){\text {d}}u)\right\} {\text {d}}x\right] {\text {d}}y\nonumber \\{} & {} \quad =\int \left[ \sum _{i=1}^p \int \frac{\partial }{\partial x_i}g_i(x,y)F(t){\text {d}}x\right] {\text {d}}y, \end{aligned}$$
(62)

where \(t=\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}\), and \(\sigma _{ij}^{*}\) is the i, \(j^{th}\) element of \(\Sigma _{11}^{-1}\). Upon dividing and multiplying (62) by f(t),

$$\begin{aligned}{} & {} \int \int \text {div}_x(g(x,y))\frac{F(t)}{f(t)}f(t){\text {d}}x{\text {d}}y\nonumber \\{} & {} \quad = E\left[ \text {div}_x(g(X,Y))\frac{F\left( \Vert X-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert Y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}\right) }{f\left( \Vert X-\theta \Vert ^2_{\Sigma _{11}^{-1}}+\Vert Y-(\theta +\eta ) \Vert ^2_{\Sigma _{22}^{-1}}\right) }\right] \end{aligned}$$
(63)

establishing the result. \(\square \)

1.6 A.6. Proof of Lemma 8 in Sect. 4

Proof

Let \(t=\frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\). Denoting

$$\begin{aligned} F(t) = \frac{1}{2}\int _t^{\infty }f(u){\text {d}}u, \end{aligned}$$

the partial derivative of F(t) with respect to the \(i^{th}\) coordinate of u is

$$\begin{aligned} \frac{\partial }{\partial u_i}F(t)= \frac{\partial }{\partial u_i} \frac{1}{2}\int _t^{\infty }f(u){\text {d}}u = -f(t)\frac{u_i}{\sigma ^2}. \end{aligned}$$
(64)

Therefore,

$$\begin{aligned}{} & {} E\left[ \frac{h(X,Y,S)}{\sigma ^2}\right] \nonumber \\ {}{} & {} \quad =\int _{\Re ^{2p+k}} \frac{h\left( x,y,\Vert u \Vert ^2\right) }{\sigma ^2}f\left( \frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\right) {\text {d}}u{\text {d}}x{\text {d}}y\nonumber \\{} & {} \quad = \int _{\Re ^{2p+k}} \frac{u'}{\sigma ^2} \frac{uh(x,y,\Vert u \Vert ^2)}{\Vert u \Vert ^2}f\left( \frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\right) \nonumber \\ {}{} & {} \quad {\text {d}}u{\text {d}}x{\text {d}}y. \end{aligned}$$
(65)

Let \(g_{x,y}(u) = \frac{uh(x,y,\Vert u \Vert ^2)}{\Vert u \Vert ^2}\) with \(i^{th}\) coordinate

$$\begin{aligned} g_{x,y}(u)_i = \frac{u_ih(x,y,\Vert u \Vert ^2)}{\Vert u \Vert ^2} \end{aligned}$$

in (65). By the weak differentiability of g, and the expression for the partial derivative of F(t) in (64), expression (65) satisfies

$$\begin{aligned}{} & {} \sum _{i=1}^k \int _{\Re ^{2p+k}} \frac{u_i}{\sigma ^2}g_{x,y}(u)_i f\left( \frac{1}{\sigma ^2}(\Vert x-\theta \Vert ^2_{\Sigma _{11}^{-1}} + \Vert y-\eta ^* \Vert ^2_{\Sigma _{22}^{-1}}+\Vert u \Vert ^2)\right) {\text {d}}u{\text {d}}x{\text {d}}y, \nonumber \\\end{aligned}$$
(66)
$$\begin{aligned}{} & {} \quad = \sum _{i=1}^k\int _{\Re ^{2p+k}}g_{x,y}(u)_i \left( -\frac{\partial }{\partial u_i} F(t)\right) {\text {d}}u{\text {d}}x{\text {d}}y, \end{aligned}$$
(67)
$$\begin{aligned}{} & {} \quad = \sum _{i=1}^k \int _{\Re ^{2p+k}} \left( \frac{\partial }{\partial u_i}g_{x,y}(u)_i\right) F(t){\text {d}}u{\text {d}}x{\text {d}}y \end{aligned}$$
(68)
$$\begin{aligned}{} & {} \quad = \int _{\Re ^{2p+k}} \text {div}_u(g_{x,y}(u))\frac{F(t)}{f(t)}f(t){\text {d}}u{\text {d}}x{\text {d}}y \end{aligned}$$
$$\begin{aligned}{} & {} \quad = E\left[ \text {div}_u(g_{x,y}(u))\frac{F(t)}{f(t)}\right] , \end{aligned}$$
(69)

where the equality from (67) to (68) is justified by the weak differentiability of \(g_{x,y}(u)\) for all (xy).

Since

$$\begin{aligned} \text {div}_u(g_{x,y}(u))=(k-2)\frac{h(x,y,s)}{S}+2\frac{\partial }{\partial S}h(x,y,s) \end{aligned}$$

expression (69) is equivalent to

$$\begin{aligned} E\left[ \left( (k-2)\frac{h(X,Y,S)}{S}+2\frac{\partial }{\partial S}h(X,Y,S)\right) \frac{F(t)}{f(t)}\right] \end{aligned}$$

establishing the result. \(\square \)

Appendix B

1.1 B.1. Development of UMVUE Estimator

When \(\eta =0\) the density of the pair of random variables \(\begin{pmatrix} X\\ Y \end{pmatrix}\) is

$$\begin{aligned} f(x,y)= \vert 2\pi \Sigma \vert ^{-\frac{1}{2}}\exp \left( -\frac{1}{2}\left( \begin{pmatrix} x-\theta \\ y-\theta \end{pmatrix}'\Sigma ^{-1} \begin{pmatrix} x-\theta \\ y-\theta \end{pmatrix} \right) \right) . \end{aligned}$$
(70)

Let

$$\begin{aligned} X_{2.1}= Y-\Sigma _{21}\Sigma _{11}^{-1}X \end{aligned}$$

so that X is independent of \(X_{2.1}\), and

$$\begin{aligned} Y^* =(I-\Sigma _{21}\Sigma _{11}^{-1})^{-1}X_{2.1} \end{aligned}$$
(71)

so that,

$$\begin{aligned} E[Y^*]=(I-\Sigma _{21}\Sigma _{11}^{-1})^{-1}(I-\Sigma _{21}\Sigma _{11}^{-1})\theta =\theta \end{aligned}$$
(72)

\(Y^*\) is an unbiased estimator of \(\theta \), with variance

$$\begin{aligned} \text {var}(Y^*)=\Sigma _{Y^*}=(I-\Sigma _{21}\Sigma _{11}^{-1})^{-1}[\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12}](I-\Sigma _{21}\Sigma _{11}^{-1})^{-t}. \end{aligned}$$
(73)

The joint density of \(\begin{pmatrix} X\\ Y^*\end{pmatrix} \) is

$$\begin{aligned} f(x,y^*)=k\exp \left( -\frac{1}{2}\left( \begin{pmatrix}x-\theta \\ y^*-\theta \end{pmatrix}' \begin{pmatrix} \Sigma _{11}^{-1} &{} 0 \\ 0 &{} \Sigma _{Y^*}^{-1} \end{pmatrix} \begin{pmatrix} x-\theta \\ y^*-\theta \end{pmatrix}\right) \right) , \end{aligned}$$
(74)

where k is the normalizing constant in (74). By expanding the quadratic form in the exponential of density (74), we can find a complete sufficient statistic for the parameter \(\theta \) that will be unbiased and thus will be the UMVUE estimator. Expanding the quadratic form in (74) yields

$$\begin{aligned} x'\Sigma _{11}^{-1}x+\theta '\Sigma _{11}^{-1} \theta -2\theta '\Sigma _{11}^{-1}x +y^{*}{}'\Sigma _{Y^*}^{-1}y^*+ \theta '\Sigma _{Y^*}^{-1} \theta -2\theta '\Sigma _{Y^*}^{-1}y^*, \end{aligned}$$
(75)

which implies the density in (74) is an exponential family of the form

$$\begin{aligned} h(x,y^*)g(\theta )e^{\theta 'T(X,Y^*)} \end{aligned}$$
(76)

with complete sufficient statistic

$$\begin{aligned} T(X,Y^*)= \Sigma _{11}^{-1}X+\Sigma _{Y^*}^{-1}Y^*. \end{aligned}$$
(77)

Thus, the UMVUE estimator of \(\theta \), \(\delta _c\), is

$$\begin{aligned} \delta _c(X,Y)=(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}(\Sigma _{11}^{-1}X+\Sigma _{Y^*}^{-1}Y^*), \end{aligned}$$
(78)

with variance

$$\begin{aligned} \Sigma _{\delta _c}=\text {Cov}(\delta _c,\delta _c)=(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}. \end{aligned}$$
(79)

When the loss is of the form \(L_Q(d,\theta )=(d-\theta )'Q(d-\theta )\), the risk \(\delta _c\) when \(Y^*\) is unbiased for \(\theta \) is

$$\begin{aligned}{} & {} E[(\delta _c-\theta )'Q(\delta _c-\theta )]=tr(Q\Sigma _{\delta _c})=tr(Q(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}) \\{} & {} \quad = \text {tr}(Q[\Sigma _{11}^{-1}+(I-\Sigma _{21}\Sigma _{11}^{-1})'[\Sigma _{22}-\Sigma _{21}\Sigma _{11}^{-1}\Sigma _{12}]^{-1}(I-\Sigma _{21}\Sigma _{11}^{-1})]^{-1} ). \end{aligned}$$

When the researcher is mistaken and \(Y^*\) is a biased for \(\theta \) (\(\eta \ne 0\)), with bias

$$\begin{aligned}{} & {} E[\delta _c-\theta ]= E[(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}(\Sigma _{11}^{-1}\theta +\Sigma _{Y^*}^{-1}(\theta +\eta ))-\theta ] \\{} & {} \quad = (\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1})^{-1}\Sigma _{Y^*}^{-1}\eta , \end{aligned}$$

the risk of using \(\delta _c\) as an estimator of \(\theta \) will be

$$\begin{aligned} E[\Vert \delta _c(X,Y)-\theta \Vert ^2_{Q}]= \text {tr}(Q\Sigma _{\delta _c})+\eta '\Sigma _{Y^*}^{-1}(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1} )^{-1}Q(\Sigma _{11}^{-1}+\Sigma _{Y^*}^{-1} )^{-1}\Sigma _{Y^*}^{-1}\eta , \end{aligned}$$

which is unbounded in \(\eta \) unlike the estimators develop in Sect. 2 which have bounded risk.

1.2 B.2. Comparison of Risks Between \(\delta _c\) (50) and \(\delta _1\)(45)

To compare the risk \(\delta _1(X,Y)\) and \(\delta _c(X,Y)\), we first compare the risk of using \(\delta _c\) when \(\eta =0\), and show that \(\delta _c\) will have uniformly smaller risk than \(\delta _1\) when \(\eta =0\). We then give sufficient conditions for when the risk of \(\delta _1\) will dominate the risk of \(\delta _c\) by comparing the upper bound for the risk of \(\delta _1(X,Y)\) developed in Corollary 2 to the exact risk of using \(\delta _c\). When \(\eta '\eta =0\),

$$\begin{aligned} p\sigma ^2-\frac{(p-2)\sigma ^2B}{A+B} \le R(\delta _1(X,Y), \theta , \eta ) \le p\sigma ^2-\frac{(p-2)\sigma ^2B}{A+B}, \end{aligned}$$

where

$$\begin{aligned} A:= \sigma ^2\tau ^2(1-\varrho ^2) \end{aligned}$$

and

$$\begin{aligned} B:= (\sigma ^2-\varrho \sigma \tau )^2, \end{aligned}$$

by Corollary 2. In comparison, the exact risk of \(\delta _c\)

$$\begin{aligned} R(\delta _c(X,Y),\theta ,\eta ) = \frac{p\sigma ^2A}{A+B}=\frac{p\sigma ^2(\sigma ^2\tau ^2(1-\varrho ^2))}{\sigma ^2\tau ^2(1-\varrho ^2)+(\sigma ^2-\varrho \sigma \tau )^2} \end{aligned}$$

so that

$$\begin{aligned}{} & {} R(\delta _1(X,Y),\theta , 0)-R(\delta _c(X,Y),\theta , 0)=p\sigma ^2-\frac{(p-2)\sigma ^2B}{A+B}- \frac{p\sigma ^2A}{A+B} \\{} & {} \quad = p\sigma ^2\frac{B}{A+B}-(p-2)\sigma ^2\frac{B}{A+B}= \frac{2\sigma ^2B}{A+B} \ge 0 \end{aligned}$$

implying that \(R(\delta _c(X,Y), \theta , 0 ) < R(\delta _1(X,Y), \theta , 0)\).

When the upper bound for the risk of \(\delta _1\) given by Corollary 2 is less than the risk of \(\delta _c\)

$$\begin{aligned}{} & {} p\sigma ^2-\frac{(p-2)\sigma ^2B}{(p-2)[B+A]+\sigma ^2\eta '\eta }< \frac{p\sigma ^2A}{A+B}+ \frac{\eta '\eta B^2}{[A+B]^2}\nonumber \\{} & {} \quad \Leftrightarrow \frac{p\sigma ^2B}{A+B} -\frac{(p-2)\sigma ^2B}{(p-2)[B+A]+\sigma ^2\eta '\eta } < \frac{\eta '\eta B^2}{[A+B]^2} \nonumber \\{} & {} \quad \Leftrightarrow p\sigma ^2B(A+B)[(p-2)(B+A)+\sigma ^2\eta '\eta ]- (p-2)\sigma ^2B[A+B]^2\nonumber \\ \end{aligned}$$
(80)
$$\begin{aligned}{} & {} \quad < \eta '\eta B^2[(p-2)(B+A)+\sigma ^2\eta '\eta ]. \end{aligned}$$
(81)

Let \(x =\eta '\eta \) where \(x \in [0,\infty )\) and

$$\begin{aligned} Q(x)= ax^2+bx+c. \end{aligned}$$

For

$$\begin{aligned} a= & {} \sigma ^2B^2,\\ c= & {} -(p-2)\sigma ^2B(A+B)^2[p-1], \end{aligned}$$

and

$$\begin{aligned} b= B(B+A)[(p-2)B-p\sigma ^4], \end{aligned}$$

expressions (80)–(81) will be satisfied once \(Q(x)>0\). Since \(a>0\) and \(c<0\), Q(x) will have 2 real roots denoted by \(r_1\) and \(r_2\) respectively, where \(r_1 <0\) and \(r_2>0\). Since \(a>0\), \(x >r_2\) implies \(Q(x)>0\) and so a sufficient condition for \(R(\delta _1(X,Y), \theta ,\eta ) < R(\delta _c(X,Y), \theta ,\eta ) \) is

$$\begin{aligned}{} & {} \eta ' \eta > \frac{-(\sigma ^2-\varrho \sigma \tau )^2[(\sigma ^2-\varrho \sigma \tau )^2+\sigma ^2\tau ^2(1-\varrho ^2)]((p-2)(\sigma ^2-\varrho \sigma \tau )^2-p\sigma ^4)}{2\sigma ^2(\sigma ^2-\varrho \sigma \tau )^4}\\{} & {} \quad + \frac{\sqrt{(\sigma ^2-\varrho \sigma \tau )^4( (\sigma ^2-\varrho \sigma \tau )^2+\sigma ^2\tau ^2(1-\varrho ^2))^2 [((p-2)(\sigma ^2-\varrho \sigma \tau )^2-p\sigma ^4 )^2+4\sigma ^4(p-2)(p-1)(\sigma ^2-\varrho \sigma \tau )^2]}}{2\sigma ^2(\sigma ^2-\varrho \sigma \tau )^4}. \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zinonos, S., Strawderman, W.E. On combining unbiased and possibly biased correlated estimators. Jpn J Stat Data Sci (2023). https://doi.org/10.1007/s42081-023-00194-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42081-023-00194-2

Mathematics Subject Classification

Navigation