Skip to main content
Log in

General unbiased estimating equations for variance components in linear mixed models

  • Original Paper
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

This paper introduces a general framework for estimating variance components in the linear mixed models via general unbiased estimating equations, which include some well-used estimators such as the restricted maximum likelihood estimator. We derive the asymptotic covariance matrices and second-order biases under general estimating equations without assuming the normality of the underlying distributions and identify a class of second-order unbiased estimators of variance components. It is also shown that the asymptotic covariance matrices and second-order biases do not depend on whether the regression coefficients are estimated by the generalized or ordinary least squares methods. We carry out numerical studies to check the performance of the proposed methods based on typical linear mixed models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Battese, G. E., Harter, R. M., & Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83, 28–36.

    Article  Google Scholar 

  • Boreinstein, M., Hedges, L. V., & Higgins, J. P. T. (2009). Introduction to meta-analysis. Wiley.

  • Datta, G. S., & Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statistica Sinica, 10, 613–627.

    MathSciNet  MATH  Google Scholar 

  • Doornik, J. A. (2007). Object-oriented matrix programming using Ox (3rd ed.). Timberlake Consultants Press and Oxford.

  • Fay, R. E., & Herriot, R. (1979). Estimates of income for small places: An application of James–Stein procedures to census data. Journal of the American Statistical Association, 74, 269–277.

    Article  MathSciNet  Google Scholar 

  • Prasad, N. G. N., & Rao, J. N. K. (1990). The estimation of the mean squared error of small area estimators. Journal of the American Statistical Association, 85, 163–171.

    Article  MathSciNet  Google Scholar 

  • Rao, C. R., & Kleffe, J. (1988). Estimation of variance components and applications. North-Holland.

  • Rao, J. N. K., & Molina, I. (2015). Small area estimation (2nd ed.). Wiley.

  • Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance components. Wiley.

  • Verbeke, G., & Molenberghs, G. (2006). Linear mixed models for longitudinal data. Springer.

Download references

Acknowledgements

We would like to thank the Associate Editor and the two reviewers for many valuable comments and helpful suggestions, which led to an improved version of this paper. This research was supported in part by Grant-in-Aid for Scientific Research (18K11188) from the Japan Society for the Promotion of Science.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Kubokawa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proofs

1.1 A.1 A preliminary lemma

For the proof, we use the following lemma:

Lemma A.1

Let \({\varvec{u}}={\varvec{\varepsilon }}+{\varvec{Z}}{\varvec{v}}\). Then,  for matrices \({\varvec{C}}\) and \({\varvec{D}},\) it holds that

$$\begin{aligned} E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]=2\mathrm{tr}\,({\varvec{C}}{\varvec{{\Sigma }}}{\varvec{D}}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{C}}{\varvec{{\Sigma }}})\mathrm{tr}\,({\varvec{D}}{\varvec{{\Sigma }}}) + K_e h_e({\varvec{C}},{\varvec{D}}) +K_v h_v({\varvec{C}}, {\varvec{D}}), \end{aligned}$$
(8)

where \(h_e({\varvec{C}},{\varvec{D}})\) and \(h_v({\varvec{C}}, {\varvec{D}})\) are given in Theorem 2.1.

Proof

It is demonstrated that \(E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]=E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}] +E[{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{C}}{\varvec{Z}}{\varvec{v}}{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{v}}]+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+4\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\). Let \({\varvec{x}}=(x_1, \ldots , x_N)^\top ={\varvec{R}}_e^{-1/2}{\varvec{\varepsilon }}\), \({\widetilde{{\varvec{C}}}}={\varvec{R}}_e^{1/2}{\varvec{C}}{\varvec{R}}_e^{1/2}\) and \({\widetilde{{\varvec{D}}}}={\varvec{R}}_e^{1/2}{\varvec{D}}{\varvec{R}}_e^{1/2}\). Then, \(E[{\varvec{x}}]=\mathbf{0}\), \(E[{\varvec{x}}{\varvec{x}}^\top ]={\varvec{I}}_N\), \(E[x_a^4]=K_e+3\), \(a=1, \ldots , N\), and \(E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}]=E[{\varvec{x}}^\top {\widetilde{{\varvec{C}}}}{\varvec{x}}{\varvec{x}}^\top {\widetilde{{\varvec{D}}}}{\varvec{x}}]\). Let \({\delta }_{a=b=c=d}=1\) for \(a=b=c=d\), and otherwise, \({\delta }_{a=b=c=d}=0\). The notation \({\delta }_{a=b\not = c=d}\) is defined similarly. It is observed that for \(a, b, c, d =1, \ldots , N\),

$$\begin{aligned}&E[x_a({\widetilde{{\varvec{C}}}})_{ab}x_b x_c({\widetilde{{\varvec{D}}}})_{cd}x_d]\\&\quad =E[ x_a^4 ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}{\delta }_{a=b=c=d}+x_a^2x_c^2 ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{cc}{\delta }_{a=b\not = c=d} + 2 x_a^2x_b^2({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}{\delta }_{a=c\not = b=d}]\\&\quad = (K_e+3) ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}{\delta }_{a=b=c=d}+({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{cc}{\delta }_{a=b\not = c=d} + 2 ({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}{\delta }_{a=c\not = b=d}\\&\quad = K_e({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}{\delta }_{a=b=c=d}+({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{cc}{\delta }_{a=b}{\delta }_{c=d} + 2 ({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}{\delta }_{a=c}{\delta }_{b=d}, \end{aligned}$$

which implies that

$$\begin{aligned} \sum _{a, b, c, d}E[x_a({\widetilde{{\varvec{C}}}})_{ab}x_b x_c({\widetilde{{\varvec{D}}}})_{cd}x_d]= & {} K_e \sum _{a=1}^N ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}+ \sum _{a=1}^N({\widetilde{{\varvec{C}}}})_{aa}\sum _{c=1}^N({\widetilde{{\varvec{D}}}})_{cc}\\&+ 2 \sum _{a=1}^N\sum _{b=1}^N({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}, \end{aligned}$$

or

$$\begin{aligned} E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}]=2\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{R}}_e)+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e) +K_e h_e({\varvec{C}}, {\varvec{D}}). \end{aligned}$$

Similarly,

$$\begin{aligned}&E[{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{C}}{\varvec{Z}}{\varvec{v}}{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{v}}]\\&\quad =2\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top ) + K_v h_v({\varvec{C}},{\varvec{D}}). \end{aligned}$$

Thus, we have

$$\begin{aligned} E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]&=2\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{R}}_e)+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e) \mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)+2\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\\&\quad +\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top ) +\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\\&\quad +\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+4\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\\&\quad +K_e h_e({\varvec{C}}, {\varvec{D}})+ K_v h_v({\varvec{C}},{\varvec{D}}), \end{aligned}$$

which can be rewritten as the expression in (9) for \({\varvec{{\Sigma }}}={\varvec{R}}_e+{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top \). \(\square \)

1.2 A.2 Proof of Theorem 2.1

For \(a=1, \ldots , k\), let \(\ell _a={\varvec{y}}^\top {\varvec{C}}_a{\varvec{y}}- \mathrm{tr}\,({\varvec{D}}_a)\) for \({\varvec{C}}_a={\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}}\) and \({\varvec{D}}_a={\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}}{\varvec{{\Sigma }}}\). For \({\varvec{u}}={\varvec{y}}-{\varvec{X}}{\varvec{\beta }}={\varvec{\varepsilon }}+{\varvec{Z}}{\varvec{v}}\), \(\ell _a\) is rewritten as \(\ell _a={\varvec{u}}^\top {\varvec{C}}_a{\varvec{u}}-\mathrm{tr}\,({\varvec{D}}_a)\). By the Taylor series expansion,

$$\begin{aligned} 0= & {} {\mathbf{col}}_a(\ell _a) + {\mathbf{mat}}_{ab}(\ell _{a(b)})(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})\\&+{1\over 2}{\mathbf{col}}_a\left\{ \sum _{b=1}^k\sum _{c=1}^k \ell _{a(bc)} (\widehat{\psi }_b-\psi _b)(\widehat{\psi }_c-\psi _c)\right\} +O_p(N^{-1/2}), \end{aligned}$$

where \({\mathbf{mat}}_{ab}(x_{ab})\) is a \(k\times k\) matrix with the (ab)-th element \(x_{ab}\). Then,

$$\begin{aligned} \widehat{{\varvec{\psi }}}-{\varvec{\psi }}= & {} - \{{\mathbf{mat}}_{ab}(\ell _{a(b)}) \}^{-1} \left[ {\mathbf{col}}_a(\ell _a) +{1\over 2}{\mathbf{col}}_a\left\{ \sum _{b=1}^k\sum _{c=1}^k \ell _{a(bc)}(\widehat{\psi }_b-\psi _b)(\widehat{\psi }_c-\psi _c)\right\} \right] \\&+O_p(N^{-3/2}). \end{aligned}$$

Since \(\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_a)=\mathrm{tr}\,({\varvec{D}}_a)\), we have \(\ell _a=\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\). In addition, \(\ell _{a(b)}= \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})+\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\) and \(\ell _{a(bc)}= \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})+\mathrm{tr}\,\{{\varvec{C}}_{a(bc)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\). Let \({\varvec{A}}_1={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})\}\) and \({\varvec{A}}_0={\mathbf{mat}}_{ab}[\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}]\). It is noted that \({\varvec{A}}_1=O(N)\), \({\varvec{A}}_0=O_p(N^{1/2})\), \(\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})=O(N)\) and \(\mathrm{tr}\,\{{\varvec{C}}_{a(bc)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}=O_p(N^{1/2})\). Then it can be seen that

$$\begin{aligned} \{{\mathbf{mat}}_{ab}(\ell _{a(b)}) \}^{-1} = ({\varvec{A}}_1+{\varvec{A}}_0)^{-1}={\varvec{A}}_1^{-1}-{\varvec{A}}_1^{-1}{\varvec{A}}_0{\varvec{A}}_1^{-1}+O_p(N^{-2}), \end{aligned}$$

so that

$$\begin{aligned} \widehat{{\varvec{\psi }}}-{\varvec{\psi }}=&-{\varvec{A}}_1^{-1} {\mathbf{col}}_a[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] + {\varvec{A}}_1^{-1}{\varvec{A}}_0{\varvec{A}}_1^{-1} {\mathbf{col}}_a[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] \nonumber \\&-{1\over 2}{\varvec{A}}_1^{-1}{\mathbf{col}}_a\left\{ \sum _{b=1}^k\sum _{c=1}^k \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})(\widehat{\psi }_b-\psi _b)(\widehat{\psi }_c-\psi _c)\right\} +O_p(N^{-3/2}). \end{aligned}$$

It is noted that \(({\varvec{C}}_a)_{ij}=({\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}})_{ij}=({\varvec{W}}_a)_{ij}+O(N^{-1})\), \(({\varvec{C}}_{a(b)})_{ij}=({\varvec{W}}_{a(b)})_{ij}+O(N^{-1})\) and \(({\varvec{C}}_{a(bc)})_{ij}=({\varvec{W}}_{a(bc)})_{ij}+O(N^{-1})\). Then, \(\mathrm{tr}\,({\varvec{C}}_a{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}})+O(1)\), \(\mathrm{tr}\,({\varvec{C}}_{a(b)}{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}})+O(1)\) and \(\mathrm{tr}\,({\varvec{C}}_{a(bc)}{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_{a(bc)}{\varvec{{\Sigma }}})+O(1)\). Since \({\varvec{D}}_a={\varvec{C}}_a{\varvec{{\Sigma }}}\), \({\varvec{D}}_{a(b)}={\varvec{C}}_{a(b)}{\varvec{{\Sigma }}}+{\varvec{C}}_a{\varvec{{\Sigma }}}_{(b)}\) and \({\varvec{D}}_{a(bc)}={\varvec{C}}_{a(bc)}{\varvec{{\Sigma }}}+{\varvec{C}}_{a(b)}{\varvec{{\Sigma }}}_{(c)}+{\varvec{C}}_{a(c)}{\varvec{{\Sigma }}}_{(b)}+{\varvec{C}}_a{\varvec{{\Sigma }}}_{(bc)}\), it is seen that \(\mathrm{tr}\,({\varvec{D}}_{a(b)})=\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)}) +O(1)\) and \(\mathrm{tr}\,({\varvec{D}}_{a(bc)})=\mathrm{tr}\,({\varvec{W}}_{a(bc)}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_{a(c)}{\varvec{{\Sigma }}}_{(b)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})+O(1)\). Thus,

$$\begin{aligned} \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})= & {} - \mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})+O(1),\nonumber \\ \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})= & {} -\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})-\mathrm{tr}\,({\varvec{W}}_{a(c)}{\varvec{{\Sigma }}}_{(b)})-\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)}) +O(1). \end{aligned}$$
(9)

Letting \({\varvec{A}}={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})\}\), we have \({\varvec{A}}_1=-{\varvec{A}}+O(1)\). Using Lemma A.1, we can approximate the covariance matrix of \(\widehat{{\varvec{\psi }}}\) as

$$\begin{aligned} E[(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})^\top ]&= {\varvec{A}}_1^{-1} {\mathbf{mat}}_{ab}( E[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\mathrm{tr}\,\{{\varvec{C}}_b({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] ) {\varvec{A}}_1^{-1} + O(N^{-3/2})\\&= 2{\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}+{\varvec{A}}^{-1}{\widetilde{{\varvec{B}}}}{\varvec{A}}^{-1} + O(N^{-3/2}), \end{aligned}$$

for \({\varvec{B}}={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}{\varvec{W}}_b{\varvec{{\Sigma }}})\}\) and \({\widetilde{{\varvec{B}}}}={\mathbf{mat}}_{ab}\{K_eh_e({\varvec{W}}_a,{\varvec{W}}_b)+K_vh_v({\varvec{W}}_a,{\varvec{W}}_b)\}\).

The bias of \(\widehat{{\varvec{\psi }}}\) is

$$\begin{aligned} E(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})&= -{1\over 2}{\varvec{A}}^{-1}{\mathbf{col}}_a\left[ \sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})\} ({\varvec{A}}^{-1}(2 {\varvec{B}}+{\widetilde{{\varvec{B}}}}){\varvec{A}}^{-1})_{bc}\right] \\&\quad + E({\varvec{A}}^{-1}{\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_a[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] ) +O(N^{-3/2}). \end{aligned}$$

Concerning the second term in RHS, the a-th element of \(E\{ ({\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_c[\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}])\}\) is

$$\begin{aligned}&E\{({\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_c[\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}])_a\}\\&\quad =\sum _{b=1}^k\sum _{c=1}^k E[\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}({\varvec{A}})^{bc}\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}]\\&\quad =\sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}{\varvec{W}}_c{\varvec{{\Sigma }}}) +K_eh_e({\varvec{W}}_{a(b)},{\varvec{W}}_c)+K_vh_v({\varvec{W}}_{a(b)},{\varvec{W}}_c)\}({\varvec{A}})^{bc} + O(N^{-1}). \end{aligned}$$

Then,

$$\begin{aligned}&E(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})\nonumber \\&\quad ={\varvec{A}}^{-1}{\mathbf{col}}_a\left( \sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}{\varvec{W}}_c{\varvec{{\Sigma }}}) +K_eh_e({\varvec{W}}_{a(b)},{\varvec{W}}_c)+K_vh_v({\varvec{W}}_{a(b)},{\varvec{W}}_c)\}({\varvec{A}})^{bc}\right) \nonumber \\&\quad -{1\over 2}{\varvec{A}}^{-1}{\mathbf{col}}_a\left[ \sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})\} ({\varvec{A}}^{-1} (2 {\varvec{B}}+{\widetilde{{\varvec{B}}}}){\varvec{A}}^{-1})_{bc}\right] +O(N^{-3/2}), \end{aligned}$$

which provides the expression in (3) in Theorem 2.1.

1.3 A.3 Proof of Proposition 3.1

Case of \({\varvec{W}}_a={\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}\). We have \({\varvec{W}}_{a(b)}=-{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}-{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}+{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}\), which yields that \(\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)})=({\varvec{A}})_{ab}\) and \(({\varvec{B}})_{ab}=\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}{\varvec{W}}_b{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)})=({\varvec{A}})_{ab}\). Thus, \({\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}={\varvec{A}}^{-1}\) and the covariance matrix of \(\widehat{{\varvec{\psi }}}\) is \(2{\varvec{A}}^{-1}+O(N^{-3/2})\). Moreover, note that

$$\begin{aligned} ({\varvec{K}}_a)_{bc}=&\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}{\varvec{W}}_c{\varvec{{\Sigma }}})= -2\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)}),\\ ({\varvec{H}}_a)_{bc}=&\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})=-2\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)}), \end{aligned}$$

which shows that \({\varvec{W}}_a^\mathrm{REML}\) satisfies (4).

Case of \({\varvec{W}}_a=({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}+{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1})/2\). From (2), it follows that \(({\varvec{A}})_{ab}=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})\) and \(({\varvec{B}})_{ab}=\{\mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}{\varvec{{\Sigma }}}_{(b)})\}/2\). The asymptotic covariance matrix of \(\widehat{{\varvec{\psi }}}\) is \(2{\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}\), and the bias is derived from (3).

Case of \({\varvec{W}}_a={\varvec{{\Sigma }}}_{(a)}\). Straightforward calculation shows that \(({\varvec{A}})_{ab} = \mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})\) and \(({\varvec{B}})_{ab}=\mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}})\). The asymptotic covariance matrix of \(\widehat{{\varvec{\psi }}}\) is \(2 {\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}+O(N^{-3/2})\). Moreover, since \(W_{a(b)}=0\), the condition (4) holds.

Appendix B: Summary of estimation methods in specific models

Here, we provide specific forms of the REML-type, FH-type, and their OLS-based estimators, the PR-type estimator and the Prasad–Rao estimator in the Fay–Herriot model and the nested error regression model.

1.1 B.1 Fay–Herriot model

The marginal distribution of \({\varvec{y}}=(y_1, \ldots , y_m)^\top \) in the Fay–Herriot model has \(E[{\varvec{y}}]={\varvec{X}}{\varvec{\beta }}\) and \(\mathbf{Cov}\,({\varvec{y}})={\varvec{{\Sigma }}}=\psi _1{\varvec{I}}_m + {\varvec{D}}\), where p is a dimension of \({\varvec{\beta }}\) and \({\varvec{D}}=\mathrm{diag}\,(D_1, \ldots , D_m)\).

REML \(\widehat{\psi }_1^\mathrm{RE}\) corresponds to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}})\) for \({\varvec{P}}={\varvec{{\Sigma }}}^{-1}-{\varvec{{\Sigma }}}^{-1}{\varvec{X}}({\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}\).

OLS-based REML \(\widehat{\psi }_1^\mathrm{ORM}\) corresponds to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-2}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}})\) for \({\widetilde{{\varvec{P}}}}={\varvec{I}}-{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top \).

Fay–Herriot estimator \(\widehat{\psi }_1^\mathrm{FH}\) corresponds to \({\varvec{W}}_1^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=m-p\).

OLS-based FH estimator \(\widehat{\psi }_1^\mathrm{OFH}\) corresponds to \({\varvec{W}}_1^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=m-2p+\mathrm{tr}\,\{({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}}\}\).

Prasad–Rao estimator \(\widehat{\psi }_1^\mathrm{PR}\) corresponds to \({\varvec{W}}_1^\mathrm{Q}={\varvec{I}}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\) and it is given by \(\widehat{\psi }_1^\mathrm{PR}=[{\varvec{y}}^\top {\widetilde{{\varvec{P}}}}{\varvec{y}}- \mathrm{tr}\,({\varvec{D}})+\mathrm{tr}\,\{({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{D}}{\varvec{X}}\}]/(m-p)\).

The asymptotic variances and second-order biases can be provided from Proposition 3.2 as follows: REML \(\widehat{\psi }_1^{\mathrm{RE}}\) and OLS-based REML \(\widehat{\psi }_1^\mathrm{ORM}\) have the same asymptotic variance and the second-order bias

$$\begin{aligned} \mathrm{Var}(\widehat{\psi }_1^\mathrm{RE})&\approx {2 \over \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})} + {K_e\mathrm{tr}\,({\varvec{{\Sigma }}}^{-4}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-4}) \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}^2},\\ \mathrm{Bias}(\widehat{\psi }_1^\mathrm{RE})&\approx -2{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-5}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-5})\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}^2}\\&\quad + 2{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-3})\{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-4}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-4})\}\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}^3}. \end{aligned}$$

Fay–Herriot estimator \(\widehat{\psi }_1^\mathrm{FH}\) and OLS-based FH estimator \(\widehat{\psi }_1^\mathrm{OFH}\) have the same asymptotic variance and the second-order bias

$$\begin{aligned} \mathrm{Var}(\widehat{\psi }_1^\mathrm{FH})&\approx {2m \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2 } + {K_e\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}) \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2},\\ \mathrm{Bias}(\widehat{\psi }_1^\mathrm{FH})&\approx 2{m\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})-\{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2 \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^3}\\&\quad -{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-3}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-3})\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2} + {\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^3}, \end{aligned}$$

which implies that \(\widehat{\psi }_1^\mathrm{UFH}=\widehat{\psi }_1^\mathrm{FH}-2[m\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-2})-\{\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-1})\}^2]/\{\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-1})\}^3\) is unbiased up to second order under normality, where \(\widehat{{\varvec{{\Sigma }}}}=\widehat{\psi }_1^\mathrm{FH}{\varvec{I}}_m+{\varvec{D}}\).

Prasad–Rao estimator \(\widehat{\psi }_1^\mathrm{PR}\) is second-order unbiased and has the asymptotic variance \(\mathrm{Var}(\widehat{\psi }_1^\mathrm{PR})\approx \{2\mathrm{tr}\,({\varvec{{\Sigma }}}^2)+K_e\mathrm{tr}\,({\varvec{D}}^2)+m\psi _1^2K_v\}/m^2\).

1.2 B.2 Nested error regression model

The NER model is written as \({\varvec{y}}_i={\varvec{X}}_i{\varvec{\beta }}+{\varvec{j}}_{n_i}v_i + {\varvec{\varepsilon }}_i\) for \(i=1, \ldots , m\), where \({\varvec{y}}_i\), \({\varvec{\beta }}\) and \({\varvec{\varepsilon }}_i\) are \(n_i\), p and \(n_i\) dimensional vectors, \({\varvec{X}}_i\) is an \(n_i\times p\) matrix, \(v_i\) is scalar and \({\varvec{j}}_{n_i}=(1, \ldots , 1)^\top \in {\mathbb {R}}^{n_i}\). Here, \(v_i\) and \({\varvec{\varepsilon }}_i\) are independent random variables such that \(E[v_i]=0\), \(\mathrm{Var}(v_i)=\psi _1\), \(E[{\varvec{\varepsilon }}_i]=\mathbf{0}\) and \(\mathbf{Cov}\,({\varvec{\varepsilon }}_i)=\psi _2{\varvec{I}}_{n_i}\). Let \({\varvec{y}}=({\varvec{y}}_1^\top , \ldots , {\varvec{y}}_m^\top )^\top \), \({\varvec{X}}=({\varvec{X}}_1^\top , \ldots , {\varvec{X}}_m^\top )^\top \), \(N=\sum _{i=1}^m n_i\) and \({\varvec{G}}=\text {block diag}({\varvec{J}}_{n_1}, \ldots , {\varvec{J}}_{n_m})\) for \({\varvec{J}}_{n_i}={\varvec{j}}_{n_i}{\varvec{j}}_{n_i}^\top \). In addition, let \({\varvec{{\Sigma }}}=\text {block diag}({\varvec{{\Sigma }}}_1, \ldots , {\varvec{{\Sigma }}}_m)\) for \({\varvec{{\Sigma }}}_i=\psi _1{\varvec{J}}_{n_i}+\psi _2{\varvec{I}}_{n_i}\). Then, \({\varvec{{\Sigma }}}=\psi _1{\varvec{G}}+\psi _2{\varvec{I}}_N\), \({\varvec{{\Sigma }}}_{(1)}={\varvec{G}}\) and \({\varvec{{\Sigma }}}_{(2)}={\varvec{I}}_N\).

REML \(\widehat{\psi }_1^\mathrm{RE}\) and \(\widehat{\psi }_1^\mathrm{RE}\) correspond to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}\), \({\varvec{W}}_2^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\), and the estimating equations are \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}}{\varvec{G}})\) and \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}})\).

OLS-based REML \(\widehat{\psi }_1^\mathrm{ORM}\) and \(\widehat{\psi }_2^\mathrm{ORM}\) correspond to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}\), \({\varvec{W}}_2^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\), and the estimating equations are \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1})\) and \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-2})\).

FH-type estimators \(\widehat{\psi }_1^\mathrm{FH}\) and \(\widehat{\psi }_2^\mathrm{FH}\) correspond to \({\varvec{W}}_1^\mathrm{FH}=({\varvec{{\Sigma }}}^{-1}{\varvec{G}}+{\varvec{G}}{\varvec{{\Sigma }}}^{-1})/2\), \({\varvec{W}}_2^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\), and the estimating equations are

$$\begin{aligned}&\sum _{i=1}^m{n_i^2({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{G})^2\over n_i\psi _1+\psi _2}= N - \sum _{i=1}^m {n_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\over n_i\psi _1+\psi _2},\\&\quad \psi _2={1\over N-p}\sum _{i=1}^m\sum _{j=1}^{n_i}(y_{ij}-{\varvec{x}}_{ij}^\top \widehat{{\varvec{\beta }}}^\mathrm{G})^2-{1\over N-p}\sum _{i=1}^m{n_i^2\psi _1\over n_i\psi _1+\psi _2}({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{G})^2. \end{aligned}$$

OLS-based FH estimators \(\widehat{\psi }_1^\mathrm{OFH}\) and \(\widehat{\psi }_2^\mathrm{OFH}\) correspond to \({\varvec{W}}_1^\mathrm{FH}=({\varvec{{\Sigma }}}^{-1}{\varvec{G}}+{\varvec{G}}{\varvec{{\Sigma }}}^{-1})/2\), \({\varvec{W}}_2^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\), and the estimating equations are

$$\begin{aligned}&\sum _{i=1}^m{n_i^2({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2\over n_i\psi _1+\psi _2}= N - 2\sum _{i=1}^m n_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\\&\quad +\sum _{i=1}^m {n_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\over n_i\psi _1+\psi _2},\\&\sum _{i=1}^m\sum _{j=1}^{n_i}(y_{ij}-{\varvec{x}}_{ij}^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2-\sum _{i=1}^m{n_i^2\psi _1\over n_i\psi _1+\psi _2}({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2 =\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-1}). \end{aligned}$$

PR-type estimators \(\widehat{\psi }_1^\mathrm{Q}\) and \(\widehat{\psi }_2^\mathrm{Q}\) correspond to \({\varvec{W}}_1^\mathrm{Q}={\varvec{G}}\), \({\varvec{W}}_2^\mathrm{Q}={\varvec{I}}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\), and the estimators are \(\widehat{\psi }_1^\mathrm{Q} = \{\sum _{i=1}^m n_i^2 ({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2-\widehat{\psi }_2\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})\}/\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})^2\) and

$$\begin{aligned} \widehat{\psi }_2^\mathrm{Q} ={ \sum _{i=1}^m\sum _{j=1}^{n_i}(y_{ij}-{\varvec{x}}_{ij}^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2- [\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})/\mathrm{tr}\,\{({\widetilde{{\varvec{P}}}}{\varvec{G}})^2\}]\sum _{i=1}^m n_i^2 ({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2 \over N-p-\{\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})\}^2\mathrm{tr}\,\{({\widetilde{{\varvec{P}}}}{\varvec{G}})^2\}}. \end{aligned}$$

Prasad–Rao estimators are \(\widehat{\psi }_1^\mathrm{PR}=\{{\varvec{y}}^\top {\widetilde{{\varvec{P}}}}{\varvec{y}}-(N-p)\widehat{\psi }_2\}/\{N-\sum _{i=1}^mn_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\}\) and \(\widehat{\psi }_2^\mathrm{PR}=\{ {\varvec{y}}^\top \{{\varvec{E}}-{\varvec{E}}{\varvec{X}}({\varvec{X}}^\top {\varvec{E}}{\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{E}}\}{\varvec{y}}\}/(N-k-p)\), where \({\varvec{E}}=\text {block diag}({\varvec{I}}_{n_1}-n_1^{-1}{\varvec{J}}_{n_1}, \ldots , {\varvec{I}}_{n_m}-n_m^{-1}{\varvec{J}}_{n_m})\).

Hereafter we assume that \(K_e=K_v=0\) for simplicity. Note that \({\varvec{{\Sigma }}}{\varvec{G}}={\varvec{G}}{\varvec{{\Sigma }}}\), \(\psi _1{\varvec{G}}={\varvec{{\Sigma }}}-\psi _2{\varvec{I}}_N\), \(\psi _2{\varvec{{\Sigma }}}^{-1}={\varvec{I}}_N-\psi _1\text {block diag}({\gamma }_1{\varvec{J}}_{n_1}, \ldots , {\gamma }_m{\varvec{J}}_{n_m})\), \(\psi _2^2{\varvec{{\Sigma }}}^{-2}={\varvec{I}}_N-\psi _1\text {block diag}((1+\psi _2{\gamma }_1){\gamma }_1{\varvec{J}}_{n_1}, \ldots , (1+\psi _2{\gamma }_m){\gamma }_m{\varvec{J}}_{n_m})\) for \({\gamma }_i=1/(\psi _2+n_i\psi _1)\). Then the asymptotic variances and second-order biases can be provided from Proposition 3.3 as follows: REML \(\widehat{{\varvec{\psi }}}^\mathrm{RE}\) and OLS-based REML \(\widehat{{\varvec{\psi }}}^\mathrm{ORM}\) are second-order unbiased and have the same asymptotic variance

$$\begin{aligned} \mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{RE})\approx&2 \begin{pmatrix}\mathrm{tr}\,\{({\varvec{{\Sigma }}}^{-1}{\varvec{G}})^2\} &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}}) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\end{pmatrix}^{-1} =2 \begin{pmatrix}\sum _{i=1}^mn_i^2{\gamma }_i^2 &{} \sum _{i=1}^mn_i{\gamma }_i^2\\ \sum _{i=1}^mn_i{\gamma }_i^2 &{} (N-m)/\psi _2^2+\sum _{i=1}^m{\gamma }_i^2\end{pmatrix}^{-1}, \end{aligned}$$

which was given in Datta and Lahiri (2000).

Fay–Herriot estimator \(\widehat{{\varvec{\psi }}}^\mathrm{FH}\) and OLS-based FH estimator \(\widehat{{\varvec{\psi }}}^\mathrm{OFH}\) have the same asymptotic covariance matrix \(\mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{FH})\approx 2{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1}\), where

$$\begin{aligned} {\varvec{A}}_\mathrm{FH}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2{\gamma }_i &{} \sum _{i=1}^mn_i{\gamma }_i\\ \sum _{i=1}^mn_i{\gamma }_i &{} (N-m)/\psi _2+\sum _{i=1}^m{\gamma }_i\end{pmatrix},\\ {\varvec{B}}_\mathrm{FH}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}{\varvec{G}}+{\varvec{G}}^2) &{} N\\ N &{} N\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2 &{} N\\ N &{} N\end{pmatrix}, \end{aligned}$$

and the same second-order bias

$$\begin{aligned} \mathbf{Bias}(\widehat{{\varvec{\psi }}}^\mathrm{FH})\approx 2{\varvec{A}}_\mathrm{FH}^{-1} \begin{pmatrix} \mathrm{tr}\,({\varvec{K}}_1{\varvec{A}}_\mathrm{FH}^{-1})-\mathrm{tr}\,({\varvec{H}}_1{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1})\\ \mathrm{tr}\,({\varvec{K}}_2{\varvec{A}}_\mathrm{FH}^{-1})-\mathrm{tr}\,({\varvec{H}}_2{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1})\end{pmatrix}, \end{aligned}$$

where

$$\begin{aligned} {\varvec{K}}_1=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^3)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^3{\gamma }_i &{} \sum _i n_i^2{\gamma }_i \\ \sum _i n_i^2{\gamma }_i &{} \sum _i n_i{\gamma }_i\end{pmatrix},\\ {\varvec{K}}_2=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^2{\gamma }_i &{} \sum _i n_i{\gamma }_i \\ \sum _i n_i{\gamma }_i &{} (N-m)/\psi _2+\sum _i {\gamma }_i\end{pmatrix},\\ {\varvec{H}}_1=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^3{\gamma }_i^2 &{} \sum _i n_i^2{\gamma }_i^2 \\ \sum _i n_i^2{\gamma }_i^2 &{} \sum _i n_i{\gamma }_i^2\end{pmatrix},\\ {\varvec{H}}_2=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}})&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^2{\gamma }_i^2 &{} \sum _i n_i{\gamma }_i^2 \\ \sum _i n_i{\gamma }_i^2 &{} (N-m)/\psi _2+\sum _i {\gamma }_i^2\end{pmatrix}. \end{aligned}$$

PR-type estimator \(\widehat{{\varvec{\psi }}}^\mathrm{Q}\) is second-order unbiased and has the same asymptotic covariance matrix \(\mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{Q})\approx 2{\varvec{A}}_\mathrm{Q}^{-1}{\varvec{B}}_\mathrm{Q}{\varvec{A}}_\mathrm{Q}^{-1}\), where

$$\begin{aligned} {\varvec{A}}_\mathrm{Q}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{G}}^2) &{} \mathrm{tr}\,({\varvec{G}})\\ \mathrm{tr}\,({\varvec{G}}) &{} \mathrm{tr}\,({\varvec{I}}_N)\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2 &{} N \\ N &{} N \end{pmatrix},\\ {\varvec{B}}_\mathrm{Q}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{{\Sigma }}}^2{\varvec{G}}^2) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^2{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^2{\varvec{G}}) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^2)\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2/{\gamma }_i^2 &{} \sum _{i=1}^mn_i/{\gamma }_i^2\\ \sum _{i=1}^mn_i/{\gamma }_i^2 &{} (N-m)\psi _2^2+\sum _{i=1}^m1/{\gamma }_i^2\end{pmatrix}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kubokawa, T., Sugasawa, S., Tamae, H. et al. General unbiased estimating equations for variance components in linear mixed models. Jpn J Stat Data Sci 4, 841–859 (2021). https://doi.org/10.1007/s42081-021-00138-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-021-00138-8

Keywords

Navigation