General unbiased estimating equations for variance components in linear mixed models

Kubokawa, T.; Sugasawa, S.; Tamae, H.; Chaudhuri, S.

doi:10.1007/s42081-021-00138-8

General unbiased estimating equations for variance components in linear mixed models

Original Paper
Published: 08 September 2021

Volume 4, pages 841–859, (2021)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

T. Kubokawa¹,
S. Sugasawa²,
H. Tamae³ &
…
S. Chaudhuri⁴

259 Accesses
1 Citation
Explore all metrics

Abstract

This paper introduces a general framework for estimating variance components in the linear mixed models via general unbiased estimating equations, which include some well-used estimators such as the restricted maximum likelihood estimator. We derive the asymptotic covariance matrices and second-order biases under general estimating equations without assuming the normality of the underlying distributions and identify a class of second-order unbiased estimators of variance components. It is also shown that the asymptotic covariance matrices and second-order biases do not depend on whether the regression coefficients are estimated by the generalized or ordinary least squares methods. We carry out numerical studies to check the performance of the proposed methods based on typical linear mixed models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On drawbacks of least squares Lehmann–Scheffé estimation of variance components

Article 19 January 2021

A new method for obtaining explicit estimators in unbalanced mixed linear models

Article Open access 10 August 2017

Robust estimation of the number of components for mixtures of linear regression models

Article 04 August 2015

References

Battese, G. E., Harter, R. M., & Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83, 28–36.
Article Google Scholar
Boreinstein, M., Hedges, L. V., & Higgins, J. P. T. (2009). Introduction to meta-analysis. Wiley.
Datta, G. S., & Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statistica Sinica, 10, 613–627.
MathSciNet MATH Google Scholar
Doornik, J. A. (2007). Object-oriented matrix programming using Ox (3rd ed.). Timberlake Consultants Press and Oxford.
Fay, R. E., & Herriot, R. (1979). Estimates of income for small places: An application of James–Stein procedures to census data. Journal of the American Statistical Association, 74, 269–277.
Article MathSciNet Google Scholar
Prasad, N. G. N., & Rao, J. N. K. (1990). The estimation of the mean squared error of small area estimators. Journal of the American Statistical Association, 85, 163–171.
Article MathSciNet Google Scholar
Rao, C. R., & Kleffe, J. (1988). Estimation of variance components and applications. North-Holland.
Rao, J. N. K., & Molina, I. (2015). Small area estimation (2nd ed.). Wiley.
Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance components. Wiley.
Verbeke, G., & Molenberghs, G. (2006). Linear mixed models for longitudinal data. Springer.

Download references

Acknowledgements

We would like to thank the Associate Editor and the two reviewers for many valuable comments and helpful suggestions, which led to an improved version of this paper. This research was supported in part by Grant-in-Aid for Scientific Research (18K11188) from the Japan Society for the Promotion of Science.

Author information

Authors and Affiliations

Faculty of Economics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
T. Kubokawa
Center for Spatial Information Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, 277-8568, Japan
S. Sugasawa
Nospare Inc., 2-7-13, Kita-aoyama, Minato-ku, Tokyo, 107-0061, Japan
H. Tamae
Department of Statistics and Applied Probability, National University of Singapore, Block S16, Level 7, 6 Science Drive 2, Singapore, 117546, Singapore
S. Chaudhuri

Authors

T. Kubokawa
View author publications
You can also search for this author in PubMed Google Scholar
S. Sugasawa
View author publications
You can also search for this author in PubMed Google Scholar
H. Tamae
View author publications
You can also search for this author in PubMed Google Scholar
S. Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. Kubokawa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Proofs

1.1 A.1 A preliminary lemma

For the proof, we use the following lemma:

Lemma A.1

Let ${\varvec{u}}={\varvec{\varepsilon }}+{\varvec{Z}}{\varvec{v}}$. Then, for matrices ${\varvec{C}}$ and ${\varvec{D}},$ it holds that

$$\begin{aligned} E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]=2\mathrm{tr}\,({\varvec{C}}{\varvec{{\Sigma }}}{\varvec{D}}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{C}}{\varvec{{\Sigma }}})\mathrm{tr}\,({\varvec{D}}{\varvec{{\Sigma }}}) + K_e h_e({\varvec{C}},{\varvec{D}}) +K_v h_v({\varvec{C}}, {\varvec{D}}), \end{aligned}$$

(8)

where $h_e({\varvec{C}},{\varvec{D}})$ and $h_v({\varvec{C}}, {\varvec{D}})$ are given in Theorem 2.1.

Proof

It is demonstrated that $E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]=E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}] +E[{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{C}}{\varvec{Z}}{\varvec{v}}{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{v}}]+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+4\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )$. Let ${\varvec{x}}=(x_1, \ldots , x_N)^\top ={\varvec{R}}_e^{-1/2}{\varvec{\varepsilon }}$, ${\widetilde{{\varvec{C}}}}={\varvec{R}}_e^{1/2}{\varvec{C}}{\varvec{R}}_e^{1/2}$ and ${\widetilde{{\varvec{D}}}}={\varvec{R}}_e^{1/2}{\varvec{D}}{\varvec{R}}_e^{1/2}$. Then, $E[{\varvec{x}}]=\mathbf{0}$, $E[{\varvec{x}}{\varvec{x}}^\top ]={\varvec{I}}_N$, $E[x_a^4]=K_e+3$, $a=1, \ldots , N$, and $E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}]=E[{\varvec{x}}^\top {\widetilde{{\varvec{C}}}}{\varvec{x}}{\varvec{x}}^\top {\widetilde{{\varvec{D}}}}{\varvec{x}}]$. Let ${\delta }_{a=b=c=d}=1$ for $a=b=c=d$, and otherwise, ${\delta }_{a=b=c=d}=0$. The notation ${\delta }_{a=b\not = c=d}$ is defined similarly. It is observed that for $a, b, c, d =1, \ldots , N$,

$$\begin{aligned}&E[x_a({\widetilde{{\varvec{C}}}})_{ab}x_b x_c({\widetilde{{\varvec{D}}}})_{cd}x_d]\\&\quad =E[ x_a^4 ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}{\delta }_{a=b=c=d}+x_a^2x_c^2 ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{cc}{\delta }_{a=b\not = c=d} + 2 x_a^2x_b^2({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}{\delta }_{a=c\not = b=d}]\\&\quad = (K_e+3) ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}{\delta }_{a=b=c=d}+({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{cc}{\delta }_{a=b\not = c=d} + 2 ({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}{\delta }_{a=c\not = b=d}\\&\quad = K_e({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}{\delta }_{a=b=c=d}+({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{cc}{\delta }_{a=b}{\delta }_{c=d} + 2 ({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}{\delta }_{a=c}{\delta }_{b=d}, \end{aligned}$$

which implies that

$$\begin{aligned} \sum _{a, b, c, d}E[x_a({\widetilde{{\varvec{C}}}})_{ab}x_b x_c({\widetilde{{\varvec{D}}}})_{cd}x_d]= & {} K_e \sum _{a=1}^N ({\widetilde{{\varvec{C}}}})_{aa}({\widetilde{{\varvec{D}}}})_{aa}+ \sum _{a=1}^N({\widetilde{{\varvec{C}}}})_{aa}\sum _{c=1}^N({\widetilde{{\varvec{D}}}})_{cc}\\&+ 2 \sum _{a=1}^N\sum _{b=1}^N({\widetilde{{\varvec{C}}}})_{ab}({\widetilde{{\varvec{D}}}})_{ab}, \end{aligned}$$

or

$$\begin{aligned} E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}]=2\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{R}}_e)+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e) +K_e h_e({\varvec{C}}, {\varvec{D}}). \end{aligned}$$

Similarly,

$$\begin{aligned}&E[{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{C}}{\varvec{Z}}{\varvec{v}}{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{v}}]\\&\quad =2\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top ) + K_v h_v({\varvec{C}},{\varvec{D}}). \end{aligned}$$

Thus, we have

$$\begin{aligned} E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]&=2\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{R}}_e)+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e) \mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)+2\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\\&\quad +\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top ) +\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\\&\quad +\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+4\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\\&\quad +K_e h_e({\varvec{C}}, {\varvec{D}})+ K_v h_v({\varvec{C}},{\varvec{D}}), \end{aligned}$$

which can be rewritten as the expression in (9) for ${\varvec{{\Sigma }}}={\varvec{R}}_e+{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top $. $\square $

1.2 A.2 Proof of Theorem 2.1

For $a=1, \ldots , k$, let $\ell _a={\varvec{y}}^\top {\varvec{C}}_a{\varvec{y}}- \mathrm{tr}\,({\varvec{D}}_a)$ for ${\varvec{C}}_a={\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}}$ and ${\varvec{D}}_a={\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}}{\varvec{{\Sigma }}}$. For ${\varvec{u}}={\varvec{y}}-{\varvec{X}}{\varvec{\beta }}={\varvec{\varepsilon }}+{\varvec{Z}}{\varvec{v}}$, $\ell _a$ is rewritten as $\ell _a={\varvec{u}}^\top {\varvec{C}}_a{\varvec{u}}-\mathrm{tr}\,({\varvec{D}}_a)$. By the Taylor series expansion,

$$\begin{aligned} 0= & {} {\mathbf{col}}_a(\ell _a) + {\mathbf{mat}}_{ab}(\ell _{a(b)})(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})\\&+{1\over 2}{\mathbf{col}}_a\left\{ \sum _{b=1}^k\sum _{c=1}^k \ell _{a(bc)} (\widehat{\psi }_b-\psi _b)(\widehat{\psi }_c-\psi _c)\right\} +O_p(N^{-1/2}), \end{aligned}$$

where ${\mathbf{mat}}_{ab}(x_{ab})$ is a $k\times k$ matrix with the (a, b)-th element $x_{ab}$. Then,

$$\begin{aligned} \widehat{{\varvec{\psi }}}-{\varvec{\psi }}= & {} - \{{\mathbf{mat}}_{ab}(\ell _{a(b)}) \}^{-1} \left[ {\mathbf{col}}_a(\ell _a) +{1\over 2}{\mathbf{col}}_a\left\{ \sum _{b=1}^k\sum _{c=1}^k \ell _{a(bc)}(\widehat{\psi }_b-\psi _b)(\widehat{\psi }_c-\psi _c)\right\} \right] \\&+O_p(N^{-3/2}). \end{aligned}$$

Since $\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_a)=\mathrm{tr}\,({\varvec{D}}_a)$, we have $\ell _a=\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}$. In addition, $\ell _{a(b)}= \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})+\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}$ and $\ell _{a(bc)}= \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})+\mathrm{tr}\,\{{\varvec{C}}_{a(bc)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}$. Let ${\varvec{A}}_1={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})\}$ and ${\varvec{A}}_0={\mathbf{mat}}_{ab}[\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}]$. It is noted that ${\varvec{A}}_1=O(N)$, ${\varvec{A}}_0=O_p(N^{1/2})$, $\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})=O(N)$ and $\mathrm{tr}\,\{{\varvec{C}}_{a(bc)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}=O_p(N^{1/2})$. Then it can be seen that

$$\begin{aligned} \{{\mathbf{mat}}_{ab}(\ell _{a(b)}) \}^{-1} = ({\varvec{A}}_1+{\varvec{A}}_0)^{-1}={\varvec{A}}_1^{-1}-{\varvec{A}}_1^{-1}{\varvec{A}}_0{\varvec{A}}_1^{-1}+O_p(N^{-2}), \end{aligned}$$

so that

$$\begin{aligned} \widehat{{\varvec{\psi }}}-{\varvec{\psi }}=&-{\varvec{A}}_1^{-1} {\mathbf{col}}_a[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] + {\varvec{A}}_1^{-1}{\varvec{A}}_0{\varvec{A}}_1^{-1} {\mathbf{col}}_a[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] \nonumber \\&-{1\over 2}{\varvec{A}}_1^{-1}{\mathbf{col}}_a\left\{ \sum _{b=1}^k\sum _{c=1}^k \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})(\widehat{\psi }_b-\psi _b)(\widehat{\psi }_c-\psi _c)\right\} +O_p(N^{-3/2}). \end{aligned}$$

It is noted that $({\varvec{C}}_a)_{ij}=({\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}})_{ij}=({\varvec{W}}_a)_{ij}+O(N^{-1})$, $({\varvec{C}}_{a(b)})_{ij}=({\varvec{W}}_{a(b)})_{ij}+O(N^{-1})$ and $({\varvec{C}}_{a(bc)})_{ij}=({\varvec{W}}_{a(bc)})_{ij}+O(N^{-1})$. Then, $\mathrm{tr}\,({\varvec{C}}_a{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}})+O(1)$, $\mathrm{tr}\,({\varvec{C}}_{a(b)}{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}})+O(1)$ and $\mathrm{tr}\,({\varvec{C}}_{a(bc)}{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_{a(bc)}{\varvec{{\Sigma }}})+O(1)$. Since ${\varvec{D}}_a={\varvec{C}}_a{\varvec{{\Sigma }}}$, ${\varvec{D}}_{a(b)}={\varvec{C}}_{a(b)}{\varvec{{\Sigma }}}+{\varvec{C}}_a{\varvec{{\Sigma }}}_{(b)}$ and ${\varvec{D}}_{a(bc)}={\varvec{C}}_{a(bc)}{\varvec{{\Sigma }}}+{\varvec{C}}_{a(b)}{\varvec{{\Sigma }}}_{(c)}+{\varvec{C}}_{a(c)}{\varvec{{\Sigma }}}_{(b)}+{\varvec{C}}_a{\varvec{{\Sigma }}}_{(bc)}$, it is seen that $\mathrm{tr}\,({\varvec{D}}_{a(b)})=\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)}) +O(1)$ and $\mathrm{tr}\,({\varvec{D}}_{a(bc)})=\mathrm{tr}\,({\varvec{W}}_{a(bc)}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_{a(c)}{\varvec{{\Sigma }}}_{(b)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})+O(1)$. Thus,

$$\begin{aligned} \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})= & {} - \mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})+O(1),\nonumber \\ \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})= & {} -\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})-\mathrm{tr}\,({\varvec{W}}_{a(c)}{\varvec{{\Sigma }}}_{(b)})-\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)}) +O(1). \end{aligned}$$

(9)

Letting ${\varvec{A}}={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})\}$, we have ${\varvec{A}}_1=-{\varvec{A}}+O(1)$. Using Lemma A.1, we can approximate the covariance matrix of $\widehat{{\varvec{\psi }}}$ as

$$\begin{aligned} E[(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})^\top ]&= {\varvec{A}}_1^{-1} {\mathbf{mat}}_{ab}( E[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\mathrm{tr}\,\{{\varvec{C}}_b({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] ) {\varvec{A}}_1^{-1} + O(N^{-3/2})\\&= 2{\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}+{\varvec{A}}^{-1}{\widetilde{{\varvec{B}}}}{\varvec{A}}^{-1} + O(N^{-3/2}), \end{aligned}$$

for ${\varvec{B}}={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}{\varvec{W}}_b{\varvec{{\Sigma }}})\}$ and ${\widetilde{{\varvec{B}}}}={\mathbf{mat}}_{ab}\{K_eh_e({\varvec{W}}_a,{\varvec{W}}_b)+K_vh_v({\varvec{W}}_a,{\varvec{W}}_b)\}$.

The bias of $\widehat{{\varvec{\psi }}}$ is

$$\begin{aligned} E(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})&= -{1\over 2}{\varvec{A}}^{-1}{\mathbf{col}}_a\left[ \sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})\} ({\varvec{A}}^{-1}(2 {\varvec{B}}+{\widetilde{{\varvec{B}}}}){\varvec{A}}^{-1})_{bc}\right] \\&\quad + E({\varvec{A}}^{-1}{\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_a[\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}] ) +O(N^{-3/2}). \end{aligned}$$

Concerning the second term in RHS, the a-th element of $E\{ ({\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_c[\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}])\}$ is

$$\begin{aligned}&E\{({\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_c[\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}])_a\}\\&\quad =\sum _{b=1}^k\sum _{c=1}^k E[\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}({\varvec{A}})^{bc}\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}]\\&\quad =\sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}{\varvec{W}}_c{\varvec{{\Sigma }}}) +K_eh_e({\varvec{W}}_{a(b)},{\varvec{W}}_c)+K_vh_v({\varvec{W}}_{a(b)},{\varvec{W}}_c)\}({\varvec{A}})^{bc} + O(N^{-1}). \end{aligned}$$

Then,

$$\begin{aligned}&E(\widehat{{\varvec{\psi }}}-{\varvec{\psi }})\nonumber \\&\quad ={\varvec{A}}^{-1}{\mathbf{col}}_a\left( \sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}{\varvec{W}}_c{\varvec{{\Sigma }}}) +K_eh_e({\varvec{W}}_{a(b)},{\varvec{W}}_c)+K_vh_v({\varvec{W}}_{a(b)},{\varvec{W}}_c)\}({\varvec{A}})^{bc}\right) \nonumber \\&\quad -{1\over 2}{\varvec{A}}^{-1}{\mathbf{col}}_a\left[ \sum _{b=1}^k\sum _{c=1}^k \{2\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})\} ({\varvec{A}}^{-1} (2 {\varvec{B}}+{\widetilde{{\varvec{B}}}}){\varvec{A}}^{-1})_{bc}\right] +O(N^{-3/2}), \end{aligned}$$

which provides the expression in (3) in Theorem 2.1.

1.3 A.3 Proof of Proposition 3.1

Case of ${\varvec{W}}_a={\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}$. We have ${\varvec{W}}_{a(b)}=-{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}-{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}+{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}$, which yields that $\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)})=({\varvec{A}})_{ab}$ and $({\varvec{B}})_{ab}=\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}{\varvec{W}}_b{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)})=({\varvec{A}})_{ab}$. Thus, ${\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}={\varvec{A}}^{-1}$ and the covariance matrix of $\widehat{{\varvec{\psi }}}$ is $2{\varvec{A}}^{-1}+O(N^{-3/2})$. Moreover, note that

$$\begin{aligned} ({\varvec{K}}_a)_{bc}=&\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}{\varvec{W}}_c{\varvec{{\Sigma }}})= -2\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)}),\\ ({\varvec{H}}_a)_{bc}=&\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})=-2\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(c)}), \end{aligned}$$

which shows that ${\varvec{W}}_a^\mathrm{REML}$ satisfies (4).

Case of ${\varvec{W}}_a=({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}+{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1})/2$. From (2), it follows that $({\varvec{A}})_{ab}=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})$ and $({\varvec{B}})_{ab}=\{\mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}{\varvec{{\Sigma }}}_{(b)})\}/2$. The asymptotic covariance matrix of $\widehat{{\varvec{\psi }}}$ is $2{\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}$, and the bias is derived from (3).

Case of ${\varvec{W}}_a={\varvec{{\Sigma }}}_{(a)}$. Straightforward calculation shows that $({\varvec{A}})_{ab} = \mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})$ and $({\varvec{B}})_{ab}=\mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}})$. The asymptotic covariance matrix of $\widehat{{\varvec{\psi }}}$ is $2 {\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}+O(N^{-3/2})$. Moreover, since $W_{a(b)}=0$, the condition (4) holds.

Appendix B: Summary of estimation methods in specific models

Here, we provide specific forms of the REML-type, FH-type, and their OLS-based estimators, the PR-type estimator and the Prasad–Rao estimator in the Fay–Herriot model and the nested error regression model.

1.1 B.1 Fay–Herriot model

The marginal distribution of ${\varvec{y}}=(y_1, \ldots , y_m)^\top $ in the Fay–Herriot model has $E[{\varvec{y}}]={\varvec{X}}{\varvec{\beta }}$ and $\mathbf{Cov}\,({\varvec{y}})={\varvec{{\Sigma }}}=\psi _1{\varvec{I}}_m + {\varvec{D}}$, where p is a dimension of ${\varvec{\beta }}$ and ${\varvec{D}}=\mathrm{diag}\,(D_1, \ldots , D_m)$.

REML $\widehat{\psi }_1^\mathrm{RE}$ corresponds to ${\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}$ and the estimating equation is $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}})$ for ${\varvec{P}}={\varvec{{\Sigma }}}^{-1}-{\varvec{{\Sigma }}}^{-1}{\varvec{X}}({\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}$.

OLS-based REML $\widehat{\psi }_1^\mathrm{ORM}$ corresponds to ${\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}$ and the estimating equation is $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-2}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}})$ for ${\widetilde{{\varvec{P}}}}={\varvec{I}}-{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top $.

Fay–Herriot estimator $\widehat{\psi }_1^\mathrm{FH}$ corresponds to ${\varvec{W}}_1^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}$ and the estimating equation is $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=m-p$.

OLS-based FH estimator $\widehat{\psi }_1^\mathrm{OFH}$ corresponds to ${\varvec{W}}_1^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}$ and the estimating equation is $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=m-2p+\mathrm{tr}\,\{({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}}\}$.

Prasad–Rao estimator $\widehat{\psi }_1^\mathrm{PR}$ corresponds to ${\varvec{W}}_1^\mathrm{Q}={\varvec{I}}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}$ and it is given by $\widehat{\psi }_1^\mathrm{PR}=[{\varvec{y}}^\top {\widetilde{{\varvec{P}}}}{\varvec{y}}- \mathrm{tr}\,({\varvec{D}})+\mathrm{tr}\,\{({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{D}}{\varvec{X}}\}]/(m-p)$.

The asymptotic variances and second-order biases can be provided from Proposition 3.2 as follows: REML $\widehat{\psi }_1^{\mathrm{RE}}$ and OLS-based REML $\widehat{\psi }_1^\mathrm{ORM}$ have the same asymptotic variance and the second-order bias

$$\begin{aligned} \mathrm{Var}(\widehat{\psi }_1^\mathrm{RE})&\approx {2 \over \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})} + {K_e\mathrm{tr}\,({\varvec{{\Sigma }}}^{-4}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-4}) \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}^2},\\ \mathrm{Bias}(\widehat{\psi }_1^\mathrm{RE})&\approx -2{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-5}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-5})\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}^2}\\&\quad + 2{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-3})\{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-4}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-4})\}\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}^3}. \end{aligned}$$

Fay–Herriot estimator $\widehat{\psi }_1^\mathrm{FH}$ and OLS-based FH estimator $\widehat{\psi }_1^\mathrm{OFH}$ have the same asymptotic variance and the second-order bias

$$\begin{aligned} \mathrm{Var}(\widehat{\psi }_1^\mathrm{FH})&\approx {2m \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2 } + {K_e\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}) \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2},\\ \mathrm{Bias}(\widehat{\psi }_1^\mathrm{FH})&\approx 2{m\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})-\{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2 \over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^3}\\&\quad -{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-3}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-3})\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^2} + {\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\{K_e \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{D}}^2)+\psi _1^2K_v\mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\}\over \{\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\}^3}, \end{aligned}$$

which implies that $\widehat{\psi }_1^\mathrm{UFH}=\widehat{\psi }_1^\mathrm{FH}-2[m\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-2})-\{\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-1})\}^2]/\{\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-1})\}^3$ is unbiased up to second order under normality, where $\widehat{{\varvec{{\Sigma }}}}=\widehat{\psi }_1^\mathrm{FH}{\varvec{I}}_m+{\varvec{D}}$.

Prasad–Rao estimator $\widehat{\psi }_1^\mathrm{PR}$ is second-order unbiased and has the asymptotic variance $\mathrm{Var}(\widehat{\psi }_1^\mathrm{PR})\approx \{2\mathrm{tr}\,({\varvec{{\Sigma }}}^2)+K_e\mathrm{tr}\,({\varvec{D}}^2)+m\psi _1^2K_v\}/m^2$.

1.2 B.2 Nested error regression model

The NER model is written as ${\varvec{y}}_i={\varvec{X}}_i{\varvec{\beta }}+{\varvec{j}}_{n_i}v_i + {\varvec{\varepsilon }}_i$ for $i=1, \ldots , m$, where ${\varvec{y}}_i$, ${\varvec{\beta }}$ and ${\varvec{\varepsilon }}_i$ are $n_i$, p and $n_i$ dimensional vectors, ${\varvec{X}}_i$ is an $n_i\times p$ matrix, $v_i$ is scalar and ${\varvec{j}}_{n_i}=(1, \ldots , 1)^\top \in {\mathbb {R}}^{n_i}$. Here, $v_i$ and ${\varvec{\varepsilon }}_i$ are independent random variables such that $E[v_i]=0$, $\mathrm{Var}(v_i)=\psi _1$, $E[{\varvec{\varepsilon }}_i]=\mathbf{0}$ and $\mathbf{Cov}\,({\varvec{\varepsilon }}_i)=\psi _2{\varvec{I}}_{n_i}$. Let ${\varvec{y}}=({\varvec{y}}_1^\top , \ldots , {\varvec{y}}_m^\top )^\top $, ${\varvec{X}}=({\varvec{X}}_1^\top , \ldots , {\varvec{X}}_m^\top )^\top $, $N=\sum _{i=1}^m n_i$ and ${\varvec{G}}=\text {block diag}({\varvec{J}}_{n_1}, \ldots , {\varvec{J}}_{n_m})$ for ${\varvec{J}}_{n_i}={\varvec{j}}_{n_i}{\varvec{j}}_{n_i}^\top $. In addition, let ${\varvec{{\Sigma }}}=\text {block diag}({\varvec{{\Sigma }}}_1, \ldots , {\varvec{{\Sigma }}}_m)$ for ${\varvec{{\Sigma }}}_i=\psi _1{\varvec{J}}_{n_i}+\psi _2{\varvec{I}}_{n_i}$. Then, ${\varvec{{\Sigma }}}=\psi _1{\varvec{G}}+\psi _2{\varvec{I}}_N$, ${\varvec{{\Sigma }}}_{(1)}={\varvec{G}}$ and ${\varvec{{\Sigma }}}_{(2)}={\varvec{I}}_N$.

REML $\widehat{\psi }_1^\mathrm{RE}$ and $\widehat{\psi }_1^\mathrm{RE}$ correspond to ${\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}$, ${\varvec{W}}_2^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}$, and the estimating equations are $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}}{\varvec{G}})$ and $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}})$.

OLS-based REML $\widehat{\psi }_1^\mathrm{ORM}$ and $\widehat{\psi }_2^\mathrm{ORM}$ correspond to ${\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}$, ${\varvec{W}}_2^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}$, and the estimating equations are $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1})$ and $({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-2})$.

FH-type estimators $\widehat{\psi }_1^\mathrm{FH}$ and $\widehat{\psi }_2^\mathrm{FH}$ correspond to ${\varvec{W}}_1^\mathrm{FH}=({\varvec{{\Sigma }}}^{-1}{\varvec{G}}+{\varvec{G}}{\varvec{{\Sigma }}}^{-1})/2$, ${\varvec{W}}_2^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}$, and the estimating equations are

$$\begin{aligned}&\sum _{i=1}^m{n_i^2({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{G})^2\over n_i\psi _1+\psi _2}= N - \sum _{i=1}^m {n_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\over n_i\psi _1+\psi _2},\\&\quad \psi _2={1\over N-p}\sum _{i=1}^m\sum _{j=1}^{n_i}(y_{ij}-{\varvec{x}}_{ij}^\top \widehat{{\varvec{\beta }}}^\mathrm{G})^2-{1\over N-p}\sum _{i=1}^m{n_i^2\psi _1\over n_i\psi _1+\psi _2}({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{G})^2. \end{aligned}$$

OLS-based FH estimators $\widehat{\psi }_1^\mathrm{OFH}$ and $\widehat{\psi }_2^\mathrm{OFH}$ correspond to ${\varvec{W}}_1^\mathrm{FH}=({\varvec{{\Sigma }}}^{-1}{\varvec{G}}+{\varvec{G}}{\varvec{{\Sigma }}}^{-1})/2$, ${\varvec{W}}_2^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}$, and the estimating equations are

$$\begin{aligned}&\sum _{i=1}^m{n_i^2({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2\over n_i\psi _1+\psi _2}= N - 2\sum _{i=1}^m n_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\\&\quad +\sum _{i=1}^m {n_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\over n_i\psi _1+\psi _2},\\&\sum _{i=1}^m\sum _{j=1}^{n_i}(y_{ij}-{\varvec{x}}_{ij}^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2-\sum _{i=1}^m{n_i^2\psi _1\over n_i\psi _1+\psi _2}({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2 =\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-1}). \end{aligned}$$

PR-type estimators $\widehat{\psi }_1^\mathrm{Q}$ and $\widehat{\psi }_2^\mathrm{Q}$ correspond to ${\varvec{W}}_1^\mathrm{Q}={\varvec{G}}$, ${\varvec{W}}_2^\mathrm{Q}={\varvec{I}}$ and $\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}$, and the estimators are $\widehat{\psi }_1^\mathrm{Q} = \{\sum _{i=1}^m n_i^2 ({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2-\widehat{\psi }_2\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})\}/\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})^2$ and

$$\begin{aligned} \widehat{\psi }_2^\mathrm{Q} ={ \sum _{i=1}^m\sum _{j=1}^{n_i}(y_{ij}-{\varvec{x}}_{ij}^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2- [\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})/\mathrm{tr}\,\{({\widetilde{{\varvec{P}}}}{\varvec{G}})^2\}]\sum _{i=1}^m n_i^2 ({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2 \over N-p-\{\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})\}^2\mathrm{tr}\,\{({\widetilde{{\varvec{P}}}}{\varvec{G}})^2\}}. \end{aligned}$$

Prasad–Rao estimators are $\widehat{\psi }_1^\mathrm{PR}=\{{\varvec{y}}^\top {\widetilde{{\varvec{P}}}}{\varvec{y}}-(N-p)\widehat{\psi }_2\}/\{N-\sum _{i=1}^mn_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\}$ and $\widehat{\psi }_2^\mathrm{PR}=\{ {\varvec{y}}^\top \{{\varvec{E}}-{\varvec{E}}{\varvec{X}}({\varvec{X}}^\top {\varvec{E}}{\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{E}}\}{\varvec{y}}\}/(N-k-p)$, where ${\varvec{E}}=\text {block diag}({\varvec{I}}_{n_1}-n_1^{-1}{\varvec{J}}_{n_1}, \ldots , {\varvec{I}}_{n_m}-n_m^{-1}{\varvec{J}}_{n_m})$.

Hereafter we assume that $K_e=K_v=0$ for simplicity. Note that ${\varvec{{\Sigma }}}{\varvec{G}}={\varvec{G}}{\varvec{{\Sigma }}}$, $\psi _1{\varvec{G}}={\varvec{{\Sigma }}}-\psi _2{\varvec{I}}_N$, $\psi _2{\varvec{{\Sigma }}}^{-1}={\varvec{I}}_N-\psi _1\text {block diag}({\gamma }_1{\varvec{J}}_{n_1}, \ldots , {\gamma }_m{\varvec{J}}_{n_m})$, $\psi _2^2{\varvec{{\Sigma }}}^{-2}={\varvec{I}}_N-\psi _1\text {block diag}((1+\psi _2{\gamma }_1){\gamma }_1{\varvec{J}}_{n_1}, \ldots , (1+\psi _2{\gamma }_m){\gamma }_m{\varvec{J}}_{n_m})$ for ${\gamma }_i=1/(\psi _2+n_i\psi _1)$. Then the asymptotic variances and second-order biases can be provided from Proposition 3.3 as follows: REML $\widehat{{\varvec{\psi }}}^\mathrm{RE}$ and OLS-based REML $\widehat{{\varvec{\psi }}}^\mathrm{ORM}$ are second-order unbiased and have the same asymptotic variance

$$\begin{aligned} \mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{RE})\approx&2 \begin{pmatrix}\mathrm{tr}\,\{({\varvec{{\Sigma }}}^{-1}{\varvec{G}})^2\} &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}}) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\end{pmatrix}^{-1} =2 \begin{pmatrix}\sum _{i=1}^mn_i^2{\gamma }_i^2 &{} \sum _{i=1}^mn_i{\gamma }_i^2\\ \sum _{i=1}^mn_i{\gamma }_i^2 &{} (N-m)/\psi _2^2+\sum _{i=1}^m{\gamma }_i^2\end{pmatrix}^{-1}, \end{aligned}$$

which was given in Datta and Lahiri (2000).

Fay–Herriot estimator $\widehat{{\varvec{\psi }}}^\mathrm{FH}$ and OLS-based FH estimator $\widehat{{\varvec{\psi }}}^\mathrm{OFH}$ have the same asymptotic covariance matrix $\mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{FH})\approx 2{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1}$, where

$$\begin{aligned} {\varvec{A}}_\mathrm{FH}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2{\gamma }_i &{} \sum _{i=1}^mn_i{\gamma }_i\\ \sum _{i=1}^mn_i{\gamma }_i &{} (N-m)/\psi _2+\sum _{i=1}^m{\gamma }_i\end{pmatrix},\\ {\varvec{B}}_\mathrm{FH}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}{\varvec{G}}+{\varvec{G}}^2) &{} N\\ N &{} N\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2 &{} N\\ N &{} N\end{pmatrix}, \end{aligned}$$

and the same second-order bias

$$\begin{aligned} \mathbf{Bias}(\widehat{{\varvec{\psi }}}^\mathrm{FH})\approx 2{\varvec{A}}_\mathrm{FH}^{-1} \begin{pmatrix} \mathrm{tr}\,({\varvec{K}}_1{\varvec{A}}_\mathrm{FH}^{-1})-\mathrm{tr}\,({\varvec{H}}_1{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1})\\ \mathrm{tr}\,({\varvec{K}}_2{\varvec{A}}_\mathrm{FH}^{-1})-\mathrm{tr}\,({\varvec{H}}_2{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1})\end{pmatrix}, \end{aligned}$$

where

$$\begin{aligned} {\varvec{K}}_1=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^3)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^3{\gamma }_i &{} \sum _i n_i^2{\gamma }_i \\ \sum _i n_i^2{\gamma }_i &{} \sum _i n_i{\gamma }_i\end{pmatrix},\\ {\varvec{K}}_2=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}})&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^2{\gamma }_i &{} \sum _i n_i{\gamma }_i \\ \sum _i n_i{\gamma }_i &{} (N-m)/\psi _2+\sum _i {\gamma }_i\end{pmatrix},\\ {\varvec{H}}_1=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}}^2)&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^3{\gamma }_i^2 &{} \sum _i n_i^2{\gamma }_i^2 \\ \sum _i n_i^2{\gamma }_i^2 &{} \sum _i n_i{\gamma }_i^2\end{pmatrix},\\ {\varvec{H}}_2=&-\begin{pmatrix} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}})&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2}{\varvec{G}})&{} \mathrm{tr}\,({\varvec{{\Sigma }}}^{-2})\end{pmatrix} =-\begin{pmatrix} \sum _i n_i^2{\gamma }_i^2 &{} \sum _i n_i{\gamma }_i^2 \\ \sum _i n_i{\gamma }_i^2 &{} (N-m)/\psi _2+\sum _i {\gamma }_i^2\end{pmatrix}. \end{aligned}$$

PR-type estimator $\widehat{{\varvec{\psi }}}^\mathrm{Q}$ is second-order unbiased and has the same asymptotic covariance matrix $\mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{Q})\approx 2{\varvec{A}}_\mathrm{Q}^{-1}{\varvec{B}}_\mathrm{Q}{\varvec{A}}_\mathrm{Q}^{-1}$, where

$$\begin{aligned} {\varvec{A}}_\mathrm{Q}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{G}}^2) &{} \mathrm{tr}\,({\varvec{G}})\\ \mathrm{tr}\,({\varvec{G}}) &{} \mathrm{tr}\,({\varvec{I}}_N)\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2 &{} N \\ N &{} N \end{pmatrix},\\ {\varvec{B}}_\mathrm{Q}=&\begin{pmatrix}\mathrm{tr}\,({\varvec{{\Sigma }}}^2{\varvec{G}}^2) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^2{\varvec{G}})\\ \mathrm{tr}\,({\varvec{{\Sigma }}}^2{\varvec{G}}) &{} \mathrm{tr}\,({\varvec{{\Sigma }}}^2)\end{pmatrix} =\begin{pmatrix}\sum _{i=1}^mn_i^2/{\gamma }_i^2 &{} \sum _{i=1}^mn_i/{\gamma }_i^2\\ \sum _{i=1}^mn_i/{\gamma }_i^2 &{} (N-m)\psi _2^2+\sum _{i=1}^m1/{\gamma }_i^2\end{pmatrix}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kubokawa, T., Sugasawa, S., Tamae, H. et al. General unbiased estimating equations for variance components in linear mixed models. Jpn J Stat Data Sci 4, 841–859 (2021). https://doi.org/10.1007/s42081-021-00138-8

Download citation

Received: 16 May 2021
Revised: 25 August 2021
Accepted: 29 August 2021
Published: 08 September 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s42081-021-00138-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

General unbiased estimating equations for variance components in linear mixed models

Abstract

Access this article

Similar content being viewed by others

On drawbacks of least squares Lehmann–Scheffé estimation of variance components

A new method for obtaining explicit estimators in unbalanced mixed linear models

Robust estimation of the number of components for mixtures of linear regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Proofs

1.1 A.1 A preliminary lemma

Lemma A.1

Proof

1.2 A.2 Proof of Theorem 2.1

1.3 A.3 Proof of Proposition 3.1

Appendix B: Summary of estimation methods in specific models

1.1 B.1 Fay–Herriot model

1.2 B.2 Nested error regression model

Rights and permissions

About this article

Cite this article

Keywords

Navigation

General unbiased estimating equations for variance components in linear mixed models

Abstract

Access this article

Similar content being viewed by others

On drawbacks of least squares Lehmann–Scheffé estimation of variance components

A new method for obtaining explicit estimators in unbalanced mixed linear models

Robust estimation of the number of components for mixtures of linear regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Proofs

1.1 A.1 A preliminary lemma

Lemma A.1

Proof

1.2 A.2 Proof of Theorem 2.1

1.3 A.3 Proof of Proposition 3.1

Appendix B: Summary of estimation methods in specific models

1.1 B.1 Fay–Herriot model

1.2 B.2 Nested error regression model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation