Abstract
This paper introduces a general framework for estimating variance components in the linear mixed models via general unbiased estimating equations, which include some well-used estimators such as the restricted maximum likelihood estimator. We derive the asymptotic covariance matrices and second-order biases under general estimating equations without assuming the normality of the underlying distributions and identify a class of second-order unbiased estimators of variance components. It is also shown that the asymptotic covariance matrices and second-order biases do not depend on whether the regression coefficients are estimated by the generalized or ordinary least squares methods. We carry out numerical studies to check the performance of the proposed methods based on typical linear mixed models.
Similar content being viewed by others
References
Battese, G. E., Harter, R. M., & Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83, 28–36.
Boreinstein, M., Hedges, L. V., & Higgins, J. P. T. (2009). Introduction to meta-analysis. Wiley.
Datta, G. S., & Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statistica Sinica, 10, 613–627.
Doornik, J. A. (2007). Object-oriented matrix programming using Ox (3rd ed.). Timberlake Consultants Press and Oxford.
Fay, R. E., & Herriot, R. (1979). Estimates of income for small places: An application of James–Stein procedures to census data. Journal of the American Statistical Association, 74, 269–277.
Prasad, N. G. N., & Rao, J. N. K. (1990). The estimation of the mean squared error of small area estimators. Journal of the American Statistical Association, 85, 163–171.
Rao, C. R., & Kleffe, J. (1988). Estimation of variance components and applications. North-Holland.
Rao, J. N. K., & Molina, I. (2015). Small area estimation (2nd ed.). Wiley.
Searle, S. R., Casella, G., & McCulloch, C. E. (1992). Variance components. Wiley.
Verbeke, G., & Molenberghs, G. (2006). Linear mixed models for longitudinal data. Springer.
Acknowledgements
We would like to thank the Associate Editor and the two reviewers for many valuable comments and helpful suggestions, which led to an improved version of this paper. This research was supported in part by Grant-in-Aid for Scientific Research (18K11188) from the Japan Society for the Promotion of Science.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proofs
1.1 A.1 A preliminary lemma
For the proof, we use the following lemma:
Lemma A.1
Let \({\varvec{u}}={\varvec{\varepsilon }}+{\varvec{Z}}{\varvec{v}}\). Then, for matrices \({\varvec{C}}\) and \({\varvec{D}},\) it holds that
where \(h_e({\varvec{C}},{\varvec{D}})\) and \(h_v({\varvec{C}}, {\varvec{D}})\) are given in Theorem 2.1.
Proof
It is demonstrated that \(E[{\varvec{u}}^\top {\varvec{C}}{\varvec{u}}{\varvec{u}}^\top {\varvec{D}}{\varvec{u}}]=E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}] +E[{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{C}}{\varvec{Z}}{\varvec{v}}{\varvec{v}}^\top {\varvec{Z}}^\top {\varvec{D}}{\varvec{Z}}{\varvec{v}}]+\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+\mathrm{tr}\,({\varvec{D}}{\varvec{R}}_e)\mathrm{tr}\,({\varvec{C}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )+4\mathrm{tr}\,({\varvec{C}}{\varvec{R}}_e{\varvec{D}}{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top )\). Let \({\varvec{x}}=(x_1, \ldots , x_N)^\top ={\varvec{R}}_e^{-1/2}{\varvec{\varepsilon }}\), \({\widetilde{{\varvec{C}}}}={\varvec{R}}_e^{1/2}{\varvec{C}}{\varvec{R}}_e^{1/2}\) and \({\widetilde{{\varvec{D}}}}={\varvec{R}}_e^{1/2}{\varvec{D}}{\varvec{R}}_e^{1/2}\). Then, \(E[{\varvec{x}}]=\mathbf{0}\), \(E[{\varvec{x}}{\varvec{x}}^\top ]={\varvec{I}}_N\), \(E[x_a^4]=K_e+3\), \(a=1, \ldots , N\), and \(E[{\varvec{\varepsilon }}^\top {\varvec{C}}{\varvec{\varepsilon }}{\varvec{\varepsilon }}^\top {\varvec{D}}{\varvec{\varepsilon }}]=E[{\varvec{x}}^\top {\widetilde{{\varvec{C}}}}{\varvec{x}}{\varvec{x}}^\top {\widetilde{{\varvec{D}}}}{\varvec{x}}]\). Let \({\delta }_{a=b=c=d}=1\) for \(a=b=c=d\), and otherwise, \({\delta }_{a=b=c=d}=0\). The notation \({\delta }_{a=b\not = c=d}\) is defined similarly. It is observed that for \(a, b, c, d =1, \ldots , N\),
which implies that
or
Similarly,
Thus, we have
which can be rewritten as the expression in (9) for \({\varvec{{\Sigma }}}={\varvec{R}}_e+{\varvec{Z}}{\varvec{R}}_v{\varvec{Z}}^\top \). \(\square \)
1.2 A.2 Proof of Theorem 2.1
For \(a=1, \ldots , k\), let \(\ell _a={\varvec{y}}^\top {\varvec{C}}_a{\varvec{y}}- \mathrm{tr}\,({\varvec{D}}_a)\) for \({\varvec{C}}_a={\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}}\) and \({\varvec{D}}_a={\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}}{\varvec{{\Sigma }}}\). For \({\varvec{u}}={\varvec{y}}-{\varvec{X}}{\varvec{\beta }}={\varvec{\varepsilon }}+{\varvec{Z}}{\varvec{v}}\), \(\ell _a\) is rewritten as \(\ell _a={\varvec{u}}^\top {\varvec{C}}_a{\varvec{u}}-\mathrm{tr}\,({\varvec{D}}_a)\). By the Taylor series expansion,
where \({\mathbf{mat}}_{ab}(x_{ab})\) is a \(k\times k\) matrix with the (a, b)-th element \(x_{ab}\). Then,
Since \(\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_a)=\mathrm{tr}\,({\varvec{D}}_a)\), we have \(\ell _a=\mathrm{tr}\,\{{\varvec{C}}_a({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\). In addition, \(\ell _{a(b)}= \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})+\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\) and \(\ell _{a(bc)}= \mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})+\mathrm{tr}\,\{{\varvec{C}}_{a(bc)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}\). Let \({\varvec{A}}_1={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(b)}-{\varvec{D}}_{a(b)})\}\) and \({\varvec{A}}_0={\mathbf{mat}}_{ab}[\mathrm{tr}\,\{{\varvec{C}}_{a(b)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}]\). It is noted that \({\varvec{A}}_1=O(N)\), \({\varvec{A}}_0=O_p(N^{1/2})\), \(\mathrm{tr}\,({\varvec{{\Sigma }}}{\varvec{C}}_{a(bc)}-{\varvec{D}}_{a(bc)})=O(N)\) and \(\mathrm{tr}\,\{{\varvec{C}}_{a(bc)}({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}=O_p(N^{1/2})\). Then it can be seen that
so that
It is noted that \(({\varvec{C}}_a)_{ij}=({\varvec{Q}}^\top {\varvec{W}}_a{\varvec{Q}})_{ij}=({\varvec{W}}_a)_{ij}+O(N^{-1})\), \(({\varvec{C}}_{a(b)})_{ij}=({\varvec{W}}_{a(b)})_{ij}+O(N^{-1})\) and \(({\varvec{C}}_{a(bc)})_{ij}=({\varvec{W}}_{a(bc)})_{ij}+O(N^{-1})\). Then, \(\mathrm{tr}\,({\varvec{C}}_a{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}})+O(1)\), \(\mathrm{tr}\,({\varvec{C}}_{a(b)}{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}})+O(1)\) and \(\mathrm{tr}\,({\varvec{C}}_{a(bc)}{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{W}}_{a(bc)}{\varvec{{\Sigma }}})+O(1)\). Since \({\varvec{D}}_a={\varvec{C}}_a{\varvec{{\Sigma }}}\), \({\varvec{D}}_{a(b)}={\varvec{C}}_{a(b)}{\varvec{{\Sigma }}}+{\varvec{C}}_a{\varvec{{\Sigma }}}_{(b)}\) and \({\varvec{D}}_{a(bc)}={\varvec{C}}_{a(bc)}{\varvec{{\Sigma }}}+{\varvec{C}}_{a(b)}{\varvec{{\Sigma }}}_{(c)}+{\varvec{C}}_{a(c)}{\varvec{{\Sigma }}}_{(b)}+{\varvec{C}}_a{\varvec{{\Sigma }}}_{(bc)}\), it is seen that \(\mathrm{tr}\,({\varvec{D}}_{a(b)})=\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)}) +O(1)\) and \(\mathrm{tr}\,({\varvec{D}}_{a(bc)})=\mathrm{tr}\,({\varvec{W}}_{a(bc)}{\varvec{{\Sigma }}})+\mathrm{tr}\,({\varvec{W}}_{a(b)}{\varvec{{\Sigma }}}_{(c)})+\mathrm{tr}\,({\varvec{W}}_{a(c)}{\varvec{{\Sigma }}}_{(b)})+\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(bc)})+O(1)\). Thus,
Letting \({\varvec{A}}={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})\}\), we have \({\varvec{A}}_1=-{\varvec{A}}+O(1)\). Using Lemma A.1, we can approximate the covariance matrix of \(\widehat{{\varvec{\psi }}}\) as
for \({\varvec{B}}={\mathbf{mat}}_{ab}\{\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}{\varvec{W}}_b{\varvec{{\Sigma }}})\}\) and \({\widetilde{{\varvec{B}}}}={\mathbf{mat}}_{ab}\{K_eh_e({\varvec{W}}_a,{\varvec{W}}_b)+K_vh_v({\varvec{W}}_a,{\varvec{W}}_b)\}\).
The bias of \(\widehat{{\varvec{\psi }}}\) is
Concerning the second term in RHS, the a-th element of \(E\{ ({\varvec{A}}_0{\varvec{A}}^{-1} {\mathbf{col}}_c[\mathrm{tr}\,\{{\varvec{C}}_c({\varvec{u}}{\varvec{u}}^\top -{\varvec{{\Sigma }}})\}])\}\) is
Then,
which provides the expression in (3) in Theorem 2.1.
1.3 A.3 Proof of Proposition 3.1
Case of \({\varvec{W}}_a={\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}\). We have \({\varvec{W}}_{a(b)}=-{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}-{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}}^{-1}+{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(ab)}{\varvec{{\Sigma }}}^{-1}\), which yields that \(\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}_{(b)})=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)})=({\varvec{A}})_{ab}\) and \(({\varvec{B}})_{ab}=\mathrm{tr}\,({\varvec{W}}_a{\varvec{{\Sigma }}}{\varvec{W}}_b{\varvec{{\Sigma }}})=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(b)})=({\varvec{A}})_{ab}\). Thus, \({\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}={\varvec{A}}^{-1}\) and the covariance matrix of \(\widehat{{\varvec{\psi }}}\) is \(2{\varvec{A}}^{-1}+O(N^{-3/2})\). Moreover, note that
which shows that \({\varvec{W}}_a^\mathrm{REML}\) satisfies (4).
Case of \({\varvec{W}}_a=({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}+{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}^{-1})/2\). From (2), it follows that \(({\varvec{A}})_{ab}=\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})\) and \(({\varvec{B}})_{ab}=\{\mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})+\mathrm{tr}\,({\varvec{{\Sigma }}}^{-1}{\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}{\varvec{{\Sigma }}}_{(b)})\}/2\). The asymptotic covariance matrix of \(\widehat{{\varvec{\psi }}}\) is \(2{\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}\), and the bias is derived from (3).
Case of \({\varvec{W}}_a={\varvec{{\Sigma }}}_{(a)}\). Straightforward calculation shows that \(({\varvec{A}})_{ab} = \mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}_{(b)})\) and \(({\varvec{B}})_{ab}=\mathrm{tr}\,({\varvec{{\Sigma }}}_{(a)}{\varvec{{\Sigma }}}{\varvec{{\Sigma }}}_{(b)}{\varvec{{\Sigma }}})\). The asymptotic covariance matrix of \(\widehat{{\varvec{\psi }}}\) is \(2 {\varvec{A}}^{-1}{\varvec{B}}{\varvec{A}}^{-1}+O(N^{-3/2})\). Moreover, since \(W_{a(b)}=0\), the condition (4) holds.
Appendix B: Summary of estimation methods in specific models
Here, we provide specific forms of the REML-type, FH-type, and their OLS-based estimators, the PR-type estimator and the Prasad–Rao estimator in the Fay–Herriot model and the nested error regression model.
1.1 B.1 Fay–Herriot model
The marginal distribution of \({\varvec{y}}=(y_1, \ldots , y_m)^\top \) in the Fay–Herriot model has \(E[{\varvec{y}}]={\varvec{X}}{\varvec{\beta }}\) and \(\mathbf{Cov}\,({\varvec{y}})={\varvec{{\Sigma }}}=\psi _1{\varvec{I}}_m + {\varvec{D}}\), where p is a dimension of \({\varvec{\beta }}\) and \({\varvec{D}}=\mathrm{diag}\,(D_1, \ldots , D_m)\).
REML \(\widehat{\psi }_1^\mathrm{RE}\) corresponds to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}})\) for \({\varvec{P}}={\varvec{{\Sigma }}}^{-1}-{\varvec{{\Sigma }}}^{-1}{\varvec{X}}({\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}\).
OLS-based REML \(\widehat{\psi }_1^\mathrm{ORM}\) corresponds to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-2}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}})\) for \({\widetilde{{\varvec{P}}}}={\varvec{I}}-{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top \).
Fay–Herriot estimator \(\widehat{\psi }_1^\mathrm{FH}\) corresponds to \({\varvec{W}}_1^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=m-p\).
OLS-based FH estimator \(\widehat{\psi }_1^\mathrm{OFH}\) corresponds to \({\varvec{W}}_1^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\) and the estimating equation is \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=m-2p+\mathrm{tr}\,\{({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}{\varvec{X}}({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{{\Sigma }}}^{-1}{\varvec{X}}\}\).
Prasad–Rao estimator \(\widehat{\psi }_1^\mathrm{PR}\) corresponds to \({\varvec{W}}_1^\mathrm{Q}={\varvec{I}}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\) and it is given by \(\widehat{\psi }_1^\mathrm{PR}=[{\varvec{y}}^\top {\widetilde{{\varvec{P}}}}{\varvec{y}}- \mathrm{tr}\,({\varvec{D}})+\mathrm{tr}\,\{({\varvec{X}}^\top {\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{D}}{\varvec{X}}\}]/(m-p)\).
The asymptotic variances and second-order biases can be provided from Proposition 3.2 as follows: REML \(\widehat{\psi }_1^{\mathrm{RE}}\) and OLS-based REML \(\widehat{\psi }_1^\mathrm{ORM}\) have the same asymptotic variance and the second-order bias
Fay–Herriot estimator \(\widehat{\psi }_1^\mathrm{FH}\) and OLS-based FH estimator \(\widehat{\psi }_1^\mathrm{OFH}\) have the same asymptotic variance and the second-order bias
which implies that \(\widehat{\psi }_1^\mathrm{UFH}=\widehat{\psi }_1^\mathrm{FH}-2[m\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-2})-\{\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-1})\}^2]/\{\mathrm{tr}\,(\widehat{{\varvec{{\Sigma }}}}^{-1})\}^3\) is unbiased up to second order under normality, where \(\widehat{{\varvec{{\Sigma }}}}=\widehat{\psi }_1^\mathrm{FH}{\varvec{I}}_m+{\varvec{D}}\).
Prasad–Rao estimator \(\widehat{\psi }_1^\mathrm{PR}\) is second-order unbiased and has the asymptotic variance \(\mathrm{Var}(\widehat{\psi }_1^\mathrm{PR})\approx \{2\mathrm{tr}\,({\varvec{{\Sigma }}}^2)+K_e\mathrm{tr}\,({\varvec{D}}^2)+m\psi _1^2K_v\}/m^2\).
1.2 B.2 Nested error regression model
The NER model is written as \({\varvec{y}}_i={\varvec{X}}_i{\varvec{\beta }}+{\varvec{j}}_{n_i}v_i + {\varvec{\varepsilon }}_i\) for \(i=1, \ldots , m\), where \({\varvec{y}}_i\), \({\varvec{\beta }}\) and \({\varvec{\varepsilon }}_i\) are \(n_i\), p and \(n_i\) dimensional vectors, \({\varvec{X}}_i\) is an \(n_i\times p\) matrix, \(v_i\) is scalar and \({\varvec{j}}_{n_i}=(1, \ldots , 1)^\top \in {\mathbb {R}}^{n_i}\). Here, \(v_i\) and \({\varvec{\varepsilon }}_i\) are independent random variables such that \(E[v_i]=0\), \(\mathrm{Var}(v_i)=\psi _1\), \(E[{\varvec{\varepsilon }}_i]=\mathbf{0}\) and \(\mathbf{Cov}\,({\varvec{\varepsilon }}_i)=\psi _2{\varvec{I}}_{n_i}\). Let \({\varvec{y}}=({\varvec{y}}_1^\top , \ldots , {\varvec{y}}_m^\top )^\top \), \({\varvec{X}}=({\varvec{X}}_1^\top , \ldots , {\varvec{X}}_m^\top )^\top \), \(N=\sum _{i=1}^m n_i\) and \({\varvec{G}}=\text {block diag}({\varvec{J}}_{n_1}, \ldots , {\varvec{J}}_{n_m})\) for \({\varvec{J}}_{n_i}={\varvec{j}}_{n_i}{\varvec{j}}_{n_i}^\top \). In addition, let \({\varvec{{\Sigma }}}=\text {block diag}({\varvec{{\Sigma }}}_1, \ldots , {\varvec{{\Sigma }}}_m)\) for \({\varvec{{\Sigma }}}_i=\psi _1{\varvec{J}}_{n_i}+\psi _2{\varvec{I}}_{n_i}\). Then, \({\varvec{{\Sigma }}}=\psi _1{\varvec{G}}+\psi _2{\varvec{I}}_N\), \({\varvec{{\Sigma }}}_{(1)}={\varvec{G}}\) and \({\varvec{{\Sigma }}}_{(2)}={\varvec{I}}_N\).
REML \(\widehat{\psi }_1^\mathrm{RE}\) and \(\widehat{\psi }_1^\mathrm{RE}\) correspond to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}\), \({\varvec{W}}_2^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\), and the estimating equations are \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}}{\varvec{G}})\) and \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{G})=\mathrm{tr}\,({\varvec{P}})\).
OLS-based REML \(\widehat{\psi }_1^\mathrm{ORM}\) and \(\widehat{\psi }_2^\mathrm{ORM}\) correspond to \({\varvec{W}}_1^\mathrm{RE}={\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}\), \({\varvec{W}}_2^\mathrm{RE}={\varvec{{\Sigma }}}^{-2}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\), and the estimating equations are \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-1}{\varvec{G}}{\varvec{{\Sigma }}}^{-1})\) and \(({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})^\top {\varvec{{\Sigma }}}^{-2}({\varvec{y}}-{\varvec{X}}\widehat{{\varvec{\beta }}}^\mathrm{O})=\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}{\widetilde{{\varvec{P}}}}{\varvec{{\Sigma }}}^{-2})\).
FH-type estimators \(\widehat{\psi }_1^\mathrm{FH}\) and \(\widehat{\psi }_2^\mathrm{FH}\) correspond to \({\varvec{W}}_1^\mathrm{FH}=({\varvec{{\Sigma }}}^{-1}{\varvec{G}}+{\varvec{G}}{\varvec{{\Sigma }}}^{-1})/2\), \({\varvec{W}}_2^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{G}\), and the estimating equations are
OLS-based FH estimators \(\widehat{\psi }_1^\mathrm{OFH}\) and \(\widehat{\psi }_2^\mathrm{OFH}\) correspond to \({\varvec{W}}_1^\mathrm{FH}=({\varvec{{\Sigma }}}^{-1}{\varvec{G}}+{\varvec{G}}{\varvec{{\Sigma }}}^{-1})/2\), \({\varvec{W}}_2^\mathrm{FH}={\varvec{{\Sigma }}}^{-1}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\), and the estimating equations are
PR-type estimators \(\widehat{\psi }_1^\mathrm{Q}\) and \(\widehat{\psi }_2^\mathrm{Q}\) correspond to \({\varvec{W}}_1^\mathrm{Q}={\varvec{G}}\), \({\varvec{W}}_2^\mathrm{Q}={\varvec{I}}\) and \(\widehat{{\varvec{\beta }}}=\widehat{{\varvec{\beta }}}^\mathrm{O}\), and the estimators are \(\widehat{\psi }_1^\mathrm{Q} = \{\sum _{i=1}^m n_i^2 ({\overline{y}}_i-{\overline{{\varvec{x}}}}_i^\top \widehat{{\varvec{\beta }}}^\mathrm{O})^2-\widehat{\psi }_2\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})\}/\mathrm{tr}\,({\widetilde{{\varvec{P}}}}{\varvec{G}})^2\) and
Prasad–Rao estimators are \(\widehat{\psi }_1^\mathrm{PR}=\{{\varvec{y}}^\top {\widetilde{{\varvec{P}}}}{\varvec{y}}-(N-p)\widehat{\psi }_2\}/\{N-\sum _{i=1}^mn_i^2{\overline{{\varvec{x}}}}_i^\top ({\varvec{X}}^\top {\varvec{X}})^{-1}{\overline{{\varvec{x}}}}_i\}\) and \(\widehat{\psi }_2^\mathrm{PR}=\{ {\varvec{y}}^\top \{{\varvec{E}}-{\varvec{E}}{\varvec{X}}({\varvec{X}}^\top {\varvec{E}}{\varvec{X}})^{-1}{\varvec{X}}^\top {\varvec{E}}\}{\varvec{y}}\}/(N-k-p)\), where \({\varvec{E}}=\text {block diag}({\varvec{I}}_{n_1}-n_1^{-1}{\varvec{J}}_{n_1}, \ldots , {\varvec{I}}_{n_m}-n_m^{-1}{\varvec{J}}_{n_m})\).
Hereafter we assume that \(K_e=K_v=0\) for simplicity. Note that \({\varvec{{\Sigma }}}{\varvec{G}}={\varvec{G}}{\varvec{{\Sigma }}}\), \(\psi _1{\varvec{G}}={\varvec{{\Sigma }}}-\psi _2{\varvec{I}}_N\), \(\psi _2{\varvec{{\Sigma }}}^{-1}={\varvec{I}}_N-\psi _1\text {block diag}({\gamma }_1{\varvec{J}}_{n_1}, \ldots , {\gamma }_m{\varvec{J}}_{n_m})\), \(\psi _2^2{\varvec{{\Sigma }}}^{-2}={\varvec{I}}_N-\psi _1\text {block diag}((1+\psi _2{\gamma }_1){\gamma }_1{\varvec{J}}_{n_1}, \ldots , (1+\psi _2{\gamma }_m){\gamma }_m{\varvec{J}}_{n_m})\) for \({\gamma }_i=1/(\psi _2+n_i\psi _1)\). Then the asymptotic variances and second-order biases can be provided from Proposition 3.3 as follows: REML \(\widehat{{\varvec{\psi }}}^\mathrm{RE}\) and OLS-based REML \(\widehat{{\varvec{\psi }}}^\mathrm{ORM}\) are second-order unbiased and have the same asymptotic variance
which was given in Datta and Lahiri (2000).
Fay–Herriot estimator \(\widehat{{\varvec{\psi }}}^\mathrm{FH}\) and OLS-based FH estimator \(\widehat{{\varvec{\psi }}}^\mathrm{OFH}\) have the same asymptotic covariance matrix \(\mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{FH})\approx 2{\varvec{A}}_\mathrm{FH}^{-1}{\varvec{B}}_\mathrm{FH}{\varvec{A}}_\mathrm{FH}^{-1}\), where
and the same second-order bias
where
PR-type estimator \(\widehat{{\varvec{\psi }}}^\mathrm{Q}\) is second-order unbiased and has the same asymptotic covariance matrix \(\mathbf{Cov}\,(\widehat{{\varvec{\psi }}}^\mathrm{Q})\approx 2{\varvec{A}}_\mathrm{Q}^{-1}{\varvec{B}}_\mathrm{Q}{\varvec{A}}_\mathrm{Q}^{-1}\), where
Rights and permissions
About this article
Cite this article
Kubokawa, T., Sugasawa, S., Tamae, H. et al. General unbiased estimating equations for variance components in linear mixed models. Jpn J Stat Data Sci 4, 841–859 (2021). https://doi.org/10.1007/s42081-021-00138-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-021-00138-8