Abstract
For testing a linear hypothesis about fixed effects in a normal mixed linear model, a popular approach is to use a Wald test, in which the test statistic is assumed to have a null distribution that is approximately chi-squared. This approximation is questionable, however, for small samples. In 1997 Kenward and Roger constructed a test that addresses this problem. They altered the Wald test in three ways: (a) adjusting the test statistic, (b) approximating the null distribution by a scaled F distribution, and (c) modifying the formulas to achieve an exact F test in two special cases. Alterations (a) and (b) lead to formulas that are somewhat complicated but can be explained by using Taylor series approximations and a few convenient assumptions. The modified formulas used in alteration (c), however, are more mysterious. Restricting attention to models with linear variance–covariance structure, we provide details of a derivation that justifies these formulas. We show that similar but different derivations lead to different formulas that also produce exact F tests in the two special cases and are equally justifiable. A simulation study was done for testing the equality of treatment effects in block-design models. Tests based on the different derivations performed very similarly. Moreover, the simulations confirm that alteration (c) is worthwhile. The Kenward–Roger test showed greater accuracy in its p values than did the unmodified version of the test.
Similar content being viewed by others
References
Arnau J, Bono R, Vallejo G (2009) Analyzing small samples of repeated measures data with the mixed-model adjusted F test. Commun Stat Simul 38:1083–1103
Chen X (2006) The adjustment of random baseline measurements in treatment effect estimation. J Stat Plan Infer 136:4161–4175
Cochran W, Cox G (1957) Experimental designs. Wiley, New York
Green P (1974) On the design of choice experiments involving multifactor alternatives. J Consum Res 1:61–68
Guiard V, Spilke J, Danicke S (2003) Evaluation and interpretation of results for three cross-over designs. Arch Anim Nutr 57:177–195
Halekoh U, Højsgaard S (2014) A Kenward–Roger approximation and parametric bootstrap methods for tests in linear mixed models—the R package pbkrtest. J Stat Softw 59(9):1–32
Harville DA, Jeske DR (1992) Mean squared error of estimation or prediction under a general linear model. J Am Stat Assoc 87:724–731
Jiang J (2007) Linear and generalized linear mixed models and their applications. Springer, New York
John PWM (1971) Statistical design and analysis of experiments. Macmillan, New York
Kackar RN, Harville DA (1984) Approximations for standard errors of estimators of fixed and random effects in mixed liner models. J Am Stat Assoc 79:853–862
Kenward MG, Roger JH (1997) Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53:983–997
Khuri AI, Mathew T, Sinha BK (1998) Statistical tests for mixed linear models. Wiley, New York
Kowalchuk RK, Keselman HJ, Algina J, Wolfinger RD (2004) The analysis of repeated measurements with mixed-model adjusted F tests. Educ Psychol Meas 64:224–242
Kuehl RO (2000) Design of experiments: statistical principles of research design and analysis, 2nd edn. Duxbury Press, Pacific Grove
Livacic-Rojas P, Vallejo G, Fernandez P (2010) Analysis of type I error rates of univariate and multivariate procedures in repeated measures designs. Commun Stat Simul 39:624–640
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London
SAS Institute Inc (2015) SAS/STAT®14.1 User’s guide. SAS Institute Inc., Cary
Schaalje GB, McBride JB, Fellingham GW (2002) Adequacy of approximations to distributions of test statistics in complex mixed linear models. J Agric Biol Environ Stat 7:512–524
Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley, New York
Spilke J, Piepho H, Hu X (2005) A simulation study on tests of hypotheses and confidence intervals for fixed effects in mixed models for blocked experiments with missing data. J Agric Biol Environ Stat 10:374–389
Stroup WW (1999) On using proc mixed for longitudinal data. In: Annual conference on applied statistics in agriculture, Paper 5. http://newprairiepress.org/agstatconference/1999/proceedings/5. Accessed 4 July 2016
VanLeeuwen DM, Birkes DS, Seely JF (1999) Balance and orthogonality in designs for mixed classification models. Ann Stat 27:1927–1947
Wimmer G, Witkovsky V (2007) Univariate linear calibration via replicated errors-in-variables model. J Stat Comput Simul 77:213–227
Witkovsky V (2012) Estimation, testing, and prediction regions of the fixed and random effects by solving the Henderson’s mixed model equations. Meas Sci Rev 12:234–248
Wulff SS, Robinson TJ (2009) Assessing the uncertainty of regression estimates in a response surface model for repeated measures. Qual Technol Quant Manag 6:309–324
Zyskind G (1967) On canonical forms, non-negative covariance matrices and best and simple least squares linear estimators in linear models. Ann Math Stat 38:1092–1109
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Details
Details
1.1 The ANOVA testing problem in Sect. 5 is a special case of Sect. 2
Testing the equality of group means in the balanced one-way ANOVA fixed-effects model in Sect. 5 is a special case of the testing problem in Sect. 2 with \(\mathbf {y}= [ \begin{array}{*{7}{c}} y_{11}&y_{12}&\cdots&y_{1v}&y_{21}&\cdots&y_{tv} \end{array} ]' \), \({\varvec{\beta }}= [ \begin{array}{*{4}{c}} \mu _1&\mu _2&\cdots&\mu _t \end{array} ]' \), \(\mathbf {X}= \mathbf {I}_t \otimes \mathbf {1}_v\), \({\varvec{\Sigma }}= \sigma ^2\mathbf {I}_n\), \({\varvec{\theta }}= \sigma ^2\), \(\mathbf {G}_1 = \mathbf {I}_n\), \(\mathbf {L}' = [ \begin{array}{*{2}{c}} \mathbf {I}_{t - 1}&-\mathbf {1}_{t - 1} \end{array} ]. \)
1.2 Calculations for the ANOVA testing problem
One can calculate, \({\hat{{\varvec{\beta }}}} = [ \begin{array}{*{4}{c}} \bar{y}_{1\cdot }&\bar{y}_{2\cdot }&\cdots&\bar{y}_{t\cdot } \end{array} ]' \), \({\hat{{\varvec{\Phi }}}}=({\hat{\sigma }}^2/v)\mathbf {I}_t\), \({\hat{w}}_{11}=2{\hat{\sigma }}^4/(n-t)\), \({\hat{\mathbf {P}}}_1=-(v/{\hat{\sigma }}^4)\mathbf {I}_t\), \({\hat{{\varvec{\Psi }}}} =({\hat{\sigma }}^2/v)(\mathbf {I}_t - t^{-1}\mathbf {1}_t\mathbf {1}'_t)\), \(A_1 = 2(t - 1)^2/(n - t)\), \(A_2 = 2(t - 1)/(n - t)\), \(B = (t + 5)/(n - t)\). Moreover, \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}} = {\hat{{\varvec{\Phi }}}}\) by the following lemma.
Lemma 6
If the model satisfies Zyskind’s condition, then \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}} = {\hat{{\varvec{\Phi }}}}\).
Note that every fixed-effects linear model satisfies Zyskind’s condition, because the column space of \({\varvec{\Sigma }}\mathbf {X}= \sigma ^2\mathbf {X}\) is contained in the column space of \(\mathbf {X}\).
1.3 Proof of Lemma 6
We can show \({\hat{{\varvec{\Lambda }}}} = \mathbf {0}\) by showing \(\mathbf {Q}_{ij} - \mathbf {P}_i{\varvec{\Phi }}\mathbf {P}_j = \mathbf {0}\). Zyskind’s condition is equivalent to the condition that \({\varvec{\Sigma }}\mathbf {J}= \mathbf {J}{\varvec{\Sigma }}\) for all allowable \({\varvec{\Sigma }}\) where \(\mathbf {J}\) is the orthogonal projection operator on the column space of \(\mathbf {X}\) (Zyskind 1967, Theorem 2). The two technical assumptions mentioned in Sect. 2 imply that \(\mathbf {G}_i\mathbf {J}= \mathbf {J}\mathbf {G}_i\) for all i. Moreover, \({\varvec{\Sigma }}\mathbf {J}= \mathbf {J}{\varvec{\Sigma }}\) implies \({\varvec{\Sigma }}^{-1}\mathbf {J}= \mathbf {J}{\varvec{\Sigma }}^{-1}\). Next, recall that the GLSE is \((\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})^{-1}\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {y}\) and the LSE is \((\mathbf {X}'\mathbf {X})^{-1}\mathbf {X}'\mathbf {y}\). It follows from the remark in Sect. 3 that Zyskind’s condition is also equivalent to the condition that \((\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})^{-1}\mathbf {X}'{\varvec{\Sigma }}^{-1} = (\mathbf {X}'\mathbf {X})^{-1}\mathbf {X}'\) for all allowable \({\varvec{\Sigma }}\), which implies \(\mathbf {X}{\varvec{\Phi }}\mathbf {X}'{\varvec{\Sigma }}^{-1} = \mathbf {X}(\mathbf {X}'\mathbf {X})^{-1}\mathbf {X}'= \mathbf {J}\). Now
1.4 \(m^\#\) and \(\lambda ^\#\) in the ANOVA case
Quantities from A.2 above can be substituted into (4.1) to obtain \(E^\#\) and \(V^\#\) shown in (5.2), which in turn can be used in (4.3) to obtain (5.3). The inequality in (5.3b) holds because \((m^\# - 2)(n - t + 2) = m^\#(n - t) + 2[m^\# - (n - t + 2)] > m^\#(n - t)\), because formula (5.3a) implies \(m^\# > 4 + n - t\), which is \( > n - t + 2\).
1.5 The Hotelling testing problem in Sect. 6 is a special case of Sect. 2
The one-sample Hotelling T-squared test in Sect. 6 is a special case of the testing problem in Sect. 2 with \(\mathbf {y}= [ \begin{array}{*{4}{c}} \mathbf {y}'_1&\mathbf {y}'_2&\cdots&\mathbf {y}'_v \end{array} ]' \), \({\varvec{\beta }}= {\varvec{\mu }}\), \(\mathbf {X}= \mathbf {1}_v \otimes \mathbf {I}_p\), \({\varvec{\Sigma }}= \mathbf {I}_v \otimes {\varvec{\Sigma }}_p\), \({\varvec{\Sigma }}_p =\) an arbitrary positive definite \(p \times p\) matrix, \({\varvec{\theta }}=\) an \(r \times 1\) vector whose entries are the entries on and above the main diagonal of \({\varvec{\Sigma }}_p = [\sigma _{ij}]_{p \times p}\). To express the linear structure of the variance–covariance matrix it is convenient to use double subscripting: \({\varvec{\Sigma }}= \sum \{\sigma _{ij}(\mathbf {I}_v \otimes \mathbf {G}_{pij}):1 \le i \le j \le p\}\) where \(\mathbf {G}_{pii} = \mathbf {u}_i\mathbf {u}'_i\), \(\mathbf {G}_{pij} = \mathbf {u}_i\mathbf {u}'_j + \mathbf {u}_j\mathbf {u}'_i\) for \(i \ne j\), \(\mathbf {u}_i =\) a \(p \times 1\) vector with a 1 in the ith position and 0’s elsewhere. Also, \(\mathbf {L}= \mathbf {I}_p\).
1.6 Calculations for the Hotelling testing problem
One can calculate \({\hat{{\varvec{\beta }}}}=\bar{\mathbf {y}}_{\cdot }\), \({\hat{{\varvec{\Phi }}}}=(1/v)\mathbf {S}\), \({\hat{w}}_{ij,fg}=({\hat{\sigma }}_{if}{\hat{\sigma }}_{jg} + {\hat{\sigma }}_{ig}{\hat{\sigma }}_{jf})/(v-1)\) where \({\hat{\sigma }}_{ij}\) denotes an entry of the matrix \(\mathbf {S}= {\hat{{\varvec{\Sigma }}}}_{p\text {REML}}\), \({\hat{\mathbf {P}}}_{ii} = - v\mathbf {S}^{-1}\mathbf {u}_i\mathbf {u}'_i\mathbf {S}^{-1}\), \({\hat{\mathbf {P}}}_{ij} = - v\mathbf {S}^{-1}(\mathbf {u}_i\mathbf {u}'_j + \mathbf {u}_j\mathbf {u}'_i)\mathbf {S}^{-1}\) for \(i < j\), \({\hat{{\varvec{\Psi }}}} = (1/v)\mathbf {S}\), \(A_1 =2p/(v - 1)\), \(A_2 = p(p + 1)/(v - 1)\), \(B = (3p + 4)/(v - 1)\). In calculating \(A_1\) and \(A_2\) we use the identities \({{\mathrm{tr}}}(\mathbf {u}_i\mathbf {u}'_j\mathbf {S}^{-1}) = \mathbf {u}'_j\mathbf {S}^{-1}\mathbf {u}_i = {\hat{\sigma }}^{ji} = \) the (j, i) entry of the matrix \(\mathbf {S}^{-1}\) and \(\sum _{j = 1}^p {\hat{\sigma }}_{ij}{\hat{\sigma }}^{jg} =\delta _{ig} =\) the Kronecker delta \(=\) the (i, g) entry of the matrix \(\mathbf {I}\) (because \(\mathbf {S}\mathbf {S}^{-1} = \mathbf {I}\)). Lemma 6 implies \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}} = {\hat{{\varvec{\Phi }}}}\) because the model satisfies Zyskind’s condition: the column space of \({\varvec{\Sigma }}\mathbf {X}= (\mathbf {I}_v \otimes {\varvec{\Sigma }}_p)(\mathbf {1}_v \otimes \mathbf {I}_p) = \mathbf {1}_v \otimes {\varvec{\Sigma }}_p = (\mathbf {1}_v \otimes \mathbf {I}_p){\varvec{\Sigma }}_p = \mathbf {X}{\varvec{\Sigma }}_p\) is contained in the column space of \(\mathbf {X}\).
1.7 \(m^\#\) and \(\lambda ^\#\) in the Hotelling case
Quantities from A.6 above can be substituted into (4.1) to obtain \(E^\#\) and \(V^\#\) shown in (6.2), which in turn can be used in (4.3) to obtain (6.3).
1.8 Proof of Lemma 1(a)
Suppose \(A_1/A_2=\ell \) as in special case 1, which implies \(B=[(\ell +6)A_2]/(2\ell )\). Formulas (7.2) and (7.3) were deliberately derived so that when \(A_1/A_2=\ell \), then \(\mathbf {d}= (\ell -2,2,4)/(\ell +6)\), hence \(\mathbf {d}B= (\ell -2,2,4)A_2/(2\ell )\). Thus (7.4) becomes
Via the formulas in (4.3) with superscripts # replaced by *, we obtain \(\ell \rho ^* = \{1 + [(\ell - 2)/(2\ell )]A_2\}/ \{1-(2/\ell )A_2\}\), \(m^* = 2\ell /A_2\) and \(\lambda ^* = 1\).
Formulas (8.2) and (8.3) give us \(e_0 = 0\), \(e_1 = \ell + 6\), \(m^\dag = (\ell + 6)/(QA_2) = 2\ell /A_2\). Formula (9.3) yields \(m^\ddag = 2\ell (\ell + 2)/[(\ell + 2)A_2] = 2\ell /A_2\). Thus \(m^* = m^\dag = m^\ddag \). Because \(E^* = E^\dag = E^\ddag \), this implies \(\lambda ^* = \lambda ^\dag = \lambda ^\ddag \).
1.9 Proof of Lemma 1(b)
Suppose \(A_1/A_2 = 2/(\ell + 1)\) as in special case 2, which implies \(B = [(3\ell + 4)A_2]/[\ell (\ell + 1)]\). Formulas (7.2) and (7.3) were deliberately derived so that when \(A_1/A_2 = 2/(\ell + 1)\), then \(\mathbf {d}= (-1,\ell +1,\ell +3)/(3\ell +4)\), hence \(\mathbf {d}B= (-1,\ell +1,\ell +3)A_2/[\ell (\ell + 1)]\). Thus (7.4) becomes
Via the formulas in (4.3) with superscripts # replaced by *, we obtain \(\ell \rho ^* = \{1-[1/[\ell (\ell + 1)]]A_2\}/\{1-[(\ell + 3)/[\ell (\ell + 1)]]A_2\}\), and \(m^*\) and \(\lambda ^*\) are as displayed in the lemma.
Formulas (8.2) and (8.3) give us \(e_0 = 1 - \ell \), \(e_1 = 3\ell + 4\), \(m^\dag = (1 - \ell ) + (3\ell + 4)/(QA_2) = m^*\). We have \(A_1 = [2/(\ell + 1)]A_2\) and so formula (9.3) yields \(m^\ddag = \{2\ell (\ell + 2) + 2[2/(\ell + 1) - \ell ]A_2\} / \{[2/(\ell +1) +2]A_2\} = m^*\). So we see \(m^* = m^\dag = m^\ddag \), which then implies \(\lambda ^* = \lambda ^\dag = \lambda ^\ddag \).
1.10 Proof of Lemma 2
It suffices to show \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j) = {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j)\). For any \(p \times p\) matrix \(\mathbf {M}\), \(\mathbf {L}'\mathbf {M}\mathbf {L}\) is a \(1 \times 1\) matrix and hence is a scalar. Therefore
because
1.11 Proof of Lemma 3
It suffices to show \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j) = \ell {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j)\). The assumptions of the lemma imply
We are assuming \(\mathbf {L}' = [ \begin{array}{*{2}{c}} \mathbf {0}&\mathbf {L}'_2 \end{array} ]\), so
Note that \(\mathbf {P}_i\) in Sect. 3 can be expressed as \(\mathbf {P}_i = (\partial /\partial \theta _i)(\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})\). Under the assumptions of the lemma,
where \(g_i({\varvec{\theta }}) = f({\varvec{\theta }})^{-1}(\partial /\partial \theta _i)f({\varvec{\theta }})\) and \(\mathbf {M}= \mathbf {C}^{-1}\mathbf {L}_2(\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2)^{-1}\mathbf {L}'_2\). Now \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i) = g_i({\varvec{\theta }}){{\mathrm{tr}}}(\mathbf {M})\) and \({{\mathrm{tr}}}(\mathbf {M}) = {{\mathrm{tr}}}[(\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2)^{-1}\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2] = {{\mathrm{tr}}}(\mathbf {I}_\ell ) = \ell \). Therefore \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j) = \ell ^2g_i({\varvec{\theta }})g_j({\varvec{\theta }})\). Also,
and \(\mathbf {M}^2 = \mathbf {M}\), so \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j) = \ell g_i({\varvec{\theta }})g_j({\varvec{\theta }})\).
1.12 Proof of Theorem 2
The theorem follows from Lemmas 1(a) and 3, if we show that a BIBD model satisfies the conditions of Lemma 3. First note that \(\mathbf {B}\mathbf {B}' = k\mathbf {B}(\mathbf {B}'\mathbf {B})^{-1}\mathbf {B}' = k\mathbf {P}_\mathbf {B}\) where \(\mathbf {P}_\mathbf {B}\) denotes the orthogonal projection matrix on the column space of \(\mathbf {B}\). Now we can write
Write \(\mathbf {T}^* = \mathbf {T}\mathbf {U}\) where \(\mathbf {U}\) is the \(t \times (t - 1)\) matrix obtained by subtracting the last column of \(\mathbf {I}_t\) from each of the first \(t - 1\) columns of \(\mathbf {I}_t\). Then
because (1) \(\mathbf {1}'_n\mathbf {P}_\mathbf {B}= \mathbf {1}'_n\) and (2) \(\mathbf {1}'_n\mathbf {T}\mathbf {U}= \mathbf {0}\). (1) is true because \(\mathbf {1}_n\) is in the column space of \(\mathbf {B}\), and (2) is true because \(\mathbf {1}'_n\mathbf {T}= r\mathbf {1}'_t\) and \(\mathbf {1}'_t\mathbf {U}= \mathbf {0}\). Next write
It can be shown (Khuri et al. 1998, p. 176) that \(\mathbf {N}\mathbf {N}' = (r - g)\mathbf {I}_t + g\mathbf {1}_t\mathbf 1'_t\). Then
and hence \(\mathbf {X}'_2{\varvec{\Sigma }}^{-1}\mathbf {X}_2 = f({\varvec{\theta }})\mathbf {C}\) where \(f({\varvec{\theta }}) = \sigma _e^{-2}\{r - (r - g)\sigma _b^2 / (k\sigma _b^2 + \sigma _e^2)\}\) and \(\mathbf {C}= \mathbf {U}'\mathbf {U}\).
1.13 Proof of Lemma 5
We will assume that the REML estimator \({\hat{{\varvec{\theta }}}}(\mathbf {y})\) can be characterized as the unique solution of the REML equations (see Jiang 2007, p. 13):
where \({\varvec{\Gamma }}= {\varvec{\Gamma }}({\varvec{\theta }}) = {\varvec{\Sigma }}^{-1} - {\varvec{\Sigma }}^{-1}\mathbf {X}{\varvec{\Phi }}\mathbf {X}'{\varvec{\Sigma }}^{-1}\). The REML estimator is known to be location-invariant (Kackar and Harville 1984, p. 854); that is, \({\hat{{\varvec{\theta }}}}(\mathbf {y}+ \mathbf {X}\mathbf {b}) = {\hat{{\varvec{\theta }}}}(\mathbf {y})\) for all \(\mathbf {b}\in \mathbb {R}^p\). Moreover, it is scale-equivariant in the sense that \({\hat{{\varvec{\theta }}}}(c\mathbf {y}) = c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\) for \(c \ne 0\), which can be verified as follows. The estimate \({\hat{{\varvec{\theta }}}}(c\mathbf {y})\) is the unique solution to the REML equations (*) when the data vector is \(c\mathbf {y}\): \({{\mathrm{tr}}}\{{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y})]\mathbf {G}_i\} = (c\mathbf {y})'{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y})](c\mathbf {y})\). If we can show that \(c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\) is also a solution, then we can conclude \({\hat{{\varvec{\theta }}}}(c\mathbf {y}) = c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\). So we want to show \({{\mathrm{tr}}}\{{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i\} = (c\mathbf {y})'{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})](c\mathbf {y})\). First note that \({\varvec{\Gamma }}(c{\varvec{\theta }}) = c^{-1}{\varvec{\Gamma }}({\varvec{\theta }})\) because \({\varvec{\Sigma }}(c{\varvec{\theta }}) = c{\varvec{\Sigma }}({\varvec{\theta }})\) and \({\varvec{\Phi }}(c{\varvec{\theta }}) = c{\varvec{\Phi }}({\varvec{\theta }})\). Now
For convenience one can combine the properties of location-invariance and scale-equivariance in a single equation: \({\hat{{\varvec{\theta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\).
By its definition \(F_{{\mathrm {KR}}} = \frac{1}{\ell }[\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})]'[\mathbf {L}'{\hat{{\varvec{\Phi }}}}_{\mathrm {A}}(\mathbf {y})\mathbf {L}]^{-1}[\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})]\). To compare this with \(F_{{\mathrm {KR}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})\), first observe that \({\hat{{\varvec{\Sigma }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = {\varvec{\Sigma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})] = {\varvec{\Sigma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})] = c^2{\varvec{\Sigma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})] = c^2{\hat{{\varvec{\Sigma }}}}(\mathbf {y})\). Therefore
and \(\mathbf {L}'{\hat{{\varvec{\beta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}_0) = \mathbf {L}'[c{\hat{{\varvec{\beta }}}}(\mathbf {y}) + \mathbf {b}_0] = c\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})\) when \(\mathbf {L}'\mathbf {b}_0 = \mathbf {0}\).
Next check that \({\hat{{\varvec{\Phi }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\Phi }}}}(\mathbf {y})\), \({\hat{\mathbf {P}}}_i(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^{-4}{\hat{\mathbf {P}}}_i(\mathbf {y})\), \({\hat{\mathbf {Q}}}_{ij}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^{-6}{\hat{\mathbf {Q}}}_{ij}(\mathbf {y})\), \({\hat{{\varvec{\Gamma }}}}(c\mathbf {y}+\mathbf {X}\mathbf {b}) = c^{-2}{\hat{{\varvec{\Gamma }}}}(\mathbf {y})\). Recall that \({\hat{\mathbf {W}}} = [{\hat{w}}_{ij}]_{r\times r} = {\tilde{\mathbf {W}}}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\) where \({\tilde{\mathbf {W}}} = [{\tilde{w}}_{ij}]_{r\times r} = \mathfrak {I}^{-1}\) and \(\mathfrak {I}\) is the expected information matrix. We can write \(\mathfrak {I} = [{\tilde{w}}^{ij}]_{r\times r}\) and \({\tilde{w}}^{ij} = \frac{1}{2}{{\mathrm{tr}}}({\varvec{\Gamma }}\mathbf {G}_i{\varvec{\Gamma }}\mathbf {G}_j)\) (see (1.21) in Jiang 2007). One can see that \({\hat{w}}^{ij}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^{-4}{\hat{w}}^{ij}(\mathbf {y})\), \({\hat{w}}_{ij}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^4{\hat{w}}_{ij}(\mathbf {y})\), \({\hat{{\varvec{\Lambda }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\Lambda }}}}(\mathbf {y})\), and \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\Phi }}}}_{\mathrm {A}}(\mathbf {y})\). Now
1.14 Tables of average values of simulated m and \(\lambda \)
The average values of the simulated m and \(\lambda \) are shown in Tables 4 and 5.
Rights and permissions
About this article
Cite this article
Alnosaier, W., Birkes, D. Inner workings of the Kenward–Roger test. Metrika 82, 195–223 (2019). https://doi.org/10.1007/s00184-018-0669-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-018-0669-9