Skip to main content
Log in

Inner workings of the Kenward–Roger test

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

For testing a linear hypothesis about fixed effects in a normal mixed linear model, a popular approach is to use a Wald test, in which the test statistic is assumed to have a null distribution that is approximately chi-squared. This approximation is questionable, however, for small samples. In 1997 Kenward and Roger constructed a test that addresses this problem. They altered the Wald test in three ways: (a) adjusting the test statistic, (b) approximating the null distribution by a scaled F distribution, and (c) modifying the formulas to achieve an exact F test in two special cases. Alterations (a) and (b) lead to formulas that are somewhat complicated but can be explained by using Taylor series approximations and a few convenient assumptions. The modified formulas used in alteration (c), however, are more mysterious. Restricting attention to models with linear variance–covariance structure, we provide details of a derivation that justifies these formulas. We show that similar but different derivations lead to different formulas that also produce exact F tests in the two special cases and are equally justifiable. A simulation study was done for testing the equality of treatment effects in block-design models. Tests based on the different derivations performed very similarly. Moreover, the simulations confirm that alteration (c) is worthwhile. The Kenward–Roger test showed greater accuracy in its p values than did the unmodified version of the test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arnau J, Bono R, Vallejo G (2009) Analyzing small samples of repeated measures data with the mixed-model adjusted F test. Commun Stat Simul 38:1083–1103

    Article  MathSciNet  MATH  Google Scholar 

  • Chen X (2006) The adjustment of random baseline measurements in treatment effect estimation. J Stat Plan Infer 136:4161–4175

    Article  MathSciNet  MATH  Google Scholar 

  • Cochran W, Cox G (1957) Experimental designs. Wiley, New York

    MATH  Google Scholar 

  • Green P (1974) On the design of choice experiments involving multifactor alternatives. J Consum Res 1:61–68

    Article  Google Scholar 

  • Guiard V, Spilke J, Danicke S (2003) Evaluation and interpretation of results for three cross-over designs. Arch Anim Nutr 57:177–195

    Article  Google Scholar 

  • Halekoh U, Højsgaard S (2014) A Kenward–Roger approximation and parametric bootstrap methods for tests in linear mixed models—the R package pbkrtest. J Stat Softw 59(9):1–32

    Article  Google Scholar 

  • Harville DA, Jeske DR (1992) Mean squared error of estimation or prediction under a general linear model. J Am Stat Assoc 87:724–731

    Article  MathSciNet  MATH  Google Scholar 

  • Jiang J (2007) Linear and generalized linear mixed models and their applications. Springer, New York

    MATH  Google Scholar 

  • John PWM (1971) Statistical design and analysis of experiments. Macmillan, New York

    MATH  Google Scholar 

  • Kackar RN, Harville DA (1984) Approximations for standard errors of estimators of fixed and random effects in mixed liner models. J Am Stat Assoc 79:853–862

    MATH  Google Scholar 

  • Kenward MG, Roger JH (1997) Small sample inference for fixed effects from restricted maximum likelihood. Biometrics 53:983–997

    Article  MATH  Google Scholar 

  • Khuri AI, Mathew T, Sinha BK (1998) Statistical tests for mixed linear models. Wiley, New York

    Book  MATH  Google Scholar 

  • Kowalchuk RK, Keselman HJ, Algina J, Wolfinger RD (2004) The analysis of repeated measurements with mixed-model adjusted F tests. Educ Psychol Meas 64:224–242

    Article  MathSciNet  MATH  Google Scholar 

  • Kuehl RO (2000) Design of experiments: statistical principles of research design and analysis, 2nd edn. Duxbury Press, Pacific Grove

    MATH  Google Scholar 

  • Livacic-Rojas P, Vallejo G, Fernandez P (2010) Analysis of type I error rates of univariate and multivariate procedures in repeated measures designs. Commun Stat Simul 39:624–640

    Article  MathSciNet  MATH  Google Scholar 

  • Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London

    MATH  Google Scholar 

  • SAS Institute Inc (2015) SAS/STAT®14.1 User’s guide. SAS Institute Inc., Cary

  • Schaalje GB, McBride JB, Fellingham GW (2002) Adequacy of approximations to distributions of test statistics in complex mixed linear models. J Agric Biol Environ Stat 7:512–524

    Article  Google Scholar 

  • Searle SR, Casella G, McCulloch CE (1992) Variance components. Wiley, New York

    Book  MATH  Google Scholar 

  • Spilke J, Piepho H, Hu X (2005) A simulation study on tests of hypotheses and confidence intervals for fixed effects in mixed models for blocked experiments with missing data. J Agric Biol Environ Stat 10:374–389

    Article  Google Scholar 

  • Stroup WW (1999) On using proc mixed for longitudinal data. In: Annual conference on applied statistics in agriculture, Paper 5. http://newprairiepress.org/agstatconference/1999/proceedings/5. Accessed 4 July 2016

  • VanLeeuwen DM, Birkes DS, Seely JF (1999) Balance and orthogonality in designs for mixed classification models. Ann Stat 27:1927–1947

    Article  MathSciNet  MATH  Google Scholar 

  • Wimmer G, Witkovsky V (2007) Univariate linear calibration via replicated errors-in-variables model. J Stat Comput Simul 77:213–227

    Article  MathSciNet  MATH  Google Scholar 

  • Witkovsky V (2012) Estimation, testing, and prediction regions of the fixed and random effects by solving the Henderson’s mixed model equations. Meas Sci Rev 12:234–248

    Article  Google Scholar 

  • Wulff SS, Robinson TJ (2009) Assessing the uncertainty of regression estimates in a response surface model for repeated measures. Qual Technol Quant Manag 6:309–324

    Article  MathSciNet  Google Scholar 

  • Zyskind G (1967) On canonical forms, non-negative covariance matrices and best and simple least squares linear estimators in linear models. Ann Math Stat 38:1092–1109

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Birkes.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Details

Details

1.1 The ANOVA testing problem in Sect. 5 is a special case of Sect. 2

Testing the equality of group means in the balanced one-way ANOVA fixed-effects model in Sect. 5 is a special case of the testing problem in Sect. 2 with \(\mathbf {y}= [ \begin{array}{*{7}{c}} y_{11}&y_{12}&\cdots&y_{1v}&y_{21}&\cdots&y_{tv} \end{array} ]' \), \({\varvec{\beta }}= [ \begin{array}{*{4}{c}} \mu _1&\mu _2&\cdots&\mu _t \end{array} ]' \), \(\mathbf {X}= \mathbf {I}_t \otimes \mathbf {1}_v\), \({\varvec{\Sigma }}= \sigma ^2\mathbf {I}_n\), \({\varvec{\theta }}= \sigma ^2\), \(\mathbf {G}_1 = \mathbf {I}_n\), \(\mathbf {L}' = [ \begin{array}{*{2}{c}} \mathbf {I}_{t - 1}&-\mathbf {1}_{t - 1} \end{array} ]. \)

1.2 Calculations for the ANOVA testing problem

One can calculate, \({\hat{{\varvec{\beta }}}} = [ \begin{array}{*{4}{c}} \bar{y}_{1\cdot }&\bar{y}_{2\cdot }&\cdots&\bar{y}_{t\cdot } \end{array} ]' \), \({\hat{{\varvec{\Phi }}}}=({\hat{\sigma }}^2/v)\mathbf {I}_t\), \({\hat{w}}_{11}=2{\hat{\sigma }}^4/(n-t)\), \({\hat{\mathbf {P}}}_1=-(v/{\hat{\sigma }}^4)\mathbf {I}_t\), \({\hat{{\varvec{\Psi }}}} =({\hat{\sigma }}^2/v)(\mathbf {I}_t - t^{-1}\mathbf {1}_t\mathbf {1}'_t)\), \(A_1 = 2(t - 1)^2/(n - t)\), \(A_2 = 2(t - 1)/(n - t)\), \(B = (t + 5)/(n - t)\). Moreover, \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}} = {\hat{{\varvec{\Phi }}}}\) by the following lemma.

Lemma 6

If the model satisfies Zyskind’s condition, then \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}} = {\hat{{\varvec{\Phi }}}}\).

Note that every fixed-effects linear model satisfies Zyskind’s condition, because the column space of \({\varvec{\Sigma }}\mathbf {X}= \sigma ^2\mathbf {X}\) is contained in the column space of \(\mathbf {X}\).

1.3 Proof of Lemma 6

We can show \({\hat{{\varvec{\Lambda }}}} = \mathbf {0}\) by showing \(\mathbf {Q}_{ij} - \mathbf {P}_i{\varvec{\Phi }}\mathbf {P}_j = \mathbf {0}\). Zyskind’s condition is equivalent to the condition that \({\varvec{\Sigma }}\mathbf {J}= \mathbf {J}{\varvec{\Sigma }}\) for all allowable \({\varvec{\Sigma }}\) where \(\mathbf {J}\) is the orthogonal projection operator on the column space of \(\mathbf {X}\) (Zyskind 1967, Theorem 2). The two technical assumptions mentioned in Sect. 2 imply that \(\mathbf {G}_i\mathbf {J}= \mathbf {J}\mathbf {G}_i\) for all i. Moreover, \({\varvec{\Sigma }}\mathbf {J}= \mathbf {J}{\varvec{\Sigma }}\) implies \({\varvec{\Sigma }}^{-1}\mathbf {J}= \mathbf {J}{\varvec{\Sigma }}^{-1}\). Next, recall that the GLSE is \((\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})^{-1}\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {y}\) and the LSE is \((\mathbf {X}'\mathbf {X})^{-1}\mathbf {X}'\mathbf {y}\). It follows from the remark in Sect. 3 that Zyskind’s condition is also equivalent to the condition that \((\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})^{-1}\mathbf {X}'{\varvec{\Sigma }}^{-1} = (\mathbf {X}'\mathbf {X})^{-1}\mathbf {X}'\) for all allowable \({\varvec{\Sigma }}\), which implies \(\mathbf {X}{\varvec{\Phi }}\mathbf {X}'{\varvec{\Sigma }}^{-1} = \mathbf {X}(\mathbf {X}'\mathbf {X})^{-1}\mathbf {X}'= \mathbf {J}\). Now

$$\begin{aligned} \mathbf {P}_i{\varvec{\Phi }}\mathbf {P}_j&= \mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {G}_i{\varvec{\Sigma }}^{-1}\mathbf {X}{\varvec{\Phi }}\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {G}_j{\varvec{\Sigma }}^{-1}\mathbf {X}\\&= \mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {G}_i{\varvec{\Sigma }}^{-1}\mathbf {J}\mathbf {G}_j{\varvec{\Sigma }}^{-1}\mathbf {X}= \mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {G}_i{\varvec{\Sigma }}^{-1}\mathbf {G}_j{\varvec{\Sigma }}^{-1}\mathbf {J}\mathbf {X}\\&= \mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {G}_i{\varvec{\Sigma }}^{-1}\mathbf {G}_j{\varvec{\Sigma }}^{-1}\mathbf {X}=\mathbf {Q}_{ij}. \end{aligned}$$

1.4 \(m^\#\) and \(\lambda ^\#\) in the ANOVA case

Quantities from A.2 above can be substituted into (4.1) to obtain \(E^\#\) and \(V^\#\) shown in (5.2), which in turn can be used in (4.3) to obtain (5.3). The inequality in (5.3b) holds because \((m^\# - 2)(n - t + 2) = m^\#(n - t) + 2[m^\# - (n - t + 2)] > m^\#(n - t)\), because formula (5.3a) implies \(m^\# > 4 + n - t\), which is \( > n - t + 2\).

1.5 The Hotelling testing problem in Sect. 6 is a special case of Sect. 2

The one-sample Hotelling T-squared test in Sect. 6 is a special case of the testing problem in Sect. 2 with \(\mathbf {y}= [ \begin{array}{*{4}{c}} \mathbf {y}'_1&\mathbf {y}'_2&\cdots&\mathbf {y}'_v \end{array} ]' \), \({\varvec{\beta }}= {\varvec{\mu }}\), \(\mathbf {X}= \mathbf {1}_v \otimes \mathbf {I}_p\), \({\varvec{\Sigma }}= \mathbf {I}_v \otimes {\varvec{\Sigma }}_p\), \({\varvec{\Sigma }}_p =\) an arbitrary positive definite \(p \times p\) matrix, \({\varvec{\theta }}=\) an \(r \times 1\) vector whose entries are the entries on and above the main diagonal of \({\varvec{\Sigma }}_p = [\sigma _{ij}]_{p \times p}\). To express the linear structure of the variance–covariance matrix it is convenient to use double subscripting: \({\varvec{\Sigma }}= \sum \{\sigma _{ij}(\mathbf {I}_v \otimes \mathbf {G}_{pij}):1 \le i \le j \le p\}\) where \(\mathbf {G}_{pii} = \mathbf {u}_i\mathbf {u}'_i\), \(\mathbf {G}_{pij} = \mathbf {u}_i\mathbf {u}'_j + \mathbf {u}_j\mathbf {u}'_i\) for \(i \ne j\), \(\mathbf {u}_i =\) a \(p \times 1\) vector with a 1 in the ith position and 0’s elsewhere. Also, \(\mathbf {L}= \mathbf {I}_p\).

1.6 Calculations for the Hotelling testing problem

One can calculate \({\hat{{\varvec{\beta }}}}=\bar{\mathbf {y}}_{\cdot }\), \({\hat{{\varvec{\Phi }}}}=(1/v)\mathbf {S}\), \({\hat{w}}_{ij,fg}=({\hat{\sigma }}_{if}{\hat{\sigma }}_{jg} + {\hat{\sigma }}_{ig}{\hat{\sigma }}_{jf})/(v-1)\) where \({\hat{\sigma }}_{ij}\) denotes an entry of the matrix \(\mathbf {S}= {\hat{{\varvec{\Sigma }}}}_{p\text {REML}}\), \({\hat{\mathbf {P}}}_{ii} = - v\mathbf {S}^{-1}\mathbf {u}_i\mathbf {u}'_i\mathbf {S}^{-1}\), \({\hat{\mathbf {P}}}_{ij} = - v\mathbf {S}^{-1}(\mathbf {u}_i\mathbf {u}'_j + \mathbf {u}_j\mathbf {u}'_i)\mathbf {S}^{-1}\) for \(i < j\), \({\hat{{\varvec{\Psi }}}} = (1/v)\mathbf {S}\), \(A_1 =2p/(v - 1)\), \(A_2 = p(p + 1)/(v - 1)\), \(B = (3p + 4)/(v - 1)\). In calculating \(A_1\) and \(A_2\) we use the identities \({{\mathrm{tr}}}(\mathbf {u}_i\mathbf {u}'_j\mathbf {S}^{-1}) = \mathbf {u}'_j\mathbf {S}^{-1}\mathbf {u}_i = {\hat{\sigma }}^{ji} = \) the (ji) entry of the matrix \(\mathbf {S}^{-1}\) and \(\sum _{j = 1}^p {\hat{\sigma }}_{ij}{\hat{\sigma }}^{jg} =\delta _{ig} =\) the Kronecker delta \(=\) the (ig) entry of the matrix \(\mathbf {I}\) (because \(\mathbf {S}\mathbf {S}^{-1} = \mathbf {I}\)). Lemma 6 implies \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}} = {\hat{{\varvec{\Phi }}}}\) because the model satisfies Zyskind’s condition: the column space of \({\varvec{\Sigma }}\mathbf {X}= (\mathbf {I}_v \otimes {\varvec{\Sigma }}_p)(\mathbf {1}_v \otimes \mathbf {I}_p) = \mathbf {1}_v \otimes {\varvec{\Sigma }}_p = (\mathbf {1}_v \otimes \mathbf {I}_p){\varvec{\Sigma }}_p = \mathbf {X}{\varvec{\Sigma }}_p\) is contained in the column space of \(\mathbf {X}\).

1.7 \(m^\#\) and \(\lambda ^\#\) in the Hotelling case

Quantities from A.6 above can be substituted into (4.1) to obtain \(E^\#\) and \(V^\#\) shown in (6.2), which in turn can be used in (4.3) to obtain (6.3).

1.8 Proof of Lemma 1(a)

Suppose \(A_1/A_2=\ell \) as in special case 1, which implies \(B=[(\ell +6)A_2]/(2\ell )\). Formulas (7.2) and (7.3) were deliberately derived so that when \(A_1/A_2=\ell \), then \(\mathbf {d}= (\ell -2,2,4)/(\ell +6)\), hence \(\mathbf {d}B= (\ell -2,2,4)A_2/(2\ell )\). Thus (7.4) becomes

$$\begin{aligned} V^* = \frac{2}{\ell }\frac{\left( 1 + \frac{\ell - 2}{2\ell }A_2\right) }{\left( 1 - \frac{1}{\ell }A_2\right) ^2\left( 1 - \frac{2}{\ell }A_2\right) } = \frac{2(E^*)^2}{\ell }\frac{\left( 1 + \frac{\ell - 2}{2\ell }A_2\right) }{\left( 1 - \frac{2}{\ell }A_2\right) }. \end{aligned}$$

Via the formulas in (4.3) with superscripts # replaced by *, we obtain \(\ell \rho ^* = \{1 + [(\ell - 2)/(2\ell )]A_2\}/ \{1-(2/\ell )A_2\}\), \(m^* = 2\ell /A_2\) and \(\lambda ^* = 1\).

Formulas (8.2) and (8.3) give us \(e_0 = 0\), \(e_1 = \ell + 6\), \(m^\dag = (\ell + 6)/(QA_2) = 2\ell /A_2\). Formula (9.3) yields \(m^\ddag = 2\ell (\ell + 2)/[(\ell + 2)A_2] = 2\ell /A_2\). Thus \(m^* = m^\dag = m^\ddag \). Because \(E^* = E^\dag = E^\ddag \), this implies \(\lambda ^* = \lambda ^\dag = \lambda ^\ddag \).

1.9 Proof of Lemma 1(b)

Suppose \(A_1/A_2 = 2/(\ell + 1)\) as in special case 2, which implies \(B = [(3\ell + 4)A_2]/[\ell (\ell + 1)]\). Formulas (7.2) and (7.3) were deliberately derived so that when \(A_1/A_2 = 2/(\ell + 1)\), then \(\mathbf {d}= (-1,\ell +1,\ell +3)/(3\ell +4)\), hence \(\mathbf {d}B= (-1,\ell +1,\ell +3)A_2/[\ell (\ell + 1)]\). Thus (7.4) becomes

$$\begin{aligned} V^* = \frac{2(E^*)^2}{\ell } \frac{\left[ 1 - \frac{1}{\ell (\ell + 1)}A_2\right] }{\left[ 1 -\frac{\ell + 3}{\ell (\ell + 1)}A_2\right] }. \end{aligned}$$

Via the formulas in (4.3) with superscripts # replaced by *, we obtain \(\ell \rho ^* = \{1-[1/[\ell (\ell + 1)]]A_2\}/\{1-[(\ell + 3)/[\ell (\ell + 1)]]A_2\}\), and \(m^*\) and \(\lambda ^*\) are as displayed in the lemma.

Formulas (8.2) and (8.3) give us \(e_0 = 1 - \ell \), \(e_1 = 3\ell + 4\), \(m^\dag = (1 - \ell ) + (3\ell + 4)/(QA_2) = m^*\). We have \(A_1 = [2/(\ell + 1)]A_2\) and so formula (9.3) yields \(m^\ddag = \{2\ell (\ell + 2) + 2[2/(\ell + 1) - \ell ]A_2\} / \{[2/(\ell +1) +2]A_2\} = m^*\). So we see \(m^* = m^\dag = m^\ddag \), which then implies \(\lambda ^* = \lambda ^\dag = \lambda ^\ddag \).

1.10 Proof of Lemma 2

It suffices to show \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j) = {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j)\). For any \(p \times p\) matrix \(\mathbf {M}\), \(\mathbf {L}'\mathbf {M}\mathbf {L}\) is a \(1 \times 1\) matrix and hence is a scalar. Therefore

$$\begin{aligned} {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j)&= {{\mathrm{tr}}}[{\varvec{\Phi }}\mathbf {L}(\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}(\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_j]\\&= (\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-2}{{\mathrm{tr}}}({\varvec{\Phi }}\mathbf {L}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_j)\\&= (\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-2}{{\mathrm{tr}}}(\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_j{\varvec{\Phi }}\mathbf {L})\\&= (\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-2}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_j{\varvec{\Phi }}\mathbf {L}\\&= [(\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}][(\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_j{\varvec{\Phi }}\mathbf {L}] = {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j), \end{aligned}$$

because

$$\begin{aligned} {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i)&= {{\mathrm{tr}}}[{\varvec{\Phi }}\mathbf {L}(\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i] = (\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}{{\mathrm{tr}}}({\varvec{\Phi }}\mathbf {L}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i)\\&= (\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}{{\mathrm{tr}}}(\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}) = (\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}\mathbf {P}_i{\varvec{\Phi }}\mathbf {L}. \end{aligned}$$

1.11 Proof of Lemma 3

It suffices to show \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j) = \ell {{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j)\). The assumptions of the lemma imply

$$\begin{aligned} \mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X}= \begin{bmatrix} \mathbf {X}'_1{\varvec{\Sigma }}^{-1}\mathbf {X}_1&\quad \mathbf {0} \\ \mathbf {0}&\quad f({\varvec{\theta }})\mathbf {C}\end{bmatrix} \quad \text {and}\quad {\varvec{\Phi }}= (\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})^{-1} = \begin{bmatrix} *&\quad \mathbf {0} \\ \mathbf {0}&\quad f({\varvec{\theta }})^{-1}\mathbf {C}^{-1} \end{bmatrix}. \end{aligned}$$

We are assuming \(\mathbf {L}' = [ \begin{array}{*{2}{c}} \mathbf {0}&\mathbf {L}'_2 \end{array} ]\), so

$$\begin{aligned} {\varvec{\Psi }}= {\varvec{\Phi }}\mathbf {L}(\mathbf {L}'{\varvec{\Phi }}\mathbf {L})^{-1}\mathbf {L}'{\varvec{\Phi }}= f({\varvec{\theta }})^{-1} \begin{bmatrix} \mathbf {0}&\quad \mathbf {0} \\ \mathbf {0}&\quad \mathbf {C}^{-1}\mathbf {L}_2(\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2)^{-1}\mathbf {L}'_2\mathbf {C}^{-1} \end{bmatrix}. \end{aligned}$$

Note that \(\mathbf {P}_i\) in Sect. 3 can be expressed as \(\mathbf {P}_i = (\partial /\partial \theta _i)(\mathbf {X}'{\varvec{\Sigma }}^{-1}\mathbf {X})\). Under the assumptions of the lemma,

$$\begin{aligned} \mathbf {P}_i = \begin{bmatrix} *&\quad \mathbf {0} \\ \mathbf {0}&\quad \frac{\partial f}{\partial \theta _i}({\varvec{\theta }})\mathbf {C}\end{bmatrix} \quad \text {and}\quad {\varvec{\Psi }}\mathbf {P}_i = g_i({\varvec{\theta }}) \begin{bmatrix} \mathbf {0}&\quad \mathbf {0} \\ \mathbf {0}&\quad \mathbf {M}\\ \end{bmatrix} \end{aligned}$$

where \(g_i({\varvec{\theta }}) = f({\varvec{\theta }})^{-1}(\partial /\partial \theta _i)f({\varvec{\theta }})\) and \(\mathbf {M}= \mathbf {C}^{-1}\mathbf {L}_2(\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2)^{-1}\mathbf {L}'_2\). Now \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i) = g_i({\varvec{\theta }}){{\mathrm{tr}}}(\mathbf {M})\) and \({{\mathrm{tr}}}(\mathbf {M}) = {{\mathrm{tr}}}[(\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2)^{-1}\mathbf {L}'_2\mathbf {C}^{-1}\mathbf {L}_2] = {{\mathrm{tr}}}(\mathbf {I}_\ell ) = \ell \). Therefore \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i){{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_j) = \ell ^2g_i({\varvec{\theta }})g_j({\varvec{\theta }})\). Also,

$$\begin{aligned} {\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j = g_i({\varvec{\theta }})g_j({\varvec{\theta }}) \begin{bmatrix} \mathbf {0}&\quad \mathbf {0} \\ \mathbf {0}&\quad \mathbf {M}^2 \\ \end{bmatrix} \end{aligned}$$

and \(\mathbf {M}^2 = \mathbf {M}\), so \({{\mathrm{tr}}}({\varvec{\Psi }}\mathbf {P}_i{\varvec{\Psi }}\mathbf {P}_j) = \ell g_i({\varvec{\theta }})g_j({\varvec{\theta }})\).

1.12 Proof of Theorem 2

The theorem follows from Lemmas 1(a) and 3, if we show that a BIBD model satisfies the conditions of Lemma 3. First note that \(\mathbf {B}\mathbf {B}' = k\mathbf {B}(\mathbf {B}'\mathbf {B})^{-1}\mathbf {B}' = k\mathbf {P}_\mathbf {B}\) where \(\mathbf {P}_\mathbf {B}\) denotes the orthogonal projection matrix on the column space of \(\mathbf {B}\). Now we can write

$$\begin{aligned}&{\varvec{\Sigma }}= (k\sigma _b^2 + \sigma _e^2)\mathbf {P}_\mathbf {B}+ \sigma _e^2(\mathbf {I}_n - \mathbf {P}_\mathbf {B}) \quad \text {and}\\&{\varvec{\Sigma }}^{-1} = (k\sigma _b^2 +\sigma _e^2)^{-1}\mathbf {P}_\mathbf {B}+ (\sigma _e^2)^{-1}(\mathbf {I}_n - \mathbf {P}_\mathbf {B}). \end{aligned}$$

Write \(\mathbf {T}^* = \mathbf {T}\mathbf {U}\) where \(\mathbf {U}\) is the \(t \times (t - 1)\) matrix obtained by subtracting the last column of \(\mathbf {I}_t\) from each of the first \(t - 1\) columns of \(\mathbf {I}_t\). Then

$$\begin{aligned} \mathbf {X}'_1{\varvec{\Sigma }}^{-1}\mathbf {X}_2 = (k\sigma _b^2 +\sigma _e^2)^{-1}\mathbf {1}'_n\mathbf {P}_\mathbf {B}\mathbf {T}\mathbf {U}+ (\sigma _e^2)^{-1}\mathbf {1}'_n(\mathbf {I}_n -\mathbf {P}_\mathbf {B})\mathbf {T}\mathbf {U}= \mathbf {0}, \end{aligned}$$

because (1) \(\mathbf {1}'_n\mathbf {P}_\mathbf {B}= \mathbf {1}'_n\) and (2) \(\mathbf {1}'_n\mathbf {T}\mathbf {U}= \mathbf {0}\). (1) is true because \(\mathbf {1}_n\) is in the column space of \(\mathbf {B}\), and (2) is true because \(\mathbf {1}'_n\mathbf {T}= r\mathbf {1}'_t\) and \(\mathbf {1}'_t\mathbf {U}= \mathbf {0}\). Next write

$$\begin{aligned} {\varvec{\Sigma }}^{-1}= & {} [(k\sigma _b^2 + \sigma _e^2)^{-1} -(\sigma _e^2)^{-1}]\mathbf {P}_\mathbf {B}+ (\sigma _e^2)^{-1}\mathbf {I}_n \\= & {} \sigma _e^{-2}\{\mathbf {I}_n - [\sigma _b^2 / (k\sigma _b^2 + \sigma _e^2)]\mathbf {B}\mathbf {B}'\}. \end{aligned}$$

It can be shown (Khuri et al. 1998, p. 176) that \(\mathbf {N}\mathbf {N}' = (r - g)\mathbf {I}_t + g\mathbf {1}_t\mathbf 1'_t\). Then

$$\begin{aligned} \mathbf {X}'_2{\varvec{\Sigma }}^{-1}\mathbf {X}_2&= \sigma _e^{-2}\{\mathbf {U}'\mathbf {T}'\mathbf {T}\mathbf {U}- [\sigma _b^2 / (k\sigma _b^2 + \sigma _e^2)]\mathbf {U}'\mathbf {T}'\mathbf {B}\mathbf {B}'\mathbf {T}\mathbf {U}\},\\ \mathbf {U}'\mathbf {T}'\mathbf {T}\mathbf {U}&= r\mathbf {U}'\mathbf {U},\\ \mathbf {U}'\mathbf {T}'\mathbf {B}\mathbf {B}'\mathbf {T}\mathbf {U}&= \mathbf {U}'\mathbf {N}\mathbf {N}'\mathbf {U}= (r - g)\mathbf {U}'\mathbf {U}, \end{aligned}$$

and hence \(\mathbf {X}'_2{\varvec{\Sigma }}^{-1}\mathbf {X}_2 = f({\varvec{\theta }})\mathbf {C}\) where \(f({\varvec{\theta }}) = \sigma _e^{-2}\{r - (r - g)\sigma _b^2 / (k\sigma _b^2 + \sigma _e^2)\}\) and \(\mathbf {C}= \mathbf {U}'\mathbf {U}\).

1.13 Proof of Lemma 5

We will assume that the REML estimator \({\hat{{\varvec{\theta }}}}(\mathbf {y})\) can be characterized as the unique solution of the REML equations (see Jiang 2007, p. 13):

$$\begin{aligned} {{\mathrm{tr}}}\{{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i\} = \mathbf {y}'{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {y}\quad \text {for } i = 1, \ldots ,r \end{aligned}$$
(*)

where \({\varvec{\Gamma }}= {\varvec{\Gamma }}({\varvec{\theta }}) = {\varvec{\Sigma }}^{-1} - {\varvec{\Sigma }}^{-1}\mathbf {X}{\varvec{\Phi }}\mathbf {X}'{\varvec{\Sigma }}^{-1}\). The REML estimator is known to be location-invariant (Kackar and Harville 1984, p. 854); that is, \({\hat{{\varvec{\theta }}}}(\mathbf {y}+ \mathbf {X}\mathbf {b}) = {\hat{{\varvec{\theta }}}}(\mathbf {y})\) for all \(\mathbf {b}\in \mathbb {R}^p\). Moreover, it is scale-equivariant in the sense that \({\hat{{\varvec{\theta }}}}(c\mathbf {y}) = c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\) for \(c \ne 0\), which can be verified as follows. The estimate \({\hat{{\varvec{\theta }}}}(c\mathbf {y})\) is the unique solution to the REML equations (*) when the data vector is \(c\mathbf {y}\): \({{\mathrm{tr}}}\{{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y})]\mathbf {G}_i\} = (c\mathbf {y})'{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y})](c\mathbf {y})\). If we can show that \(c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\) is also a solution, then we can conclude \({\hat{{\varvec{\theta }}}}(c\mathbf {y}) = c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\). So we want to show \({{\mathrm{tr}}}\{{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i\} = (c\mathbf {y})'{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})](c\mathbf {y})\). First note that \({\varvec{\Gamma }}(c{\varvec{\theta }}) = c^{-1}{\varvec{\Gamma }}({\varvec{\theta }})\) because \({\varvec{\Sigma }}(c{\varvec{\theta }}) = c{\varvec{\Sigma }}({\varvec{\theta }})\) and \({\varvec{\Phi }}(c{\varvec{\theta }}) = c{\varvec{\Phi }}({\varvec{\theta }})\). Now

$$\begin{aligned}&(c\mathbf {y})'{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})](c\mathbf {y}) = c^2\mathbf {y}'c^{-2}{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_ic^{-2}{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {y}\\&\quad = c^{-2}\mathbf {y}'{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {y}{\mathop {=}\limits ^{\text {by (*)}}} c^{-2}{{\mathrm{tr}}}\{{\varvec{\Gamma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i\} = {{\mathrm{tr}}}\{{\varvec{\Gamma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})]\mathbf {G}_i\}. \end{aligned}$$

For convenience one can combine the properties of location-invariance and scale-equivariance in a single equation: \({\hat{{\varvec{\theta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})\).

By its definition \(F_{{\mathrm {KR}}} = \frac{1}{\ell }[\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})]'[\mathbf {L}'{\hat{{\varvec{\Phi }}}}_{\mathrm {A}}(\mathbf {y})\mathbf {L}]^{-1}[\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})]\). To compare this with \(F_{{\mathrm {KR}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})\), first observe that \({\hat{{\varvec{\Sigma }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = {\varvec{\Sigma }}[{\hat{{\varvec{\theta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})] = {\varvec{\Sigma }}[c^2{\hat{{\varvec{\theta }}}}(\mathbf {y})] = c^2{\varvec{\Sigma }}[{\hat{{\varvec{\theta }}}}(\mathbf {y})] = c^2{\hat{{\varvec{\Sigma }}}}(\mathbf {y})\). Therefore

$$\begin{aligned} {\hat{{\varvec{\beta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})&= \{\mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})]^{-1}\mathbf {X}\}^{-1} \mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b})]^{-1}(c\mathbf {y}+ \mathbf {X}\mathbf {b})\\&= \{\mathbf {X}'[c^2{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}\mathbf {X}\}^{-1} \mathbf {X}'[c^2{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}(c\mathbf {y}+ \mathbf {X}\mathbf {b})\\&= \{\mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}\mathbf {X}\}^{-1} \mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}(c\mathbf {y}+ \mathbf {X}\mathbf {b})\\&= c\{\mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}\mathbf {X}\}^{-1} \mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}\mathbf {y}+ \{\mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}\mathbf {X}\}^{-1} \mathbf {X}'[{\hat{{\varvec{\Sigma }}}}(\mathbf {y})]^{-1}\mathbf {X}\mathbf {b}\\&= c{\hat{{\varvec{\beta }}}}(\mathbf {y}) + \mathbf {b}\end{aligned}$$

and \(\mathbf {L}'{\hat{{\varvec{\beta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}_0) = \mathbf {L}'[c{\hat{{\varvec{\beta }}}}(\mathbf {y}) + \mathbf {b}_0] = c\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})\) when \(\mathbf {L}'\mathbf {b}_0 = \mathbf {0}\).

Next check that \({\hat{{\varvec{\Phi }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\Phi }}}}(\mathbf {y})\), \({\hat{\mathbf {P}}}_i(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^{-4}{\hat{\mathbf {P}}}_i(\mathbf {y})\), \({\hat{\mathbf {Q}}}_{ij}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^{-6}{\hat{\mathbf {Q}}}_{ij}(\mathbf {y})\), \({\hat{{\varvec{\Gamma }}}}(c\mathbf {y}+\mathbf {X}\mathbf {b}) = c^{-2}{\hat{{\varvec{\Gamma }}}}(\mathbf {y})\). Recall that \({\hat{\mathbf {W}}} = [{\hat{w}}_{ij}]_{r\times r} = {\tilde{\mathbf {W}}}[{\hat{{\varvec{\theta }}}}(\mathbf {y})]\) where \({\tilde{\mathbf {W}}} = [{\tilde{w}}_{ij}]_{r\times r} = \mathfrak {I}^{-1}\) and \(\mathfrak {I}\) is the expected information matrix. We can write \(\mathfrak {I} = [{\tilde{w}}^{ij}]_{r\times r}\) and \({\tilde{w}}^{ij} = \frac{1}{2}{{\mathrm{tr}}}({\varvec{\Gamma }}\mathbf {G}_i{\varvec{\Gamma }}\mathbf {G}_j)\) (see (1.21) in Jiang 2007). One can see that \({\hat{w}}^{ij}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^{-4}{\hat{w}}^{ij}(\mathbf {y})\), \({\hat{w}}_{ij}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^4{\hat{w}}_{ij}(\mathbf {y})\), \({\hat{{\varvec{\Lambda }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\Lambda }}}}(\mathbf {y})\), and \({\hat{{\varvec{\Phi }}}}_{\mathrm {A}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}) = c^2{\hat{{\varvec{\Phi }}}}_{\mathrm {A}}(\mathbf {y})\). Now

$$\begin{aligned} F_{{\mathrm {KR}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}_0)&= \frac{1}{\ell } [\mathbf {L}'{\hat{{\varvec{\beta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}_0)]' [\mathbf {L}'{\hat{{\varvec{\Phi }}}}_\mathrm {A}(c\mathbf {y}+ \mathbf {X}\mathbf {b}_0)\mathbf {L}]^{-1} [\mathbf {L}'{\hat{{\varvec{\beta }}}}(c\mathbf {y}+ \mathbf {X}\mathbf {b}_0)]\\&= \frac{1}{\ell } [c\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})]' [\mathbf {L}'c^2{\hat{{\varvec{\Phi }}}}_\mathrm {A}(\mathbf {y})\mathbf {L}]^{-1} [c\mathbf {L}'{\hat{{\varvec{\beta }}}}(\mathbf {y})]\\&= F_{{\mathrm {KR}}}(\mathbf {y}). \end{aligned}$$

1.14 Tables of average values of simulated m and \(\lambda \)

The average values of the simulated m and \(\lambda \) are shown in Tables 4 and 5.

Table 4 The observed averages of the denominator degrees of freedom \(m^\#\), \(m^*\), \(m^\dag \) and \(m^\ddag \) for the four test procedures applied to the well-behaved data sets from among 10,000 data sets generated from each of 40 models (8 designs \(\times \) 5 values of \(\rho \))
Table 5 The observed averages of the scale factors \(\lambda ^\#\), \(\lambda ^*\), \(\lambda ^\dag \) and \(\lambda ^\ddag \) for the four test procedures applied to the well-behaved data sets from among 10,000 data sets generated from each of 40 models (8 designs \(\times \) 5 values of \(\rho \))

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alnosaier, W., Birkes, D. Inner workings of the Kenward–Roger test. Metrika 82, 195–223 (2019). https://doi.org/10.1007/s00184-018-0669-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-018-0669-9

Keywords

Navigation