Testing for center effects on survival and competing risks outcomes using pseudo-value regression

Wang, Yanzhi; Logan, Brent R.

doi:10.1007/s10985-018-9443-6

Testing for center effects on survival and competing risks outcomes using pseudo-value regression

Published: 05 July 2018

Volume 25, pages 206–228, (2019)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

377 Accesses
1 Citation
Explore all metrics

Abstract

In multi-center studies, the presence of a cluster effect leads to correlation among outcomes within a center and requires different techniques to handle such correlation. Testing for a cluster effect can serve as a pre-screening step to help guide the researcher towards the appropriate analysis. With time to event data, score tests have been proposed which test for the presence of a center effect on the hazard function. However, sometimes researchers are interested in directly modeling other quantities such as survival probabilities or cumulative incidence at a fixed time. We propose a test for the presence of a center effect acting directly on the quantity of interest using pseudo-value regression, and derive the asymptotic properties of our proposed test statistic. We examine the performance of our proposed test through simulation studies in both survival and competing risks settings. The proposed test may be more powerful than tests based on the hazard function in settings where the center effect is time-varying. We illustrate the test using a multicenter registry study of survival and competing risks outcomes after hematopoietic cell transplantation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Sample size recommendations for studies on reliability and measurement error: an online application based on simulation studies

Article Open access 23 November 2022

Lidwine B. Mokkink, Henrica de Vet, … Iris Eekhout

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

Xiang Wan, Wenqian Wang, … Tiejun Tong

Deep learning for survival analysis: a review

Article Open access 19 February 2024

Simon Wiegrebe, Philipp Kopper, … Andreas Bender

References

Andersen PK, Hansen MG, Klein JP (2004) Regression analysis of restricted mean survival time based onpseudo-observations. Lifetime Data Anal 10:335–350. https://doi.org/10.1007/s10985-004-4771-0
Andersen PK, Klein JP, Rosthøj S (2003) Generalised linear models for correlated pseudo-observations, with applications to multi-state models. Biometrika 90(1):15–27. https://doi.org/10.1093/biomet/90.1.15
Andersen PK, Perme MP (2010) Pseudo-observations in survival analysis. Stat Methods Med Res 19(1):71–99. https://doi.org/10.1177/0962280209105020
Article MathSciNet Google Scholar
Commenges D, Andersen PK (1995) Score test of homogeneity for survival data. Lifetime Data Anal 1:145–156
Article MathSciNet MATH Google Scholar
Fine JP, Gray RJ (1999) A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc 94(446):496–509
Article MathSciNet MATH Google Scholar
Graw F, Gerds TA, Schumacher M (2009) On pseudo-values for regression analysis in multi-state models. Lifetime Data Anal 15:241–255
Article MathSciNet MATH Google Scholar
Gray RJ (1995) Tests for variation over groups in survival data. J Am Stat Assoc 90:198–203
Article MathSciNet MATH Google Scholar
Jacqmin-Gadda H, Commenges D (1995) Tests of homogeneity for generalized linear models. J Am Stat Assoc 90:1237–1246
Article MathSciNet MATH Google Scholar
Katsahian S, Boudreau C (2011) Estimating and testing for center effects in competing risks. Stat Med 30:1608–1617
Article MathSciNet Google Scholar
Klein JP, Andersen PK (2005) Regression modeling of competing risks data based on pseudovalues of the cumulative incidence function. Biometrics 61(1):223–229. https://doi.org/10.1111/j.0006-341X.2005.031209.x
Article MathSciNet MATH Google Scholar
Klein JP, Logan RB, Harhoff M, Andersen PK (2007) Analyzing survival curves at a fixed point in time. Stat Med 26:4505–4519. https://doi.org/10.1002/sim.2864
Article MathSciNet Google Scholar
Liang KY (1987) A locally most powerful test for homogeneity with many strata. Biometrika 74(2):259–264. https://doi.org/10.1093/biomet/74.2.259
Article MathSciNet MATH Google Scholar
Lin X (1997) Variance component testing in generalised linear models with random effects. Biometrika 84(2):309–326. https://doi.org/10.1093/biomet/84.2.309
Article MathSciNet MATH Google Scholar
Logan BR, Wang T (2014) Chap 10: Pseudo-value regression models. In: Klein JP, van Houwelingen HC, Ibrahim JG, Scheike TH (eds) Handbook of survival analysis. CRC Press, Boca Raton
Google Scholar
Logan BR, Zhang MJ, Klein JP (2011) Marginal models for clustered time-to-event data with competing risks using pseudovalues. Biometrics 67:1–7
Article MathSciNet MATH Google Scholar
Satten GA, Datta S (2001) Inverse-probability-of-censoring weighed average. Am Stat 55:207–210
Article MATH Google Scholar
Scheike TH, Zhang M-J, Gerds T (2008) Predicting cumulative incidence probability by direct binomialregression. Biomerika 95:205–220
Article MATH Google Scholar
VanHouwelingen H, Putter H (2008) Dynamic predicting by landmarking as an alternative for multi-state modeling: an application to acute lymphoid leukemia data. Lifetime Data Anal 14:447–463
Article MathSciNet MATH Google Scholar
VanHouwelingen H, Putter H (2015) Comparison of stopped cox regression with direct methods such as pseudo-values and binomial regression. Lifetime Data Anal 21:180–196
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Division of Research Services/Department of Medicine, University of Illinois College of Medicine at Peoria, 1 Illini Dr., Peoria, IL, 61605, USA
Yanzhi Wang
Division of Biostatistics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI, 53226, USA
Brent R. Logan

Authors

Yanzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Brent R. Logan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanzhi Wang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 3 KB)

Appendix: Asymptotic distribution of the proposed test statistic

Here we assume that the cluster size $n_k$’s and the covariates vector ${\varvec{x}}_{kj}$ belong to a bounded set. We also assume standard regularity conditions on the risk set, namely that there exist functions $r^c(s)$ defined on [0, t] with $\inf _{s \in [0,t]} r^c(s) >0$ such that

$$\begin{aligned} \sup _{s \in [0,t]}\left| a_n^{-2}{R^c}^{(n)}(s)-r^c(s)\right| \rightarrow _p 0 \qquad as \quad n \rightarrow \infty , \end{aligned}$$

where $R^c(t)$ denotes the number at risk at time t, $\{a_n\}$ is a sequence of positive constants.

We derive the asymptotic distribution of the proposed test for the competing risks setting, and note that the survival setting can be obtained as a special case. Before we derive the mean, variance, and distribution of the pseudo-value test under $H_0$, we first present two Lemmas characterizing regularity conditions on the $\varphi _{kj}$.

Lemma 1

Under $H_0$, ${\text {var}}(\varphi _{kj})$ is bounded.

Proof of Lemma 1

$$\begin{aligned} {\text {var}}(\varphi _{kj})=&{\text {var}}\left\{ \frac{\Delta _{kj}N_{kj}(t)}{G(T_{kj})}\right\} \end{aligned}$$

(3)

$$\begin{aligned}&+{\text {var}}\left\{ \int _0^{T_{kj}}\frac{P(T\le t, \epsilon = 1 | T\ge u)}{G(u)} d M_{kj}^c (u) \right\} \end{aligned}$$

(4)

$$\begin{aligned}&+ 2{\text {cov}}\left\{ \frac{\Delta _{kj}N_{kj}(t)}{G(T_{kj})} , \int _0^{T_{kj}}\frac{P(T\le t, \epsilon = 1 | T\ge u)}{G(u)} d M_{kj}^c (u) \right\} . \end{aligned}$$

(5)

The term in (3) is bounded since $\Delta _{kj}(t)N_{kj}(t)$ is either 0 or 1 and $1/G(T_{kj})$ is bounded under the regularity conditions on the risk set. The term in (4) can be written as

$$\begin{aligned}&{\text {var}}\left\{ \int _0^{T_{kj}}\right. \left. \frac{P(T\le t, \epsilon = 1 | T\ge u)}{G(u)} d M_{kj}^c (u) \right\} \nonumber \\&\quad =\int _0^{T_{kj}}\frac{(P(T\le t, \epsilon = 1 | T\ge u))^2}{G^2(u)} I(T_{kj} \ge u) \lambda ^c(u)d u \nonumber \\&\quad \le \frac{1}{G^2(T_{kj})}\Lambda ^c(T_{kj}). \end{aligned}$$

which is bounded due to the regularity conditions on the risk set.

Finally the term in (5) can be rewritten as

$$\begin{aligned}&{\text {E}}\left\{ \frac{\Delta _{kj}N_{kj}(t)}{G(T_{kj})} \quad \right. \left. \int _0^{T_{kj}}\frac{P(T\le t, \epsilon = 1 | T\ge u)}{G(u)} d M_{kj}^c (u) \right\} \nonumber \\&\quad =-{\text {E}}\bigg \{\frac{I(T_{kj} \le t)I(\widetilde{T_{kj}} \le C_{kj})}{G(T_{kj})} \nonumber \\&\qquad \int _0^{T_{kj}}\frac{P(T\le t, \epsilon = 1 | T\ge u)}{G(u)}I(T_{kj}\ge u)\lambda ^c(u)du \bigg \}. \end{aligned}$$

(6)

which is bounded in absolute value by $\Lambda ^c(T_{kj})/G^2(T_{kj})$. Therefore ${\text {var}}(\varphi _{kj})$ is bounded.

The second lemma utilizes Lemma 1 to establish additional regularity conditions used to establish the asymptotic distribution of the proposed test statistic; because the results are straightforward given Lemma 1, it is stated without proof. $\square $

Lemma 2

Defining

$$\begin{aligned} W_k =\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}\left( \varphi _{kj}-\mu _{kj}^0\right) \left( \varphi _{kj'}-\mu _{kj'}^0\right) , \end{aligned}$$

the following conditions hold:

(a)
$\sum _{k=1}^{\infty }\left\{ {\text {var}}(\partial W_k/\partial \beta _l)/k^2\right\} < \infty \quad \forall \quad l=1,\dots ,p$.
(b)
${\text {E}}\{\partial ^2W_k/\partial \beta _l \partial \beta _m\}$ is bounded $\forall \quad l,m=1,\dots ,p \quad $ and $\forall \quad k$.
(c)
$\sum _{k=1}^{\infty }\left\{ {\text {var}}(\partial ^2W_k/\partial \beta _l\beta _m)/k^2\right\} < \infty \quad \forall \quad l,m=1,\dots ,p$.
(d)
$\forall \quad \epsilon >0 \sum _{k=1}^K\int _{\vert z\vert \ge \epsilon }z^2 dF_k \rightarrow 0$ where $z_k=W_k/I^{1/2}$ with distribution function $F_k$ (Lindeberg’s condition).

Based on the conditions established in Lemmas 1 and 2, we can prove the following theorem on the asymptotic distribution of the score test statistic under $H_0$.

Theorem

Under $H_0$ and the regularity conditions described above,

$$\begin{aligned} \frac{T_{PC}({{\widehat{\varvec{\beta }}}})}{\sqrt{I}} \overset{D}{\rightarrow } N(0,1), \end{aligned}$$

where ${\widehat{\varvec{\beta }}}$ is a consistent estimator of $\varvec{\beta }$ under $H_0$.

Proof

First we show that $T_{PC}$ is asymptotically equivalent to $K^{-1/2}W$, where

$$\begin{aligned} W&=\sum _{k=1}^K\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}W_k. \nonumber \\ T_{PC}&= K^{-1/2}\sum _{k=1}^K\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}\left\{ \varphi _{kj}-\mu _{kj}^0+O_p(K^{-1/2})\right\} \left\{ \varphi _{kj'}-\mu _{kj'}^0+O_p(K^{-1/2})\right\} \nonumber \\&= K^{-1/2}W + O_p(1)2K^{-1} \sum _{k=1}^K(n_k-1)\sum _{j=1}^{n_k}\left( \varphi _{kj}-\mu _{kj}^0\right) +O_p\left( K^{-1/2}\right) . \end{aligned}$$

Note that ${\text {E}}\{\sum _{j=1}^{n_k}(\varphi _{kj}-\mu _{kj}^0)\}=0$. Then by the law of large numbers, since $n_k$ is bounded and from Lemma 1${\text {var}}(\varphi _{kj}-\mu _{kj}^0)$ is bounded,

$$\begin{aligned} K^{-1} \sum _{k=1}^K\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}\left( \varphi _{kj'}-\mu _{kj'}^0\right) =o_p(1). \end{aligned}$$

Therefore we have

$$\begin{aligned} T_{PC}=K^{-1/2}W+o_p(1), \end{aligned}$$

and the two statistics will have the same limiting distribution. The mean of the limiting distribution of $K^{-1/2}W$ is 0 because under $H_0$, $\varphi _{kj}$ is independent of $\varphi _{kj'}$, so that

$$\begin{aligned} E(K^{-1/2}W)=K^{-1/2}\sum _{k=1}^K\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}E\left( \varphi _{kj}-\mu _{kj}^0\right) E\left( \varphi _{kj'}-\mu _{kj'}^0\right) =0. \end{aligned}$$

To derive the variance of the limiting distribution first note that since $\varphi _{kj}$ and $\varphi _{kj'}$ are independent under $H_0$ for $j \ne j'$, $ {\text {E}}[(\varphi _{kj}-\mu _{kj}^0)(\varphi _{kj'}-\mu _{kj'}^0)]=0$ and

$$\begin{aligned} {\text {cov}}\left[ \left( \varphi _{kj}-\mu _{kj}^0\right) \left( \varphi _{kj'}-\mu _{kj'}^0\right) , \left( \varphi _{kl}-\mu _{kl}^0\right) \left( \varphi _{kl'}-\mu _{kl'}^0\right) \right] =0, \end{aligned}$$

for $(l,l') \ne (j,j')$. Then

$$\begin{aligned} {\text {var}}(K^{-1/2}W)&=K^{-1}\sum _{k=1}^K\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}{\text {var}}\left[ \left( \varphi _{kj}- \mu _{kj}^0\right) \left( \varphi _{kj'}-\mu _{kj'}^0\right) \right] \nonumber \\&=K^{-1}\sum _{k=1}^K\sum _{j=1}^{n_k}\sum _{j'\ne j}^{n_k}{\text {E}}\left[ \left( \varphi _{kj}-\mu _{kj}\right) ^2\left( \varphi _{kj'}-\mu _{kj'}^0\right) ^2\right] \nonumber \\&=I. \end{aligned}$$

Proof of asymptotic normality of the test statistic follows closely the derivation in Jacqmin-Gadda and Commenges applied to W, using the regularity conditions in Lemma 2, and so the details are omitted. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Logan, B.R. Testing for center effects on survival and competing risks outcomes using pseudo-value regression. Lifetime Data Anal 25, 206–228 (2019). https://doi.org/10.1007/s10985-018-9443-6

Download citation

Received: 28 March 2017
Accepted: 29 June 2018
Published: 05 July 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s10985-018-9443-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Testing for center effects on survival and competing risks outcomes using pseudo-value regression

Abstract

Access this article

Similar content being viewed by others

Sample size recommendations for studies on reliability and measurement error: an online application based on simulation studies

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Deep learning for survival analysis: a review

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (R 3 KB)

Appendix: Asymptotic distribution of the proposed test statistic

Lemma 1

Proof of Lemma 1

Lemma 2

Theorem

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Testing for center effects on survival and competing risks outcomes using pseudo-value regression

Abstract

Access this article

Similar content being viewed by others

Sample size recommendations for studies on reliability and measurement error: an online application based on simulation studies

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Deep learning for survival analysis: a review

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (R 3 KB)

Appendix: Asymptotic distribution of the proposed test statistic

Appendix: Asymptotic distribution of the proposed test statistic

Lemma 1

Proof of Lemma 1

Lemma 2

Theorem

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation