Skip to main content
Log in

Quantifying the average of the time-varying hazard ratio via a class of transformations

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

The hazard ratio derived from the Cox model is a commonly used summary statistic to quantify a treatment effect with a time-to-event outcome. The proportional hazards assumption of the Cox model, however, is frequently violated in practice and many alternative models have been proposed in the statistical literature. Unfortunately, the regression coefficients obtained from different models are often not directly comparable. To overcome this problem, we propose a family of weighted hazard ratio measures that are based on the marginal survival curves or marginal hazard functions, and can be estimated using readily available output from various modeling approaches. The proposed transformation family includes the transformations considered by Schemper et al. (Statist Med 28:2473–2489, 2009) as special cases. In addition, we propose a novel estimate of the weighted hazard ratio based on the maximum departure from the null hypothesis within the transformation family, and develop a Kolmogorov\(-\)Smirnov type of test statistic based on this estimate. Simulation studies show that when the hazard functions of two groups either converge or diverge, this new estimate yields a more powerful test than tests based on the individual transformations recommended in Schemper et al. (Statist Med 28:2473–2489, 2009), with a similar magnitude of power loss when the hazards cross. The proposed estimates and test statistics are applied to a colorectal cancer clinical trial.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Abadi A, Amanpour F, Bajdik C, Yavari P (2012) Breast cancer survival analysis: applying the generalized gamma distribution under different conditions of the proportional hazards and accelerated failure time assumptions. Int J Prev Med 3(9):644

    Google Scholar 

  • Abrahamowicz M, Mackenzie T, Esdaile JM (1996) Time-dependent hazard ratio: modeling and hypothesis testing with application in lupus nephritis. J Am Statist Assoc 91(436):1432–1439

    Article  MATH  Google Scholar 

  • Banerjee T, Chen MH, Dey DK, Kim S (2007) Bayesian analysis of generalized odds-rate hazards models for survival data. Lifetime Data Anal 13(2):241–260

    Article  MATH  MathSciNet  Google Scholar 

  • Cox D (1972) Regression models and life-tables (with Discussion). J Royal Statist Soc Ser B 34:187–220

    MATH  Google Scholar 

  • Fan J, Yao Q (2003) Nonlinear time series. Springer, Berlin

    MATH  Google Scholar 

  • Gill RD, van der Vaart AW (1993) Non- and semi-parametric maximum likelihood estimators and the von Mises method: II. Scand J Statist 20(4):271–288

    MATH  MathSciNet  Google Scholar 

  • Gray RJ (1992) Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. J Am Statist Assoc 87(420):942–951

    Article  Google Scholar 

  • Hastie T, Tibshirani R (1993) Varying-coefficient models. J Royal Statist Soc Ser B (Methodological) 55(4):757–796

    MATH  MathSciNet  Google Scholar 

  • Hess KR (1994) Assessing time-by-covariate interactions in proportional hazards regression models using cubic spline functions. Statist Med 13(10):1045–1062

    Article  Google Scholar 

  • Kalbfleisch JD, Prentice RL (1981) Estimation of the average hazard ratio. Biometrika 68:105–112

    Article  MATH  MathSciNet  Google Scholar 

  • Kooperberg C, Stone CJ, Truong YK (1995) Hazard regression. J Am Statist Assoc 90(429):78–94

    Article  MATH  MathSciNet  Google Scholar 

  • Kosorok MR (2007) Introduction to empirical processes and semiparametric inference. Springer, Berlin

  • Lininger L, Gail MH, Green SB, Byar DP (1979) Comparison of four tests for equality of survival curves in the presence of stratification and censoring. Biometrika 66:417–428

    Article  Google Scholar 

  • Moeschberger ML, Klein JP (2003) Survival analysis: techniques for censored and truncated data. Springer, Berlin

    Google Scholar 

  • Pepe MS, Fleming TR (1989) Weighted Kaplan–Meier statistics: a class of distance tests for censored survival data. Biometrics 45(2):497–507

    Article  MATH  MathSciNet  Google Scholar 

  • Robins J, Tsiatis A (1992) Semiparametric estimation of an accelerated failure time model with time-dependent covariates. Biometrika 79:311–20

    MATH  MathSciNet  Google Scholar 

  • Royston P, Parmar M (2002) Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statist Med 21:2175–2197

    Article  Google Scholar 

  • Schemper M, Wakounig S, Heinze G (2009) The estimation of average hazard ratios by weighted Cox regression. Statist Med 28:2473–2489

    Article  MathSciNet  Google Scholar 

  • Shen Y, Fleming TR (1997) Weighted mean survival test statistics: a class of distance tests for censored survival data. J Royal Statist Soc 59(1):269–280

    Article  MATH  MathSciNet  Google Scholar 

  • Therneau T, Grambsch P (2000) Modeling survival data: extending the cox model. Springer, New York

    Book  Google Scholar 

  • Wei L (1992) The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Statist Med 11:1871–1879

    Article  Google Scholar 

  • Xu R, O’Quigley J (2000) Estimating average regression effect under non-proportional hazards. Biostatistics 1:423–439

    Article  MATH  Google Scholar 

  • Zeng D, Chen Q, Chen MH, Ibrahim J, Group AR (2012) Estimating treatment effects with treatment crossovers via semi-competing risks models: an application to a colorectal cancer study. Biometrika 99:167–184

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joseph G. Ibrahim.

Appendix

Appendix

We state here the conditions needed to establish consistency of \(\hat{\theta }_G\) and to derive its asymptotic distribution.

  1. (C.1)

    \(G\) is thrice-continuously differentiable and is strictly increasing. Additionally, \(\varOmega (t;s,v)\) is twice-continuously differentiable in \((t,s,v)\).

  2. (C.2)

    \(\sqrt{n}(\hat{S}_1-S_{10},\hat{S}_0-S_{00})\) converges in distribution to a bivariate mean-zero Gaussian process, denoted by \((\mathcal{G}_1, \mathcal{G}_0)\) in \(BV[0,\tau ]\times BV[0,\tau ]\), where \(BV[0,\tau ]\) denotes the spaces consisting of functions that have finite total variation in \([0,\tau ]\) and \(S_{k0}\) is the true survival function in treatment arm \(k\). Here, \(\tau \) is the study duration.

  3. (C.3)

    \(S_{k0}\) is strictly decreasing and thrice-continuously differentiable in \([0,\tau ]\). Moreover, \(S_{k0}(\tau )>0\).

  4. (C.4)

    The kernel function \(K(x)\) is differentiable, symmetric with respect to 0, and has compact support on \([-1,1]\).

  5. (C.5)

    The bandwidth \(a_n\) satisfies \(n a_n^2\rightarrow \infty \) and \(na_n^4\rightarrow 0\).

  6. (C.6)

    Conditional on the data, \(\sqrt{n} (\hat{S}_1^*-\hat{S}_1, \hat{S}_0^*-\hat{S}_0)\) converges in distribution to \((\mathcal{G}_1, \mathcal{G}_0)\), where \((\hat{S}_1^*, \hat{S}_0^*)\) are resampled statistics for \((\hat{S}_1, \hat{S}_0)\).

Condition (C.6) is an assumption regarding the consistency and asymptotic distribution of the estimates generated by the boostrap procedure. Chapter 20 of Kosorok (2007) validates this assumption for several survival modeling techniques including the Cox proportional hazards model.

Lemma 1

Under Conditions (C.1)–(C.5), \(\sup _{t\in [0,\tau ]}\vert \hat{h}_k(t)-h_{k0}(t)\vert \rightarrow _p 0, \ \ k=0,1,\) where \(h_{k0}\) denotes the true hazard function in treatment arm \(k\).

Proof (of Lemma 1)

First, we note that \(\hat{h}_k(t)=-a_n^{-1}\int K(x) d\log \hat{S}_k(t+a_n x).\) By carrying out an integration by parts, we can rewrite \(\hat{h}_k(t)\) as \(\hat{h}_k(t)=\int a_n^{-1}\log \hat{S}_k(t+a_n x) K'(x)dx.\) Moreover, we can continuously extend defining \(\hat{S}_k\) and \(S_{k0}\) to \([-a_n,\tau +a_n]\) so that (C.2) still holds. Thus, (C.2) implies \(\sup _{t\in [-a_n, \tau +a_n]} \vert \hat{S}_k(t)- S_{k0}(t)\vert =O_p(n^{-1/2}).\) (C.3) further gives \(\sup _{t\in [-a_n, \tau +a_n]} \vert \log \hat{S}_k(t)-\log S_{k0}(t)\vert =O_p(n^{-1/2}).\) Therefore,

$$\begin{aligned} \sup _{t\in [0,\tau ]}\vert \hat{h}_k(t)-h_{k0}(t)\vert&\le \int a_n^{-1}\sup _{t\in [-a_n, \tau +a_n]} \vert \log \hat{S}_k(t)- \log S_{k0}(t)\vert |K'(x)|dx\\&\le O_p(1/\surd {na_n^2}) \end{aligned}$$

goes to 0 by condition (C.5). This proves the lemma. \(\square \)

Proof (of Theorem 1)

Using Lemma 1 and noting \(\inf _{t\in [0,\tau ]} h_0(t)>0\), we obtain that uniformly in \(t\in [0,\tau ]\), as \(n \rightarrow \infty , G\left\{ \frac{\hat{h}_1(t)}{\hat{h}_0(t)}\right\} \rightarrow _p G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} ,\) and \(\varOmega \{t; \hat{S}_0(t), \hat{S}_1(t)\}\rightarrow _p \varOmega \{t;S_{00}(t), S_{10}(t)\}.\) Thus, it is clear that \(\hat{\theta }_G \rightarrow _p \theta _G\) as \(n \rightarrow \infty \). This establishes consistency.

To derive the asymptotic distribution of \(\hat{\theta }_G\), by the mean-value theorem, we obtain

$$\begin{aligned}&n^{1/2}(\hat{\theta }_G-\theta _G)\\&\quad =n^{1/2} \left( (G^{-1})'\left[ \int G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} \varOmega \{t; S_{00}(t), S_{10}(t)\}dt\right] +o_p(1)\right) \\&\qquad \times \left[ \!\int \! G\left\{ \frac{\hat{h}_{1}(t)}{\hat{h}_{0}(t)}\right\} \varOmega \{t; \hat{S}_{0}(t), \hat{S}_{1}(t)\}dt\!-\! \int \! G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\!\right\} \varOmega \{t; S_{00}(t), S_{10}(t)\}dt\!\right] \!. \end{aligned}$$

Using the mean-value theorem again, we have

$$\begin{aligned}&G\left\{ \frac{\hat{h}_{1}(t)}{\hat{h}_{0}(t)}\right\} \varOmega \{t; \hat{S}_{0}(t), \hat{S}_{1}(t)\}- G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} \varOmega \{t; S_{00}(t), S_{10}(t)\}\\&\quad =\left[ G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} \varOmega \{t; S_{00}(t), S_{10}(t)\}+o_p(1)\right] \\&\qquad \times \left[ \frac{\hat{h}_1(t)-h_{10}(t)}{h_{00}(t)}-\frac{h_{10}(t)\{\hat{h}_0(t)-h_{00}(t)\}}{h_{00}(t)^2}\right] \\&\qquad +\left[ G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} \frac{\partial }{\partial S_0(t)} \varOmega \{t; S_{00}(t), S_{10}(t)\}+o_p(1)\right] \{\hat{S}_0(t)-S_{00}(t)\}\\&\qquad +\left[ G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} \frac{\partial }{\partial S_1(t)} \varOmega \{t; S_{00}(t), S_{10}(t)\}+o_p(1)\right] \{\hat{S}_1(t)-S_{10}(t)\}, \end{aligned}$$

where \(o_p(1)\) is a random element converging in probability to zero uniformly in \(t\in [0,\tau ]\).

For convenience, we denote \(H=(G^{-1})'\{\int Q_0(t)Q_1(t)dt\}\), \(Q_0(t)= G\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} , Q_1(t)= \varOmega \{t; S_{00}(t), S_{10}(t)\}\), \(Q_2(t)=\frac{\partial }{\partial S_0(t)} \varOmega \{t; S_{00}(t), S_{10}(t)\}\), and \(Q_3(t)=\frac{\partial }{\partial S_1(t)} \varOmega \{t; S_{00}(t), S_{10}(t)\}\). Then, after combining the above results, we obtain

$$\begin{aligned}&n^{1/2}(\hat{\theta }_G-\theta _G)\\&\quad =n^{1/2} \left\{ H+o_p(1)\right\} \left[ \int G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)\{\hat{h}_1(t)-h_{10}(t)\}/h_{00}(t) dt\right] \\&\qquad -\,n^{1/2} \left\{ H+o_p(1)\right\} \left[ \int G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)h_{10}(t)\{\hat{h}_0(t)-h_{00}(t)\}/h_{00}(t)^2 dt\right] \\&\qquad +\,n^{1/2} \left\{ H+o_p(1)\right\} \left[ \int Q_0(t)Q_3(t)\{\hat{S}_1(t)-S_{10}(t)\} dt\right] \\&\qquad +\,n^{1/2} \left\{ H+o_p(1)\right\} \left[ \int Q_0(t)Q_2(t)\{\hat{S}_0(t)-S_{00}(t)\} dt\right] . \end{aligned}$$

The last two terms on the right-hand side both take the form \(n^{1/2} \int [A(t)+o_p(1)](\hat{S}_k(t)-S_{k0}(t))dt\) for some bounded function \(A(t)\) so that they converge to a normal distribution by condition (C.2). Thus, we only focus on the first two terms, namely (I) and (II), on the right-hand side. Using the definition of \(\hat{h}_1(t)\), we can rewrite the first term (I) as

$$\begin{aligned}&n^{1/2} \left\{ H+o_p(1)\right\} \int G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)\left\{ -\int K_{a_n}(s-t)d\log \hat{S}_1(s) \right. \\&\left. +\int K_{a_n}(s-t)d\log S_{10}(s)- \int K_{a_n}(s-t)d\log S_{10}(s)-h_{10}(t)\right\} /h_{00}(t) dt. \end{aligned}$$

Since

$$\begin{aligned}&- \int K_{a_n}(s-t)d\log S_{10}(s)-h_{10}(t)=\int K_{a_n}(s-t)h_{10}(s)ds-h_{10}(t) \\&\quad = \int K(x)\left\{ h_{10}(t+a_nx)-h_{10}(t)\right\} dx=O(a_n^2) \end{aligned}$$

and \(na_n^4\rightarrow 0\), (I) becomes

$$\begin{aligned}&n^{1/2} \left\{ H+o_p(1)\right\} \int G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t) \\&\quad \times \left[ -\int K_{a_n}(s-t)d\{\log \hat{S}_1(s) -\log S_{10}(s)\}\right] /h_{00}(t) dt+o_p(1), \end{aligned}$$

which is also equal to

$$\begin{aligned}&-n^{1/2} \left\{ H+o_p(1)\right\} \int _s d\{\log \hat{S}_1(s) -\log S_{10}(s)\} \\&\quad \times \int _t G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)K_{a_n}(s-t)/h_{00}(t)dt+o_p(1). \end{aligned}$$

However, since

$$\begin{aligned} \int _t G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)K_{a_n}(s-t)/h_{00}(t)dt =G'\left\{ \frac{h_{10}(s)}{h_{00}(s)}\right\} Q_1(s)/h_{00}(s)+O(a_n^2), \end{aligned}$$

it follows that

$$\begin{aligned} (I)=-n^{1/2} H \int _s G'\left\{ \frac{h_{10}(s)}{h_{00}(s)}\right\} Q_1(s)/h_{00}(s)d\{\log \hat{S}_1(s) -\log S_{10}(s)\}+o_p(1). \end{aligned}$$

Similarly, we obtain

$$\begin{aligned} (II)\!=\!n^{1/2} H \int _s G'\left\{ \frac{h_{10}(s)}{h_{00}(s)}\right\} Q_1(s)h_{10}(s)/h_{00}(s)^2d\{\log \hat{S}_0(s)\! -\! \log S_{00}(s)\}\!+\!o_p(1). \end{aligned}$$

Thus, we have

$$\begin{aligned}&n^{1/2}(\hat{\theta }_G-\theta _G) \nonumber \\&\quad =n^{1/2} \int A_1(t)d \{\log \hat{S}_1(t)\!-\!\log S_{10}(t)\} +n^{1/2} \int A_0(t)d\{\log \hat{S}_0(t)\!-\!\log S_{00}(t)\} \nonumber \\&\qquad +\,n^{1/2} \int B_1(t) \{\hat{S}_1(t)-S_{10}(t)\}dt+n^{1/2} \int B_0(t)\{\hat{S}_0(t)-S_{00}(t)\}dt+o_p(1),\nonumber \\ \end{aligned}$$
(15)

where \(A_0(t)=H G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)h_{10}(t)/h_{00}(t)^2\), \(A_1(t)=-H G'\left\{ \frac{h_{10}(t)}{h_{00}(t)}\right\} Q_1(t)/h_{00}(t)\), \(B_0(t) =H Q_0(t)Q_2(t)\), and \(B_1(t)=H Q_0(t)Q_3(t)\). Hence, Theorem 1 follows from condition (C.2). \(\square \)

Proof of Theorem 2

Let the weight function \(\varOmega (t)\) be independent of \(S_0(t)\) and \(S_1(t)\) and satisfy \(\int \varOmega (t) dt=1\). We write the hazard function of the treatment arm as \(h_1(t)/h_0(t)=1+\epsilon \lambda (t)\), where \(\lambda (t)\) is a function of \(t\). When \(h_1(t)\) is in the neighborhood of the null hypothesis with \(h_1(t)=h_0(t)\), i.e., \(\epsilon \) is close to zero, the Taylor’s series expansion of \(\theta _a\) is given by

$$\begin{aligned} \theta _a&= G_a^{-1}\left\{ \int _0^\tau G_a(1) \varOmega (t) dt\right\} + (G_a^{-1})' \left\{ \int _0^\tau G_a(1) \varOmega (t) dt \right\} \\&\int _0^\tau G_a'(1) \varOmega (t) \lambda (t) dt \times \epsilon + o(\epsilon ) \nonumber \\&= 1+ (G_a^{-1})'\{G_a(1)\} \times G_a'(1)\int _0^\tau \varOmega (t) \lambda (t) dt \times \epsilon + o(\epsilon ). \end{aligned}$$

Moreover, according to (15), the variance of \(\hat{\theta }_a\) around the null hypothesis can be written as \((G_a'(t))^2 B\), where \(B\) is a positive value independent of \(a\). Therefore, the local power of \(\theta _a\) can be written as

$$\begin{aligned}&P\left( \left| \frac{\hat{\theta }_{a}-1}{sd(\hat{\theta }_{a})} \right| > Z_{1-\alpha /2} \mid h_0(t)=h_1(t) \right) \\&\quad = P\left( \left| (G_a^{-1})'\{G_a(1)\} \epsilon \int _0^\tau \varOmega (t) \lambda (t) dt + o(\epsilon )\right| > B Z_{1-\alpha /2}\right) . \end{aligned}$$

For the transformation family in (2) with \(a \in [-1,1]\), we can optimize the local power by maximizing \(|(G_a^{-1})'\{G_a(1)\}|=|(1+a)^{1+a}|\) for \(a \in [-1,1]\), resulting in an optimal value at \(a=1\).

When \(\int \varOmega (t) \lambda (t) dt=0\) for the crossing hazards case, we need a higher order Taylor’s series expansion, given by

$$\begin{aligned} \theta _a&= G_a^{-1}\left\{ \int _0^\tau G_a(1) \varOmega (t) dt\right\} + (G_a^{-1})' \left\{ \int _0^\tau G_a(1) \varOmega (t) dt \right\} \\&\times \int _0^\tau G_a'(1) \varOmega (t) \lambda (t) dt \times \epsilon \nonumber +\, (G_a^{-1})'' \left\{ \int _0^\tau G_a(1) \varOmega (t) dt \right\} \\&\times \left\{ \int _0^\tau G_a'(1) \varOmega (t) \lambda (t) dt\right\} ^2 \times \epsilon ^2 +\, (G_a^{-1})' \left\{ \int _0^\tau G_a(1) \varOmega (t) dt \right\} \nonumber \\&\times \left\{ \int _0^\tau G_a''(1) \varOmega (t) \lambda ^2(t) dt \right\} \times \epsilon ^2 + o(\epsilon ^2) \nonumber \\&= 1+ (G_a^{-1})'\{G_a(1)\} \times G_a''(1) \times \int _0^\tau \varOmega (t) \lambda ^2(t) dt \times \epsilon ^2 + o(\epsilon ^2) \nonumber \end{aligned}$$

In this case, the local power of \(\theta _a\) is given by

$$\begin{aligned} P\left( \left| (G_a^{-1})'\{G_a(1)\} \times \frac{G_a''(1)}{G_a'(1)} \times \int _0^\tau \varOmega (t) \lambda ^2(t) dt \times \epsilon ^2 + o(\epsilon ^2)\right| > B Z_{1-\alpha /2}\right) \end{aligned}$$

where \(G_a''(x)=-\left( \frac{a+1}{a+x} \right) G_a'(x)\). Again, this local power is maximized at \(a=1\). \(\square \)

Proof of Theorem 3

The proof is based on the same linearization given in equation (15) but on the right hand side of equation (15), expressions \(A_1(t), A_0(t), B_1(t)\), and \(B_0(t)\) are indexed by \(a \in [0,1]\). Additionally, \(o_p(1)\) on the right hand side of (15) converges in probability to zero uniformly in \(a\). It is easy to check from the explicit expressions of \(A_1, A_0, B_1, B_0\) that they all belong to a bounded set in \(BV[0,\tau ]\) for any \(a \in [0,1]\). Thus, condition (C.2) and the results in Gill and Vaart (1993) yield that \(n^{1/2} (\hat{\theta }_a-\theta _a)\), as a stochastic process indexed by \(a \in [0,1]\), converges weakly to a Gaussian process. Theorem 3 thus follows from the continuity theorem. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Q., Zeng, D., Ibrahim, J.G. et al. Quantifying the average of the time-varying hazard ratio via a class of transformations. Lifetime Data Anal 21, 259–279 (2015). https://doi.org/10.1007/s10985-014-9301-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-014-9301-0

Keywords

Navigation