A flexible semiparametric transformation model for survival data

Scheike, Thomas H.

doi:10.1007/s10985-006-9021-1

A flexible semiparametric transformation model for survival data

Published: 20 September 2006

Volume 12, pages 461–480, (2006)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Thomas H. Scheike¹

282 Accesses
10 Citations
Explore all metrics

Abstract

I suggest an extension of the semiparametric transformation model that specifies a time-varying regression structure for the transformation, and thus allows time-varying structure in the data. Special cases include a stratified version of the usual semiparametric transformation model. The model can be thought of as specifying a first order Taylor expansion of a completely flexible baseline. Large sample properties are derived and estimators of the asymptotic variances of the regression coefficients are given. The method is illustrated by a worked example and a small simulation study. A goodness of fit procedure for testing if the regression effects lead to a satisfactory fit is also suggested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semiparametric methods for left-truncated and right-censored survival data with covariate measurement error

Article 02 June 2020

Extended Exponential Geometric Proportional Hazard Model

Article 01 June 2014

Z-estimation and stratified samples: application to survival models

Article 15 January 2015

References

Bagdonavicius V, Nikulin M (1999) Generalised proportional hazards model based on modified partial likelihood. Lifetime Data Anal 5:329–350
Article MathSciNet Google Scholar
Bagdonavicius V, Nikulin M (2001) Accelerated life models: modelling and statistical analysis. Chapman and Hall, London
Google Scholar
Bennett S (1983) Analysis of survival data by the proportional odds model. Statist Med 2:273–277
Google Scholar
Cai T, Wei LJ, Wilcox M (2000) Semiparametric regression analysis for clustered failure time data. Biometrika 87:867–878
Article MathSciNet Google Scholar
Chen K, Jin Z, Ying Z (2002) Semiparametric analysis of transformation models with censored data. Biometrika 89:659–668
Article MathSciNet Google Scholar
Cheng SC, Wei LJ, Ying Z (1995) Analysis of transformation models with censored data. Biometrika 82:835–845
Article MathSciNet Google Scholar
Cheng SC, Wei LJ, Ying Z (1997) Prediction of survival probabilities with semi-parametric transformation models. J Am Statist Assoc 92:227–235
Article MathSciNet Google Scholar
Cox DR (1972) Regression models and life tables (with discussion). J Roy Statist Soc Ser B 34:187–220
MathSciNet Google Scholar
Dabrowska DM (1997) Smoothed Cox regression. Ann Statist 25:1510–1540
Article MathSciNet Google Scholar
Dabrowska DM (2005) Quantile regression in transformation models. arXiv:math.ST 05115082v1:1–34
Dabrowska DM (2006) Estimation in a class of semiparametric transformation models. arXiv:math.ST 0511506v2:1–48
Dabrowska DM, Doksum KA (1988) Partial likelihood in transformation models with censored data. Scand J Statist 15:1–24
MathSciNet Google Scholar
Fine J, Ying Z, Wei LJ (1998) On the linear transformation model with censored data. Biometrika 85:980–986
Article Google Scholar
Gill RD, Johansen S (1990) A survey of product-integration with a view towards application in survival analysis. Ann Statist 18:1501–1555
MathSciNet Google Scholar
Jensen GV, Torp-Pedersen C, Hildebrandt P, Kober L, Nielsen FE, Melchior T, Joen T, Andersen PK (1997) Does in-hospital ventricular fibrillation affect prognosis after myocardial infarction?. Eur Heart J 18:919–924
Google Scholar
Kosorok MR, Lee BL, Fine JP (2004) Robust inference for univariate proportional hazards frailty regression models. Ann Statist 32:1448–1491
Article MathSciNet Google Scholar
Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572
Article MathSciNet Google Scholar
Murphy S, Rossini A, Van Der Vaart A (1997) Maximum likelihood estimation in the proportional odds model. J Am Statist Assoc 92:968–976
Article Google Scholar
Scheike TH, Zhang MJ (2002) An additive-multiplicative Cox–Aalen model. Scand J Statist 28:75–88
Article MathSciNet Google Scholar
Scheike TH, Zhang MJ (2003) Extensions and applications of the Cox–Aalen survival model. Biometrics 59:1033–1045
Article MathSciNet Google Scholar

Download references

Acknowledgements

I would like to thank Martin Jacobsen for pointing my direction to the product integration formulae. I also appreciate discussions with Torben Martinussen. Part of the this work was done while the author visited the Center for Advanced Study in Oslo, and was partly supported by an NIH grant. I would also like to thank two referees and the associate editor. One for a detailed reading and making several suggestions for an improved presentation, and one for providing me with several recent references that deals with the asymptotic analysis. Their comments helped improve the manuscript.

Author information

Authors and Affiliations

Department of Biostatistics, University Of Copenhagen, Øster Farimagsgade 5 B, 2099, Copenhagen K, DK-1014, Denmark
Thomas H. Scheike

Authors

Thomas H. Scheike
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas H. Scheike.

Appendix

I here sketch the main arguments of the proof that establishes the asymptotic variances, extending that of Bagdonavicius and Nikulin (1999, 2001) and Chen et al. (2002) to more than one dimension. A detailed consistency proof for the standard transformation model can be found in Dabrowska (2005), and it appears that these arguments can be extended to our setting, but this is no trivial exercise. The key to the multivariate extension is the use of product-limit integration formulae and I here focus on establishing the key formulae that gives expressions for the standard deviations of the estimated quantities.

Assume that subjects are i.i.d. with covariates that are uniformly bounded. Define ${\varvec D}^{**}(\beta,A) = \hbox{diag}(\exp(2 Z_i^{T}\beta)\dot{\lambda}_0(H({t-}|X_i)\exp(Z_i^{T}\beta))), {\varvec S}^{**}(\beta, A) = n^{-1}{\varvec X}^T(t){\varvec D}^{**}(t,\beta,A) {\varvec X}(t), {\varvec S}_j^{**}(\beta, A) = n^{-1}{\varvec X}^{T}(t){\varvec D}(X_{ij}){\varvec D}^{**}(t,\beta,A){\varvec X}(t), {\varvec S}^{Z}(\beta, A) = n^{-1}{\varvec Z}^T(t){\varvec D}(t,\beta,A) {\varvec X}(t)$, and ${\varvec S}_j^{**Z}(\beta, A) = n^{-1}{\varvec Z}^{T}(t) {\varvec D}(X_{ij}){\varvec D}^{**}(t,\beta,A){\varvec X}(t)$.

One of the needed assumptions is that all S matrices converge uniformly to deterministic matrices in both time and the parameter space for β. Let Θ denote an open ball around the true parameter value β₀, then it is assumed that the limit of all large S’s exist and are denoted by small s’s. Such that for example $lim_p {\varvec S}_j^{**Z}(\beta, A) = s_j^{**Z}$ uniformly in $\Theta\times [0,\tau]$. Define the limit of $E_j^{**}(t) = {\varvec S}_0^{-1}(t,\beta){\varvec S}_j^{**}(t,\beta)$ as $e_j^{**}(u,\beta)$. I also assume that the covariates are uniformly bounded.

I start by making the observation that (up to $o_p(n^{-1})$)

$$ \begin{aligned} \tilde{A}(\beta_0,t) - A_0(t) &= \int_{0}^{t}{\varvec X}^{-}(\beta_0,\tilde{A})\,\hbox{d}{\varvec M} \\ &- \int_{0}^{t} \left\{{\varvec X}^{-}(\beta_0,A_0) - {\varvec X}^{-}(\beta_0,\tilde{A})\right\} {\varvec D}(\beta_0,A_0) {\varvec X}\,\hbox{d}A_0(s) \end{aligned} $$

where the time arguments were omitted for notational simplicity. Note that the first term, V _n, by the martingale CLT, can be shown to converge to a Gaussian martingale process V(t) with variance $\sigma^2(t) = \int_{0}^{t}s_{0}^{-1}\,\hbox{d}A_0$. Note also that V _n(t) can be written as

$$ V_n(t) = \sum_i\int_{0}^{t}{\varvec S}_0^{-1}(t,\beta_0) X_i\,\hbox{d}M_{i}. $$

The integrand in the last integral above can be Taylor expanded by the use of matrix derivatives thus leading to (up to $o_p(n^{-1/2})$)

$$ \begin{aligned} ({\varvec S}_0^{-1}(t,\beta_0,\tilde{A}) - {\varvec S}_0^{-1}(t,&\beta_0,A_0)) {\varvec S}_0(\beta_0,A_0) = \\ &- \sum_{j=1}^p {\varvec S}_0^{-1}(t,\beta_0,A_0){\varvec S}_j^{**}(t,\beta_0,A_0)(\tilde{A} _j({t-}) - A_{0j}({t-})). \end{aligned} $$

This implies that $\sqrt{n}(\tilde{A}(\beta_0,t) - A_0(t))$ converges in distribution to a p-dimensional process W that satisfies the integral equation

$$W(t) = \sum_{j=1}^p \int_{0}^{t} - W_j(s-) e_j^{**}(s,\beta_0)\,\hbox{d}A_0(s) + V(t).$$

Denote the elements of $e_{j}^{**}(t)$ as $e_{j,{km}}^{**}(t)$. With the p × p matrices F(t) with elements $F_{jk}(t) = \sum_{m=1}^p e_{j,{km}}^{**}(t)\alpha_{0,m}(t)$ the solution can be written as

$$\hbox{d}W(t) = -W({t-})F\,\hbox{d}t +\hbox{d}V(t). $$

Using the product integration formulae (Gill and Johansen 1990) leads to the solution

$$\begin{aligned} W(t) &= \int_{0}^{t}\,\hbox{d}V(s) {\cal F}(s,t) \\ {\cal F}(s,t) &= \prod_{{]}s,t{]}} (I - F\,\hbox{d}t) \end{aligned}$$

(13)

and where $\mathcal{F}(s,t)$ is the product integral of F. In the one-dimensional (commutative) case this equals $\exp(-\int_{s}^{t} e^{**}(u,\beta_0)\,\hbox{d}A_0(u))$. Note that the matrix $\mathcal{F}(s,t)$ is estimated consistently by the product of (the atoms)

$$ \hat{\mathcal{F}}(s,t) = \prod_{]s,t]}(I - \,\hbox{d}\hat{F}) $$

(14)

where $\hbox{d}\hat{F}_{jk}(t) = \sum_{m=1}^p E_{j,{km}}^{**}(t)\,\hbox{d}\hat{A}_m(t).$ Using the product integral rather than the exponential in the one-dimensional case yields an estimator with the same asymptotic properties.

I now turn to the estimating function for $\beta, \tilde{U}(\beta).$ By the martingale decomposition

$$ \begin{aligned} \tilde{U}(\beta_0) &= \int{\left\{{\varvec Z}^T -{\varvec Z}^{T}{\varvec D}(\beta_0,\tilde{A}){\varvec X}S_0^{-1}(\beta_0,\tilde{A}) {\varvec X}^T \right\}}\,\hbox{d}M \\ &+ \int{\left\{{\varvec Z}^{T} D(\beta_0, A_0){\varvec X} - {\varvec Z}^{T}{\varvec D}(\beta_0, \tilde{A}){\varvec X}\right\}}\,\hbox{d}A_0 \\ &+ \int{{\varvec Z}^{T}{\varvec D}(\beta_0,\tilde{A}){\varvec X}\left\{{\varvec S}_0^{-1}(\beta_0,A_0) - {\varvec S}_0^{-1}(\beta_0,\tilde{A})\right\} {\varvec S}_0(\beta_0,A_0)}\,\hbox{d}A_0. \end{aligned} $$

By a Taylor expansion, as before, it may be shown that the difference (up to $o_p(n^{-1/2})$)

$$ {\varvec Z}^{T} {\varvec D}(\beta_0, A_0) {\varvec X} - {\varvec Z}^{T} {\varvec D}(\beta_0, \tilde{A}) {\varvec X} = -\sum_{j=1}^p {\varvec S}_j^{**Z}(\beta_0, A_0)(\tilde{A}_j({t-}) - A_{0j}({t-})). $$

The last two term therefore equals ( $o_p(n^{-1/2})$)

$$ \sum_{j=1}^p \int{\left\{s_j^{**z} - s^z s_0^{-1} s_j^{**}\right\}}W_j\,\hbox{d}A_0$$

that can be written as

$$ \int{\left\{G_1(t) - G_2(t)\right\}} W(t)\,\hbox{d}t$$

with G ₁(t) and G ₂(t) an q × p matrix with elements $G_{1,kj}(t) = \sum_{m=1}^p s_{j,km}^{**z}\alpha_{0,m}(t)$ and $G_{2,kj}(t) = \sum_{m=1}^p \left\{s^z s_0^{-1}s_j^{**}\right\}_{km} \alpha_{0,m}(t)$. Now changing the order of integration and using the structure of W(t) one gets

$$ \begin{aligned} & \int_{0}^{\tau}\left\{G_1(t)-G_2(t)\right\}W(t)\,\hbox{d}t = \int_{0}^{\tau}\left\{G_1(t)-G_2(t)\right\}\int_{0}^{t}\mathcal{F} (s,t)V(s)\,\hbox{d}s\,\hbox{d}t \\ &= \int_{0}^{\tau} g(s)V(s)\,\hbox{d}s = \sum_{i}\int_{0}^{\tau} g(s) s_0^{-1}(s,\beta) X_i \,\hbox{d}M_i(s), \\ \end{aligned} $$

where

$$ g(s) = \int_{s}^{\tau}\left\{G_1(t)-G_2(t)\right\}\mathcal{F}(s,t)\,\hbox{d}t. $$

Define

$$ \begin{aligned} q_{1i}(t,\beta) &= Z_i - s_z(t,\beta)s_0^{-1}(t,\beta)X_i \\ q_{2i}(t,\beta) &= g(t) s_0^{-1}(t,\beta)X_i \\ q_i(t) &= q_{1i}(t,\beta) + q_{2i}(t,\beta). \end{aligned} $$

(15)

Then one can write the estimating function as follows (up to an o _p(1) term)

$$ n^{-1/2}\tilde{U}(\beta_0) = n^{-1/2}\sum_{i}\int_{0}^{\tau}\left\{q_{1i}(t,\beta_0) +q_{2i}(t,\beta_0)\right\}\,\hbox{d}M_i(t). $$

This is a sum of i.i.d. terms (or a martingale) and therefore converges to a normal distribution with variance that is estimated by the robust estimator given in (11).

Based on a Taylor series expansion (up to an o _p(1) term) we find that

$$ \begin{aligned} \sqrt{n}(\hat{A}(t) - A_0(t)) &= n^{1/2} \int_{0}^{t} \frac{\partial}{\partial\beta}X^{-}(t,\beta_0)\,\hbox{d}{\varvec N}(\hat{\beta} - \beta_0) + n^{-1/2}\sum_{i}\int_{0}^{t}\mathcal{F}(s,t) s_0^{-1}(s,\beta_0) X_i(s)\,\hbox{d}M_i(s) \\ &= n^{-1/2} \sum_{i}L_i(t,\beta_0) \end{aligned} $$

where

$$ \begin{aligned} L_i(t,\beta) &= P(t,\beta_0)I^{-1}(\tau)\int_{0}^{\tau} Y_i(t) q_i(t,\beta_0)\,\hbox{d}M_i(t) \\ &+ \int_{0}^{t} Y_i(s) \mathcal{F}(s,t) s_0^{-1}(s,\beta_0) X_i(s)\,\hbox{d}M_i(s), \end{aligned} $$

(16)

and with P(t,β) the limit of $n^{-1}\tilde{P}(t,\hat{\beta})$ defined in (8), with I(τ) the limit of $n^{-1} \mathcal{I}(\tau,\beta_0)$, and with the product integral $\mathcal{F}(s,t)$ defined in (13).

Note that an estimator of $\mathcal{F}(s,t)$ is given in (14).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Scheike, T. A flexible semiparametric transformation model for survival data. Lifetime Data Anal 12, 461–480 (2006). https://doi.org/10.1007/s10985-006-9021-1

Download citation

Received: 09 February 2006
Accepted: 28 July 2006
Published: 20 September 2006
Issue Date: December 2006
DOI: https://doi.org/10.1007/s10985-006-9021-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A flexible semiparametric transformation model for survival data

Abstract

Access this article

Similar content being viewed by others

Semiparametric methods for left-truncated and right-censored survival data with covariate measurement error

Extended Exponential Geometric Proportional Hazard Model

Z-estimation and stratified samples: application to survival models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A flexible semiparametric transformation model for survival data

Abstract

Access this article

Similar content being viewed by others

Semiparametric methods for left-truncated and right-censored survival data with covariate measurement error

Extended Exponential Geometric Proportional Hazard Model

Z-estimation and stratified samples: application to survival models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation