Abstract
I suggest an extension of the semiparametric transformation model that specifies a time-varying regression structure for the transformation, and thus allows time-varying structure in the data. Special cases include a stratified version of the usual semiparametric transformation model. The model can be thought of as specifying a first order Taylor expansion of a completely flexible baseline. Large sample properties are derived and estimators of the asymptotic variances of the regression coefficients are given. The method is illustrated by a worked example and a small simulation study. A goodness of fit procedure for testing if the regression effects lead to a satisfactory fit is also suggested.
Similar content being viewed by others
References
Bagdonavicius V, Nikulin M (1999) Generalised proportional hazards model based on modified partial likelihood. Lifetime Data Anal 5:329–350
Bagdonavicius V, Nikulin M (2001) Accelerated life models: modelling and statistical analysis. Chapman and Hall, London
Bennett S (1983) Analysis of survival data by the proportional odds model. Statist Med 2:273–277
Cai T, Wei LJ, Wilcox M (2000) Semiparametric regression analysis for clustered failure time data. Biometrika 87:867–878
Chen K, Jin Z, Ying Z (2002) Semiparametric analysis of transformation models with censored data. Biometrika 89:659–668
Cheng SC, Wei LJ, Ying Z (1995) Analysis of transformation models with censored data. Biometrika 82:835–845
Cheng SC, Wei LJ, Ying Z (1997) Prediction of survival probabilities with semi-parametric transformation models. J Am Statist Assoc 92:227–235
Cox DR (1972) Regression models and life tables (with discussion). J Roy Statist Soc Ser B 34:187–220
Dabrowska DM (1997) Smoothed Cox regression. Ann Statist 25:1510–1540
Dabrowska DM (2005) Quantile regression in transformation models. arXiv:math.ST 05115082v1:1–34
Dabrowska DM (2006) Estimation in a class of semiparametric transformation models. arXiv:math.ST 0511506v2:1–48
Dabrowska DM, Doksum KA (1988) Partial likelihood in transformation models with censored data. Scand J Statist 15:1–24
Fine J, Ying Z, Wei LJ (1998) On the linear transformation model with censored data. Biometrika 85:980–986
Gill RD, Johansen S (1990) A survey of product-integration with a view towards application in survival analysis. Ann Statist 18:1501–1555
Jensen GV, Torp-Pedersen C, Hildebrandt P, Kober L, Nielsen FE, Melchior T, Joen T, Andersen PK (1997) Does in-hospital ventricular fibrillation affect prognosis after myocardial infarction?. Eur Heart J 18:919–924
Kosorok MR, Lee BL, Fine JP (2004) Robust inference for univariate proportional hazards frailty regression models. Ann Statist 32:1448–1491
Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572
Murphy S, Rossini A, Van Der Vaart A (1997) Maximum likelihood estimation in the proportional odds model. J Am Statist Assoc 92:968–976
Scheike TH, Zhang MJ (2002) An additive-multiplicative Cox–Aalen model. Scand J Statist 28:75–88
Scheike TH, Zhang MJ (2003) Extensions and applications of the Cox–Aalen survival model. Biometrics 59:1033–1045
Acknowledgements
I would like to thank Martin Jacobsen for pointing my direction to the product integration formulae. I also appreciate discussions with Torben Martinussen. Part of the this work was done while the author visited the Center for Advanced Study in Oslo, and was partly supported by an NIH grant. I would also like to thank two referees and the associate editor. One for a detailed reading and making several suggestions for an improved presentation, and one for providing me with several recent references that deals with the asymptotic analysis. Their comments helped improve the manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
I here sketch the main arguments of the proof that establishes the asymptotic variances, extending that of Bagdonavicius and Nikulin (1999, 2001) and Chen et al. (2002) to more than one dimension. A detailed consistency proof for the standard transformation model can be found in Dabrowska (2005), and it appears that these arguments can be extended to our setting, but this is no trivial exercise. The key to the multivariate extension is the use of product-limit integration formulae and I here focus on establishing the key formulae that gives expressions for the standard deviations of the estimated quantities.
Assume that subjects are i.i.d. with covariates that are uniformly bounded. Define \({\varvec D}^{**}(\beta,A) = \hbox{diag}(\exp(2 Z_i^{T}\beta)\dot{\lambda}_0(H({t-}|X_i)\exp(Z_i^{T}\beta))), {\varvec S}^{**}(\beta, A) = n^{-1}{\varvec X}^T(t){\varvec D}^{**}(t,\beta,A) {\varvec X}(t), {\varvec S}_j^{**}(\beta, A) = n^{-1}{\varvec X}^{T}(t){\varvec D}(X_{ij}){\varvec D}^{**}(t,\beta,A){\varvec X}(t), {\varvec S}^{Z}(\beta, A) = n^{-1}{\varvec Z}^T(t){\varvec D}(t,\beta,A) {\varvec X}(t)\), and \({\varvec S}_j^{**Z}(\beta, A) = n^{-1}{\varvec Z}^{T}(t) {\varvec D}(X_{ij}){\varvec D}^{**}(t,\beta,A){\varvec X}(t)\).
One of the needed assumptions is that all S matrices converge uniformly to deterministic matrices in both time and the parameter space for β. Let Θ denote an open ball around the true parameter value β0, then it is assumed that the limit of all large S’s exist and are denoted by small s’s. Such that for example \(lim_p {\varvec S}_j^{**Z}(\beta, A) = s_j^{**Z}\) uniformly in \(\Theta\times [0,\tau]\). Define the limit of \(E_j^{**}(t) = {\varvec S}_0^{-1}(t,\beta){\varvec S}_j^{**}(t,\beta)\) as \(e_j^{**}(u,\beta)\). I also assume that the covariates are uniformly bounded.
I start by making the observation that (up to \(o_p(n^{-1})\))
where the time arguments were omitted for notational simplicity. Note that the first term, V n , by the martingale CLT, can be shown to converge to a Gaussian martingale process V(t) with variance \(\sigma^2(t) = \int_{0}^{t}s_{0}^{-1}\,\hbox{d}A_0\). Note also that V n (t) can be written as
The integrand in the last integral above can be Taylor expanded by the use of matrix derivatives thus leading to (up to \(o_p(n^{-1/2})\))
This implies that \(\sqrt{n}(\tilde{A}(\beta_0,t) - A_0(t))\) converges in distribution to a p-dimensional process W that satisfies the integral equation
Denote the elements of \(e_{j}^{**}(t)\) as \(e_{j,{km}}^{**}(t)\). With the p × p matrices F(t) with elements \(F_{jk}(t) = \sum_{m=1}^p e_{j,{km}}^{**}(t)\alpha_{0,m}(t)\) the solution can be written as
Using the product integration formulae (Gill and Johansen 1990) leads to the solution
and where \(\mathcal{F}(s,t)\) is the product integral of F. In the one-dimensional (commutative) case this equals \(\exp(-\int_{s}^{t} e^{**}(u,\beta_0)\,\hbox{d}A_0(u))\). Note that the matrix \(\mathcal{F}(s,t)\) is estimated consistently by the product of (the atoms)
where \(\hbox{d}\hat{F}_{jk}(t) = \sum_{m=1}^p E_{j,{km}}^{**}(t)\,\hbox{d}\hat{A}_m(t).\) Using the product integral rather than the exponential in the one-dimensional case yields an estimator with the same asymptotic properties.
I now turn to the estimating function for \(\beta, \tilde{U}(\beta).\) By the martingale decomposition
By a Taylor expansion, as before, it may be shown that the difference (up to \(o_p(n^{-1/2})\))
The last two term therefore equals ( \(o_p(n^{-1/2})\))
that can be written as
with G 1(t) and G 2(t) an q × p matrix with elements \(G_{1,kj}(t) = \sum_{m=1}^p s_{j,km}^{**z}\alpha_{0,m}(t)\) and \(G_{2,kj}(t) = \sum_{m=1}^p \left\{s^z s_0^{-1}s_j^{**}\right\}_{km} \alpha_{0,m}(t)\). Now changing the order of integration and using the structure of W(t) one gets
where
Define
Then one can write the estimating function as follows (up to an o p (1) term)
This is a sum of i.i.d. terms (or a martingale) and therefore converges to a normal distribution with variance that is estimated by the robust estimator given in (11).
Based on a Taylor series expansion (up to an o p (1) term) we find that
where
and with P(t,β) the limit of \(n^{-1}\tilde{P}(t,\hat{\beta})\) defined in (8), with I(τ) the limit of \(n^{-1} \mathcal{I}(\tau,\beta_0)\), and with the product integral \(\mathcal{F}(s,t)\) defined in (13).
Note that an estimator of \(\mathcal{F}(s,t)\) is given in (14).
Rights and permissions
About this article
Cite this article
Scheike, T. A flexible semiparametric transformation model for survival data. Lifetime Data Anal 12, 461–480 (2006). https://doi.org/10.1007/s10985-006-9021-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-006-9021-1