Abstract
Covariate measurement error occurs commonly in survival analysis. Under the proportional hazards model, measurement error effects have been well studied, and various inference methods have been developed to correct for error effects under such a model. In contrast, error-contaminated survival data under the additive hazards model have received relatively less attention. In this paper, we investigate this problem by exploring measurement error effects on parameter estimation and the change of the hazard function. New insights of measurement error effects are revealed, as opposed to well-documented results for the Cox proportional hazards model. We propose a class of bias correction estimators that embraces certain existing estimators as special cases. In addition, we exploit the regression calibration method to reduce measurement error effects. Theoretical results for the developed methods are established, and numerical assessments are conducted to illustrate the finite sample performance of our methods.
Similar content being viewed by others
References
Breslow NE, Day NE (1980) Statistical methods in cancer research, vol 1., The design and analysis of case–control studies IARC, Lyon
Buzas JS (1998) Unbiased scores in proportional hazards regression with covariate measurement error. J Stat Plan Inference 67:247–257
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. Chapman & Hall/CRC, Boca Raton
Cox DR (1972) Regression models and life-tables (with Discussion). J R Stat Soc Ser B 34:187–220
Cox DR, Oakes D (1984) Analysis of survival data. Chapman & Hall/CRC, Boca Raton
Fuchs HJ, Borowitz DS, Christiansen DH, Morris EM, Nash ML, Ramsey BW, Rosenstein BJ, Smith AL, Wohl ME (1994) Effect of aerosolized recombinant human DNase on exacerbations of respiratory symptoms and on pulmonary function in patients with cystic fibrosis. N Engl J Med 331:637–642
Horn RA, Johnson CR (1985) Matrix analysis. Cambridge University Press, New York
Hu C, Lin DY (2004) Semiparametric failure time regression with replicates of mismeasured covariates. J Am Stat Assoc 99:105–118
Hu P, Tsiatis AA, Davidian M (1998) Estimating the parameters in the Cox model when covariates are measured with error. Biometrics 54:1407–1419
Huang Y, Wang CY (2000) Cox regression with accurate covariates unascertainable: a nonparametric-correction approach. J Am Stat Assoc 45:1209–1219
Jiang J, Zhou H (2007) Additive hazard regression with auxiliary covariates. Biometrika 94:359–369
Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, Hoboken
Kulich M, Lin DY (2000) Additive hazards regression with covariate measurement error. J Am Stat Assoc 95:238–248
Li Y, Lin X (2003) Functional inference in frailty measurement error models for clustered survival data using the SIMEX approach. J Am Stat Assoc 98:191–203
Li Y, Ryan L (2004) Survival analysis with heterogeneous covariate measurement error. J Am Stat Assoc 99:724–735
Lin DY, Ying Z (1994) Semiparametric analysis of the additive risk model. Biometrika 81:61–71
Nakamura T (1992) Proportional hazards model with covariates subject to measurement error. Biometrics 48:829–838
Pollard D (1990) Empirical processes: theory and applications. IMS, Hayward
Prentice RL (1982) Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika 69:331–342
Song X, Huang Y (2005) On corrected score approach for proportional hazards model with covariate measurement error. Biometrics 61:702–714
Sun L, Zhang Z, Sun J (2006) Additive hazards regression of failure time data with covariate measurement errors. Stat Neerlandica 60:497–509
van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, New York
Wang CY, Hsu L, Feng ZD, Prentice RL (1997) Regression calibration in failure time regression. Biometrics 53:131–145
Yan Y, Yi GY (2015) A class of functional methods for error-contaminated survival data under additive hazards models with replicate measurements. J Am Stat Assoc. doi:10.1080/01621459.2015.1034317
Yi GY, Lawless JF (2007) A corrected likelihood method for the proportional hazards model with covariates subject to measurement error. J Stat Plan Inference 137:1816–1828
Yi GY, Reid N (2010) A note on Mis-specified estimating functions. Stat Sinica 20:1749–1769
Zucker DM, Spiegelman D (2008) Corrected score estimation in the proportional hazards model with misclassified discrete covariates. Stat Med 27:1911–1933
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In the following, for an \(m\times 1\) vector \(a=(a_{(1)},a_{(2)},\cdots , a_{(m)})^T\), we use the Euclidean norm \(||a||=(\sum a_{(i)}^2)^{1/2}\). For a matrix A, define \(||A||=\sup _{i,j}|A_{(i)(j)}|\). When we say that the vector process \(A_n(t)\) converges almost surely to A(t) uniformly in t, we mean that \(\sup _{0\le t\le \tau }||A_n(t)-A(t)||\mathop {\rightarrow }\limits ^{a.s.} 0\), as \(n\rightarrow \infty \).
1.1 Appendix 1: regularity conditions
-
R1.
\(\{N_i(\cdot ),Y_i(\cdot ),Z_i(\cdot )\}, i=1,\cdots ,n\) are independent and identically distributed.
-
R2.
\(\Pr \{Y_1(\tau )=1\}>0\).
-
R3.
\(\Lambda _0(\tau )<\infty \).
-
R4.
\(\sup _{t\in [0,\tau ]}||E[Z_1^{\otimes 2}(t)]||<\infty \).
-
R5.
Bounded variation condition: for \(i=1,\cdots , n\), \(j=1,\cdots , p+q\),
$$\begin{aligned} |Z_{i(j)}(0)|+\int _0^{\tau } |dZ_{i(j)}(u)|\le K \end{aligned}$$holds almost surely for all the sample path, where K is a constant.
-
R6.
All the \(n_i\), \(i=1,\cdots , n\), are upper bounded by a constant \(N_0\), and
$$\begin{aligned} \lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n I\{n_i=j\} \end{aligned}$$exists, where \(j=1,\cdots ,N_0\).
-
R7.
\(||E(\epsilon _{11}^{\otimes 2})||<\infty \).
-
R8.
\(\int _0^{\tau }E\left[ Y_1(t)\{{Z}_1(t)-e(t)\}^{\otimes 2}\right] dt\), \(\Sigma _{nv}\), and \(\Sigma _{rc}^*\) are positive definite.
1.2 Appendix 2: Proof of Theorem 1
Since each component of the vector \(n^{-1}\sum _{i=1}^n Y_i(t)Z_i(t)\) is of bounded variation by Condition R5, it can be written as the difference of two nondecreasing functions. Thus, \(n^{-1}\sum _{i=1}^n Y_i(t)Z_i(t)\) is manageable (Pollard 1990, p. 38), and \(n^{-1}\sum _{i=1}^n Y_i(t)\hat{Z}_i(t)\) is manageable. Together with Conditions R4, R6 and R7, the two conditions of the uniform strong law of large numbers (USLLN) (Pollard 1990, p. 41) are thus verified, and we obtain that as \(n\rightarrow \infty \), \(n^{-1}\sum _{i=1}^n Y_i(t)\hat{Z}_i(t)\mathop {\rightarrow }\limits ^{a.s.} E[Y_i(t)Z_i(t)]\) uniformly in t. Similarly, \(n^{-1}\sum _{i=1}^n Y_i(t)\mathop {\rightarrow }\limits ^{a.s.} E[Y_i(t)]\), and \(n^{-1}\sum _{i=1}^n N_i(t)\mathop {\rightarrow }\limits ^{a.s.} E[N_i(t)]\) uniformly in t. By USLLN together with Condition R2, \(\sum _{i=1}^n Y_i(t)\hat{Z}_i(t)/\sum _{i=1}^n Y_i(t)\mathop {\rightarrow }\limits ^{a.s.} e(t)\) uniformly in t. Thus, \(n^{-1}\sum _{i=1}^n\int _0^{\tau } \tilde{Z}(t)dN_i(t)\mathop {\rightarrow }\limits ^{a.s.}\int _0^{\tau }e(t)dE[N_i(t)]\). Similarly,
\(n^{-1}\sum _{i=1}^n\int _0^{\tau } \hat{Z}_i(t)dN_i(t)\mathop {\rightarrow }\limits ^{a.s.}\int _0^{\tau }E\left[ Z_i(t)dN_i(t)\right] .\) Note that \(E[\hat{Z}_i^{\otimes 2}(t)]=E[Z_i^{\otimes 2}(t)]+\Sigma _1/n_i\). Thus,
After some algebra, these results lead to that \(n^{-1} U_{nv}(\beta )\mathop {\longrightarrow }\limits ^{a.s.}\mathcal {U}_{nv}(\beta )\), as \(n\rightarrow \infty \). Thus, \(\hat{\beta }_{nv}\mathop {\rightarrow }\limits ^{a.s.}\beta _{nv}^*\), as \(n\rightarrow \infty \).
Now we show that
Note that \(U_{nv}(\beta _{nv}^*)=U_1^*-U_2^*\), where
Similar to the proof of Theorem 1 of Kulich and Lin (2000), we obtain that
To prove (13), it remains to prove that
Observe that
Plugging (15) into \(n^{-1/2}U_2^*\), we obtain (14), and thus (13) is proved.
By the Taylor series expansion, \(0=n^{-1/2}U_{nv}(\hat{\beta }_{nv})= n^{-1/2}U_{nv}(\beta _{nv}^*)+ \left[ n^{-1}\frac{\partial U_{nv}(\beta )}{\partial \beta }\right] n^{1/2}(\hat{\beta }_{nv}-\beta _{nv}^*)\), leading to
It is straightforward that
Let \(U_{nv,i}=\int _0^{\tau }\left\{ \hat{Z}_i(t)-e(t)\right\} d \tilde{M}_i(t;\beta )- \int _0^{\tau }Y_i(t)\left\{ \hat{Z}_i(t)-e(t)\right\} ^{\otimes 2}(\beta _{nv}^*-\beta )dt,i=1,\cdots ,n\). By Condition R6, \(E[||n^{-1/2}U_{nv,i}||^2I\{||n^{-1/2}U_{nv,i}||>\epsilon \}]\) can only take at most \(N_0\) possible values for fixed \(\epsilon >0\). Without loss of generality, suppose when \(i=1\), it achieves the maximum value. It follows from the Markov inequality that \(\Pr \{||n^{-1/2}U_{nv,1}||>\epsilon \}\le n^{-1}E[||U_{nv,1}||^2]/\epsilon ^2\rightarrow 0\) as \(n\rightarrow \infty \), and thus
Consequently, the Lindeberg condition (van der Vaart 1998, p. 20) is verified. By the multivariate Lindeberg–Feller central limit theorem (van der Vaart 1998, p. 20), we obtain that \(n^{-1/2}U_{nv}(\beta _{nv}^*)\) is asymptotically normal with mean 0 and covariance matrix \(\Sigma _{nv}=\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^nE( U_{nv,i})^{\otimes 2}\). It follows from (16) that \(\hat{\beta }_{nv}\) is asymptotically normal with mean \(\beta _{nv}^*\) and covariance matrix \(\mathcal {D}_{nv}^{-1}\Sigma _{nv}\mathcal {D}_{nv}^{-1}\). The matrices \(\Sigma _{nv}\) and \(\mathcal {D}_{nv}\) can be consistently estimated by their empirical counterpart.
1.3 Appendix 3: asymptotic equivalence of \(\hat{\beta }_{bc}\) and \(\hat{\beta }_{szs}\)
Note that
Thus,
where the third equation comes from (15). Together with error bounds for the inverse matrices (Horn and Johnson 1985), we obtain \(\sqrt{n}(\hat{B}_{bc,1}^{-1}-\hat{B}_{szs,1}^{-1})=o_p(1)\). Note also that \(\hat{B}_{bc,1}+\hat{B}_{bc,2}=\hat{B}_{szs,1}+\hat{B}_{szs,2}\). It then follows that
Thus, \(\hat{\beta }_{bc}\) and \(\hat{\beta }_{szs}\) are asymptotically equivalent, and they have the same asymptotic distribution.
1.4 Appendix 4
Following Carroll et al. (2006), we estimate \({\Sigma }_{xx}\), \({\Sigma }_{xv}\), \({\Sigma }_{vv}\), \(\mu _x\), and \(\mu _{v}\) by \(\hat{\Sigma }_{xx}\), \(\hat{\Sigma }_{xv}\), \(\hat{\Sigma }_{vv}\), \(\bar{W}_{\cdot \cdot }={\sum _{i=1}^n\sum _{j=1}^{n_i}W_{ij}}/{\sum _{i=1}^n n_i}\) and \(\bar{V}_{\cdot }(t)\), respectively, where
1.5 Appendix 5: Proof of Theorem 2
First, we consider the case where all \(n_i\) are equal. Note that
It follows that \(\hat{\beta }_{rc}\mathop {\longrightarrow }\limits ^{a.s.}\beta _{rc}^*\), as \(n\rightarrow \infty \). Together with the fact that
we obtain
Since the asymptotic expansion of \(\sqrt{n}({A}_{rc,1}^{-1})^T(\hat{\beta }_{nv}-{\beta }_{nv}^*)\) is obtained in Theorem 1, thus, to obtain the asymptotic expansion of \(\sqrt{n}(\hat{\beta }_{rc}-{\beta }_{rc}^*)\), by (17) we need only to examine \(\sqrt{n}\{(\hat{A}_{rc,1}^{-1})^T-({A}_{rc,1}^{-1})^T\}{\beta }_{nv}^*\). This can be done, in principle, by a Taylor series expansion. In the following, we first study the asymptotic expansion for the univariate case, and then the multivariate case.
By considering a Taylor series expansion, we obtain
Noting that
together with (17), we obtain that \(\sqrt{n}(\hat{\beta }_{rc}-{\beta }_{rc}^*)\) is a sum of independent terms asymptotically, and thus, \(\hat{\beta }_{rc}\) is asymptotic normal with mean \({\beta }_{rc}^*\) and variance \(\Sigma _{rc}^*\), which can be obtained easily by (17).
For the multivariate case, \(\hat{\beta }_{rc}\) is still asymptotic normal with mean \({\beta }_{rc}^*\) and covariance matrix \(\Sigma _{rc}^*\) whose form is complicated. We suggest to use the first term of (17) only to obtain an approximate variance of \(\hat{\beta }_{rc}\): \(\Sigma _{rc}^*\approx [{A}_{rc,1}^{-1}]^T\mathcal {D}_{nv}^{-1}\Sigma _{nv}\mathcal {D}_{nv}^{-1}[{A}_{rc,1}^{-1}]\), which can be consistently estimated by the empirical counterpart.
For the general case where the \(n_i\) are not necessarily equal, it can be shown that \(\hat{\beta }_{rc}\) is still asymptotically normal, with a complicated form for the asymptotic covariance.
Rights and permissions
About this article
Cite this article
Yan, Y., Yi, G.Y. Analysis of error-prone survival data under additive hazards models: measurement error effects and adjustments. Lifetime Data Anal 22, 321–342 (2016). https://doi.org/10.1007/s10985-015-9340-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-015-9340-1