Skip to main content

Advertisement

Log in

Estimation of complier causal treatment effects with informatively interval-censored failure time data

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Estimation of compiler causal treatment effects has been discussed by many authors under different situations but only limited literature exists for interval-censored failure time data, which often occur in many areas such as longitudinal or periodical follow-up studies. Particularly it does not seem to exist a method that can deal with informative interval censoring, which can happen naturally and make the analysis much more challenging. Also, it has been shown that when the informative censoring exists, the analysis without taking it into account would yield biased or misleading results. To address this, we propose an estimated sieve maximum likelihood approach with the use of instrumental variables. The asymptotic properties of the resulting estimators of regression parameters are established, and a simulation study is performed and suggests that it works well. Finally, it is applied to a set of real data that motivated this study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abadie, A., Angrist, J., Imbens, G. (2002). Instrumental variables estimates of the effect of subsidized training on the quantiles of trainee earnings. Econometrica, 70, 91–117.

    Article  MathSciNet  MATH  Google Scholar 

  • Angrist, J. D., Imbens, G. W., Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91, 444–455.

    Article  MATH  Google Scholar 

  • Baiocchi, M., Cheng, J., Small, D. S. (2014). Instrumental variable methods for causal inference. Statistics in Medicine, 33, 2297–2340.

    Article  MathSciNet  Google Scholar 

  • Baker, S. G. (1998). Analysis of survival data from a randomized trial with all-or-none compliance: Estimating the cost-effectiveness of a cancer screening program. Journal of the American Statistical Association, 93, 929–934.

    Article  Google Scholar 

  • Chen, X., Fan, Y., Tsyrennikov, V. (2006). Efficient estimation of semiparametric multivariate copula models. Journal of the American Statistical Association, 101, 1228–1240.

    Article  MathSciNet  MATH  Google Scholar 

  • Cheng, J., Small, D. S., Tan, Z., Ten Have, T. R. (2009). Efficient nonparametric estimation of causal effects in randomized trials with noncompliance. Biometrika, 96, 19–36.

    Article  MathSciNet  MATH  Google Scholar 

  • Cuzick, J., Sasieni, P., Myles, J., Tyrer, J. (2007). Estimating the effect of treatment in a proportional hazards model in the presence of non-compliance and contamination. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69, 565–588.

    Article  MathSciNet  MATH  Google Scholar 

  • Du, M., Zhou, Q., Zhao, S., Sun, J. (2021). Regression analysis of case-cohort studies in the presence of dependent interval censoring. Journal of Applied Statistics, 48, 846–865.

    Article  MathSciNet  MATH  Google Scholar 

  • Hammer, S. M., Sobieszczyk, M. E., Janes, H., Karuna, S. T., Mulligan, M. J., Grove, D., Koblin, B. A., Buchbinder, S. P., Keefer, M. C., Tomaras, G. D., et al. (2013). Efficacy trial of a dna/rad5 hiv-1 preventive vaccine. New England Journal of Medicine, 369, 2083–2092.

    Article  Google Scholar 

  • Huang, X., Wolfe, R. A. (2002). A frailty model for informative censoring. Biometrics, 58, 510–520.

    Article  MathSciNet  MATH  Google Scholar 

  • Huling, J. D., Yu, M., O’Malley, A. J. (2019). Instrumental variable based estimation under the semiparametric accelerated failure time model. Biometrics, 75, 516–27.

    Article  MathSciNet  MATH  Google Scholar 

  • Janes, H. E., Cohen, K. W., Frahm, N., De Rosa, S. C., Sanchez, B., Hural, J., Magaret, C. A., Karuna, S., Bentley, C., Gottardo, R., et al. (2017). Higher t-cell responses induced by dna/rad5 hiv-1 preventive vaccine are associated with lower hiv-1 infection risk in an efficacy trial. The Journal of Infectious Diseases, 215, 1376–1385.

    Article  Google Scholar 

  • Kalbfleisch, J. D., Prentice, R. L. (2011). The statistical analysis of failure time data. New York: Wiley.

  • Li, G., Lu, X. (2015). A bayesian approach for instrumental variable analysis with censored time-to-event outcome. Statistics in Medicine, 34, 664–684.

    Article  MathSciNet  Google Scholar 

  • Li, S., Gray, R. J. (2016). Estimating treatment effect in a proportional hazards model in randomized clinical trials with all-or-nothing compliance. Biometrics, 72, 742–750.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, S., Peng, L. (2021). Instrumental variable estimation of complier causal treatment effect with interval-censored data. Biometrics, 79, 253–263.

    Article  MathSciNet  Google Scholar 

  • Lin, H., Li, Y., Jiang, L., Li, G. (2014). A semiparametric linear transformation model to estimate causal effects for survival data. Canadian Journal of Statistics, 42, 18–35.

    Article  MathSciNet  MATH  Google Scholar 

  • Lorentz, G. G. (1986). Bernstein polynomials (2nd ed.). New York: Chelsea Publishing Co.

  • Ma, L., Hu, T., Sun, J. (2015). Sieve maximum likelihood regression analysis of dependent current status data. Biometrika, 102, 731–738.

    Article  MathSciNet  MATH  Google Scholar 

  • Ma, L., Hu, T., Sun, J. (2016). Cox regression analysis of dependent interval-censored failure time data. Computational Statistics Data Analysis, 103, 79–90.

    Article  MathSciNet  MATH  Google Scholar 

  • Nie, H., Cheng, J., Small, D. S. (2011). Inference for the effect of treatment on survival probability in randomized trials with noncompliance and administrative censoring. Biometrics, 67, 1397–1405.

    Article  MathSciNet  MATH  Google Scholar 

  • O’Malley, A. J., Cotterill, P., Schermerhorn, M. L., Landon, B. E. (2011). Improving observational study estimates of treatment effects using joint modeling of selection effects and outcomes: The case of aaa repair. Medical care, 49, 1126.

    Article  Google Scholar 

  • Shen, X., Wong, W. H. (1994). Convergence rate of sieve estimates. Annals of Statistics, 22, 580–615.

    Article  MathSciNet  MATH  Google Scholar 

  • Sun, J. (1999). A nonparametric test for current status data with unequal censoring. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61, 243–250.

    Article  MathSciNet  MATH  Google Scholar 

  • Sun, J. (2006). The statistical analysis of interval-censored failure time data. New York: Springer.

  • Van, D., Wellner, J. A. (1996). Weak convergence and empirical processes. New York: Springer.

  • Wang, P., Zhao, H., Sun, J. (2016). Regression analysis of case k interval-censored failure time data in the presence of informative censoring. Biometrics, 72, 1103–1112.

    Article  MathSciNet  MATH  Google Scholar 

  • Youyi, F., Shen, X., Ashley, V. C., Aaron, D., Seaton, K. E., Yu, C., Grant, S. P., Guido, F., Decamp, A. C., Bailer, R. T. (2018). Modification of the association between t-cell immune responses and human immunodeficiency virus type 1 infection risk by vaccine-induced antibody responses in the hvtn 505 trial. Journal of Infectious Diseases, 217, 1280–1288.

    Article  Google Scholar 

  • Yu, W., Chen, K., Sobel, M. E., Ying, Z. (2015). Semiparametric transformation models for causal inference in time to event studies with all-or-nothing compliance. Journal of the Royal Statistical Society Series B, Statistical Methodology, 77, 397–415.

    Article  MathSciNet  MATH  Google Scholar 

  • Zeng, D. (2012). Estimating treatment effects with treatment switching via semicompeting risks models: An application to a colorectal cancer study. Biometrika, 99, 167–184.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, Y., Hua, L., Huang, J. (2010). A spline-based semiparametric maximum likelihood estimation method for the cox model with interval-censored data. Scandinavian Journal of Statistics, 37, 338–354.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, Z., Sun, J., Sun, L. (2005). Statistical analysis of current status data with informative observation times. Statistics in Medicine, 24, 1399–1407.

    Article  MathSciNet  Google Scholar 

  • Zhang, Z., Sun, L., Sun, J., Finkelstein, D. M. (2007). Regression analysis of failure time data with informative interval censoring. Statistics in Medicine, 26, 2533–2546.

    Article  MathSciNet  Google Scholar 

  • Zhou, Q., Hu, T., Sun, J. (2016). A sieve semiparametric maximum likelihood approach for regression analysis of bivariate interval-censored failure time data. Journal of the American Statistical Association, 112, 664–672.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the Associate Editor and two reviewers for their many helpful comments and suggestions that greatly improved the paper. This work was partially supported by the Natural Science Foundation of Jilin Province (Grant No. 20230101002JC) and the National Nature Science Foundation of China (Grant No. 11801212, Grant No. 12071176).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peijie Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs of Theorems 1–3

Appendix: Proofs of Theorems 13

In this Appendix, we will sketch the proofs for the asymptotic properties of \({\hat{\xi }}_n\) given in Theorems 13. To establish the asymptotic properties of \({\hat{\xi }}_n\), we need the following regularity conditions, which are commonly used in the studies of interval-censored data and usually satisfied in practice (Huang and Wolfe 2002; Ma et al. 2016; Zhang et al. 2010; Zhou et al. 2016).

(A.1):

The true value for \(\eta\), denoted as \(\eta _0\), is in the interior of a compact set \(\mathcal {B}\) in \(R^{p_{\eta }}\), \(\Vert \eta _0\Vert \le B\) for a constant \(B>0\), and \(P(R-L>\varepsilon )=1\) for some \(\varepsilon >0\).

(A.2):

The distribution of the covariate X has a bounded support in \(R^p\) and is not concentrated on any proper subspace of \(R^p\).

(A.3):

The first derivative of \(\Lambda _{t0}(\cdot )\) and \(\Lambda _{w0}(\cdot )\), denoted by \(\Lambda _{t0}^{(1)}(\cdot )\) and \(\Lambda _{w0}^{(1)}(\cdot )\), is Holder continuous with exponent \(s\in (0, 1]\). That is, there exists a constant \(K > 0\) such that \(|\Lambda _{t0}^{(1)}(t_1)- \Lambda _{t0}^{(1)}(t_2)|\le K|t_1-t_2|^{s}\) for all \(t_1\), \(t_2\in [\sigma _1,\tau _1]\), where \(0<\sigma _1<\tau _1<\infty\), and \(\Lambda _{w0}(\cdot )\) has the similar properties. Let \(v=1+s\).

(A.4):

There exists a constant \(K >0\) such that for every \(\xi\) in a neighborhood of \(\xi _0\), \(P\{l(\xi ,O)-l(\xi _0,O)\}\preceq -Kd^2(\xi ,\xi _0)\), where O is the observation data and \(\preceq\) means ‘smaller than, up to a constant.’

(A.5):

The matrix \(E({l^*(\eta _0,O)}^{\otimes 2})\) is finite and positive definite, where \(a^{\otimes 2}=aa^{T}\) for a vector a, and \(l^*(\eta ,O)\) is the efficient score for \(\eta\) based on the observation O and given in the proof of Theorem 3.

For the proof, we will mainly employ the empirical process theory and some nonparametric techniques. Let \(Pf=\int f(y)dP\) denote the expectation of f(Y) under the probability measure P, and \(P_nf = n^{-1}\sum _{i=1}^nf(Y_i)\), the expectation of f(Y) under the empirical measure \(P_n\). Define the covering number of the class \(\mathcal {L}_n =\{l(\xi ,O ):\xi \in \Psi _n\}\), where \(l(\xi ,O)\) is the log-likelihood function based on a single observation O. Also for any \(\epsilon >0\), define the covering number \(N(\epsilon ,\mathcal {L}_n, L_1(P_n))\) as the smallest positive integer \(\kappa\) for which there exists \(\{\xi ^{(1)},\ldots ,\xi ^{(\kappa )}\)} such that

$$\begin{aligned} \min _{j\in \{1,\ldots ,\kappa \}}\frac{1}{n}\sum _{i=1}^n|l(\xi ,O_i)-l(\xi ^{(j)},O_i)|<\epsilon \end{aligned}$$

for all \(\xi \in \Psi _n\), where \(\{O_1,\ldots ,O_n\}\) represent the observed data and for \(j =1,\ldots ,\kappa\), \(\xi ^{(j)}=(\eta ^{(j)},\Lambda _t^{(j)},\Lambda _w^{(j)})\in \Psi _n\). If no such \(\kappa\) exists, define \(N(\epsilon ,\mathcal {L}_n, L_1(P_n))=\infty\). Also for the proof, we need first to establish the following two lemmas, whose proofs are similar to those for Lemmas 1 and 2 in Zhou et al. (2016).

Lemma 1

Assume that the regularity conditions (A.1)-(A.3) given above hold. Then, we have that the covering number of the class \(\mathcal {L}_n =\{l(\xi ,O):\xi \in \Psi _n\}\) satisfies

$$\begin{aligned} N\Big (\epsilon ,\mathcal {L}_n, L_1(P_n)\Big )\le KM_n^{2(m+1)}\epsilon ^{-(p_{\eta }+2(m+1))} \end{aligned}$$

for a constant K, where \(m=o(n^q)\) with \(q\in (0, 1)\) is the degree of Bernstein polynomials, and \(M_n = O(n^a)\) with \(a>0\) controls the size of the sieve space \(\Psi _n\).

Lemma 2

Assume that the regularity conditions (A.1)–(A.3) given above hold. Then, we have that

$$\begin{aligned} \sup _{\xi \in \Psi _n}|P_nl(\xi ,O)-Pl(\xi ,O)|\rightarrow 0 \end{aligned}$$

almost surely.

Now, we are ready to prove Theorems 13.

Proof of Theorem 1

We first prove the strong consistency of \({\hat{\xi }}_n\). Let \(l(\xi ,O)\) denote the log-likelihood function based on a given single observation O and consider the class of functions \(\mathcal {L}_n =\{l(\xi ,O ):\xi \in \Psi _n\}\). By Lemma 1, the covering number of \(\mathcal {L}_n\) satisfies

$$\begin{aligned} N\Big (\epsilon ,\mathcal {L}_n, L_1(P_n)\Big )\le KM_n^{2(m+1)}\epsilon ^{-(p_{\eta }+2m+2)}. \end{aligned}$$

Furthermore, by Lemma 2, we have

$$\begin{aligned} \begin{aligned} \sup _{\xi \in \Psi _n}|P_nl(\xi ,O)-Pl(\xi ,O)|\rightarrow 0 \ \text {almost surely}. \end{aligned} \end{aligned}$$
(5)

Note that \(Pl(\xi ,O)=P\{pl(\xi ,O)\}=Pl(\xi ,O)\) and \(\xi _0\) maximizes \(Pl(\xi ,O)\). Let \(M(\xi ,O)=-l(\xi ,O)\), and define \(K_{\epsilon }=\{\xi :d(\xi ,\xi _0)\ge \epsilon ,\xi \in \Psi _n\}\) for \(\epsilon >0\) and

$$\begin{aligned} \zeta _{1n}=\sup _{\xi \in \Psi _n}|P_nM(\xi ,O)-PM(\xi ,O)|,\zeta _{2n}=P_nM(\xi _0,O)-PM(\xi _0,O). \end{aligned}$$

Then,

$$\begin{aligned} \begin{aligned} \inf _{K_{\epsilon }}PM(\xi ,O)=&\inf _{K_{\epsilon }}\Big \{PM(\xi ,O)-P_nM(\xi ,O)+P_nM(\xi ,O)\Big \}\\&\le \zeta _{1n}+\inf _{K_{\epsilon }}P_nM(\xi ,O). \end{aligned} \end{aligned}$$
(6)

If \({\hat{\xi }}_n\in K_{\epsilon }\), then we have

$$\begin{aligned} \begin{aligned} \inf _{K_{\epsilon }}P_nM(\xi ,O)=P_nM({\hat{\xi }}_n,O)\le P_nM(\xi _0,O)=\zeta _{2n}+PM(\xi _0,O). \end{aligned} \end{aligned}$$
(7)

Define \(\delta _{\epsilon }=\inf _{K_{\epsilon }}PM(\xi ,O)-PM(\xi _0,O)\). Under Condition (A.4), we have \(\delta _{\epsilon } > 0\). It follows from (6) and (7) that

$$\begin{aligned} \inf _{K_{\epsilon }}PM(\xi ,O)\le \zeta _{1n}+\zeta _{2n}+PM(\xi _0,O)=\zeta _n+PM(\xi _0,O), \end{aligned}$$

with \(\zeta _n = \zeta _{1n} + \zeta _{2n}\), and hence, \(\zeta _n\ge \delta _{\epsilon }\). This gives \(\{{\hat{\xi }}_n\in K_{\epsilon }\}\subseteq \{\zeta _n\ge \delta _{\epsilon }\}\), and by (5) and the strong law of large numbers, we have both \(\zeta _{1n}\rightarrow 0\) and \(\zeta _{2n}\rightarrow 0\) almost surely. Therefore, \(\bigcup _{k=1}^{\infty }\bigcap _{n=k}^{\infty }\{{\hat{\xi }}_n\in K_{\epsilon }\}\subseteq \bigcup _{k=1}^{\infty }\bigcap _{n=k}^{\infty }\{\zeta _n\ge \delta _{\epsilon }\}\), which proves that \(d({\hat{\xi }}_n,\xi _0)\rightarrow 0\) almost surely. \(\square\)

Proof of Theorem 2

We will show the convergence rate of \({\hat{\xi }}_n\) by using Theorem 3.4.1 of Van and Wellner (1996). First note from Theorem 1.6.2 of Lorentz (1986) that exists Bernstein polynomials \(\Lambda _{tn0}\) and \(\Lambda _{wn0}\) such that \(\Vert \Lambda _{tn0}-\Lambda _{t0}\Vert _{\infty }=O(m^{-v/2})\) and \(\Vert \Lambda _{wn0}-\Lambda _{w0}\Vert _{\infty }=O(m^{-v/2})\). Then, we have \(d(\xi _{n0}-\xi _0)=O(n^{-vq/2})\). For any \(s>0\), define the class of functions \(\mathcal {F}_s=\{l(\xi ,O)-l(\xi _{n0},O):\xi \in \Psi _n,s/2<d(\xi -\xi _{n0})\le s\}\). One can easily show that \(P\{l(\xi _0,O)-l(\xi _{n0},O)\}\le Kd^2(\xi _0,\xi _{n0})\le Kn^{-vq}\). Hence, under Condition (A.4), we have for large n, \(P\{l(\xi ,O)-l(\xi _{n0},O)\}=P\{l(\xi ,O)-l(\xi _0,O)\}+P\{l(\xi _0,O)-l(\xi _{n0},O)\}\le -Ks^2+Kn^{-vq}=-Ks^2\), for any \(l(\xi ,O)-l(\xi _{n0},O)\in \mathcal {F}_s\).

Following the calculations in Shen and Wong (1994), we can establish that for \(0<\epsilon <s\), \(\log N_{[]}(\epsilon , \mathcal {F}_s, L_2(P))\le KN\log (s/\epsilon )\) with \(N = 2(m+1)\). Moreover, some algebraic manipulations yield that \(P\{(l(\xi ,O)-l(\xi _{n0},O)\}^2\le Ks^2\) for any \(l(\xi ,O)-l(\xi _{n0},O)\in \mathcal {F}_s\). It is easy to see that \(\mathcal {F}_s\) is uniformly bounded. Therefore, by Lemma 3.4.2 of Van and Wellner (1996), we obtain

$$\begin{aligned} E_{P}\left\| n^{1 / 2}\left( P_{n}-P\right) \right\| _{\mathcal {F}_{s}} \le K J_{[]}\left( s, \mathcal {F}_{s}, L_{2}(P)\right) \left\{ 1+\frac{J_{[]}\left( s, \mathcal {F}_{s}, L_{2}(P)\right) }{s^{2} n^{1 / 2}}\right\} , \end{aligned}$$

where \(J_{[]}\left( s, \mathcal {F}_{s}, L_{2}(P)\right) =\int _0^s\{1+\log N_{[]}(\epsilon , \mathcal {F}_s, L_2(P))\}^{1/2}d\epsilon \le KN^{1/2}s\). This yields \(\phi _n(s)=N^{1/2}s+N/n^{1/2}\). It is easy to see that \(\phi _n(s)/s\) is decreasing in s, and \(v_n^2\phi _n(1/v_n)=v_nN^{1/2}+v_n^2N/n^{1/2}\le Kn^{1/2}\), where \(v_n=N^{-1/2}n^{1/2}=n^{(1-q)/2}\).

Finally, note that \(P_n\{l({\hat{\xi }}_n,O)-l(\xi _{n0},O)\}\ge 0\) and \(d({\hat{\xi }}_n,\xi _{n0})\le d({\hat{\xi }}_n,\xi _{0})+d(\xi _0,\xi _{n0})\rightarrow 0\) in probability. Thus by applying Theorem 3.4.1 of Van and Wellner (1996), we have \(n^{(1-q)/2}d({\hat{\xi }}_n,\xi _{n0})= O_p(1)\). This together with \(d(\xi _{n0},\xi _{0}) = O(n^{-vq/2})\) yields that \(d({\hat{\xi }}_n,\xi _{0})=O_p(n^{-(1-q)/2}+n^{-vq/2})\) and the proof is completed.\(\square\)

Proof of Theorem 3

Now, we will prove the asymptotic normality of \({\hat{\eta }}_n\). Let V denote the linear span of \(\Psi -\xi _0\) and define the Fisher inner product for \(u,\tilde{u} \in V\) as \(<u,\tilde{u}>=P\{\dot{l}(\xi _0,O)[u]\dot{l}(\xi _0,O)[\tilde{u}]\}\) and the Fisher norm for \(u\in V\) as \(||u||^2=<u,u>\), where

$$\begin{aligned} \dot{l}(\xi _0,O)[u]=\frac{dl(\xi _0+su,O)}{ds}\big |_{s=0} \end{aligned}$$

denotes the first-order directional derivative of \(l(\xi ,O)\) at the direction \(u\in V\) (evaluated at \(\xi _0)\). Also, let \(\bar{V}\) be the closed linear span of V under the Fisher norm. Then, \((\bar{V}, ||\cdot ||)\) is a Hilbert space. Furthermore, for a vector of \(p_{\eta }\) dimension b with \(||b||\le 1\) and for any \(u\in V\), define a smooth function of \(\xi\) as \(h(\xi )=b^{T}\eta\) and

$$\begin{aligned} \dot{h}(\xi _0)[u]=\frac{dh(\xi _0+su)}{ds}\big |_{s=0} \end{aligned}$$

whenever the right-hand side limit is well defined. Then by the Riesz representation theorem, there exists \(u^{*}\in \bar{V}\) such that \(\dot{h}(\xi _0)[u]=<u,u^{*}>\) for all \(u\in \bar{V}\) and \(||u^{*}||=||\dot{h}(\xi _0)||\). Also, note that \(h(\xi )-h(\xi _0) = \dot{h}(\xi _0)[\xi -\xi _0]\). It thus follows from the Cramér–Wold device that to prove the asymptotic normality for \({\hat{\eta }}_n\), i.e., \(n^{1/2}({\hat{\eta }}_n-\eta _0)\rightarrow N(0, I^{-1}(\eta _0))\) in distribution, it suffices to show that

$$\begin{aligned} n^{1/2}<{\hat{\xi }}_n-\xi _0,u^{*}>\rightarrow _{d}N(0,b^{T}I^{-1}(\eta _0)b) \end{aligned}$$

since \(b^{T}({\hat{\eta }}_n-\eta _0)=h({\hat{\xi }}_n)-h(\xi _0)=\dot{h}(\xi _0)[\xi -\xi _0]=<{\hat{\xi }}_n-\xi _0,u^{*}>\). In fact, the above holds since one can show that \(n^{1/2}<{\hat{\xi }}_n-\xi _0,u^{*}>\rightarrow _{d}N(0,||u^{*}||^2)\) and \(||u^{*}||^2=b^{T}I^{-1}(\eta _0)b\).

We first prove that \(n^{1/2}<{\hat{\xi }}_n-\xi _0,u^{*}>\rightarrow _{d}N(0,||u^{*}||^2)\). Let \(\delta _n=n^{-\min \{(1-q)/2,vq/2\}}\) denote the rate of convergence obtained in Theorem 2, and for any \(\xi \in \Psi\) such that \(d(\xi ,\xi _0)\le \delta _n\), define the first-order directional derivative of \(l(\xi ,O)\) at the direction \(u\in V\) as

$$\begin{aligned} \dot{l}(\xi ,O)[u]=\frac{dl(\xi +su,O)}{ds}\big |_{s=0} \end{aligned}$$

and the second-order directional derivative at the direction \(u,\tilde{u} \in V\) as

$$\begin{aligned} {\ddot{l}}(\xi ,O)[u,\tilde{u}]=\frac{d^2l(\xi +su+\tilde{s}\tilde{u},O)}{d\tilde{s}ds} \big |_{s=0}\big |_{\tilde{s}=0}=\frac{d\dot{l}(\xi +\tilde{s}\tilde{u},O)[\tilde{u}]}{d\tilde{s}} \big |_{\tilde{s}=0}. \end{aligned}$$

Note that by Condition (A.3) and Theorem 1.6.2 of Lorentz (1986), there exists \(\Pi _n u^{*}\in \Psi -\xi _0\) such that \(||\Pi _n u^{*}-u^{*}||=O(n^{-qv})\). Furthermore, under the assumption \(q>1/2v\), we have \(\delta _n ||\Pi _n u^{*}-u^{*}||=o(n^{-1/2})\). Define \(v[\xi -\xi _0,O]=l(\xi ,O)-l(\xi _0,O)-\dot{l}(\xi _0,O)[\xi -\xi _0]\) and let \(\varepsilon _n\) be any positive sequence satisfying \(\varepsilon _n=o(n^{-1/2})\). Then by the definition of \({\hat{\xi }}_n\), we have

$$\begin{aligned} \begin{aligned} 0 \le&P_{n}\left[ l\left( {\hat{\xi }}_{n}, O\right) -l\left( {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}, O\right) \right] \\ =&\pm \varepsilon _{n} P_{n} \dot{l}\left( \xi _{0}, O\right) \left[ \Pi _{n} u^{*}\right] +\left( P_{n}-P\right) \left\{ v\left[ {\hat{\xi }}_{n}-\xi _{0}, O\right] -v\left[ {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}-\xi _{0}, O\right] \right\} \\&+P\left\{ v\left[ {\hat{\xi }}_{n}-\xi _{0}, O\right] -v\left[ {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}-\xi _{0}, O\right] \right\} \\ =&\pm \varepsilon _{n} P_{n} \dot{l}\left( \xi _{0}, O\right) \left[ u^{*}\right] \pm \varepsilon _{n} P_{n} \dot{l}\left( \xi _{0}, O\right) \left[ \Pi _{n} u^{*}-u^{*}\right] +\left( P_{n}-P\right) \left\{ v\left[ {\hat{\xi }}_{n}-\xi _{0}, O\right] \right. \\&\left. -r\left[ {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}-\xi _{0}, O\right] \right\} +P\left\{ v\left[ {\hat{\xi }}_{n}-\xi _{0}, O\right] -v\left[ {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}-\xi _{0}, O\right] \right\} \\ =&\pm \varepsilon _{n} P_{n} \dot{l}\left( \xi _{0}, O\right) \left[ u^{*}\right] \pm I_{1}+I_{2}+I_{3}. \end{aligned} \end{aligned}$$

We will investigate the asymptotic behavior of \(I_1\), \(I_2\) and \(I_3\). For \(I_1\), it follows from Conditions (A.1)-(A.3), Chebyshev inequality and \(||\Pi _{n} u^{*}-u^{*}||=o(1)\) that \(I_1 =\varepsilon _n \times o_p(n^{-1/2})\). For \(I_2\), by the mean value theorem, we obtain that

$$\begin{aligned} \begin{aligned} I_2&=(P_n-P)\left\{ l({\hat{\xi }}_n,O)-l({\hat{\xi }}_n\pm \varepsilon _{n}\Pi _{n} u^{*},O)\pm \Pi \varepsilon _n\dot{l}(\xi _0,O)[\Pi _{n} u^{*}]\right\} \\&=\pm \varepsilon _n(P_n-P)\left\{ (\dot{l}({\tilde{\xi }},O)-\dot{l}({\tilde{\xi }}_0,O))[\Pi _{n} u^{*}]\right\} ,\\ \end{aligned} \end{aligned}$$

where \({\tilde{\xi }}\) lies between \({\hat{\xi }}_n\) and \({\hat{\xi }}_n\pm \varepsilon _n\Pi _{n} u^{*}\). By Theorem 2.8.3 of Van and Wellner (1996), we know that \(\{\dot{l}(\xi ,O)[\Pi _{n} u^{*}]:||\xi -\xi _0||\le \delta _n\}\) is Donsker class. Therefore, by Theorem 2.11.23 of Van and Wellner (1996), we have \(I_2 = \varepsilon _n\times o_p(n^{-1/2})\). For \(I_3\), note that

$$\begin{aligned} \begin{aligned} P\left( v\left[ \xi -\xi _{0}, O\right] \right) =&P\left\{ l(\xi , O)-l\left( \xi _{0}, O\right) -\dot{l}\left( \xi _{0}, O\right) \left[ \xi -\xi _{0}\right] \right\} \\ =&2^{-1} P\left\{ \ddot{l}({\tilde{\xi }}, O)\left[ \xi -\xi _{0}, \xi -\xi _{0}\right] -\ddot{l}\left( \xi _{0}, O\right) \left[ \xi -\xi _{0}, \xi -\xi _{0}\right] \right\} \\&+2^{-1} P\left\{ \ddot{l}\left( \xi _{0}, O\right) \left[ \xi -\xi _{0}, \xi -\xi _{0}\right] \right\} \\ =&2^{-1} P\left\{ \ddot{l}\left( \xi _{0}, O\right) \left[ \xi -\xi _{0}, \xi -\xi _{0}\right] \right\} +\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) , \end{aligned} \end{aligned}$$

where \({\tilde{\xi }}\) lies between \(\xi _0\) and \(\xi\) and the last equation follows from Taylor expansion and Conditions (A.1)-(A.3). Therefore,

$$\begin{aligned} \begin{aligned} I_{3}&=-2^{-1}\left\{ \left\| {\hat{\xi }}_{n}-\xi _{0}\right\| ^{2}-\left\| {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}-\xi _{0}\right\| ^{2}\right\} +\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) \\&=\pm \varepsilon _{n}<{\hat{\xi }}_{n}-\xi _{0}, \Pi _{n} u^{*}>+2^{-1}\left\| \varepsilon _{n} \Pi _{n} u^{*}\right\| ^{2}+\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) \\&=\pm \varepsilon _{n}<{\hat{\xi }}_{n}-\xi _{0}, u^{*}>+2^{-1}\left\| \varepsilon _{n} \Pi _{n} u^{*}\right\| ^{2}+\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) \\&=\pm \varepsilon _{n}<{\hat{\xi }}_{n}-\xi _{0}, u^{*}>+\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) , \end{aligned} \end{aligned}$$

where the last equality holds due to the facts \(\delta _n ||\Pi _n u^{*}-u^{*}||=o(n^{-1/2})\), Cauchy–Schwartz inequality and \(||\Pi _nu^{*}||^2\rightarrow ||u^{*}||^2\). Combining the above facts, together with \(P\dot{l}(\xi _0,O)[u^{*}]=0\), we can establish that

$$\begin{aligned} \begin{aligned} 0&\le P_{n}\left\{ l\left( {\hat{\xi }}_{n}, O\right) -l\left( {\hat{\xi }}_{n} \pm \varepsilon _{n} \Pi _{n} u^{*}, O\right) \right\} \\&=\pm \varepsilon _{n} P_{n} \dot{l}\left( \xi _{0}, O\right) \left[ u^{*}\right] \pm \varepsilon _{n}<{\hat{\xi }}_{n}-\xi _{0}, u^{*}>+\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) \\&=\pm \varepsilon _{n}\left( P_{n}-P\right) \{\dot{l}\left( \xi _{0}, O\right) \left[ u^{*}\right] \} \pm \varepsilon _{n}<{\hat{\xi }}_{n}-\xi _{0}, u^{*}>+\varepsilon _{n} \times o_{p}\left( n^{-1 / 2}\right) . \end{aligned} \end{aligned}$$

Therefore, we obtain \(\pm n^{1/2}(P_{n}-P)\{{\dot{l}}(\xi _{0}, O)[u^{*}]\} \pm n^{1/2}<{\hat{\xi }}_{n}-\xi _{0}, u^{*}>+o_p(1)\ge 0\) and then \(n^{1/2}<{\hat{\xi }}_{n}-\xi _{0}, u^{*}>=n^{1/2}\left( P_{n}-P\right) \left\{ {\dot{l}}\left( \xi _{0}, O\right) [u^{*}]\right\} +o_p(1)\rightarrow _{d}N(0,||u^{*}||^2)\) by the central limit theorem and \(||u^{*}||^2=||\dot{l}\left( \xi _{0}, O\right) \left[ u^{*}\right] ||^2\).

Next we will prove that \(||u^{*}||^2= b^{T}I^{-1}(\eta _0)b\). For each component \(\eta _j\), \(j=1,2,\ldots ,p_{\eta }\), we denote by \(\phi ^{*}_j=(b^{*}_{1j}, b^{*}_{2j})\) the value of \(\phi _j = (b_{1j}, b_{2j})\) minimizing

$$\begin{aligned} E \left\{ l_{\eta }\cdot e_j-l_{b_1}[b_{1j}]-l_{b_2}[b_{2j}]\right\} ^2, \end{aligned}$$

where \(l_{\eta }\) is the score function for \(\eta\), \(l_{b_1}\) and \(l_{b_2}\) are the score operator for \(\Lambda _t\) and \(\Lambda _w\), and \(e_j\) is a \(p_{\eta }\)-dimensional vector of zeros except the j-th element equal to 1.

Define the j-th element of \(l^{*}(\eta ,O)\) as \(l_{\eta }\cdot e_j-l_{b_1}[b^{*}_{1j}]-l_{b_2}[b^{*}_{2j}]\), \(j=1,2,\ldots p_{\eta }\), and \(I(\eta )\) as \(E(\{l^{*}(\eta ,O)\}^{\otimes 2})\). By Condition (A.5), the matrix \(I(\eta _0)\) is positive definite. Furthermore, by following similar calculations in Chen et al. (2006), we obtain

$$\begin{aligned} ||u^{*}||^2=||\dot{h}(\xi _0)||^2=\sup _{u\in \bar{V}:||u||>0}\frac{|\dot{h}(\xi _0)[u]|^2}{||u||^2}=b^{T}[E(\{l^{*}(\eta _0,O)\}^{\otimes 2})]^{-1}b=b^{T}I^{-1}(\eta _0)b. \end{aligned}$$

Thus, we have shown that \(n^{1/2}({\hat{\eta }}_n-\eta _0)\rightarrow N(0, I^{-1}(\eta _0))\) in distribution for the estimator \({\hat{\eta }}_n\).\(\square\)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Y., Wang, P. & Sun, J. Estimation of complier causal treatment effects with informatively interval-censored failure time data. Ann Inst Stat Math 75, 1039–1062 (2023). https://doi.org/10.1007/s10463-023-00874-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-023-00874-6

Keywords

Navigation