Feature screening for ultrahigh-dimensional survival data when failure indicators are missing at random

Fang, Jianglin

doi:10.1007/s00362-019-01128-5

Feature screening for ultrahigh-dimensional survival data when failure indicators are missing at random

Regular Article
Published: 07 August 2019

Volume 62, pages 1141–1166, (2021)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Jianglin Fang¹

298 Accesses
6 Altmetric
1 Mention
Explore all metrics

Abstract

In modern statistical applications, the dimension of covariates can be much larger than the sample size, and extensive research has been done on screening methods which can effectively reduce the dimensionality. However, the existing feature screening procedure can not be used to handle the ultrahigh-dimensional survival data problems when failure indicators are missing at random. This motivates us to develop a feature screening procedure to handle this case. In this paper, we propose a feature screening procedure by sieved nonparametric maximum likelihood technique for ultrahigh-dimensional survival data with failure indicators missing at random. The proposed method has several desirable advantages. First, it does not rely on any model assumption and works well for nonlinear survival regression models. Second, it can be used to handle the incomplete survival data with failure indicators missing at random. Third, the proposed method is invariant under the monotone transformation of the response and satisfies the sure screening property. Simulation studies are conducted to examine the performance of our approach, and a real data example is also presented for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonparametric independence feature screening for ultrahigh-dimensional survival data

Article 25 April 2018

Feature screening based on distance correlation for ultrahigh-dimensional censored data with covariate measurement error

Article 12 October 2020

Feature Screening for High-Dimensional Survival Data via Censored Quantile Correlation

Article 07 November 2020

References

Bitouzé D, Laurent B, Massart P (1999) A Dvoretzky-Kiefer-Wolfowitz type inequality for the Kaplan-Meier estimator. Annals de I’Institut Henri Poincare B 35:735–763
Article MathSciNet Google Scholar
Candes E, Tao T (2007) The Dantzig selector: statistical estimation when $p$ is much larger than $n$. Ann Stat 35:2313–2351
MathSciNet MATH Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MathSciNet Google Scholar
Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B 35:2313–2351
MATH Google Scholar
Fan J, Song R (2010) Sure independence screening in generalized linear models with NP-dimensionality. J Am Stat Assoc 38:3567–3604
MathSciNet MATH Google Scholar
Fan J, Feng Y, Wu Y (2010) High-dimensional variable selection for Cox’s proportional hazards model. Statistics 2:70–86
MathSciNet Google Scholar
Gill R (1981) Testing with replacement and the product limit estimator. Ann Stat 9:853–860
Article MathSciNet Google Scholar
Gill R (1983) Large sample behaviour of the product-limit estimator on the whole line. Ann Stat 11:49–58
Article MathSciNet Google Scholar
González S, Rueda M, Arcos A (2008) An improved estimator to analyse missing data. Stat Pap 49:791–792
Article MathSciNet Google Scholar
He X, Wang L, Hong H (2013) Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann Stat 41:342–369
MathSciNet MATH Google Scholar
Li R, Zhong W, Zhu L (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107:1129–1140
Article MathSciNet Google Scholar
Li G, Peng H, Zhang J, Zhu L (2012) Robust rank correlation based screening. Ann Stat 40:846–877
MathSciNet MATH Google Scholar
Lin W, Lv J (2013) High-dimensional sparse additive hazards regression. J Am Stat Assoc 108:247–264
Article MathSciNet Google Scholar
Little R, Rubin D (2002) Statistical analysis with missing data. Wiley, Hoboken
Book Google Scholar
Qin J, Shao J, Zhang B (2008) ANOVA for longitudinal data with missing values. J Am Stat Assoc 103:797–810
Article Google Scholar
Rosenwald A, Wright G, Wiestner A, Chan W et al (2003) The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 3:185–197
Article Google Scholar
Shen Y, Liang H (2018) Quantile regression and its empirical likelihood with missing response at random. Stat Pap 59:685–707
Article MathSciNet Google Scholar
Song R, Lu W, Ma S, Jeng X (2014) Censored rank independence screening for high-dimensional survival data. Biometrika 101:799–814
Article MathSciNet Google Scholar
Tibshirani R (1997) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58:267–288
MathSciNet MATH Google Scholar
van der Laan M (1996) Efficient estimation in the bivariate censoring model and repairing NPMLE. Ann Stat 24:596–627
MathSciNet MATH Google Scholar
van der Laan M, Mckeague I (1998) Efficient estimation from right-censored data when failure indicators are missing at random. Ann Stat 26:164–182
MathSciNet MATH Google Scholar
Wang J (1987) A note on the uniform consistency of the Kaplan-Meier estimator. Ann Stat 15:1313–1316
Article MathSciNet Google Scholar
Wang Q, Rao J (2002) Empirical likelihood-based inference under imputation for missing response data. Ann Stat 30:894–924
Article MathSciNet Google Scholar
Wu Y, Yin G (2015) Conditional quantile screening in ultrahigh-dimensional heterogeneous data. Biometrika 102:65–76
Article MathSciNet Google Scholar
Zhang H, Lu W (2007) Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94:691–703
Article MathSciNet Google Scholar
Zhao S, Li Y (2012) Principled sure independence screening for Cox models with ultra-high-dimensional covariates. J Multivar Anal 105:397–411
Article MathSciNet Google Scholar
Zhang J, Liu Y, Wu Y (2017) Correlation rank screening for ultrahigh-dimensional survival data. Comput Stat Data Anal 108:121–132
Article MathSciNet Google Scholar
Zhu L, Li L, Li R, Zhu L (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106:1464–1475
Article MathSciNet Google Scholar

Download references

Acknowledgements

Fang’s research is supported by Project supported by Provincial Natural Science Foundation of Hunan (Grant No. 2018JJ2078) and Scientific Research Fund of Hunan Provincial Education Department (Grant No. 17C0392).

Author information

Authors and Affiliations

College of Science, Hunan Institute of Engineering, Xiangtan, 411104, Hunan, China
Jianglin Fang

Authors

Jianglin Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianglin Fang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Firstly, we introduce the following Lemma which is useful for proving Theorem 2.1.

Lemma 4.1

Under Assumptions 1–5, for any positive $\varepsilon $, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{t\in (0,L]}P\left( \sqrt{n}|\{1-V(t)\} \{\tilde{G}(t)-G(t)\}|>\varepsilon \right) \le 2\exp \{-2\varepsilon ^{2}\}, \end{aligned}$$

(7)

and

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{t\in (0,L]}P\left( \sqrt{n}|\{1-\tilde{V}(t)\} \{\tilde{G}(t)-G(t)\}|>\varepsilon \right) \le 2\exp \{-2\varepsilon ^{2}\}. \end{aligned}$$

(8)

Proof

Let $B^{0}$ denote a Brownian bridge. With the representation of that process from a Brownian bridge given by Bitouzé et al. (1999), we have

$$\begin{aligned} P\left( \sup _{t\in [0,L]}|B^{0}(t)|>\varepsilon \right) \le 2\exp \{-2\varepsilon ^{2}\}. \end{aligned}$$

(9)

Let

$$\begin{aligned} \varphi (t)=\int _{0}^{t}\frac{dG(s)}{\{1-G(s)\}^{2}\{1-V(s)\}}~~~\text {and}~~~K(t) =\frac{\varphi (t)}{1+\varphi (t)}. \end{aligned}$$

Based on Theorem 5.1 in van der Laan (1996), the estimator $\tilde{G}$ of G is efficient for the reduced data. Therefore, the result of Theorem 1.1 in Gill (1983) imply that

$$\begin{aligned} \sqrt{n}\frac{1-K}{1-G}(\tilde{G}-G){\mathop {\rightarrow }\limits ^{d}} B^{0}(K) ~~\text {in}~~ D[0,L] ~~\text {as}~~ n\rightarrow \infty , \end{aligned}$$

(10)

where ${\mathop {\rightarrow }\limits ^{d}}$ represents the convergence in distribution, and D[0, L] is the cadlag function space of real-valued functions on [0, L] endowed with the supremum norm. Hence it follows from (10) that

$$\begin{aligned}&\lim _{n\rightarrow \infty }\sup _{t\in [0,L]}P\left( \sqrt{n}\left| (1-V(t)) (\tilde{G}(t)-G(t))\right|>\varepsilon \right) \\&\quad =\lim _{n\rightarrow \infty }\sup _{t\in [0,L]}P\left( \left| \sqrt{n}\frac{1-K(t)}{1-G(t)} (\tilde{G}(t)-G(t))\right| \left| \frac{(1-G(t))(1-V(t))}{1-K(t)} \right|>\varepsilon \right) \\&\quad =\sup _{t\in [0,L]}P\left( \left| \frac{\{1-G(t)\}\{1-V(t)\}}{1-K(t)}\right| \left| B^{0}\{K(t)\}\right| >\varepsilon \right) . \end{aligned}$$

Because $|[\{1-G(t)\}\{1-V(t)\}]/\{1-K(t)\}|\le 1$, combining (9), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{t\in (0,L]}P\left( \sqrt{n}|\{1-V(t)\}\{\tilde{G}(t) -G(t)\}|>\varepsilon \right) \le 2\exp \{-2\varepsilon ^{2}\}. \end{aligned}$$

The first part of Lemma 4.1 is proved, and then we begin to prove the second part of it. By simple calculation, we have

$$\begin{aligned} \sqrt{n}\{1-\tilde{V}(t)\}\{\tilde{G}(t)-G(t)\}= & {} \sqrt{n}\{1-V(t)\} \{\tilde{G}(t)-G(t)\}+\sqrt{n}\{\tilde{V}(t)\nonumber \\&-V(t)\}\{\tilde{G}(t)-G(t)\}, \end{aligned}$$

(11)

and

$$\begin{aligned} \sqrt{n}\{\tilde{V}(t)-V(t)\}\{\tilde{G}(t)-G(t)\}=\frac{\tilde{G}(t)-G(t)}{1-G(t)} \sqrt{n}\{1-G(t)\}\{\tilde{V}(t)-V(t)\}. \end{aligned}$$

(12)

Because the $\tilde{V}(t)$ is obtained by switching the role of failure time and censoring time in (2), similar to the proof of (7), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{t\in (0,L]}P\left( \sqrt{n}|\{1-G(t)\}\{\tilde{V}(t) -V(t)\}|>\varepsilon \right) \le 2\exp \{-2\varepsilon ^{2}\}. \end{aligned}$$

(13)

Moreover, we also have

$$\begin{aligned} \sup _{t\in (0,L]}\sqrt{n}\{\tilde{V}(t)-V(t)\}\{\tilde{G}(t)-G(t)\}\le & {} \sup _{t\in (0,L]}\sqrt{n}|\{1-G(t)\}\{\tilde{V}(t)\nonumber \\&-V(t)\}|\sup _{t\in (0,L]}\left| \frac{\tilde{G}(t)-G(t)}{1-G(t)}\right| . \end{aligned}$$

(14)

If we can show

$$\begin{aligned} \sup _{t\in (0,L]}\left| \tilde{G}(t)-G(t)\right| {\mathop {\rightarrow }\limits ^{p}} 0, \end{aligned}$$

(15)

where ${\mathop {\rightarrow }\limits ^{p}}$ represents the convergence in probability. Then, we can obtain, according to Lemma 2.6 in Gill (1983), that $\sup _{t\in (0,L]}|(\tilde{G}(t)-G(t))/(1-G(t))|$ is bounded in probability. Therefore, it follows by combining (7) and (11)–(14)

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{t\in (0,L]}P\left( \sqrt{n}|\{1-\tilde{V}(t)\} \{\tilde{G}(t)-G(t)\}|>\varepsilon \right) \le 2\exp \{-2\varepsilon ^{2}\}. \end{aligned}$$

Next, we begin to prove (15). For the distribution function G(t), the cumulative hazard function H(t) can be defined as

$$\begin{aligned} H(t)=\int _{(0,t]}\frac{F(dy,1)}{1-F_{Y}(y)}, \end{aligned}$$

and its estimator $\hat{H}_{n}(t)$ is given by

$$\begin{aligned} \hat{H}_{n}(t)=\int _{(0,t]}\frac{F_{n}(dy,1)}{1-F_{Y,n}(y)}. \end{aligned}$$

By simple calculation, we have

$$\begin{aligned} \hat{H}_{n}(t)-H(t)&=\int _{(0,t]}\left( \frac{F_{n}(dy,1)}{1-F_{Y,n}(y)}- \frac{F(dy,1)}{1-F_{Y}(y)}\right) \\&=\int _{(0,t]}\frac{F_{n}(dy,1)(1-F_{Y}(y))-F(dy,1)(1-F_{Y,n}(y))}{(1-F_{Y}(y)) (1-F_{Y,n}(y))}\\&=\int _{(0,t]}\frac{F_{n}(dy,1)-F(dy,1)}{(1-F_{Y}(y))(1-F_{Y,n}(y))}\\&\quad +\int _{(0,t]} \frac{(F(dy,1)-F_{n}(dy,1))F_{Y,n}(y)}{(1-F_{Y}(y))(1-F_{Y,n}(y))}\\&\quad +\int _{(0,t]}\frac{F_{n}(dy,1)(F_{Y}(y)-F_{Y,n}(y))}{(1-F_{Y}(y)) (1-F_{Y,n}(y))}\\&=R_{1}+R_{2}+R_{3}. \end{aligned}$$

Based on Lemma 4.2 in van der Laan (1996) and Assumption 2, we can obtain

$$\begin{aligned} \sup _{t\in (0,L]}|F_{n}(dy,1)-F(dy,1)|{\mathop {\rightarrow }\limits ^{p}} 0. \end{aligned}$$

Moreover, similar to the proof of Theorem 1 in Wang (1987), we can show that

$$\begin{aligned} \sup _{t\in (0,L]}|F_{Y,n}(y)-F_{Y}(y)|{\mathop {\rightarrow }\limits ^{p}} 0. \end{aligned}$$

It follows form Assumptions 3 and 6 that $1/((1-F_{Y}(y))(1-F_{Y,n}(y)))$ is bounded. Therefore, we have

$$\begin{aligned} \sup _{t\in (0,L]}\left| R_{i}\right| {\mathop {\rightarrow }\limits ^{p}} 0,~~~~i=1,2,3, \end{aligned}$$

and

$$\begin{aligned} \sup _{t\in (0,L]}|\hat{H}_{n}(t)-H(t)|\le \left\{ \sup _{t\in (0,L]}\left| R_{1}\right| + \sup _{t\in (0,L]}\left| R_{2}\right| +\sup _{t\in (0,L]}\left| R_{3}\right| \right\} {\mathop {\rightarrow }\limits ^{p}}0. \end{aligned}$$

Now by using Lemma 2 in Gill (1981) we get

$$\begin{aligned} \sup _{t\in (0,L]}\left| \tilde{G}(t)-G(t)\right| {\mathop {\rightarrow }\limits ^{p}} 0. \end{aligned}$$

So (15) follows. The proof of Lemma 4.1 is completed. $\square $

Proof of Theorem 2.1

Let $\omega ^{*}_{k}=\{1/n\sum _{i=1}^{n}X_{ki}u(Y_{i})G_{k}(Y_{i})\}^{2}$, $k=1,\cdots ,p$. we have

$$\begin{aligned} |\hat{\omega }_{k}-\omega _{k}|&\le |\hat{\omega }_{k}-\tilde{\omega }_{k}|+ |\tilde{\omega }_{k}-\omega _{k}|\nonumber \\&\le |\hat{\omega }_{k}-\tilde{\omega }_{k}|+ |\tilde{\omega }_{k}-\omega ^{*}_{k}| +|\omega ^{*}_{k}-\omega _{k}|. \end{aligned}$$

(16)

Combining the strong law of large numbers and Assumption 1, we can obtain that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}(X_{ki})^{2}{\mathop {\longrightarrow }\limits ^{a.s.}}O(1), ~~~k=1,\cdots ,p, \end{aligned}$$

where ${\mathop {\longrightarrow }\limits ^{a.s.}}$ represents almost sure convergence. Based on the Cauchy−Schwarz inequality and the boundedness of $\tilde{G}(t)$, G(t), $\tilde{u}(t)$ and u(t), combining Assumption 4, we can obtain that

$$\begin{aligned}&\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\tilde{u}(Y_{i})\tilde{G}(Y_{i})\right|< +\infty ,~~~~~~\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i})\tilde{G} (Y_{i})\right|<+\infty ,\\&\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i})G(Y_{i})\right|<+\infty ~~~~ \text {and}~~~~E\left| X_{k}u(Y)G(Y)\right| <+\infty . \end{aligned}$$

There exists positive constants $c_{1}$, $c_{2}$ and $c_{3}$ such that

$$\begin{aligned} |\hat{\omega }_{k}-\tilde{\omega }_{k}|&=\left| \left\{ \frac{1}{n}\sum _{i=1}^{n}X_{ki} \tilde{u}(Y_{i})\tilde{G}(Y_{i})\right\} ^{2}-\left\{ \frac{1}{n} \sum _{i=1}^{n}X_{ki}u(Y_{i})\tilde{G}(Y_{i})\right\} ^{2}\right| \nonumber \\&=\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\tilde{u}(Y_{i})\tilde{G}(Y_{i})+\frac{1}{n} \sum _{i=1}^{n}X_{ki}u(Y_{i})\tilde{G}(Y_{i})\right| \nonumber \\&\quad \left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\tilde{G}(Y_{i})\{\tilde{u}(Y_{i})-u(Y_{i})\} \right| \nonumber \\&\le c_{1}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\tilde{G}(Y_{i})\{\tilde{u}(Y_{i}) -u(Y_{i})\}\right| \nonumber \\&\le c_{1}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1-\tilde{G}(Y_{i})\}\{\tilde{u} (Y_{i})-u(Y_{i})\}\right| \nonumber \\&\quad +c_{1}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{\tilde{u}(Y_{i})-u(Y_{i})\}\right| , \end{aligned}$$

(17)

$$\begin{aligned} |\tilde{\omega }_{k}-\omega ^{*}_{k}|&=\left| \left\{ \frac{1}{n}\sum _{i=1}^{n} X_{ki}u(Y_{i})\tilde{G}(Y_{i})\right\} ^{2}-\left\{ \frac{1}{n} \sum _{i=1}^{n}X_{ki}u(Y_{i})G(Y_{i})\right\} ^{2}\right| \nonumber \\&\le c_{2}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i})\{\tilde{G}(Y_{i}) -G(Y_{i})\}\right| \end{aligned}$$

(18)

and

$$\begin{aligned} |\omega ^{*}_{k}-\omega _{k}|&=\left| \left\{ \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i}) G(Y_{i})\right\} ^{2}-\left[ E\left\{ X_{k}u(Y)G(Y)\right\} \right] ^{2}\right| \nonumber \\&\le c_{3}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i})G(Y_{i})- E\left\{ X_{k}u(Y)G(Y)\right\} \right| . \end{aligned}$$

(19)

Based on $V(t)=1-u(t)$ and $\tilde{V}(t)=1-\tilde{u}(t)$, we have

$$\begin{aligned}&\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1-\tilde{G}(Y_{i})\}\{\tilde{u}(Y_{i}) -u(Y_{i})\}\right| \\&\quad =\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1-\tilde{G}(Y_{i})\}\{\tilde{V}(Y_{i}) -V(Y_{i})\}\right| \nonumber \\&\quad \le c_{4}\left( \frac{1}{n}\sum _{i=1}^{n}\left[ \{1-\tilde{G}(Y_{i})\}\{\tilde{V} (Y_{i})-V(Y_{i})\}\right] ^{2}\right) ^{1/2}\nonumber \\&\quad \le c_{4}\sup _{0\le t\le L}\left| \{1-\tilde{G}(t)\}\{\tilde{V}(t)-V(t)\} \right| \end{aligned}$$

Because the estimator $\tilde{V}(t)$ is obtained by switching the role of failure time and censoring time and combining (2), we can show that, by using (8) in Lemma 4.1 and switching the role of $\tilde{V}(t)$ and $\tilde{G}(t)$,

$$\begin{aligned}&\lim _{n\rightarrow \infty }P\left( c_{1}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1- \tilde{G}(Y_{i})\}\{\tilde{u}(Y_{i})-u(Y_{i})\}\right| \ge Mn^{-\alpha }\right) \\&\quad \le \lim _{n\rightarrow \infty }P\left( c_{1}c_{4}\sup _{0\le t\le L}\left| \{1 -\tilde{G}(t)\}\{\tilde{V}(t)-V(t)\}\right| \ge Mn^{-\alpha }\right) \\&\quad =\lim _{n\rightarrow \infty }P\left( \sup _{0\le t\le L}\sqrt{n}\left| \{1 -\tilde{G}(t)\}\{\tilde{V}(t)-V(t)\}\right| \ge (c_{1}c_{4})^{-1}Mn^{1/2- \alpha }\right) \\&\quad \le 2\exp \{-2(c_{1}c_{4})^{-2}M^{2}n^{1-2\alpha }\}. \end{aligned}$$

Without loss of generality, when n is large enough, we can show that

$$\begin{aligned}&P\left( c_{1}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1-\tilde{G}(Y_{i})\} \{\tilde{u}(Y_{i})-u(Y_{i})\}\right| \ge Mn^{-\alpha }\right) \nonumber \\&\quad \le 2\exp \{-2(c_{1}c_{4})^{-2}M^{2}n^{1-2\alpha }\}. \end{aligned}$$

(20)

Combining Assumption 6 and the Cauchy–Schwarz inequality, we have

$$\begin{aligned} \left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{\tilde{u}(Y_{i})-u(Y_{i})\}\right|&\le \frac{1}{\rho _{1}}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1-G(Y_{i})\}\{\tilde{u} (Y_{i})-u(Y_{i})\}\right| \\&=\frac{1}{\rho _{1}}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{1-G(Y_{i})\} \{\tilde{V}(Y_{i})-V(Y_{i})\}\right| \\&\le \left\{ \frac{1}{n}\sum _{i=1}^{n}X_{ki}^{2}\frac{1}{n}\sum _{i=1}^{n} \{1-G(Y_{i})\}\{\tilde{V}(Y_{i})-V(Y_{i})\}^{2}\right\} ^{1/2}\\&\le c_{5}\left\{ \frac{1}{n}\sum _{i=1}^{n}\{1-G(Y_{i})\}\{\tilde{V}(Y_{i})- V(Y_{i})\}^{2}\right\} ^{1/2}\\&\le c_{5}\sup _{0\le t\le L}\left| \{1-G(t)\}\{\tilde{V}(t)-V(t)\}\right| , \end{aligned}$$

where $c_{5}$ is a positive constant. Therefore, based on (7) in Lemma 4.1, we can obtain that

$$\begin{aligned} P\left( c_{1}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}\{\tilde{u}(Y_{i})-u(Y_{i})\} \right| \ge Mn^{-\alpha }\right) \le 2\exp \{-2(c_{1}c_{5})^{-2}M^{2}n^{1-2\alpha }\}. \end{aligned}$$

(21)

Similarly, there exists a positive constant $c_{6}$ such that

$$\begin{aligned} P\left( c_{2}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i})\{\tilde{G}(Y_{i})- G(Y_{i})\}\right| \ge Mn^{-\alpha }\right) \le 2\exp \{-2(c_{2}c_{6})^{-2}M^{2}n^{1-2\alpha }\}. \end{aligned}$$

(22)

Note that $\{X_{ki}u(Y_{i})G(Y_{i}):~i=1,\cdots ,n\}$ are independent identically sample form $X_{k}u(Y)G(Y)$, then, based on Hoeffding’s inequality, we have

$$\begin{aligned}&P\left( c_{3}\left| \frac{1}{n}\sum _{i=1}^{n}X_{ki}u(Y_{i})G(Y_{i})-E\left\{ X_{k}u (Y)G(Y)\right\} \right| \ge Mn^{-\alpha }\right) \nonumber \\&\quad \le 2\exp \left\{ -\frac{2(c_{3})^{-2}M^{2}n^{2-2\alpha }}{nc_{7}^{2}}\right\} =2\exp \{-2(c_{3}c_{7})^{-2}M^{2}n^{1-2\alpha }\}. \end{aligned}$$

(23)

According to (16), it is easy to show that

$$\begin{aligned}&P(|\hat{\omega }_{k}-\omega _{k}|\ge 3Mn^{-\alpha })\le P(|\hat{\omega }_{k}- \tilde{\omega }_{k}|+|\tilde{\omega }_{k}-\omega _{k}|\ge 3Mn^{-\alpha })\nonumber \\&\quad =P(|\hat{\omega }_{k}-\tilde{\omega }_{k}|+|\tilde{\omega }_{k}-\omega _{k}| \ge 3Mn^{-\alpha },|\hat{\omega }_{k}-\tilde{\omega }_{k}|\ge Mn^{-\alpha })\nonumber \\&\qquad +P(|\hat{\omega }_{k}-\tilde{\omega }_{k}|+|\tilde{\omega }_{k}-\omega _{k}| \ge 3Mn^{-\alpha }, |\hat{\omega }_{k}-\tilde{\omega }_{k}|< Mn^{-\alpha })\nonumber \\&\quad \le P(|\hat{\omega }_{k}-\tilde{\omega }_{k}|\ge Mn^{-\alpha })+P(| \tilde{\omega }_{k}-\omega _{k}|\ge 2Mn^{-\alpha })\nonumber \\&\quad \le P(|\hat{\omega }_{k}-\tilde{\omega }_{k}|\ge Mn^{-\alpha })+P(| \tilde{\omega }_{k}-\omega ^{*}_{k}|\ge Mn^{-\alpha })+P(|\omega ^{*}_{k}- \omega _{k}|\ge Mn^{-\alpha })\nonumber \\ \end{aligned}$$

(24)

Therefore, from (17)–(24), there exists a positive constant $\gamma $ such that

$$\begin{aligned} P(|\hat{\omega }_{k}-\omega _{k}|\ge 3Mn^{-\alpha })&\le 2\exp \{-2(c_{1}c_{4})^{-2} M^{2}n^{1-2\alpha }\}\nonumber \\&\quad +2\exp \{-2(c_{1}c_{5})^{-2}M^{2}n^{1-2\alpha }\}\nonumber \\&\quad +2\exp \{-2(c_{2}c_{6})^{-2}M^{2}n^{1-2\alpha }\}\nonumber \\&\quad +2\exp \{-2(c_{3}c_{7})^{-2}M^{2}n^{1-2\alpha }\}\nonumber \\&\le 8\exp \{-2\gamma ^{-2}M^{2}n^{1-2\alpha }\}. \end{aligned}$$

(25)

Immediately, we have

$$\begin{aligned} P(\max _{1\le k\le p}|\hat{\omega }_{k}-\omega _{k}|>Mn^{-\alpha })\le 8p\exp \left\{ \frac{-2\gamma ^{-2}M^{2}n^{1-2\alpha }}{9}\right\} . \end{aligned}$$

(26)

On the other hand, similar to the proof of the second part of Theorem 2.1 in Zhang et al. (2017), we can prove that

$$\begin{aligned} P(\mathscr {A}\subseteq \hat{\mathscr {A}})\ge 1-O\left( d\exp \left\{ \frac{-2\gamma ^{-2}M^{2}n^{1-2\alpha }}{9}\right\} \right) , \end{aligned}$$

where $d=|\mathscr {A}|$ is the cardinality of $\mathscr {A}$. The proof of Theorem 2.1 is completed. $\square $

Proof of Theorem 2.2

Let $\eta =\min \limits _{k\in \mathscr {A}}\{\omega _{k}\}-\max \limits _{k\not \in \mathscr {A}}\{\omega _{k}\}$. It follows from Assumption 5 that $\eta >0$. Similar to the proof of (24) and (25) in Theorem 2.1, we can show that

$$\begin{aligned}&P\left( \min _{k\in \mathscr {A}}\hat{\omega }_{k}<\max _{k\not \in \mathscr {A}} \hat{\omega }_{k}\right) \nonumber \\&\quad =P\left( (\max _{k\not \in \mathscr {A}}\hat{\omega }_{k} -\max _{k\not \in \mathscr {A}}\omega _{k})-(\min _{k\in \mathscr {A}}\hat{\omega }_{k}- \min _{k\in \mathscr {A}}\omega _{k})>\eta \right) \nonumber \\&\quad \le P\left( \max _{k\not \in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|+\min _{k\in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|>\eta \right) \nonumber \\&\quad =P\left( \max _{k\not \in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|+\min _{k\in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|>\eta , \max _{k\not \in \mathscr {A}}| \hat{\omega }^{*}_{k}-\omega _{k}|\ge \frac{\eta }{2}\right) \nonumber \\&\qquad +P\left( \max _{k\not \in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|+\min _{k\in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|>\eta , \max _{k\not \in \mathscr {A}}| \hat{\omega }^{*}_{k}-\omega _{k}|<\frac{\eta }{2}\right) \nonumber \\&\quad \le P\left( \max _{k\not \in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|\ge \frac{\eta }{2}\right) +P\left( \min _{k\in \mathscr {A}}|\hat{\omega }_{k}- \omega _{k}|>\frac{\eta }{2}\right) \end{aligned}$$

(27)

Therefore, combining (26) and (27), when n is large enough, we can obtain that

$$\begin{aligned} P\left( \min _{k\in \mathscr {A}}\hat{\omega }_{k}<\max _{k\not \in \mathscr {A}} \hat{\omega }_{k}\right)&\le P\left( \max _{k\not \in \mathscr {A}}|\hat{\omega }_{k}-\omega _{k}|\ge Mn^{-\alpha }\right) \\&\quad +P\left( \min _{k\in \mathscr {A}}|\hat{\omega }_{k}- \omega _{k}|>Mn^{-\alpha }\right) \\&\le 2P(\max _{1\le k\le p}|\hat{\omega }_{k}-\omega _{k}|\ge Mn^{-\alpha })\\&\le O\left( p\exp \left\{ \frac{-2\gamma ^{-2}M^{2}n^{1-2\alpha }}{9}\right\} \right) , \end{aligned}$$

and

$$\begin{aligned} P(\max _{k\not \in \mathcal {A}}\hat{\omega }^{*}_{k}<\min _{k\in \mathcal {A}} \hat{\omega }^{*}_{k})\ge 1-O\left( p\exp \left\{ \frac{-2\gamma ^{-2}M^{2}n^{1-2\alpha }}{9}\right\} \right) . \end{aligned}$$

Thus, the proof of Theorem 2.2 is completed $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, J. Feature screening for ultrahigh-dimensional survival data when failure indicators are missing at random. Stat Papers 62, 1141–1166 (2021). https://doi.org/10.1007/s00362-019-01128-5

Download citation

Received: 31 May 2018
Revised: 12 May 2019
Published: 07 August 2019
Issue Date: June 2021
DOI: https://doi.org/10.1007/s00362-019-01128-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature screening for ultrahigh-dimensional survival data when failure indicators are missing at random

Abstract

Access this article

Similar content being viewed by others

Nonparametric independence feature screening for ultrahigh-dimensional survival data

Feature screening based on distance correlation for ultrahigh-dimensional censored data with covariate measurement error

Feature Screening for High-Dimensional Survival Data via Censored Quantile Correlation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Lemma 4.1

Proof

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature screening for ultrahigh-dimensional survival data when failure indicators are missing at random

Abstract

Access this article

Similar content being viewed by others

Nonparametric independence feature screening for ultrahigh-dimensional survival data

Feature screening based on distance correlation for ultrahigh-dimensional censored data with covariate measurement error

Feature Screening for High-Dimensional Survival Data via Censored Quantile Correlation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 4.1

Proof

Proof of Theorem 2.1

Proof of Theorem 2.2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation