Skip to main content
Log in

Extended Bayesian information criterion in the Cox model with a high-dimensional feature space

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Variable selection in the Cox proportional hazards model (the Cox model) has manifested its importance in many microarray genetic studies. However, theoretical results on the procedures of variable selection in the Cox model with a high-dimensional feature space are rare because of its complicated data structure. In this paper, we consider the extended Bayesian information criterion (EBIC) for variable selection in the Cox model and establish its selection consistency in the situation of high-dimensional feature space. The EBIC is adopted to select the best model from a model sequence generated from the SIS-ALasso procedure. Simulation studies and real data analysis are carried out to demonstrate the merits of the EBIC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Andersen, P., Gill, R. (1982). Cox’s regression model for counting processes: a large sample study. The Annals of Statistics, 10(4), 1100–1120.

    Google Scholar 

  • Barabási, A., Gulbahce, N., Loscalzo, J. (2011). Network medicine: a network-based approach to human disease. Nature Reviews Genetics, 12(1), 56–68.

    Google Scholar 

  • Bogdan, M., Ghosh, J. K., Doerge, R. (2004). Modifying the schwarz bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics, 167(2), 989–999.

    Google Scholar 

  • Broman, K. W., Speed, T. P. (2002). A model selection approach for the identification of quantitative trait loci in experimental crosses. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4), 641–656.

    Google Scholar 

  • Chen, J., Chen, Z. (2008). Extended bayesian information criteria for model selection with large model spaces. Biometrika, 95(3), 759–771.

    Google Scholar 

  • Chen, J., Chen, Z. (2012). Extended bic for small-n-large-p sparse glm. Statistica Sinica, 22(2), 555.

    Google Scholar 

  • Cookson, W., Liang, L., Abecasis, G., Moffatt, M., Lathrop, M. (2009). Mapping complex disease traits with global gene expression. Nature Reviews Genetics, 10(3), 184–194.

    Google Scholar 

  • Du, P., Ma, S., Liang, H. (2010). Penalized variable selection procedure for cox models with semiparametric relative risk. Annals of statistics, 38(4), 2092.

  • Fan, J., Li, R. (2002). Variable selection for cox’s proportional hazards model and frailty model. The Annals of Statistics, 30(1), 74–99.

    Google Scholar 

  • Fan, J., Li, G., Li, R. (2005). An overview on variable selection for survival analysis. Contemporary multivariate analysis and design of experiments (p. 315). New Jersey: World Scientific.

  • Fan, J., Feng, Y., Wu, Y. (2010). High-dimensional variable selection for cox’s proportional hazards model. Borrowing strength: theory powering applications—a Festschrift for Lawrence D Brown, vol. 6 (pp. 70–86). Beachwood: IMS Collections.

  • Fill, J. (1983). Convergence rates related to the strong law of large numbers. The Annals of Probability, 11(1), 123–142.

    Google Scholar 

  • Fleming, T., Harrington, D. (1991). Counting processes and survival analysis, vol 8. Wiley Online Library.

  • Gui, J., Li, H. (2005). Penalized cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics, 21(13), 3001–3008.

    Google Scholar 

  • Luo, S., Chen, Z. (2013a). Extended bic for linear regression models with diverging number of relevant features and high or ultra-high feature spaces. Journal of Statistical Planning and Inference, 143, 494–504.

    Google Scholar 

  • Luo, S., Chen, Z. (2013b). Selection consistency of ebic for glim with non-canonical links and diverging number of parameters. Statistics and Its Interface, 6, 275–284.

    Google Scholar 

  • Rosenwald, A., Wright, G., Chan, W., Connors, J., Campo, E., Fisher, R., et al. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. New England Journal of Medicine, 346(25), 1937–1947.

    Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.

    Google Scholar 

  • Sha, N., Tadesse, M., Vannucci, M. (2006). Bayesian variable selection for the analysis of microarray data with censored outcomes. Bioinformatics, 22(18), 2262–2268.

    Google Scholar 

  • Siegmund, D. (2004). Model selection in irregular problems: Application to mapping quantitative trait loci. Biometrika, 91, 785–800.

    Google Scholar 

  • Tibshirani, R., et al. (1997). The lasso method for variable selection in the cox model. Statistics in Medicine, 16(4), 385–395.

    Google Scholar 

  • Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., et al. (2001). Missing value estimation methods for dna microarrays. Bioinformatics, 17(6), 520–525.

    Google Scholar 

  • Van de Geer, S. (1995). Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes. The Annals of Statistics, 23(5), 1779–1801.

    Google Scholar 

  • Zhang, H., Lu, W. (2007). Adaptive lasso for cox’s proportional hazards model. Biometrika, 94(3), 691–703.

    Google Scholar 

  • Zou, H. (2008). A note on path-based variable selection in the penalized proportional hazards model. Biometrika, 95(1), 241–247.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shan Luo.

Appendices

Appenidx A: remarks on the assumptions

1.1 Remark on assumption A1

Note that \(S_n, S_n^{(1)}\) and \(S_n^{(2)}\) are summations of i.i.d random variables; it is verified in Fill (1983) that, when the associated random variable satisfies A1.1, for instance, when the components in \({\varvec{Z}}\) are bounded or Gaussian random variables, there exists positive constants \(C_0, C_1\) such that

$$\begin{aligned} P\left( \sup _{t\in [0,1]}\left| S_n({\varvec{\beta }}_0,t)-s({\varvec{\beta }}_0,t)\right| \ge \frac{C_1u_n}{\sqrt{n}}\right)&\le \frac{C_0}{u_n} \exp \left( -\frac{u_n^2}{2}\right) ,\\ P\left( \sup _{t\in [0,1]}\left| S^{(1)}_{nj}({\varvec{\beta }}_0,t)-s^{(1)}_j({\varvec{\beta }}_0,t) \right| \ge \dfrac{C_1u_n}{\sqrt{n}}\right)&\le \dfrac{C_0}{u_n}\exp \left( -\dfrac{u_n^2}{2}\right) ,\\ P\left( \sup _{t\in [0,1]}\left| S^{(2)}_{nij}({\varvec{\beta }}_0,t)-s^{(2)}_{ij} ({\varvec{\beta }}_0,t)\right| \ge \dfrac{C_1u_n}{\sqrt{n}}\right)&\le \dfrac{C_0}{u_n}\exp \left( -\dfrac{u_n^2}{2}\right) \end{aligned}$$

hold for any positive \(u_n\) such that \(u_n\rightarrow +\infty , n^{-1/6}u_n\rightarrow 0\) as \(n\rightarrow +\infty \). These inequalities and A1.2 are similar to Condition (2.2) and (2.5) in Section 8.2 of Fleming and Harrington (1991). However, it is worth noting that they assume the convergence of \(S_n, S_n^{(l)}\) to \(s, s^{(l)}\) holds for a neighborhood \({\fancyscript{B}}\) of \({\varvec{\beta }}_0\). That is, \(\sup _{t\in [0,1],{\varvec{\beta }}\in {\fancyscript{B}}}\Vert S_n({\varvec{\beta }}, t)-s({\varvec{\beta }}, t)\Vert \rightarrow 0,\;\sup _{t\in [0,1] {\varvec{\beta }}\in {\fancyscript{B}}}\Vert S_n^{(l)}({\varvec{\beta }}, t)-s^{(l)}({\varvec{\beta }}, t)\Vert \rightarrow 0\) for \(l= 1, 2\) in probability. Similarly for the boundedness of \(s, s^{(l)}\). But our assumptions are made at the true value \({\varvec{\beta }}_0\). Moreover, with condition A1.2, it can be deduced that

$$\begin{aligned} P\left( \sup _{t\in [0,1]}\left| E_{nj}({\varvec{\beta }}_0,t)-e_j({\varvec{\beta }}_0,t)\right| \ge \frac{C_1u_n}{\sqrt{n}}\right) \le \frac{C_0}{u_n}\exp \left( -\frac{u_n^2}{2}\right) \end{aligned}$$
(5)

and

$$\begin{aligned} P\left( \sup _{t\in [0,1]}\left| \dfrac{I_{ij}\left( {\varvec{\beta }}_0,t\right) }{n}-\varSigma _{ij}\left( {\varvec{\beta }}_0,t \right) \right| \ge \frac{C_1u_n}{\sqrt{n}}\right) \le \frac{C_0}{u_n}\exp \left( -\frac{u_n^2}{2}\right) . \end{aligned}$$
(6)

The detailed proofs of inequalities (5) and (6) are provided in the Appendix 4. A1.3 is Condition (2.6) in Section 8.2 of Fleming and Harrington (1991). A1.4 is assumed in Theorem 4.1 in Andersen and Gill (1982); they are regular conditions in counting process theory.

1.2 Remark on assumption A2

Under Assumption A2, we have, for any positive \(u_n\) such that \(u_n\rightarrow +\infty , n^{-1/6}u_n\rightarrow 0\) as \(n\rightarrow +\infty \), there exists positive constant \(C_0\) such that

$$\begin{aligned} P\left( \left| \sum \limits _{i=1}^n\sum _{j\in s}{\varvec{a}}_j\xi _{ij}\right| \ge \sqrt{n}u_n\right) \le \frac{C_0}{u_n}\exp \left( -\frac{u_n^2}{2}\right) . \end{aligned}$$
(7)

Without loss of generality, we assume all the diagonal elements of \(\varSigma \left( {\varvec{\beta }}_0,1\right) \) are 1. Then when \({\varvec{a}}_j=1\) for any fixed \(j\) and 0 otherwise, (7) reduces to

$$\begin{aligned} P\left( \left| \sum \limits _{i=1}^n\xi _{ij}\right| \ge \sqrt{n}u_n\right) \le \frac{C_0}{u_n}\exp \left( -\frac{u_n^2}{2}\right) , \;\forall j\in \{1,2,\ldots ,p_n\}. \end{aligned}$$

Now let us see how A2 is related to A1.1 in the following: Denote \(\xi _{ij}(t)=\int _0^t \left( Z_{ij}(u)-e_j({\varvec{\beta }}_0,u)\right) \mathrm{d}M_i(u)\); it can be shown that \(Cov(\xi _{ij}(t), \xi _{ik}(t) )= \small \left[ \varSigma \left( {\varvec{\beta }}_0,t\right) \right] _{jk} \) in the following:

$$\begin{aligned} <\xi _{ij}, \xi _{ik}> (t) \!&= \! \int _0^t (Z_{ij}-e_j({\varvec{\beta }}_0, u))(Z_{ik}-e_k({\varvec{\beta }}_0, u)) d <M_i, M_i> (u)\\ \!&= \! \int _0^t (Z_{ij}-e_j({\varvec{\beta }}_0, u))(Z_{ik}-e_k({\varvec{\beta }}_0, u)) Y_i(u)\exp \left( {\varvec{z}}_i^{\tau }{\varvec{\beta }}_0\right) h_0(u)\mathrm{d}u;\\ E<\xi _{1j}, \xi _{1k}> (t) \!&= \! \int _0^t EZ_{ij}Z_{ik}Y_i(u)\exp \left( {\varvec{z}}_i^{\tau }{\varvec{\beta }}_0\right) h_0(u)\mathrm{d}u\\&-\int _0^t e_j({\varvec{\beta }}_0, u) EZ_{ik}Y_i(u)\exp \left( {\varvec{z}}_i^{\tau }{\varvec{\beta }}_0\right) h_0(u)\mathrm{d}u\\&-\int _0^t e_k({\varvec{\beta }}_0, u) EZ_{ij}Y_i(u)\exp \left( {\varvec{z}}_i^{\tau }{\varvec{\beta }}_0\right) h_0(u)\mathrm{d}u\\&+ \int _0^t e_j({\varvec{\beta }}_0, u)e_k({\varvec{\beta }}_0, u)EY_i(u)\exp \left( {\varvec{z}}_i^{\tau }{\varvec{\beta }}_0\right) h_0(u)\mathrm{d}u\\ \!&= \! \int _0^t \left[ \dfrac{E \left[ Z_{ij}Z_{ik}Y_i(u)\exp \left( {\varvec{z}}_i^{\tau }{\varvec{\beta }}_0\right) \right] }{s({\varvec{\beta }}_0, u)}-e_j({\varvec{\beta }}_0, u)e_k({\varvec{\beta }}_0, u)\right] \nonumber \\&\times s({\varvec{\beta }}_0, u)h_0(u)\mathrm{d}u\\ \!&= \! \left[ \varSigma \left( {\varvec{\beta }}_0,t\right) \right] _{jk}. \end{aligned}$$

For any fixed set \(s\), denote \(\xi _i(s)=(\xi _{ij})_{j\in s}\), note that \({\varvec{var}}(\sum \nolimits _{i=1}^n\sum _{j\in s}{\varvec{a}}_j\xi _{ij}/\sqrt{n})=1\) implies \({\varvec{a}}^{\tau }\varSigma ({\varvec{\beta }}_0(s),1){\varvec{a}}=1\). Let \(\lambda _{\min }\) denote the smallest eigenvalue. Since for \(u>0\), we have

$$\begin{aligned} E \exp \left( u\sum _{j\in s}{\varvec{a}}_j\xi _{ij}\right)&\le E \exp (u \Vert {\varvec{a}}\Vert _2\Vert \xi _i(s)\Vert _2) \nonumber \\&\le \lambda ^{-1/2}_{\min }(\varSigma ({\varvec{\beta }}_0(s),1)) |s|\max _{j} E \exp (u |\xi _{ij}|). \end{aligned}$$

Therefore, when \(\lambda _{\min }(\varSigma ({\varvec{\beta }}_0(s),1))\) is bounded from below and \(|s|\) is bounded from above, \(E \exp ( |u\xi _{ij}|)<+\infty \) for all \(j\), inequality (1) holds.

1.3 Remark on assumption A3

The more strict counterpart of A3.1 in linear regression models is the Sparse Riesz Condition. Similar conditions were also assumed in Chen and Chen (2012) for generalized linear regression models. As was relaxed technically in linear regression models, a weaker version of A3.1 can be expected in the Cox models.

Appenidx B: proofs of the main results

Proof of inequality (5)

By definition, for a fixed \(j\),

$$\begin{aligned} E_{nj}({\varvec{\beta }}_0,t)-e_j({\varvec{\beta }}_0,t)&= \dfrac{S^{(1)}_{nj}( {\varvec{\beta }}_0,t)}{S_{n}({\varvec{\beta }}_0,t)}-\dfrac{s^{(1)}_{j}( {\varvec{\beta }}_0,t)}{s({\varvec{\beta }}_0,t)}\\&= \dfrac{1}{S_{n}({\varvec{\beta }}_0,t)}\left( S^{(1)}_{nj}( {\varvec{\beta }}_0,t)-s^{(1)}_{j}({\varvec{\beta }}_0,t)\right) \\&-\dfrac{s^{(1)}_{j}({\varvec{\beta }}_0,t)}{S_{n}({\varvec{\beta }}_0,t)s({\varvec{\beta }}_0,t)} \left( S_{n}({\varvec{\beta }}_0,t)-s({\varvec{\beta }}_0,t)\right) \\&= {\fancyscript{I}}_1(t)-{\fancyscript{I}}_2(t). \end{aligned}$$

Assumption A1.2 implies \(\sup _{t\in [0,1]}\small \left| \dfrac{s^{(1)}_{j}({\varvec{\beta }}_0,t)}{s({\varvec{\beta }}_0,t)}\right| \) and \(\sup _{t\in [0,1]}\small \left| \dfrac{1}{s({\varvec{\beta }}_0,t)}\right| \) are bounded from above.

Note that \(\sup _{t\in [0,1]}\small \left| \dfrac{1}{S_{n}({\varvec{\beta }}_0,t)}\right| \) is bounded from above when

$$\begin{aligned} \sup _{t\in [0,1]}\left| S_{n}({\varvec{\beta }}_0,t)-s({\varvec{\beta }}_0,t)\right| \le \dfrac{C_1u_n}{\sqrt{n}} \end{aligned}$$

and \(n\) is sufficiently large. That is, under this condition, there exists constants \(c_1> 0,c_2>0\) such that

$$\begin{aligned} |{\fancyscript{I}}_1(t)|\le c_1\left| S^{(1)}_{nj}({\varvec{\beta }}_0,t)-s^{(1)}_{j}({\varvec{\beta }}_0,t) \right| ;\;|{\fancyscript{I}}_2(t)|\le c_2\left| S_{n}({\varvec{\beta }}_0,t)-s({\varvec{\beta }}_0,t)\right| . \end{aligned}$$

Hence,

$$\begin{aligned}&P\left( \sup _{t\in [0,1]}\left| E_{nj}({\varvec{\beta }}_0,t)-e_j({\varvec{\beta }}_0,t)\right| \ge \frac{C_1u_n}{\sqrt{n}}\right) \\&\quad \le P\left( \sup _{t\in [0,1]}\left| E_{nj}({\varvec{\beta }}_0,t)-e_j({\varvec{\beta }}_0,t)\right| \!\ge \! \frac{C_1u_n}{\sqrt{n}},\sup _{t\in [0,1]}\left| S_{n}({\varvec{\beta }},t)-s({\varvec{\beta }},t)\right| \!\le \! \dfrac{C_1u_n}{\sqrt{n}}\right) \\&\quad \quad + P\left( \sup _{t\in [0,1]}\left| S_{n}({\varvec{\beta }},t)-s({\varvec{\beta }},t)\right| \ge \dfrac{C_1u_n}{\sqrt{n}}\right) \\&\quad \le P\left( \sup _{t\in [0,1]}\left| S^{(1)}_{nj}({\varvec{\beta }}_0,t)-s^{(1)}_{j}({\varvec{\beta }}_0,t)\right| \ge \dfrac{C_1u_n}{2c_1\sqrt{n}} \right) \\&\quad \quad +P\left( \sup _{t\in [0,1]}\left| S_{n}({\varvec{\beta }}_0,t)-s({\varvec{\beta }}_0,t)\right| \ge \dfrac{C_1u_n}{2c_2\sqrt{n}} \right) \\&\quad \quad +P\left( \sup _{t\in [0,1]}\left| S_{n}({\varvec{\beta }},t)-s({\varvec{\beta }},t)\right| \ge \dfrac{C_1u_n}{\sqrt{n}}\right) \le \frac{C_0}{u_n}\exp \left( -\frac{u_n^2}{2}\right) . \end{aligned}$$

\(\square \)

Proof of inequality (6)

By definition, for fixed \(i,j\),

$$\begin{aligned}&V_{ij}\left( {\varvec{\beta }}_0,t\right) S_n\left( {\varvec{\beta }}_0,t \right) -v_{ij}\left( {\varvec{\beta }}_0,t\right) s\left( {\varvec{\beta }}_0,t\right) \\&\quad =[S_{nij}^{(2)}\left( {\varvec{\beta }}_0,t\right) -s_{ij}^{(2)} \left( {\varvec{\beta }}_0,t\right) ]-[E_{ni}\left( {\varvec{\beta }}_0,t\right) S_{nj}^{(1)} \left( {\varvec{\beta }}_0,t\right) -e_{i}\left( {\varvec{\beta }}_0,t\right) s_{j}^{(1)} \left( {\varvec{\beta }}_0,t\right) ]\\&\quad =[S_{nij}^{(2)}\left( {\varvec{\beta }}_0,t\right) -s_{ij}^{(2)} \left( {\varvec{\beta }}_0,t\right) ]-[E_{ni}\left( {\varvec{\beta }}_0,t\right) -e_{i} \left( {\varvec{\beta }}_0,t\right) ]S_{nj}^{(1)}\left( {\varvec{\beta }}_0,t\right) \\&\quad -e_{i}\left( {\varvec{\beta }}_0,t\right) [S_{nj}^{(1)}\left( {\varvec{\beta }}_0,t \right) -s_{j}^{(1)}\left( {\varvec{\beta }}_0,t\right) ]. \end{aligned}$$

By following the steps in the proof of inequality (5), we can obtain inequality (6). \(\square \)

Proof of Theorem 1

Here we decompose the \(j\)th component of the score function \(U({\varvec{\beta }}_0,t)\) defined in Sect. 2 as

$$\begin{aligned} U_j({\varvec{\beta }}_0,t)&= \sum \limits _{i=1}^n\int _0^t\left( {\varvec{z}}_{ij}-e_j \left( {\varvec{\beta }}_0,u\right) \right) \mathrm{d}M_i(u)-\sum \limits _{i=1}^n\int _0^t \left( E_{nj}\left( {\varvec{\beta }}_0,u\right) \right. \nonumber \\&\left. -e_j\left( {\varvec{\beta }}_0,u\right) \right) \mathrm{d}M_i(u)\\&= \xi _{1j}(t)-\xi _{2j}(t). \end{aligned}$$

To avoid confusion, let \(\xi _j=\xi _j(1),\;\xi _{1j}=\xi _{1j}(1),\;\xi _{2j}=\xi _{2j}(1)\). For any fixed \(s\in {\fancyscript{A}}_0\), note that for any \(j\in s,\,E_{nj}\left( {\varvec{\beta }}_0,u\right) =E_{nj}\left( {\varvec{\beta }}_0(s),u\right) ,\; e_j\left( {\varvec{\beta }}_0,u\right) =e_j\left( {\varvec{\beta }}_0(s),u\right) \), for any unit vector \({\varvec{u}}\), let \({\varvec{a}}={\varvec{u}}^{\tau }\varSigma ^{-1/2}({\varvec{\beta }}_0(s),1)\). Then

$$\begin{aligned} {\varvec{u}}^{\tau }\varSigma ^{-1/2}\left( {\varvec{\beta }}_0(s),1\right) U\left( {\varvec{\beta }}_0(s),1 \right) =\sum _{j\in s}{\varvec{a}}_j\xi _{1j}-\sum _{j\in s}{\varvec{a}}_j\xi _{2j}. \end{aligned}$$

Also, from the remark on Assumption A2, we have \({\varvec{var}}(\sum \limits _{j\in s}{\varvec{a}}_j\xi _{1j}/\sqrt{n})=1\) and \(\Vert {\varvec{a}}\Vert _2^2\le \lambda ^{-1}_{\min }\left( \varSigma ({\varvec{\beta }}_0(s),1)\right) \). Let \(u_n\) satisfy \(n^{-1/6}u_n\rightarrow 0, u_n(\ln n)^{-1/2}\rightarrow +\infty \) as \(n\rightarrow +\infty \); note that for any positive constant \(c\in (0,1)\) independent of \(n\),

$$\begin{aligned} P(|\sum _{j\in s}{\varvec{a}}_j\xi _{1j}-\sum _{j\in s}{\varvec{a}}_j\xi _{2j}|&> \sqrt{n}u_n) \le P\left( |\sum _{j\in s}{\varvec{a}}_j\xi _{1j}|>c\sqrt{n}u_n\right) \\&+P\left( |\sum _{j\in s}{\varvec{a}}_j\xi _{2j}|>(1-c)\sqrt{n}u_n\right) , \end{aligned}$$

the large deviation result of \(\sum _{j\in s}{\varvec{a}}_j\xi _{1j}\) is already given in the remark on Assumption A2, that is, there exists a constant \(C_0\) such that

$$\begin{aligned} P\left( |\sum _{j\in s}{\varvec{a}}_j\xi _{1j}|>c\sqrt{n}u_n\right) \le C_0\exp \left( -\frac{c^2u_n^2}{2}-\ln u_n\right) . \end{aligned}$$
(8)

Now it suffices to show the large deviation of \(\sum _{j\in s}{\varvec{a}}_j\xi _{2j}\). Let \(C_1\) be a positive constant, denote

$$\begin{aligned} {\fancyscript{C}}&= \left\{ \Vert \sup _{u\in [0,1]}[E_n\left( {\varvec{\beta }}_0,u\right) -e\left( {\varvec{\beta }}_0,u\right) ]\Vert _{+\infty }\le \dfrac{C_1u_n}{\sqrt{n}},\;\sup _{u\in [0,1]}|S_n({\varvec{\beta }}_0,u)\right. \nonumber \\&\left. -s({\varvec{\beta }}_0,u)| \le \dfrac{C_1u_n}{\sqrt{n}}\right\} , \end{aligned}$$

then

$$\begin{aligned}&P\left( \left| \sum _{j\in s}{\varvec{a}}_j\xi _{2j}\right| >(1-c)\sqrt{n}u_n\right) \\&\quad \le P\left( \Vert \sup _{u\in [0,1]}[E_n\left( {\varvec{\beta }}_0,u\right) -e\left( {\varvec{\beta }}_0,u\right) ]\Vert _{+\infty }\ge \dfrac{C_1u_n}{\sqrt{n}}\right) \\&\quad \quad +P\left( \sup _{u\in [0,1]}|S_n({\varvec{\beta }}_0,u)-s({\varvec{\beta }}_0,u)|\ge \dfrac{C_1u_n}{\sqrt{n}}\right) \\&\quad \quad +P\left( \left| \sum _{j\in s}{\varvec{a}}_j\xi _{2j}\right| >(1-c)\sqrt{n}u_n\mid {\fancyscript{C}}\right) \\&\quad \equiv P_{2,1}+P_{2,2,1}+P_{2,2,2}. \end{aligned}$$

Inequality (5) and the remark on Assumption A1 demonstrate that there exists positive constant \(C_0\) such that

$$\begin{aligned} P_{2,1}\le C_0\exp \left( -\dfrac{u_n^2}{2}+\kappa \ln n-\ln u_n\right) \;;\;P_{2,2,1}\le C_0\exp \left( -\dfrac{u_n^2}{2}-\ln u_n\right) . \end{aligned}$$
(9)

In the following, we verify that condition on \({\fancyscript{C}}\), the new martingale \(\sum _{j\in s}{\varvec{a}}_j\xi _{2j}(t)\) has bounded jumps by following the steps in the proof of Theorem 3.1 in ?. Let \(\bar{M}(t)=\sum \limits _{i=1}^nM_i(t),\bar{N}(t)=\sum \limits _{i=1}^nN_i(t)\), then \(|\triangle (\bar{M}(t))|=|\triangle (\bar{N}(t))|\le 1\).

First,

$$\begin{aligned} \left| \triangle \left( n^{-1/2}\xi _{2j}(t)\right) \right| \!\le \! n^{-1/2}\Vert \sup _{u\in [0,1]}[E_n\left( {\varvec{\beta }}_0,u\right) -e\left( {\varvec{\beta }}_0,u\right) ]\Vert _{+\infty } \!\equiv \! n^{-1/2}c_n \!\le \! \dfrac{C_1u_n}{n}; \end{aligned}$$

therefore,

$$\begin{aligned} \left| \triangle \left( n^{-1/2}\sum _{j\in s}{\varvec{a}}_j\xi _{2j}(t)\right) \right| \le \sum _{j\in s}|{\varvec{a}}_j|\left| \triangle \left( n^{-1/2}\xi _{2j}(t)\right) \right| \le \dfrac{|s|C_1u_n}{n}. \end{aligned}$$
(10)

Second, the predictable quadratic variation of \(n^{-1/2}\xi _{2j}(t)\), denoted by \(\left<n^{-1/2}\xi _{2j}(t)\right>\) is bilinear and for all \(j\in \{1,2,\ldots ,p_n\}\),

$$\begin{aligned} \left<n^{-1/2}\xi _{2j}(t)\right>&= n^{-1}\int _0^t\left( E_{nj} \left( {\varvec{\beta }}_0,u\right) -e_j\left( {\varvec{\beta }}_0),u\right) \right) ^2\mathrm{d} \left<\bar{M}(u)\right>\\&= \int _0^t\left( E_{nj}\left( {\varvec{\beta }}_0,u\right) -e_j\left( {\varvec{\beta }}_0,u\right) \right) ^2S_n({\varvec{\beta }}_0,u)h_0(u)\mathrm{d}u\\&\le \Vert \sup _{u\in [0,1]}[E_n\left( {\varvec{\beta }}_0,u\right) -e\left( {\varvec{\beta }}_0,u \right) ]\Vert _{+\infty }^2\int _0^tS_n({\varvec{\beta }}_0,u)h_0(u)\mathrm{d}u\\&\equiv b_n^2(t).\\ \left<n^{-1/2}\sum _{j\in s}{\varvec{a}}_j\xi _{2j}(t)\right>&\le |s|\sum _{j\in s}{\varvec{a}}_j^2\left<n^{-1/2}\xi _{2j}(t)\right>\le |s|^2b_n^2(t). \end{aligned}$$

Obviously, \(b_n^2(t)\le b_n^2(1)\le c_n^2\int _0^1S_n({\varvec{\beta }}_0,u)h_0(u)\mathrm{d}u\). Note that

$$\begin{aligned} \int _0^1S_n({\varvec{\beta }}_0,u)h_0(u)\mathrm{d}u&\le \int _0^1s({\varvec{\beta }}_0,u)h_0(u)\mathrm{d}u\nonumber \\&+ \sup _{u\in [0,1]}|S_n({\varvec{\beta }}_0,u)-s({\varvec{\beta }}_0,u)|\int _0^1h_0(u)\mathrm{d}u. \end{aligned}$$

Assumption A1.2 and Eq. (10) imply that

$$\begin{aligned} \sup _{t\in [0,1]}b_n^2(t)\le c_n^2\left( C_1+C_2\dfrac{C_1u_n}{\sqrt{n}}\right) \le C\frac{u_n^2}{n}. \end{aligned}$$

That is, when \(|s|=O(1)\), condition on \({\fancyscript{C}}\), there exists constants \(b^2=O(\frac{u_n^2}{n}),\;K=O(\frac{u_n}{n})\) such that

$$\begin{aligned} \left| \triangle \left( n^{-1/2}\sum _{j\in s}{\varvec{a}}_j\xi _{2j}(t)\right) \right| \le K;\;\left<n^{-1/2}\sum _{j\in s}{\varvec{a}}_j\xi _{2j}(t)\right>\le b^2. \end{aligned}$$

According to Lemma 2.1 in Van de Geer (1995), we have

$$\begin{aligned} P_{2,2,2}&\le 2\exp \left( -\dfrac{(1-c)^2u_n^2}{2(K(1-c)u_n+b^2)}\right) \nonumber \\&= 2\exp \left( -\dfrac{u_n^2}{2(K(1-c)^{-1}u_n+(1-c)^{-2}b^2)}\right) , \end{aligned}$$

since \(u_n^2/n\rightarrow 0\), when \(n\) is sufficiently large, there exists an arbitrarily large positive constant \(M\) such that

$$\begin{aligned} P_{2,2,2}\le 2\exp (-Mu_n^2). \end{aligned}$$

Hence, together with (8) and (9), because of the arbitrariness of \(c\), we know that there exists positive constants \(c_0\) independent of \(j\) and an arbitrarily small positive \(\varepsilon \) such that

$$\begin{aligned} P\left( \left| {\varvec{u}}^{\tau }\varSigma ^{-1/2}\left( {\varvec{\beta }}_0(s),1\right) U \left( {\varvec{\beta }}_0(s),1\right) \right| >\sqrt{n}u_n\right) \le c_0\exp \left( -\dfrac{(1-\varepsilon )u_n^2}{2}\right) . \end{aligned}$$

When \({\varvec{a}}_j=1\) and 0 otherwise, we have

$$\begin{aligned} P\left( |U_j\left( {\varvec{\beta }}_0,1\right) |>\sqrt{n}u_n\right) \le c_0\exp \left( -\dfrac{(1-\varepsilon )u_n^2}{2}\right) \end{aligned}$$

over \(j\in \{1,2,\ldots ,p_n\}\). \(\square \)

Proof of Theorem 2

For any unit vector \({\varvec{w}}(s)\), let \({\varvec{\beta }}(s)={\varvec{\beta }}_0(s)+\psi _n{\varvec{w}}(s)\) where \(\psi _n\) satisfies (4). Under Assumption A3, for all \(s\in {\fancyscript{A}}_0\), the mean value theorem implies that there exists \(\tilde{{\varvec{\beta }}}(s)\) satisfying \(\Vert \tilde{{\varvec{\beta }}}(s)-{\varvec{\beta }}_0(s)\Vert _2\le \Vert \psi _n{\varvec{w}}(s)\Vert _2\) such that

$$\begin{aligned} l_n({\varvec{\beta }}(s))-l_n({\varvec{\beta }}_0(s))&= \psi _n{\varvec{w}}^{\tau }(s)U({\varvec{\beta }}_0(s),1)- \dfrac{1}{2}\psi _n^2{\varvec{w}}(s)^{\tau }\{I(\tilde{{\varvec{\beta }}}(s),1)\}{\varvec{w}}(s)\\&\le \psi _n{\varvec{w}}^{\tau }(s)U({\varvec{\beta }}_0(s),1)-\frac{1-\varepsilon }{2}\lambda _{1,n} \psi _n^2\\&\le \psi _n\sqrt{{\varvec{w}}^{\tau }(s){\varvec{w}}(s)}\sqrt{U^{\tau }({\varvec{\beta }}_0(s),1)U({\varvec{\beta }}_0(s),1)}-\frac{1-\varepsilon }{2}\lambda _{1,n}\psi _n^2\\&\le \psi _n\sqrt{k_n}\max _{j\in s,s\in {\fancyscript{A}}_0}\left| U_j({\varvec{\beta }}_0(s),1)\right| -\frac{1-\varepsilon }{2}\lambda _{1,n}\psi _n^2. \end{aligned}$$

Hence, we have

$$\begin{aligned}&P(l_n({\varvec{\beta }}(s))-l_n({\varvec{\beta }}_0(s)>0:\;\; \text {for some}\;\;{\varvec{w}}(s))\nonumber \\&\quad \le P\left( \max _{j\in s,s\in {\fancyscript{A}}_0}\left| U_j({\varvec{\beta }}_0(s),1)\right| \ge \frac{1-\varepsilon }{2\sqrt{k_n}}\lambda _{1,n}\psi _n\right) . \end{aligned}$$

By noting that \(k_n=O(1),p_n=O(n^{\kappa })\) and letting \(u_n=\frac{1-\varepsilon }{2\sqrt{nk_n}}\lambda _{1,n}\psi _n\), \(n^{-1/6}u_n\rightarrow 0, u_n(\ln n)^{-1/2}\rightarrow +\infty \). According to (2), it follows that

$$\begin{aligned}&P\left( \max _{j\in s,s\in {\fancyscript{A}}_0}\left| U_j({\varvec{\beta }}_0(s),1)\right| \ge \frac{1-\varepsilon }{2\sqrt{k_n}}\lambda _{1,n}\psi _n\right) \nonumber \\&\quad \le \sum \limits _{j\in s,s\in {\fancyscript{A}}_0}P\left( \left| U_j({\varvec{\beta }}_0(s),1)\right| \ge \frac{1-\varepsilon }{2\sqrt{k_n}}\lambda _{1,n}\psi _n\right) \\&\quad \le k_np_n^{k_n} C_0\exp \left( -C_1\frac{\lambda _{1,n}^2\psi _n^2}{n}\right) \\&\quad \le \tilde{C}_0\exp \left( -C_1\frac{\lambda _{1,n}^2\psi _n^2}{n}+C_2\kappa \ln n\right) \end{aligned}$$

for some positive constants \(C_0,C_1,C_2,\tilde{C}_0\). It converges to 0 as \(n\) goes to infinity. Because \(l_n\left( {\varvec{\beta }}(s)\right) \) is a concave function for any \({\varvec{\beta }}(s)\), we get the desired result. \(\square \)

Proof of Theorem 3

Note that \(\{s:s\ne s_0, |s|\le Cp_0\}={\fancyscript{A}}_1 \cup {\fancyscript{A}}_0\), if we can prove that when \(\gamma > 1- \frac{1}{2\kappa }\), as \(n\rightarrow +\infty \),

$$\begin{aligned} P\left( \min _{s: s \in {\fancyscript{A}}_1}\mathrm{EBIC}_{\gamma }(s)\le \mathrm{EBIC}_{\gamma }(s_{0})\right) \rightarrow 0, \end{aligned}$$
(11)

and

$$\begin{aligned} P\left( \min _{s: s \in {\fancyscript{A}}_0}\mathrm{EBIC}_{\gamma }(s)\le \mathrm{EBIC}_{\gamma }(s_{0})\right) \rightarrow 0, \end{aligned}$$
(12)

then we will have completed the proof. Since asymptotically, \(\ln \tau ({\fancyscript{S}}_j)=j\kappa \ln n(1+o(1))\),

$$\begin{aligned} \text{ EBIC }_{\gamma }(s_{0n})-\text{ EBIC }_{\gamma }(s) \!=\! 2 \left( l_n(\hat{{\varvec{\beta }}}(s))-l_n(\hat{{\varvec{\beta }}}(s_{0n}))\right) \!+\! (1+2\gamma \kappa )\left( |s_{0n}|-|s|\right) \ln n, \end{aligned}$$

\(\text{ EBIC }_{\gamma }(s)\le \text{ EBIC }_{\gamma }(s_{0n})\) implies

$$\begin{aligned} l_n (\hat{{\varvec{\beta }}}(s))-l_n (\hat{{\varvec{\beta }}}(s_{0n}))\ge -\dfrac{1+2\gamma \kappa }{2}\left( |s_{0n}|-|s|\right) \ln n. \end{aligned}$$
  1. (1)

    When \(s\in {\fancyscript{A}}_1\), note that

    $$\begin{aligned} -\dfrac{1+2\gamma \kappa }{2}\left( |s_{0n}|-|s|\right) \ln n\ge -\dfrac{1+2\gamma \kappa }{2}|s_{0n}|\ln n\ge -C\ln n \end{aligned}$$

    for some positive constant \(C\) when \(-\dfrac{1}{2\kappa }<\gamma \le 1 \) and \(\kappa \) is a positive constant. Therefore, if we can show that

    $$\begin{aligned} P(\sup \{l_n (\hat{{\varvec{\beta }}}(s))-l_n (\hat{{\varvec{\beta }}}(s_{0n})): s\in {\fancyscript{A}}_1\}\ge -C\ln n)\rightarrow 0, \end{aligned}$$
    (13)

    then we will have (11). Now, consider \(\tilde{s}=s\cup s_{0n}\) and \({\varvec{\beta }}(\tilde{s})\) near \({\varvec{\beta }}_0(\tilde{s})\). Taylor expansion shows that

    $$\begin{aligned} l_n\left( {\varvec{\beta }}(\tilde{s})\right) -l_n\left( {\varvec{\beta }}_0(\tilde{s})\right)&\le \left( {\varvec{\beta }}(\tilde{s})-{\varvec{\beta }}_0(\tilde{s}) \right) ^{\tau }U({\varvec{\beta }}_0(s))\nonumber \\&-\dfrac{(1-\varepsilon )\lambda _{1,n}}{2} \left\| {\varvec{\beta }}(\tilde{s})-{\varvec{\beta }}_0(\tilde{s})\right\| _2^2. \end{aligned}$$

    Let \(\breve{{\varvec{\beta }}}(\tilde{s})\) be augmented \(\hat{{\varvec{\beta }}}(s)\) with components in \(\tilde{s}\cap s^c\) being 0, then \(l_n\left( \hat{{\varvec{\beta }}}(s)\right) =l_n\left( \breve{{\varvec{\beta }}}(\tilde{s})\right) \) and \(\Vert \breve{{\varvec{\beta }}}(\tilde{s})-{\varvec{\beta }}_0(\tilde{s})\Vert _2\ge |{\varvec{\beta }}_{0,\min }|\), where \(|{\varvec{\beta }}_{0,\min }|=\min \left\{ |{\varvec{\beta }}_{0,j}|:j\in s_{0n}\right\} \). The concavity of \(l_n\left( {\varvec{\beta }}(s)\right) \) implies

    $$\begin{aligned} {\fancyscript{M}}_n&= \sup \left\{ l_n\left( {\varvec{\beta }}(\tilde{s})\right) -l_n \left( {\varvec{\beta }}_0(\tilde{s})\right) :s\in {\fancyscript{A}}_1, \Vert {\varvec{\beta }}(\tilde{s})- {\varvec{\beta }}_0(\tilde{s})\Vert _2\ge |{\varvec{\beta }}_{0,\min }|\right\} \\&\le \sup \left\{ l_n\left( {\varvec{\beta }}(\tilde{s})\right) -l_n \left( {\varvec{\beta }}_0(\tilde{s})\right) :s\in {\fancyscript{A}}_1, \Vert {\varvec{\beta }}(\tilde{s})-{\varvec{\beta }}_0(\tilde{s})\Vert _2=|{\varvec{\beta }}_{0,\min }|\right\} . \end{aligned}$$

    Since for any fixed \(\tilde{s}\), when \(\Vert {\varvec{\beta }}(\tilde{s})-{\varvec{\beta }}_0(\tilde{s})\Vert _2=|{\varvec{\beta }}_{0,\min }|\),

    $$\begin{aligned} l_n\left( {\varvec{\beta }}(\tilde{s})\right) -l_n\left( {\varvec{\beta }}_0(\tilde{s})\right) \le |{\varvec{\beta }}_{0,\min }|\Vert U_j({\varvec{\beta }}_0(\tilde{s}))\Vert _{+\infty }-{\varvec{\beta }}_{0,\min }^2\dfrac{(1-\varepsilon )\lambda _{1,n}}{2}. \end{aligned}$$

    Therefore,

    $$\begin{aligned} P\left( {\fancyscript{M}}_n\ge -{\varvec{\beta }}_{0,\min }^2\dfrac{(1-\varepsilon )\lambda _{1,n}}{4}\right)&\le k_np_n^{k_n}P(\Vert U_j({\varvec{\beta }}_0(\tilde{s}))\Vert _{+\infty }\nonumber \\&\ge \dfrac{|{\varvec{\beta }}_{0,\min }|(1-\varepsilon )\lambda _{1,n}}{4}). \end{aligned}$$

    When \(n^{1/6-\delta }=O(\lambda _{1,n}/\sqrt{n})\) for some \(0<\delta <1/6\).

    $$\begin{aligned}&P(\sup \{l_n (\hat{{\varvec{\beta }}}(s))-l_n (\hat{{\varvec{\beta }}}(s_{0n})): s\in {\fancyscript{A}}_1\}\ge -C\ln n)\\&\quad \le P\left( {\fancyscript{M}}_n\ge -C\ln n\right) \le P\left( {\fancyscript{M}}_n\ge -{\varvec{\beta }}_{0,\min }^2\dfrac{(1-\varepsilon )\lambda _{1,n}}{4}\right) \\&\quad \le k_np_n^{k_n}P(\Vert U_j({\varvec{\beta }}_0(\tilde{s}))\Vert _{+\infty }\ge \sqrt{n}n^{1/6-\delta })\le c_0\exp \left( -c_1n^{1/3-2\delta }+\kappa \ln n\right) . \end{aligned}$$

    It converges to 0 when \(n\) goes to \(\infty \); inequality (13) is thus obtained.

  2. (2)

    When \(s\in {\fancyscript{A}}_0\) and \(s\ne s_{0n}\), let \(m=|s|-|s_{0n}|,\text{ EBIC }_{\gamma }(s)\le \text{ EBIC }_{\gamma }(s_{0n})\) if and only if

    $$\begin{aligned} l_n(\hat{{\varvec{\beta }}}(s))-l_n(\hat{{\varvec{\beta }}}(s_{0n}))\ge m[0.5\ln n+\gamma \ln p_n] \approx \dfrac{m(1+2\gamma \kappa )\ln n}{2}. \end{aligned}$$

    From the assumptions, we can see that

    $$\begin{aligned} l_n(\hat{{\varvec{\beta }}}(s))-l_n(\hat{{\varvec{\beta }}}(s_{0n}))\!&\le \! l_n(\hat{{\varvec{\beta }}}(s))-l_n({\varvec{\beta }}(s_{0n}))=l_n\left( \hat{{\varvec{\beta }}}(s)\right) -l_n\left( {\varvec{\beta }}_0(s)\right) \\ \!&\le \!\left( \hat{{\varvec{\beta }}}(s)-{\varvec{\beta }}(s_{0n})\right) ^{\tau }U\left( {\varvec{\beta }}_0(s),1\right) \\&-\dfrac{1}{2}\left( \hat{{\varvec{\beta }}}(s)-{\varvec{\beta }}(s_{0n})\right) ^{\tau }I\left( \widetilde{{\varvec{\beta }}}(s),1\right) \left( \hat{{\varvec{\beta }}}(s)-{\varvec{\beta }}(s_{0n})\right) \\ \!&\le \!\left( \hat{{\varvec{\beta }}}(s)-{\varvec{\beta }}(s_{0n})\right) ^{\tau }U\left( {\varvec{\beta }}_0(s),1\right) \\&-\dfrac{1-\varepsilon }{2}\left( \hat{{\varvec{\beta }}}(s)-{\varvec{\beta }}(s_{0n})\right) ^{\tau }I\left( {\varvec{\beta }}_0(s),1\right) \left( \hat{{\varvec{\beta }}}(s)-{\varvec{\beta }}(s_{0n})\right) \\ \!&\le \! \max _{{\varvec{\beta }}}\left[ {\varvec{\beta }}^{\tau }U\left( {\varvec{\beta }}_0(s),1\right) -\dfrac{1-\varepsilon }{2}{\varvec{\beta }}^{\tau }I\left( {\varvec{\beta }}_0(s),1\right) {\varvec{\beta }}\right] \\ \!&\le \! \left[ {\varvec{\beta }}^{\tau }U\left( {\varvec{\beta }}_0(s),1\right) \right] |_{{\varvec{\beta }}=[(1-\varepsilon )I\left( {\varvec{\beta }}_0(s),1\right) ]^{-1}U\left( {\varvec{\beta }}_0(s),1\right) }\\ \!&= \! \frac{1}{2n(1-\varepsilon )}U^{\tau }\left( {\varvec{\beta }}_0(s),1\right) \left[ \frac{I\left( {\varvec{\beta }}_0(s),1\right) }{n}\right] ^{-1}U\left( {\varvec{\beta }}_0(s),1\right) , \end{aligned}$$

    where \(\varepsilon \) is an arbitrary positive value. Note that \(m\) is finite; therefore, if we can show that for any fixed positive integer \(m\), when \(\gamma >1-\frac{1}{2\kappa }\),

    $$\begin{aligned}&P\left( \max \limits _{s\in {\fancyscript{A}}_0, |s|=m+|s_{0n}|}\frac{1}{2n(1-\varepsilon )}U^{\tau }\left( {\varvec{\beta }}_0(s),1 \right) \left[ \frac{I\left( {\varvec{\beta }}_0(s),1 \right) }{n}\right] ^{-1}U\left( {\varvec{\beta }}_0(s),1\right) \right. \nonumber \\&\left. \quad \ge \dfrac{m(1+2\gamma \kappa )\ln n}{2}\right) \rightarrow 0, \end{aligned}$$
    (14)

    then we will have (12). Denote

    $$\begin{aligned} {\fancyscript{T}}_1&= \left\{ \max _{s\in {\fancyscript{A}}_0}\Vert [\frac{I\left( {\varvec{\beta }}_0(s),1\right) }{n}]^{-1}- \varSigma ^{-1}\left( {\varvec{\beta }}_0(s),1\right) \Vert _{+\infty }\le \dfrac{C_1u_n}{\sqrt{n}}\right\} \\ {\fancyscript{T}}_2&= \left\{ \max _{s\in {\fancyscript{A}}_0}\frac{U^{\tau }\left( {\varvec{\beta }}_0(s),1\right) U \left( {\varvec{\beta }}_0(s),1\right) }{|s|}\le nu_n^2\right\} . \end{aligned}$$

    Inequalities (6) and (2) show that

    $$\begin{aligned} P\left( {\fancyscript{T}}_1^c\right)&\le \frac{C_0}{u_n}\exp \left( -\frac{u_n^2}{2}+2\kappa \ln n\right) ;\;P\left( {\fancyscript{T}}_2^c\right) \nonumber \\&\le c_0\exp \left( -\frac{(1-\varepsilon )u_n^2}{2}+\kappa \ln n\right) . \end{aligned}$$
    (15)

    Therefore, we have

    $$\begin{aligned}&P\left( \max \limits _{s\in {\fancyscript{A}}_0,|s|=m+|s_{0n}|}\right. \left. \frac{1}{2n(1-\varepsilon )}U^{\tau }\left( {\varvec{\beta }}_0(s),1\right) \left[ \frac{I\left( {\varvec{\beta }}_0(s),1\right) }{n}\right] ^{-1}U \left( {\varvec{\beta }}_0(s),1\right) \right. \\&\quad \left. \ge \dfrac{m(1+2\gamma \kappa )\ln n}{2}\right) \\&\quad \le P\left( \max _{s\in {\fancyscript{A}}_0,|s|=m+|s_{0n}|}U^{\tau }\left( {\varvec{\beta }}_0(s),1 \right) \left[ \frac{I\left( {\varvec{\beta }}_0(s),1\right) }{n}\right] ^{-1}U \left( {\varvec{\beta }}_0(s),1\right) \right. \\&\quad \left. \ge mn(1-\varepsilon )(1+2\gamma \kappa )\ln n \mid {\fancyscript{T}}_1,{\fancyscript{T}}_2\right) \\&\quad \quad + P\left( {\fancyscript{T}}_1^c\right) + P\left( {\fancyscript{T}}_2^c\right) . \end{aligned}$$

    Since under \({\fancyscript{T}}_1,{\fancyscript{T}}_2\),

    $$\begin{aligned}&\max _{s\in {\fancyscript{A}}_0,|s|=m+|s_{0n}|}\left[ U^{\tau }\left( {\varvec{\beta }}_0(s),1\right) \left| [\frac{I\left( {\varvec{\beta }}_0(s),1\right) }{n}]^{-1}-\varSigma ^{-1}\left( {\varvec{\beta }}_0(s),1\right) \right| U\left( {\varvec{\beta }}_0(s),1\right) \right] \\&\quad \le C\sqrt{n}u_n^3=C\frac{(n^{-1/6}u_n)^3}{\ln n}(n\ln n)=o(n\ln n), \end{aligned}$$

    the two terms in (15) both converge to 0 as \(n\) goes to \(+\infty \) and

    $$\begin{aligned}&P\left( \max _{s\in {\fancyscript{A}}_0,|s|=m+|s_{0n}|}U^{\tau }\left( {\varvec{\beta }}_0(s),1 \right) \varSigma ^{-1}\left( {\varvec{\beta }}_0(s),1\right) U\left( {\varvec{\beta }}_0(s),1 \right) \right. \\&\quad \left. \ge mn(1-\varepsilon )(1+2\gamma \kappa )\ln n \mid {\fancyscript{T}}_1,{\fancyscript{T}}_2\right) \\&\quad \le CP\left( \max _{s\in {\fancyscript{A}}_0,|s|=m+|s_{0n}|}{\varvec{u}}^{\tau }\varSigma ^{-1/2}\left( {\varvec{\beta }}_0(s),1 \right) U\left( {\varvec{\beta }}_0(s),1\right) \right. \\&\quad \left. \ge (1-\delta )\sqrt{mn(1-\varepsilon )(1+2\gamma \kappa )\ln n} \mid {\fancyscript{T}}_1,{\fancyscript{T}}_2\right) , \end{aligned}$$

    where \(\Vert {{\varvec{u}}}\Vert _2=1,\;\delta \) is an arbitrary positive value. According to (3), it can be further bounded by \(c_0^{\star }\exp \left[ -\dfrac{1-\varepsilon ^{\star }}{2}(1+2\gamma \kappa )m\ln n+m\kappa \ln n\right] \) where \(c_0^{\star }\) is a positive constant. It converges to 0 when \(\gamma >\frac{1}{1-\varepsilon ^{\star }}-\frac{1}{2\kappa }\), where \(\varepsilon ^{\star }\) is an arbitrary positive value; inequality (14) is thus obtained.

\(\square \)

About this article

Cite this article

Luo, S., Xu, J. & Chen, Z. Extended Bayesian information criterion in the Cox model with a high-dimensional feature space. Ann Inst Stat Math 67, 287–311 (2015). https://doi.org/10.1007/s10463-014-0448-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-014-0448-y

Keywords

Navigation