Focused information criterion and model averaging in censored quantile regression

Du, Jiang; Zhang, Zhongzhan; Xie, Tianfa

doi:10.1007/s00184-017-0616-1

Focused information criterion and model averaging in censored quantile regression

Published: 29 April 2017

Volume 80, pages 547–570, (2017)
Cite this article

Metrika Aims and scope Submit manuscript

Jiang Du^1,2,
Zhongzhan Zhang^1,2 &
Tianfa Xie^1,2

492 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we study model selection and model averaging for quantile regression with randomly right censored response. We consider a semi-parametric censored quantile regression model without distribution assumptions. Under general conditions, a focused information criterion and a frequentist model averaging estimator are proposed, and theoretical properties of the proposed methods are established. The performances of the procedures are illustrated by extensive simulations and the primary biliary cirrhosis data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

Cox Proportional Hazards Regression Model

Properties and Applications of A New Attractive Distribution

Article Open access 08 April 2024

References

Akaike H (1973) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 22:203–217
MathSciNet MATH Google Scholar
Bang H, Tsiatis A (2002) Median regression with censored cost data. Biometrics 58:643–649
Article MathSciNet MATH Google Scholar
Behl P, Claeskens G, Dette H (2014) Focused model selection in quantile regression. Stat Sin 24:601–624
MATH Google Scholar
Claeskens G, Croux C, Van Kerckhoven J (2006) Variable selection for logistic regression using a prediction-focused information criterion. Biometrics 62:972–979
Article MathSciNet MATH Google Scholar
Claeskens G, Carroll RJ (2007) An asymptotic theory for model selection inference in general semiparametric problems. Biometrika 94:249–265
Article MathSciNet MATH Google Scholar
Claeskens G, Hjort NL (2003) The focused information criterion (with discussion). J Am Stat Assoc 98:900–916
Article MATH Google Scholar
Deng GH, Liang H (2010) Model averaging for semiparametric additive partial linear models. Scin China math 53:1363–1376
Article MathSciNet MATH Google Scholar
Du J, Zhang ZZ, Xie TF (2013) Focused information criterion and model averaging in quantile regression. Commun Stat Theory Methods 42:3716–3734
Article MathSciNet MATH Google Scholar
Hansen BE (2007) Least squares model averaging. Econometrica 75:1175–1189
Article MathSciNet MATH Google Scholar
Hansen BE (2008) Least squares forecast averaging. J Econ 146:342–350
Article MathSciNet MATH Google Scholar
Hjort NL, Claeskens G (2003) Frequentist model average estimators (with discussion). J Am Stat Assoc 98:879–945
Article MATH Google Scholar
Hjort NL, Claeskens G (2006) Focussed information criteria and model averaging for Coxs hazard regression model. J Am Stat Assoc 101:1449–1464
Article MATH Google Scholar
Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Article MathSciNet MATH Google Scholar
Kitagawa T, Muris C (2016) Model averaging in semiparametric estimation of treatment effects. J Econ 19:271–289
Article MathSciNet MATH Google Scholar
Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91:74–89
Article MathSciNet MATH Google Scholar
Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Liu C (2015) Distribution theory of the least squares averaging estimator. J Econ 186:142–159
Article MathSciNet MATH Google Scholar
Pang L, Lu W, Wang H (2012) Variance estimation in censored quantile regression via induced smoothing. Comput Stat Data Anal 56:785–796
Article MathSciNet MATH Google Scholar
Peng L, Huang Y (2008) Survival analysis with quantile regression models. J Am Stat Assoc 103:637–649
Article MathSciNet MATH Google Scholar
Pircalabelu E, Claeskens G, Waldorp L (2015) A focused information criterion for graphical models. Stat Comput 25:1071–1092
Article MathSciNet MATH Google Scholar
Pollard D (1991) Asymptotics for least absolute deviation regression estimators. Econom Theory 7:186–199
Article MathSciNet Google Scholar
Powell JL (1984) Least absolute deviations estimation for the censored regression model. J Econ 25:303–325
Article MathSciNet MATH Google Scholar
Powell JL (1986) Censored regression quantiles. J Econ 32:143–155
Article MathSciNet MATH Google Scholar
Qian J, Peng L (2010) Censored quantile regression with partially functional effects. Biometrika 97:839–850
Article MathSciNet MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MathSciNet MATH Google Scholar
Shows J, Lu W, Zhang H (2010) Sparse estimation and inference for censored median regression. J Stat Plann Inference 140:1903–1917
Article MathSciNet MATH Google Scholar
Therneau TM, Grambsch PM (2001) Introduction to nonparametric regression. Springer, New York
Google Scholar
Tibshirani R (1997) The LASSO method for variable selection in the Cox model. Stat Med 16:385–395
Article Google Scholar
Wang H (2009) Inference on quantile regression for heteroscedastic mixed models. Stat Sin 19:1247–1261
MathSciNet MATH Google Scholar
Wang H, Fygenson M (2009) Inference for censored quantile regression models in longitudinal studies. Ann Stat 37:756–781
Article MathSciNet MATH Google Scholar
Wu Y, Liu Y (2009) Variable selection in quantile regression. Stat Sin 37:801–817
MathSciNet MATH Google Scholar
Xu J, Leng C, Ying Z (2010) Rank-based variable selection in the accelerated failure time model. Stat Comput 20:165–176
Article MathSciNet Google Scholar
Xu G, Wang S, Huang JZ (2014) Focused information criterion and model averaging based on weighted composite quantile regression. Scand J Stat 41:365–381
Article MathSciNet MATH Google Scholar
Ying Z, Jung SH, Wei LJ (1995) Survival analysis with median regression models. J Am Stat Assoc 90:178–184
Article MathSciNet MATH Google Scholar
Zeng D, Lin DY (2008) Efficient resampling methods for non-smooth estimating functions. Biostatistics 9:355–363
Article MATH Google Scholar
Zhang X, Wan ATK, Zhou SZ (2012) Focused information criteria, model selection and model averaging in a Tobit model with a non-zero threshold. J Bus Econ Stat 30:132–142
Article Google Scholar
Zhang X, Zou G, Liang H (2013) Choice of weights in FMA estimators under general parametric models. Sci China Math 56(3):443–457
Article MathSciNet MATH Google Scholar
Zhang X, Liang H (2011) Focused information criterion and model averaging for generalized additive partial linear models. Ann Stat 39:174–200
Article MathSciNet MATH Google Scholar
Zhang H, Lu W (2007) Adaptive LASSO for Cox’s proportional hazards model. Biometrika 94:1–13
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank Prof. Norbert Henze and the anonymous reviewer for their valuable suggestions that improve the presentation and the results of the paper.

Author information

Authors and Affiliations

College of Applied Sciences, Beijing University of Technology, Beijing, 100124, People’s Republic of China
Jiang Du, Zhongzhan Zhang & Tianfa Xie
Collaborative Innovation Center on Capital Social Construction and Social Management, Beijing, 100124, People’s Republic of China
Jiang Du, Zhongzhan Zhang & Tianfa Xie

Authors

Jiang Du
View author publications
You can also search for this author in PubMed Google Scholar
Zhongzhan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tianfa Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiang Du.

Additional information

Du’s research was supported by Grants from the National Natural Science Foundation of China (No. 11261025) and Program for Rixin Talents in Beijing University of Technology (No. 006000514116003). Zhang’s research was supported by the National Natural Science Foundation of China (No. 11271039) and Research Fund of Beijing Education Committee (No. 00600054K1002). Xie’s research was supported by Grants from the National Natural Science Foundation of China (No. 11571340) and the Science and Technology Project of Beijing Municipal Education Commission (KM201710005032).

Appendix

We first state some regularity conditions.

Regularity Conditions

C1: $\varepsilon _1,\ldots , \varepsilon _n$ are independent and have a common continuous conditional probability density function $f(\cdot |X = x)$ satisfying that $f(0|X=x)\ge b_0>0, |\dot{f}(0|X = x)| \le B_0$ and $\sup _s |\dot{f}(s|X = x)| \le B_0$ for all possible values x of X, where $(b_0, B_0)$ are two positive constants and $\dot{f}$ is the derivative of f.
C2: The covariate vectors $X_1,\ldots , X_n$ are independent and have a common compact support, the parameter $\beta _0$ belongs to the interior of a known compact set $\mathcal {B}_0$.
C3: $P(t \le T \le C) \ge \zeta _0 > 0$ for any $t \in [0, c]$, where $\zeta _0$ is a constant and c is the maximum follow-up.

These regularity conditions guarantee asymptotic normality of the proposed estimator.

In order to prove these theorems, we make a linear approximation to $\rho _\tau (\varepsilon _i-t)$ by $D_i=(1-\tau )I_{\{\varepsilon _i<0\}} -\tau I_{\{\varepsilon _i\ge 0\}}$. One intuitive interpretation of $D_i$ is that $D_i$ can be thought of as the first derivative of $ \rho _\tau (\varepsilon _i-t)$ at $t=0$ (Pollard 1991). The assumption that $\varepsilon _i$ has the $\tau $th quantile zero implies $E(D_i)=0, Var(D_i)=\tau (1-\tau )$. We begin by stating an auxiliary lemma, which plays an important role for the proofs of our main theorems.

Lemma 1

Denote

$$\begin{aligned} G_{n}(\beta _S)=\sum \limits _{i=1}^n\frac{\delta _i}{G_0(Y_i)} \left[ \rho _\tau (\varepsilon _i-\frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S+\frac{1}{\sqrt{n}}X_i^T\beta _0)-\rho _\tau (\varepsilon _i)\right] . \end{aligned}$$

Under conditions C1–C3, for fixed $\beta _S$ and $\beta _0$, we have

$$\begin{aligned} G_{n}(\beta _S)=\frac{1}{2n}(\beta _S^T\Pi _S-\beta _0^T)\Sigma _0(\Pi _S^T\beta _S-\beta _0)+W_{n1}^T(\Pi _S^T\beta _S-\beta _0)+o_p(1). \end{aligned}$$

where $W_{n1}=\frac{1}{\sqrt{n}}\sum \limits _{i=1}^n\frac{\delta _i}{G_0(Y_i)}D_i X_i$, and $\Sigma _0$ is defined in the following proof.

Proof of Lemma 1

Denote $M(t)=E_X[\frac{\delta }{G_0(Y)}(\rho _\tau (\varepsilon -t)-\rho _\tau (\varepsilon ))]$, and note that $M(t)=E_X(\rho _\tau (\varepsilon -t)-\rho _\tau (\varepsilon ))$, thereinafter $E_X$ means the conditional expectation taken for given $X_i$s. With condition C1, it is easy to show that M(t) has a unique minimizer at zero, and its Taylor expansion at origin has the form $M(t)=\frac{f(0)}{2}t^2+o(t^2)$. Hence, for larger n, we have

$$\begin{aligned} E_X(G_{n}(\beta _S))= & {} \frac{1}{2n}(\beta _S^T\Pi _S-\beta _0^T)\left( \sum \limits _{i=1}^n f(0|X_i)X_iX_i^T\right) (\Pi _S^T\beta _S-\beta _0)+o(A_n), \end{aligned}$$

where

$$\begin{aligned} A_n=\frac{1}{2n}(\beta _S^T\Pi _S-\beta _0^T)\left( \sum \limits _{i=1}^n f(0|X_i)X_iX_i^T\right) (\Pi _S^T\beta _S-\beta _0). \end{aligned}$$

Invoking the law of large numbers, and condition C1 and C2, one has

$$\begin{aligned} \frac{1}{ n} \sum \limits _{i=1}^n f(0|X_i)X_iX_i^T \rightarrow \Sigma _0, \end{aligned}$$

almost surely, where $\Sigma _0$ is a $d\times d$ positive definite matrix.

Therefore, combining with condition C2, one has $A_n=O(1)$ almost surely uniformly on the compact parameter space. Then, we have

$$\begin{aligned} E(G_{n}(\beta _S))=\frac{1}{2}(\beta _S^T\Pi _S-\beta _0^T)\Sigma _0(\Pi _S^T\beta _S-\beta _0)+o(1). \end{aligned}$$

$G_{n}(\beta _S)$ can be rewritten as

$$\begin{aligned} G_{n}(\beta _S)=EG_{n}(\beta _S)+W_{n1}^T(\Pi _S^T\beta _S-\beta _0)+\sum \limits _{i=1}^n[R_{i,n,S} -E(R_{i,n,S})], \end{aligned}$$

where

$$\begin{aligned} R_{i,n,S}= & {} \frac{\delta _i}{G_0(Y_i)}\left( \rho _\tau \left( \varepsilon _i-\frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S+\frac{1}{\sqrt{n}}X_i^T\beta _0\right) \right. \\&\left. -\rho _\tau (\varepsilon _i) -D_i\frac{1}{\sqrt{n}}\left( (\Pi _SX)_i^T\beta _S-X_i^T\beta _0\right) \right) . \end{aligned}$$

By routine calculation, we get

$$\begin{aligned} |R_{i,n,S}|\le \frac{\delta _i}{G_0(Y_i)} |\frac{1}{\sqrt{n}} (\Pi _SX)_i^T\beta _S-\frac{1}{\sqrt{n}}X_i^T\beta _0| I_{\left\{ \varepsilon _i\le |\frac{1}{\sqrt{n}}(\Pi _SX)_i^T\beta _S-\frac{1}{\sqrt{n}}X_i^T\beta _0|\right\} }. \end{aligned}$$

Due to the cancelation of cross product terms, by conditions C2 and C3, we obtain

$$\begin{aligned} E_X\left( \sum \limits _{i=1}^n[R_{i,n,S} -E(R_{i,n,S})]\right) ^2= & {} \sum \limits _{i=1}^n E_X\left( R_{i,n,S} -E(R_{i,n,S})\right) ^2\\\le & {} \sum \limits _{i=1}^n E_X \left( R_{i,n,S}\right) ^2\\\le & {} \sum \limits _{i=1}^n\frac{1}{\zeta ^2_0} \left| \frac{1}{\sqrt{n}}(\Pi _S X)_i^T\beta _S-\frac{1}{\sqrt{n}}X_i^T\beta _0\right| ^2\\&\qquad E_X I_{\{\varepsilon _i\le |\frac{1}{\sqrt{n}}(\Pi _SX)_i^T\beta _S-\frac{1}{\sqrt{n}}X_i^T\beta _0|\}}\\\le & {} \left( \sum \limits _{i=1}^n\frac{1}{\zeta ^2_0}\left| \frac{1}{\sqrt{n}}(\Pi _SX)_i^T\beta _S-\frac{1}{\sqrt{n}}X_i^T\beta _0\right| ^2\right) \\&\qquad E_X I_{\{\varepsilon \le |\frac{1}{\sqrt{n}}\parallel \Pi _S^T \beta _S-\beta _0\parallel \ \max \limits _{i=1,2,\ldots ,n}\parallel X_i\parallel \}} , \end{aligned}$$

where $\parallel \cdot \parallel $ denotes the Euclidean norm. By condition C2, the last term converges to zero almost surely, because

$$\begin{aligned} \sum \limits _{i=1}^n\left| \frac{1}{\sqrt{n}}X_i^T(\Pi _S^T\beta _S-\beta _0)\right| ^2= & {} (\beta _S^T\Pi _S-\beta _0^T)\frac{1}{n}\sum \limits _{i=1}^nX_iX_i^T(\Pi _S^T\beta _S-\beta _0)\\&\rightarrow (\beta _S^T\Pi _S-\beta _0^T)\Sigma (\Pi _S^T\beta _S-\beta _0), \end{aligned}$$

and

$$\begin{aligned} \max \limits _{i=1,2,\ldots ,n}\parallel X_i\parallel /\sqrt{n}\rightarrow 0, \end{aligned}$$

almost surely, where $\Sigma =E(X_1X_1^T).$ Therefore, $\sum \limits _{i=1}^n[R_{i,n,S} -E(R_{i,n,S})]=o_p(1)$ uniformly on the compact parameter space. Further, one has

$$\begin{aligned} G_{n}(\beta _S)= & {} \frac{1}{2}(\beta _S^T\Pi _S-\beta _0^T)\Sigma _0(\Pi _S^T\beta _S-\beta _0)+W_{n1}^T(\Pi _S^T\beta _S-\beta _0)+o_p(1). \end{aligned}$$

This completes the proof. $\square $

Lemma 2

Denote

$$\begin{aligned} G_{n}({\widehat{G}},\beta _S)=\sum \limits _{i=1}^n\frac{\delta _i}{{\widehat{G}}(Y_i)} \left[ \rho _\tau (\varepsilon _i-\frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S+\frac{1}{\sqrt{n}}X_i^T\beta _0)-\rho _\tau (\varepsilon _i)\right] . \end{aligned}$$

Suppose that conditions C1–C3 hold, we have the following asymptotic representations:

$$\begin{aligned} G_{n}(\widehat{G},\beta _S)=\frac{1}{2}(\beta _S^T\Pi _S-\beta _0^T)\Sigma _0(\Pi _S^T\beta _S-\beta _0)+W^T(\Pi _S^T\beta _S-\beta _0) +o_p(1), \end{aligned}$$

where W is a d-normal random vector with mean $\varvec{0}$.

Proof of Lemma 2

It is easy to show that $G_{n}({\widehat{G}},\beta _S)$ can be written as

$$\begin{aligned} G_{n}({\widehat{G}},\beta _S)=G_{n}(\beta _S)+I_{n1}-I_{n2}, \end{aligned}$$

where

$$\begin{aligned} I_{n1}= & {} \sum \limits \limits _{i=1}^n \delta _i\rho _\tau \left( \varepsilon _i-\frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S+\frac{1}{\sqrt{n}}X_i^T\beta _0\right) \left( \frac{1}{\widehat{G}(Y_i)}-\frac{1}{G_0(Y_i)}\right) ,\\ I_{n2}= & {} \sum \limits _{i=1}^n \delta _i\rho _\tau (\varepsilon _i)\left( \frac{1}{\widehat{G}(Y_i)}- \frac{1}{G_0(Y_i)}\right) , \end{aligned}$$

First, we consider $I_{n1}$, by the Taylor expansion (Shows et al. 2010), we have

$$\begin{aligned} \sqrt{n}\left( \frac{1}{{\widehat{G}}(Y_i)}-\frac{1}{G_0(Y_i)}\right)= & {} -\frac{\sqrt{n} \{{\widehat{G}}(Y_i)-G_0(Y_i)\}}{G^2_0(Y_i)}+o_p(1)\\= & {} \frac{1}{G_0(Y_i)}\frac{1}{\sqrt{n}}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}+o_p(1) , \end{aligned}$$

where $y(s)=\lim \limits _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^nI(Y_i\ge s), M_i^C(s)=(1-\delta _i)I(Y_i\le s)-\int _0^c I(Y_i\ge s)d\Lambda _C(s),$ and $\Lambda _C(s)$ is the cumulative hazard function of the censoring time C. This leads to

$$\begin{aligned} I_{n1}= & {} \sum \limits _{i=1}^n \delta _i\rho _\tau \left( \varepsilon _i-\frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S+\frac{1}{\sqrt{n}}X_i^T\beta _0\right) \\&\frac{1}{G_0(Y_i)}\frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}+o_p(1), \end{aligned}$$

Similarly, we get

$$\begin{aligned} I_{n2}=\sum \limits _{i=1}^n \delta _i\rho _\tau (\varepsilon _i)\frac{1}{G_0(Y_i)}\frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}+o_p(1). \end{aligned}$$

Therefore, similar to Lemma 1, one has

$$\begin{aligned} I_{n1}-I_{n2}= & {} \sum \limits _{i=1}^n \delta _i \rho _\tau \left( \varepsilon _i-\frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S+\frac{1}{\sqrt{n}}X_i^T\beta _0\right) \\&\frac{1}{G_0(Y_i)} \frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}\\&- \sum \limits _{i=1}^n \delta _i \rho _\tau (\varepsilon _i) \frac{1}{G_0(Y_i)} \frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}+o_p(1)\\= & {} -\sum \limits _{i=1}^n \delta _i\left[ \frac{1}{\sqrt{n}}X_i^T\Pi _S^T\beta _S-\frac{1}{\sqrt{n}}X_i^T\beta _0\right] \\&\frac{D_i}{G_0(Y_i)}\frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}+o_p(1)\\= & {} (\beta _S^T\Pi _S-\beta _0)W_{n2}+o_p(1), \end{aligned}$$

where

$$\begin{aligned} W_{n2}=-\sum \limits _{i=1}^n \delta _i\frac{1}{\sqrt{n}}X_i\frac{D_i}{G_0(Y_i)}\frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}. \end{aligned}$$

Combining Lemma 1, one has

$$\begin{aligned} G_{n}({\widehat{G}},\beta _S)= & {} \frac{1}{2}(\beta _S^T\Pi _S-\beta _0^T) \Sigma _0(\Pi _S^T\beta _S-\beta _0)\\&+\,(W_{n1}+W_{n2})^T(\Pi _S^T\beta _S- \beta _0)+o_p(1)\\= & {} \frac{1}{2}(\beta _S^T\Pi _S-\beta _0^T)\Sigma _0(\Pi _S^T\beta _S-\beta _0)\\&+\,W_n^T(\Pi _S^T\beta _S-\beta _0)+o_p(1). \end{aligned}$$

where $ W_{n}=W_{n1}+W_{n2}.$

Notice that

$$\begin{aligned} W_{n}= & {} W_{n1}+W_{n2}\\= & {} \frac{1}{\sqrt{n}}\sum \limits _{i=1}^n\frac{\delta _i}{G_0(Y_i)}D_i X_i\\&- \sum \limits _{i=1}^n \delta _i\frac{1}{\sqrt{n}}X_i\frac{D_i}{G_0(Y_i)}\frac{1}{ n}\sum _{j=1}^n \int _0^c I(Y_i\ge s)\frac{dM_j^C(s)}{y(s)}\\= & {} \frac{1}{\sqrt{n}}\sum \limits _{i=1}^n\frac{\delta _i}{G_0(Y_i)}X_iD_i- \sum \limits _{i=1}^n\frac{1}{\sqrt{n}}\int _0^c \frac{h(s)}{y(s)}dM_i^C(s)+o_p(1) \\= & {} \frac{1}{\sqrt{n}}\sum \limits _{i=1}^n\left( \frac{\delta _i}{G_0(Y_i)}X_iD_i- \int _0^c \frac{h(s)}{y(s)}dM_i^C(s)\right) +o_p(1) \\= & {} \frac{1}{\sqrt{n}}\sum \limits _{i=1}^ns_i+o_p(1), \end{aligned}$$

where $s_i=\frac{\delta _i}{G_0(Y_i)}X_iD_i-\int _0^c \frac{h(s)}{y(s)}dM_i^C(s)$ and $h(s)=\lim \limits _{n\rightarrow \infty } \frac{1}{n}\sum _{i=1}^n \frac{\delta _i D_i X_i}{G_0(Y_i)} I(Y_i\ge s)$. By the martingale central limit theorem, $\frac{1}{\sqrt{n}}\sum \limits _{i=1}^ns_i$ converges in distribution to a d-dimensional normal vector W with mean $\varvec{0}$ and variance-covariance matrix $\Sigma _1=E(s_1s_1^T)$. As a consequence, one has

$$\begin{aligned} G_{n}(\widehat{G},\beta _S)=\frac{1}{2}(\beta _S^T\Pi _S-\beta _0^T) \Sigma _0(\Pi _S^T\beta _S-\beta _0)+W^T(\Pi _S^T\beta _S-\beta _0)+o_p(1). \end{aligned}$$

$\square $

Lemma 3

(Convexity Lemma) Let $T_n(\theta ):\theta \in \Theta $ be sequence of random convex functions defined on a convex, open $\Theta $ of $R^p$. Suppose $T(\cdot )$ is a real-value function on $\Theta $ for which $T_n(\theta )\rightarrow T(\theta )$ in probability, for each $\theta $ in $\Theta $. Then for each compact subset K of $\Theta $,

$$\begin{aligned} \sup \limits _{\theta \in K}|T_n(\theta )-T(\theta )|\rightarrow 0 \end{aligned}$$

in probability, and the function $T(\cdot )$ is necessarily convex on $\Theta $.

Proof of Lemma 3

There are many versions of the proof for this well known Convexity Lemma. To save space, we skip its proof . Interested readers are referred to Pollard (1991). $\square $

Proof of Theorem 1

Observe that ${\widehat{\beta }}_S(\tau )=\arg \min \limits _{\beta _S}\sum \limits _{i=1}^n \frac{\delta _i}{{\widehat{G}}(Y_i)}\rho _\tau (Y_i-(\Pi _SX)_i^T\beta _S) $, so ${\widehat{\beta }}_S(\tau )$ minimizes

$$\begin{aligned}&\sum \limits _{i=1}^n\frac{\delta _i}{{\widehat{G}}(Y_i)}[\rho _\tau (Y_i-(\Pi _SX)_i^T\beta _S)-\rho _\tau (Y_i-X_i^T\beta _0)]\\&\quad =\sum \limits _{i=1}^n\frac{\delta _i}{{\widehat{G}}(Y_i)}[\rho _\tau (\varepsilon _i+X_i^T\beta _0-(\Pi _SX)_i^T\beta _S)-\rho _\tau (Y_i-X_i^T\beta _0)].\\ \end{aligned}$$

By Lemmas 2 and 3, we have

$$\begin{aligned}&\sqrt{n} \left\{ \widehat{\beta }_S(\tau )- \left( \begin{array}{c} \ddot{\beta }_0\\ 0 \end{array} \right) \right\} \mathop {\longrightarrow }\limits ^{d}- (\Pi _S\Sigma _0\Pi _S^{T})^{-1} \Pi _SW\\&+\,(\Pi _S\Sigma _0\Pi _S^{T})^{-1} \Pi _S\Sigma _0\left( \begin{array}{c} 0\\ \delta \end{array} \right) . \end{aligned}$$

This completes the proof. $\square $

Proofs of Theorems 2 and 3

The proofs of Theorems 2 and 3 are similar to those of Theorems 2 and 3 in Zhang and Liang (2011), respectively, and we omit them.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, J., Zhang, Z. & Xie, T. Focused information criterion and model averaging in censored quantile regression. Metrika 80, 547–570 (2017). https://doi.org/10.1007/s00184-017-0616-1

Download citation

Received: 21 June 2016
Published: 29 April 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s00184-017-0616-1

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Focused information criterion and model averaging in censored quantile regression

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Cox Proportional Hazards Regression Model

Properties and Applications of A New Attractive Distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Lemma 1

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Lemma 3

Proof of Lemma 3

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Focused information criterion and model averaging in censored quantile regression

Abstract

Access this article

Similar content being viewed by others

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Cox Proportional Hazards Regression Model

Properties and Applications of A New Attractive Distribution

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Lemma 1

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Lemma 3

Proof of Lemma 3

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation