Single index quantile regression for censored data


Quantile regression (QR) has become a popular method of data analysis, especially when the error term is heteroscedastic. It is particularly relevant for the analysis of censored survival data as an alternative to proportional hazards and the accelerated failure time models. Such data occur frequently in biostatistics, environmental sciences, social sciences and econometrics. There is a large body of work for linear/nonlinear QR models for censored data, but it is only recently that the single index quantile regression (SIQR) model has received some attention. However, the only existing method for fitting the SIQR model for censored data uses an iterative algorithm and no asymptotic theory for the resulting estimator of the parametric component is given. We propose a non-iterative estimation algorithm and derive the asymptotic distribution of the proposed estimator under heteroscedasticity. Results from simulation studies evaluating the finite sample performance of the proposed estimator are reported.

This is a preview of subscription content, log in to check access.


  1. Akritas MG (1996) On the use of nonparametric regression techniques for fitting parametric regression models. Biometrics 52(4):1342–1362

    MathSciNet  Article  Google Scholar 

  2. Bücher A, El Ghouch A, Van Keilegom I (2014) Single-index quantile regression models for censored data. Working paper.

  3. Christou E, Akritas MG (2016) Single index quantile regression for heteroscedastic data. J Multivar Anal 150:169–182

    MathSciNet  Article  Google Scholar 

  4. Christou E, Akritas MG (2018) Variable selection in heteroscedastic single-index quantile regression. Commun Stat Theory Methods 47(24):6019–6033

    MathSciNet  Article  Google Scholar 

  5. Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B 34(2):187–220

    MathSciNet  MATH  Google Scholar 

  6. Efron B (1967) The two-sample problem with censored data. In: Le Cam L, Neyman J (eds) Proceedings of the 5th Berkeley symposium in mathematical statistics, vol IV. Prentice-Hall, Upper Saddle River, pp 831–853

    Google Scholar 

  7. Gannoun A, Saracco J, Yu K (2007) Comparison of kernel estimators of conditional distribution function and quantile regression under censoring. Stat Model 7(4):329–344

    MathSciNet  Article  Google Scholar 

  8. Gonzalez-Manteiga W, Cadarso-Suarez C (1994) Asymptotic properties of a generalized Kaplan–Meier estimator with some applications. J Nonparametr Stat 4:65–78

    MathSciNet  Article  Google Scholar 

  9. Hansen B (2008) Uniform convergence rates for kernel estimation with dependent data. Econ Theory 24:726–748

    MathSciNet  Article  Google Scholar 

  10. Hjort NL, Pollard D (1993) Asymptotics for minimisers of convex processes. Unpublished manuscript.

  11. Honoré C, Khan S, Powell J (2002) Quantile regression under random censoring. J Econ 109:67–105

    MathSciNet  Article  Google Scholar 

  12. Hosmer D, Lemeshow S (1999) Applied survival analysis: regression modelling of time to evet data. Wiley, New York

    Google Scholar 

  13. Koenker R, Bassett G (1978) Regression quantiles. Econometrica 46(1):33–50

    MathSciNet  Article  Google Scholar 

  14. Koenker R, Geling O (2001) Reappraising medfly longevity: a quantile regression survival analysis. J Am Stat Assoc 96(454):458–468

    MathSciNet  Article  Google Scholar 

  15. Koenker R, Park BJ (1994) An interior point algorithm for nonlinear quantile regression. J Econ 71(1–2):265–283

    MathSciNet  MATH  Google Scholar 

  16. Kong E, Xia Y (2012) A single-index quantile regression model and its estimation. Econ Theory 28:730–768

    MathSciNet  Article  Google Scholar 

  17. Kong E, Linton O, Xia Y (2013) Global bahadur representation for nonparametric censored regression quantiles and its applications. Econ Theory 29(05):941–968

    MathSciNet  Article  Google Scholar 

  18. Li K-C, Duan N (1989) Regression analysis under link violation. Ann Stat 17(3):1009–1052

    MathSciNet  Article  Google Scholar 

  19. Li K-C, Wang J-L, Chen C-H (1999) Dimension reduction for censored regression data. Ann Stat 27(1):1–23

    MathSciNet  Article  Google Scholar 

  20. Ma S, He X (2016) Inference for single-index quantile regression models with profile optimization. Ann Stat 44(3):1234–1268

    MathSciNet  Article  Google Scholar 

  21. Mezzetti M, Giudici P (1999) Monte Carlo methods for nonparametric survival model determination. J Ital Statist Soc 1:49–60

    Article  Google Scholar 

  22. Powell JL (1984) Least absolute deviations estimation for the censored regression model. J Econ 25:303–325

    MathSciNet  Article  Google Scholar 

  23. Powell JL (1986) Censored regression quantiles. J Econ 32:143–155

    MathSciNet  Article  Google Scholar 

  24. Wang HJ, Wang L (2009) Locally weighted censored quantile regression. J Am Stat Assoc 104(487):1117–1128

    MathSciNet  Article  Google Scholar 

  25. Wu TZ, Yu K, Yu Y (2010) Single index quantile regression. J Multivar Anal 101(7):1607–1621

    MathSciNet  Article  Google Scholar 

  26. Ying Z, Jung SH, Wei LJ (1995) Survival analysis with median regression models. J Am Stat Assoc 90(429):178–184

    MathSciNet  Article  Google Scholar 

Download references


The authors wish to thank the two referees, whose comments lead to improvements in the presentation of this paper.

Author information



Corresponding author

Correspondence to Eliana Christou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 231 KB)


Appendix: Notations and assumptions


  1. 1.

    Define \(f(\mathbf {x})\) to be the marginal probability density function of \(\mathbf {X}\) and, for any \(\mathbf {x} \in \mathcal {X}_{0}\), denote by \(f_{C|\mathbf {X}}(\cdot |\mathbf {x})\), \(f_{Y|\mathbf {X}}(\cdot |\mathbf {x})\) and \(f_{\epsilon |\mathbf {X}}(\cdot |\mathbf {x})\) the conditional probability density functions of C, Y and \(\epsilon \) given \(\mathbf {X}=\mathbf {x}\), respectively.

  2. 2.

    We say a function \(m(\cdot ): \mathbb {R}^{d} \rightarrow \mathbb {R}\) has the order of smoothness s on \(\mathcal {X}_{0}\), denoted by \(m(\cdot ) \in H_{s}(\mathcal {X}_{0})\), if it is differentiable up to order [s], where [s] is the lowest integer part of s, and there exists a constant \(L>0\), such that, for all \(\mathbf {v}=(v_{1},\ldots ,v_{d})^{\top }\) with \(|\mathbf {v}|=v_{1}+\dots +v_{d}=[s]\), all \(\tau \) in the interval \([\underline{\tau }, \overline{\tau }]\), and all \(\mathbf {x}\), \(\mathbf {x}'\) in \(\mathcal {X}_{0}\),

    $$\begin{aligned} |D^{\mathbf {v}}m(\mathbf {x})-D^{\mathbf {v}}m(\mathbf {x'})| \le L \left\| \mathbf {x}-\mathbf {x}' \right\| ^{s-[s]}, \end{aligned}$$

    where \(D^{\mathbf {v}}m(\mathbf {x})\) denotes the partial derivative \(\partial ^{|\mathbf {v}|}m(\mathbf {x})/\partial x_{1}^{v_{1}}\dots x_{d}^{v_{d}}\).

  3. 3.

    Define two classes of functions. Let \(\mathbb {Z}\) be a class of positive, bounded and bounded away from zero functions \(\zeta : \mathbb {R}^{d+1} \rightarrow \mathbb {R}\), whose value at \((t,\mathbf {x}) \in \mathbb {R}^{d+1}\) can be written as \(\zeta (t|\mathbf {x})\), in the non-separable space \(l^{\infty }(t,\mathbf {x})=\{(t,\mathbf {x}^{\top })^{\top }: \mathbb {R}^{d+1} \rightarrow \mathbb {R}: \left\| \zeta \right\| _{(t,\mathbf {x})}:= \sup _{(t,\mathbf {x}^{\top })^{\top } \in \mathbb {R}^{d+1}} |\zeta (t|\mathbf {x})|<\infty \}\), where the quantity \(G(t|\mathbf {X})/\zeta (t|\mathbf {X})\) has bounded expectation uniformly in \(\zeta \). Thus, \(\mathbb {Z}\) includes \(G(t|\mathbf {x})\) and, according to Lemma B.1, includes \(\widehat{G}_{n}(t|\mathbf {x})\) for n large enough, almost surely. Moreover, define H to be a class of bounded functions \(\eta : \mathbb {R}^{d} \rightarrow \mathbb {R}\), whose value at \((t,\varvec{\beta }^{\top })^{\top } \in \mathbb {R}^{d}\) can be written as \(\eta (t|\varvec{\beta })\), in the non-separable space \(l^{\infty }(t,\varvec{\beta })\), and having bounded and continuous partial derivatives, where the first and second derivatives with respect to t exist and are bounded. Hence, H includes \(g(t|\varvec{\beta })\) and, according to Proposition 3.1, includes \(\widehat{g}_{CD}^{NW}(t|\varvec{\beta })\) for n large enough, almost surely.


Assumption A1

Let \(F_{\epsilon | \mathbf {X}}(\cdot |\mathbf {x})\) denote the distribution function of \(\epsilon \) given \(\mathbf {X}=\mathbf {x}\), which satisfies \(F_{\epsilon | \mathbf {X}}(0|\mathbf {x})=P(\epsilon \le 0 | \mathbf {X}=\mathbf {x})=\tau \) for \(0<\tau <1\), and has a density \(f_{\epsilon |\mathbf {X}}(t|\mathbf {x})\) which is continuous in \(\mathbf {x}\) for each t.

Assumption A2

The density \(f_{\mathbf {b}}(t)\) of \(\mathbf {b}^{\top }_{1}\mathbf {X}\) (usually called the ‘marginal density of \(\mathbf {X}\) in direction \(\mathbf {b}_{1}\)’) is uniformly bounded for \(t \in \mathfrak {T}_{\mathbf {b}}=\{t: t=\mathbf {b}^{\top }_{1}\mathbf {x}, \mathbf {x} \in \mathcal {X}_{0}\}\) and all \(\mathbf {b} \in \varTheta \), that is \(\sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} f_{\mathbf {b}}(t)< \infty \), and bounded away from 0, uniformly in \(\mathfrak {T}_{\mathbf {b}}\) and \(\varTheta \), that is \(\inf _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} f_{\mathbf {b}}(t)>0\). Moreover, the density function of \(\mathbf {b}^{\top }_{1}\mathbf {X}\) is uniformly continuous for \(\mathbf {b}\) in a neighborhood of \(\varvec{\beta }\).

Assumption A3

Let \(K(t): \mathbb {R} \rightarrow \mathbb {R}\) be a univariate, symmetric, second order kernel function, for which K(t) satisfies:

  1. (a)

    \( |K(t)| \le \overline{K}< \infty , \ \ \int _{\mathbb {R}}|K(t)|dt< \infty , \ \ \int _{\mathbb {R}} |t|^2 |K(t)|dt < \infty \), and

  2. (b)

    for some \(\varLambda _{1} < \infty \) and \(\varLambda _{2} < \infty \), either \(K(t)=0\) for \(|t|>\varLambda _{2}\) and for all \(t,t' \in \mathbb {R}\), \(|K(t)-K(t')| \le \varLambda _{1}|t-t'|\), or K(t) is differentiable, \(|(\partial / \partial t)K(t)| \le \varLambda _{1}\), and for some \(\nu >1\), \(|(\partial / \partial t)K(t)| \le \varLambda _{1}|t|^{-\nu }\) for \(|t|>\varLambda _{2}\).

Assumption A4

The first two derivatives of \(f_{\mathbf {b}}(t)\) and \(\varPsi (t|\mathbf {b})=f_{\mathbf {b}}(t)g(t|\mathbf {b})\) are uniformly continuous and are bounded uniformly in \(\mathbf {b}\).

Assumption A5

The bandwidth \(h=o(1)\) satisfies \(\log {n}/(nh)=o(1)\).

Assumption A6

The function \(Q_{\tau ,\varvec{\beta }_{1}}(Y|t)\), defined in (1.3), is smooth in a neighborhood of \(\varvec{\beta }\), for \(t \in \mathfrak {T}_{\varvec{\beta }}\), such that the first and second derivatives with respect to t exist and are bounded.

Assumption A7

Define, for \(\zeta (\cdot |\mathbf {x})\) in the class of functions \(\mathbb {Z}\), the function \( \varphi _{\zeta }(t|\mathbf {x})=E\left[ \{\varDelta / \zeta (Z|\mathbf {X})\} \rho _{\tau }(Z-t)|\mathbf {X}=\mathbf {x}\right] ,\) for which the expectation and differentiation can be interchanged, and let the first, second, and third derivatives of \(\varphi _{\zeta }(t|\mathbf {x})\) with respect to t exist, such that \(\varphi _{\zeta }(t|\mathbf {x})\), \(\varphi '_{\zeta }(t|\mathbf {x})\), \(\varphi ''_{\zeta }(t|\mathbf {x})\), as functions of \(\mathbf {x}\), are bounded and continuous in a neighborhood of \(\mathbf {x}\) for all small t, uniformly in \(\zeta \). Also, let \(\varphi '''_{\zeta }(t|\mathbf {x})\) be bounded as a function of t, \(\varphi ''_{\zeta }(t|\mathbf {x})\) be continuous as a function of t in a neighborhood of \(Q_{\tau ,\varvec{\beta }_{1}}(Y|\varvec{\beta }^{\top }_{1}\mathbf {x})\) uniformly in \(\mathbf {x}\), and \(\varphi _{\zeta }\{Q_{\tau ,\varvec{\beta }_{1}}(Y|\varvec{\beta }^{\top }_{1}\mathbf {X})|\mathbf {x}\} \ne 0\), uniformly in \(\zeta \).

Assumption B1

  1. (a)

    \(f(\cdot )\) is positive on \(\mathcal {X}_{0}\) and \(f(\cdot ) \in H_{s_{1}}(\mathcal {X}_{0})\), for some \(s_{1}>0\).

  2. (b)

    The conditional quantile function \(Q_{\tau }(Y|\mathbf {x}) \in H_{s_{2}}(\mathcal {X}_{0})\), for some \(s_{2}>0\).

  3. (c)

    \(f_{\epsilon |\mathbf {X}}(t|\mathbf {x})\), when seen as a function of \(\mathbf {x}\), belongs to \(H_{s_{3}}(\mathcal {X}_{0})\) for some \(s_{3}>0\), uniformly in t in a neighborhood of zero. Moreover, \(f_{\epsilon |\mathbf {X}}(0|\mathbf {x})\) is bounded away from zero uniformly in \(\mathbf {x} \in \mathcal {X}_{0}\), and its first order derivative with respect to t exists and is continuous in a neighborhood of zero for all \(\mathbf {x} \in \mathcal {X}_{0}\).

Assumption B2

The censoring variable C is conditionally independent of Y given \(\mathbf {X}\) and, for any \(\mathbf {x} \in \mathcal {X}_{0}\), there exists some finite \(\pi _{0}\), which might depend on \(\mathbf {x}\) and is within the support of the conditional distribution of Y given \(\mathbf {X}=\mathbf {x}\), such that \(G(\pi _{0}|\mathbf {x})=0\) and \(\inf _{\mathbf {x}} P(C=\pi _{0}|\mathbf {x})>0\).

Assumption B3

The functions \(f_{Y|\mathbf {X}}(t|\mathbf {x})\) and \(f_{C|\mathbf {X}}(t|\mathbf {x})\) as functions of \(\mathbf {x}\), both belong to \(H_{s_{4}}(\mathcal {X}_{0})\), for some \(s_{4}>0\), uniformly in t.

Assumption B4

  1. (a)

    The bandwidth \(h_{1}\) in the local Kaplan-Meier estimator in (2.7) is chosen such that \(nh_{1}^{2s_{4}+d}/\log {n} \rightarrow 0\), \(nh_{1}^{3d}/\log {n} \rightarrow 0\), and \(nh_{1}^{d+4}/\log {n}<\infty \).

  2. (b)

    The relationship between the two smoothing parameters \(h_{1}\) and \(h^{*}\) is such that \(h^{*}=o(h_{1})\) and \(nh_{1}^{2d}/(h^{*d}\log {n}) \rightarrow \infty \).

Comments on assumptions Assumption A1 allows to work under possible dependence between the covariate \(\mathbf {X}\) and the error term \(\epsilon \). Assumptions A2A5 come from the work of Hansen (2008), in order to ensure the uniform convergence of the density estimator \(\widehat{f}_{\mathbf {b}}(t)\) and of \(\widehat{\varPsi }(t|\mathbf {b})\); see proof of Proposition 3.1. Assumption A6 is a common assumption for the link function, and Assumption A7 imposes smoothness conditions on \(\varphi _{\zeta }(\cdot |\mathbf {x})\), since \(\rho _{\tau }(\cdot )\) is actually not differentiable at 0. Assumption B1 is standard on local polynomial estimation in QR. Assumption B2 implies that, given \(\mathbf {X}_{i}\), there is a positive mass on the upper boundary of the support of the censoring variable, which guarantees that \(\varDelta _{i}/\widehat{G}_{n}(Z_{i}|\mathbf {X}_{i})\) is uniformly finite in large samples. Assumptions B3 and B4(a) ensure the almost sure representation of \(\widehat{G}_{n}(\cdot |\mathbf {x})\) defined in (2.7), and Assumption B4(b) ensures that the higher-order remainder term in the almost sure representation of \(\widehat{G}_{n}(\cdot |\mathbf {x})\) is negligible.

Appendix: Proof of main results

For the study of the asymptotic properties of the parametric component, we will consider an equivalent objective function. Observe that by adding and subtracting the quantity \(\widehat{g}_{CD}^{NW}(\varvec{\beta }^{\top }_{1}\mathbf {X}_{i}|\varvec{\beta })\) in the objective function \(\widehat{S}_{n}^{CD}(\tau ,\mathbf {b})\) in (2.12), we get

$$\begin{aligned} \widehat{S}^{CD}_{n}(\tau ,\mathbf {b})=\sum _{i=1}^{n}\frac{\varDelta _{i}}{\widehat{G}_{n}(Z_{i}|\mathbf {X}_{i})} \rho _{\tau }\left\{ Z_{i}^{*}-\widetilde{g}(\mathbf {X}_{i}|\mathbf {b},\varvec{\beta })\right\} , \end{aligned}$$

where \(Z_{i}^{*}=Z_{i}-\widehat{g}_{CD}^{NW}(\varvec{\beta }^{\top }_{1}\mathbf {X}_{i}|\varvec{\beta })\) and, for any \(\varvec{\gamma } \in \mathbb {R}^{d-1}\) such that \(\varvec{\gamma }+\varvec{\beta } \in \varTheta \), \(\widetilde{g}(\mathbf {X}_{i}|\varvec{\gamma }+\varvec{\beta },\varvec{\beta })=\widehat{g}_{CD}^{NW}\{(\varvec{\gamma }+\varvec{\beta })^{\top }_{1}\mathbf {X}_{i}|\varvec{\gamma }+\varvec{\beta }\}-\widehat{g}_{CD}^{NW}(\varvec{\beta }^{\top }_{1}\mathbf {X}_{i}|\varvec{\beta })\), where, according to the convention used, \((\varvec{\gamma }+\varvec{\beta })_{1}=(1,(\varvec{\gamma }+\varvec{\beta })^{\top })^{\top }\). For the sake of convenience in the derivation of the asymptotic results, we define the new objective function

$$\begin{aligned} A^{CD}_{n}(\tau , \varvec{\gamma })=\sum _{i=1}^{n}\frac{\varDelta _{i}}{\widehat{G}_{n}(Z_{i}|\mathbf {X}_{i})} \big [\rho _{\tau }\{Z_{i}^{*}-\widetilde{g}(\mathbf {X}_{i}|\varvec{\gamma }/\sqrt{n}+\varvec{\beta },\varvec{\beta })\}-\rho _{\tau }(Z_{i}^{*}) \big ],\qquad \end{aligned}$$

where \(\varvec{\gamma }=\sqrt{n}(\mathbf {b}-\varvec{\beta })\).

Some lemmas

Lemma B.1

Let \(\widehat{G}_{n}(t|\mathbf {x})\) be defined in (2.7) and suppose that Assumptions B2B4 given in Appendix A hold and that \(G(\cdot |\mathbf {x})\) is smooth enough to have derivatives up to order \(\kappa _{1}\), for \(\kappa _{1}=[s_{4}]\). Then, with probability one,

$$\begin{aligned} \sup _{\mathbf {x} \in \mathcal {X}_{0}} \sup _{t} \left| \widehat{G}_{n}(t|\mathbf {x})-G(t|\mathbf {x}) \right| =O \left\{ \left( \frac{\log {n}}{nh_{1}^{d}} \right) ^{1/2} \right\} , \end{aligned}$$

where \(h_{1}\) is the bandwidth used in connection with (2.7).

Lemma B.1 follows from relationship (9) in Lemma 4.1 of Kong et al. (2013).

Lemma B.2

Let \(\widehat{Q}_{\tau }^{CD}(Y|\mathbf {x})\) defined in connection with (2.10) and suppose that Assumptions B1B4 given in Appendix A hold, with \(s_{1}, s_{2}, s_{3}>0\). For \(k=[s_{2}]\) and bandwidth \(h^{*}\) [used in (2.9)] chosen such that

$$\begin{aligned} h^{*} \propto n^{-\kappa }, \quad \text {for some} \quad \frac{1}{2s_{2}+d} \le \kappa < \frac{1}{d}, \end{aligned}$$

we have, with probability one,

$$\begin{aligned} \sup _{1 \le i \le n} \left| \widehat{Q}_{\tau }^{CD}(Y|\mathbf {X}_{i})-Q_{\tau }(Y|\mathbf {X}_{i}) \right| =O \left\{ \left( n^{1-\kappa d}/\log {n} \right) ^{-1/2} \right\} =O(\widetilde{a}_{n}). \end{aligned}$$

Lemma B.2 follows from Theorem 4.2 of Kong et al. (2013).

Remark B.3

The uniformity part of Lemma B.2 can be extended over the whole set \(\mathcal {X}_{0}\); see Kong et al. (2013).

Lemma B.4

Let \(\widehat{g}_{CD}^{NW}(t|\mathbf {b})\) be as defined in (2.8). Assume that for some \(r>2\), \(E|Q_{\tau }(Y|\mathbf {X})|^{r} < \infty \) and \(\sup _{t \in \mathfrak {T}_{\mathbf {b}}} E\{|Q_{\tau }(Y|\mathbf {X})|^{r}|\mathbf {b}^{\top }_{1}\mathbf {X}=t\}f_{\mathbf {b}}(t) < \infty \) holds for all \(\mathbf {b} \in \varTheta \), where \(\mathfrak {T}_{\mathbf {b}}=\{t: t=\mathbf {b}^{\top }_{1}\mathbf {x}, \mathbf {x} \in \mathcal {X}_{0}\}\), \(\mathcal {X}_{0}\) is the compact support of \(\mathbf {X}\), and \(f_{\mathbf {b}}\) is the density of \(\mathbf {b}^{\top }_{1}\mathbf {X}\). Then, under Assumptions A1A5B1B4 given in Appendix A, with \(s_{1}, s_{2}, s_{3}>0\), \(k=[s_{2}]\), the bandwidth \(h^{*}\) [used in (2.9)] chosen such that

$$\begin{aligned} h^{*} \propto n^{-\kappa }, \quad \text {for some} \quad \frac{1}{2s_{2}} \le \kappa < \frac{1}{4+3d}, \end{aligned}$$

and the condition \(nh^4=o(1)\), where h is the bandwidth in (2.8), we have that for any \(\mathbf {b}\),

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i=1}^{n} \left\{ \widehat{g}_{CD}^{NW}\left( \mathbf {b}^{\top }_{1}\mathbf {X}_{i}|\mathbf {b}\right) -g\left( \mathbf {b}^{\top }_{1}\mathbf {X}_{i}|\mathbf {b}\right) \right\} =o_{p}(1). \end{aligned}$$


See Supplementary Material. \(\square \)

Lemma B.5

Let \(\varphi _{\zeta }(\cdot |\mathbf {x})\) be as defined in Assumption A7. Then, under the Assumptions of Lemma B.1, and Assumptions A1 and A7 given in Appendix A, we have

$$\begin{aligned} \varphi _{\zeta }(t|\mathbf {x})= & {} E \left\{ \frac{G(Y|\mathbf {X})}{\zeta (Y|\mathbf {X})}\rho _{\tau }(Y-t) \Big | \mathbf {X}=\mathbf {x} \right\} , \\ \varphi '_{\zeta }(t|\mathbf {x})= & {} -E \left\{ \frac{G(Y|\mathbf {X})}{\zeta (Y|\mathbf {X})}\rho '_{\tau }(Y-t) \Big | \mathbf {X}=\mathbf {x} \right\} , \\ \varphi '_{\widehat{G}_{n}}(t|\mathbf {x})= & {} -\Big [\tau -F_{\epsilon |\mathbf {X}}\Big \{t-g(\varvec{\beta }^{\top }_{1}\mathbf {x}|\varvec{\beta })\Big |\mathbf {x}\Big \}\Big ]+o_{p}(1), \\ \varphi ''_{\widehat{G}_{n}}(t|\mathbf {x})= & {} \Big [f_{\epsilon |\mathbf {X}}\Big \{t-g(\varvec{\beta }^{\top }_{1}\mathbf {x}|\varvec{\beta })\Big |\mathbf {x}\Big \}+o_{p}(1)\Big ]I(t<\pi _0), \end{aligned}$$

where \(\pi _0\) is defined in Assumption B2.


See Supplementary Material. \(\square \)

Lemma B.6

Let \(A_{n}^{CD}(\tau ,\varvec{\gamma })\) defined in (B.1) for \(\varvec{\gamma } \in \mathbb {R}^{d-1}\). Then, under the Assumptions of Proposition 3.2, we have the following quadratic approximation uniformly in \(\varvec{\gamma }\) in a compact set

$$\begin{aligned} A_{n}^{CD}\left( \tau ,\varvec{\gamma } \right) =\frac{1}{2}\varvec{\gamma }^{\top }\mathbb {V}\varvec{\gamma }+\mathbf {W}^{\top }_{n}\varvec{\gamma }+o_{p}(1), \end{aligned}$$

where \(\mathbb {V}\) is defined in (3.1) and

$$\begin{aligned} \mathbf {W}_{n}=-\frac{1}{\sqrt{n}} \sum _{i=1}^{n}\frac{\varDelta _{i}}{\widehat{G}_{n}\left( Z_{i}| \mathbf {X}_{i}\right) }\rho '_{\tau }(Z_{i}^{*})g' \left( \varvec{\beta }^{\top }_{1} \mathbf {X}_{i}|\varvec{\beta }\right) \left\{ \mathbf {X}_{i,-1}-E\left( \mathbf {X}_{-1}| \varvec{\beta }^{\top }_{1}\mathbf {X}\right) \right\} ,\nonumber \\ \end{aligned}$$

for \(Z_{i}^{*}=Z_{i}-\widehat{g}_{CD}^{NW}(\varvec{\beta }^{\top }_{1}\mathbf {X}_{i}|\varvec{\beta })\), \(g'(t|\mathbf {b})=(\partial / \partial t)g(t|\mathbf {b})\), and \(\mathbf {X}_{-1}\) the \((d-1)\)-dimensional vector after removing the first coordinate.


See Supplementary Material. \(\square \)

Lemma B.7

Let \(\mathbf {W}^{*}_{n}=-n^{-1/2}\mathbf {W}_{n}\), where \(\mathbf {W}_{n}\) is defined in (B.2). Then, under the assumptions of Lemma B.6,

$$\begin{aligned} P \left( \sqrt{n}\varSigma ^{-1}\mathbf {W}_{n}^{*} \le t | \mathbb {X} \right) =\varPhi (t)+o_{p}(1), \end{aligned}$$

where \(\varSigma \) is defined in (3.2) and \(\varPhi (t)\) denotes the standard normal cumulative distribution function.


See Supplementary Material. \(\square \)

Proof of Proposition 3.1

We need to prove that \(\widehat{g}^{NW}_{CD}(t| \mathbf {b})\) is uniformly consistent estimator of \(g(t| \mathbf {b})\), uniformly in \(\mathbf {b} \in \varTheta \) and in \(t \in \mathfrak {T}_{\mathbf {b}}\). Let \(K_{h}(\cdot )=K(\cdot /h)\), and write \(\widehat{g}^{NW}_{CD}(t|\mathbf {b})=\widehat{\varPsi }(t| \mathbf {b})/ \widehat{f}_{\mathbf {b}}(t)\), where \(\widehat{\varPsi }(t| \mathbf {b})=(nh)^{-1}\sum _{i=1}^{n}\widehat{Q}^{CD}_{\tau }(Y|\mathbf {X}_{i})K_{h} \left( t-\mathbf {b}^{\top }_{1}\mathbf {X}_{i} \right) \) and \(\widehat{f}_{\mathbf {b}}(t)=(nh)^{-1}\sum _{i=1}^{n}K_{h} \left( t-\mathbf {b}^{\top }_{1}\mathbf {X}_{i} \right) \). To prove the uniform consistency, we will consider the numerator and denominator separately.

For the denominator, we use Theorem 6 of Hansen (2008) [take his \(\beta =\infty \) and the mixing coefficients as \(\alpha _{m}=0\)] to obtain

$$\begin{aligned} \sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} \left| \widehat{f}_{\mathbf {b}}(t)-f_{\mathbf {b}}(t) \right| =O_{p} \left\{ \left( \frac{\log {n}}{nh} \right) ^{1/2}+h^{2} \right\} =O_{p}\left( a_{n}+h^{2}\right) . \end{aligned}$$

Next, we will show that \(\widehat{\varPsi }(t| \mathbf {b})\) is consistent estimator of \(\varPsi (t| \mathbf {b})=g(t|\mathbf {b})f_{\mathbf {b}}(t)\), uniformly in \(\mathbf {b} \in \varTheta \) and \(t \in \mathfrak {T}_{\mathbf {b}}\), and determine its rate of convergence. Let

$$\begin{aligned} \varPsi ^{*}(t| \mathbf {b})=\frac{1}{nh}\sum _{i=1}^{n} Q_{\tau }(Y|\mathbf {X}_{i})K_{h} \left( t-\mathbf {b}^{\top }_{1}\mathbf {X}_{i} \right) \end{aligned}$$

and note that

$$\begin{aligned} |\widehat{\varPsi }(t| \mathbf {b})-\varPsi ^{*}(t| \mathbf {b})|= & {} \left| \frac{1}{nh}\sum _{i=1}^{n} \left\{ \widehat{Q}^{CD}_{\tau }(Y|\mathbf {X}_{i})-Q_{\tau }(Y|\mathbf {X}_{i})\right\} K_{h} \left( t-\mathbf {b}^{\top }_{1}\mathbf {X}_{i} \right) \right| \\\le & {} \sup _{1 \le i \le n} \left| \widehat{Q}^{CD}_{\tau }(Y|\mathbf {X}_{i})-Q_{\tau }(Y|\mathbf {X}_{i}) \right| \frac{1}{nh}\sum _{i=1}^{n} K_{h} \left( t-\mathbf {b}^{\top }_{1}\mathbf {X}_{i} \right) \\= & {} O_{p}\left\{ \left( n^{1-\kappa d}/\log {n} \right) ^{-1/2} \right\} \widehat{f}_{\mathbf {b}}(t), \end{aligned}$$

where the last equality follows from Lemma B.2. Therefore, by (B.3) and Assumption A2,

$$\begin{aligned} \sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} \left| \widehat{\varPsi }(t| \mathbf {b})-\varPsi ^{*}(t| \mathbf {b}) \right|\le & {} O_{p}\left( \widetilde{a}_{n} \right) \sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} \widehat{f}_{\mathbf {b}}(t)\nonumber \\= & {} O_{p}(\widetilde{a}_{n}) \left\{ \sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} f_{\mathbf {b}}(t)+O_{p} \left( a_{n}+h^{2} \right) \right\} \nonumber \\= & {} O_{p}\left( \widetilde{a}_{n} \right) . \end{aligned}$$

Next, Theorem 2 of Hansen (2008) yields \(\sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} \left| \varPsi ^{*}(t| \mathbf {b})-E\{\varPsi ^{*}(t| \mathbf {b})\} \right| =O_{p} \left( a_{n} \right) \), where, recalling the notation \(\varPsi (t| \mathbf {b})=g(t|\mathbf {b})f_{\mathbf {b}}(t)\), and using Assumption A4, \(E\{\varPsi ^{*}(t| \mathbf {b})\}=\varPsi (t|\mathbf {b})+O(h^{2})\). Thus, \(\sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} \left| \varPsi ^{*}(t|\mathbf {b})-\varPsi (t|\mathbf {b}) \right| =O_{p}(a_{n}+h^{2})\) which, together with (B.4) yields

$$\begin{aligned} \sup _{\mathbf {b} \in \varTheta , t \in \mathfrak {T}_{\mathbf {b}}} \left| \widehat{\varPsi }(t| \mathbf {b})-\varPsi (t| \mathbf {b}) \right| =O_{p}\left( \widetilde{a}_{n}+a_{n}+h \right) . \end{aligned}$$

Therefore, using (B.3), (B.5), and Assumption A2, we get

$$\begin{aligned} \left| \frac{\widehat{\varPsi }(t|\mathbf {b})}{\widehat{f}_{\mathbf {b}}(t)}-g(t|\mathbf {b}) \right|\le & {} \left| \frac{\widehat{\varPsi }(t|\mathbf {b})}{\widehat{f}_{\mathbf {b}}(t)}-\frac{\widehat{\varPsi }(t|\mathbf {b})}{f_{\mathbf {b}}(t)} \right| +\left| \frac{\widehat{\varPsi }(t|\mathbf {b})}{f_{\mathbf {b}}(t)}-g(t|\mathbf {b}) \right| \\= & {} \left| \frac{\widehat{\varPsi }(t|\mathbf {b})}{\widehat{f}_{\mathbf {b}}(t|\mathbf {b})} \right| \left| \frac{\widehat{f}_{\mathbf {b}}(t)-f_{\mathbf {b}}(t)}{f_{\mathbf {b}}(t)} \right| + \left| \frac{\widehat{\varPsi }(t| \mathbf {b})-\varPsi (t| \mathbf {b})}{f_{\mathbf {b}}(t)} \right| \\= & {} O_{p} \left( a_{n}+h^{2} \right) +O_{p}\left( \widetilde{a}_{n}+a_{n}+h^{2} \right) \\= & {} O_{p}\left( \widetilde{a}_{n}+a_{n}+h^{2} \right) \end{aligned}$$

uniformly in \(\mathbf {b} \in \varTheta \) and \(t \in \mathfrak {T}_{\mathbf {b}}\).

Proof of Proposition 3.2

To prove the \(\sqrt{n}\)-consistency of \(\widehat{\varvec{\beta }}\), enough to show that for any given \(\delta >0\), there exists a constant C such that

$$\begin{aligned} P \left\{ \inf _{\left\| \varvec{\gamma } \right\| \ge C} A^{CD}_{n}\left( \tau ,\varvec{\gamma }\right) >A^{CD}_{n}(\tau ,\mathbf {0}) \right\} \ge 1-\delta , \end{aligned}$$

where \(A^{CD}_{n}\left( \tau ,\varvec{\gamma }\right) \) is defined in (B.1), which implies that with probability at least \(1-\delta \) there exists a local minimum in the ball \(\{\varvec{\beta }+\varvec{\gamma }/\sqrt{n}: \left\| \varvec{\gamma } \right\| \le C\}\). The quadratic approximation derived in Lemma B.6, yields that

$$\begin{aligned} A^{CD}_{n}(\tau ,\varvec{\gamma })-A^{CD}_{n}(\tau ,\mathbf {0})=\frac{1}{2}\varvec{\gamma }^{\top }\mathbb {V}\varvec{\gamma }+\mathbf {W}^{\top }_{n}\varvec{\gamma }+o_{p}(1), \end{aligned}$$

where \(\mathbb {V}\) and \(\mathbf {W}_{n}\) are defined in (3.1) and (B.2) respectively, for any \(\varvec{\gamma }\) in a compact subset of \(\mathbb {R}^{d-1}\). Therefore, the difference (B.7) is dominated by the quadratic term \((1/2)\varvec{\gamma }^{\top }\mathbb {V}\varvec{\gamma }\) for \(\varvec{\gamma }\) greater than or equal to sufficiently large C. Hence, (B.6) follows.

Proof of Theorem 3.3

The proof follows form the Quadratic Approximation Lemma (Hjort and Pollard 1993). Consider the scaled difference \(\varvec{\gamma }=\sqrt{n}(\mathbf {b}-\varvec{\beta })\) and \(\widehat{\varvec{\gamma }}\) the solution to the objective function \(A^{CD}_{n}\left( \tau ,\varvec{\gamma } \right) \) defined in (B.1). From the \(\sqrt{n}\)-consistency of \(\widehat{\varvec{\beta }}\), the quadratic approximation derived in Lemma B.6 holds uniformly in \(\varvec{\gamma }\) in a compact set. Using the convexity assumption, the minimizer \(\widehat{\varvec{\gamma }}\) of \(A^{CD}_{n}\left( \tau ,\varvec{\gamma } \right) \) is only \(o_{p}(1)\) away from the minimizer \(\widehat{\varvec{\gamma }}^{*}=-\mathbb {V}^{-1}\mathbf {W}_{n}\). Thus, \(\widehat{\varvec{\gamma }}-\widehat{\varvec{\gamma }}^{*}=o_{p}(1)\) and therefore,

$$\begin{aligned} \sqrt{n}\left( \widehat{\varvec{\beta }}-\varvec{\beta }-S_{n}\right) =o_{p}(1), \end{aligned}$$


$$\begin{aligned} S_{n}=-\frac{1}{\sqrt{n}}\mathbb {V}^{-1}\mathbf {W}_{n}=\mathbb {V}^{-1}\mathbf {W}_{n}^{*}. \end{aligned}$$

The asymptotic normality of \(\mathbf {W}_{n}^{*}\) was derived in Lemma B.7 and it follows that

$$\begin{aligned} \sqrt{n}S_{n}=\sqrt{n}\mathbb {V}^{-1}\mathbf {W}_{n}^{*} {\mathop {\rightarrow }\limits ^{d}}N\left( 0,\mathbb {V}^{-1}\varSigma \mathbb {V}^{-1}\right) , \end{aligned}$$

where \(\varSigma \) is defined in (3.2). Therefore, using (B.8) and (B.9) we get,

$$\begin{aligned} \sqrt{n}\left( \widehat{\varvec{\beta }}-\varvec{\beta } \right) {\mathop {\rightarrow }\limits ^{d}}N \left( 0, \mathbb {V}^{-1}\varSigma \mathbb {V}^{-1} \right) . \end{aligned}$$

Supplementary material

Appendix C: Proofs of Lemmas B.4B.5B.6, and B.7 are given in the online Supplementary Material file.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Christou, E., Akritas, M.G. Single index quantile regression for censored data. Stat Methods Appl 28, 655–678 (2019).

Download citation


  • Censored data
  • Dimension reduction
  • Index model
  • Quantile regression