Skip to main content

Central quantile subspace

Abstract

Quantile regression (QR) is becoming increasingly popular due to its relevance in many scientific investigations. There is a great amount of work about linear and nonlinear QR models. Specifically, nonparametric estimation of the conditional quantiles received particular attention, due to its model flexibility. However, nonparametric QR techniques are limited in the number of covariates. Dimension reduction offers a solution to this problem by considering low-dimensional smoothing without specifying any parametric or nonparametric regression relation. The existing dimension reduction techniques focus on the entire conditional distribution. We, on the other hand, turn our attention to dimension reduction techniques for conditional quantiles and introduce a new method for reducing the dimension of the predictor \(\mathbf {X}\). The novelty of this paper is threefold. We start by considering a single index quantile regression model, which assumes that the conditional quantile depends on \(\mathbf {X}\) through a single linear combination of the predictors, then extend to a multi-index quantile regression model, and finally, generalize the proposed methodology to any statistical functional of the conditional distribution. The performance of the methodology is demonstrated through simulation examples and real data applications. Our results suggest that this method has a good finite sample performance and often outperforms the existing methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Alkenani, A., Yu, K.: Penalized single-index quantile regression. Int. J. Stat. Probab. 2(3), 12–30 (2013)

    Google Scholar 

  2. Breiman, L., Friedman, J.H.: Estimating optimal transformations for multiple regression and correlation. J. Am. Stat. Assoc. 80(391), 580–598 (1985)

    MathSciNet  MATH  Google Scholar 

  3. Brillinger, D.R.: A generalized linear model with ‘Gaussian’ regressor variables. In: Bickel, P.J., Doksum, K.A., Hodges, J.L. (eds.) A Festschrift for Erich L Lehmann. CRC Press, Wadsworth, Belmont, CA (1983)

    Google Scholar 

  4. Bura, E., Cook, R.D.: Extending sliced inverse regression: the weighted chi-squared test. J. Am. Stat. Assoc. 96(455), 996–1003 (2001)

    MathSciNet  MATH  Google Scholar 

  5. Chaudhuri, P.: Nonparametric estimates of regression quantiles and their local Bahadur representation. Ann. Stat. 19(2), 760–777 (1991)

    MathSciNet  MATH  Google Scholar 

  6. Chaudhuri, P., Doksum, K., Samarov, A.: On average derivative quantile regression. Ann. Stat. 25, 715–744 (1997)

    MathSciNet  MATH  Google Scholar 

  7. Chiaromonte, F., Cook, R.D., Li, B.: Sufficient dimension reduction in regressions with categorical predictors. Ann. Stat. 30, 475–497 (2002)

    MathSciNet  MATH  Google Scholar 

  8. Christou, E.: Robust dimension reduction using sliced inverse median regression. Stat. Pap. (2018). https://doi.org/10.1007/s00362-018-1007-z

  9. Christou, E., Akritas, M.G.: Single index quantile regression for heteroscedastic data. J. Multivar. Anal. 150, 169–182 (2016)

    MathSciNet  MATH  Google Scholar 

  10. Christou, E., Akritas, M.G.: Variable selection in heteroscedastic single index quantile regression. Commun. Stat. Theory Methods 47, 6019–6033 (2018)

    MathSciNet  Google Scholar 

  11. Christou, E., Grabchak, M.: Estimation of value-at-risk using single index quantile regression. J. Appl. Stat. 46(13), 2418–2433 (2019)

    MathSciNet  Google Scholar 

  12. Cont, R.: Empirical properties of asset returns: stylized facts and statistical issues. Quant. Finance 1(2), 223–236 (2001)

    MATH  Google Scholar 

  13. Cook, R.D.: Regression Graphics: Ideas for Studying Regressions Through Graphics. Wiley, New York (1998)

    MATH  Google Scholar 

  14. Cook, R.D., Li, B.: Dimension reduction for conditional mean in regression. Ann. Stat. 30(2), 455–474 (2002)

    MathSciNet  MATH  Google Scholar 

  15. Cook, R.D., Nachtsheim, C.J.: Reweighting to achieve elliptically contoured covariates in regression. J. Am. Stat. Assoc. 89(426), 592–599 (1994)

    MATH  Google Scholar 

  16. Cook, R.D., Weisberg, S., Li, K.-C.: Comment on “Sliced inverse regression for dimension reduction”. J. Am. Stat. Assoc. 86, 328–332 (1991)

    MATH  Google Scholar 

  17. Diaconis, P., Freedman, D.: Asymptotics of graphical projection pursuit. Ann. Stat. 12, 793–815 (1984)

    MathSciNet  MATH  Google Scholar 

  18. Dong, Y., Li, B.: Dimension reduction for non-elliptically distributed predictors: second-order methods. Biometrika 97, 279–294 (2010)

    MathSciNet  MATH  Google Scholar 

  19. Fan, Y., Härdle, W.K., Wang, W., Zhu, L.: Single-index-based CoVaR with very high-dimensional covariates. J. Bus. Econ. Stat. 36(2), 212–226 (2018)

    MathSciNet  Google Scholar 

  20. Gooijer, J.G., Zerom, D.: On additive conditional quantiles with high-dimensional covariates. J. Am. Stat. Assoc. 98(461), 135–146 (2003)

    MathSciNet  MATH  Google Scholar 

  21. Grocer, S.: Beware the risks of the bitcoin: winklevii outline the downside. Wall Street J. (2013). https://blogs.wsj.com/moneybeat/2013/07/02/beware-the-risks-of-the-bitcoin-winklevii-outline-the-downside/

  22. Guerre, E., Sabbah, C.: Uniform bias study and Bahadur representation for local polynomial estimators of the conditional quantile function. Econom. Theory 28(01), 87–129 (2012)

    MathSciNet  MATH  Google Scholar 

  23. Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)

    MATH  Google Scholar 

  24. Hristache, M., Juditsky, A., Polzehl, J., Spokoiny, V.: Structure adaptive approach for dimension reduction. Ann. Stat. 29(6), 1537–1566 (2001)

    MathSciNet  MATH  Google Scholar 

  25. Jiang, R., Zhou, Z.-G., Qian, W.-M., Chen, Y.: Two step composite quantile regression for single-index models. Comput. Stat. Data Anal. 64, 180–191 (2013)

    MathSciNet  MATH  Google Scholar 

  26. Koenker, R., Bassett, G.: Regression quantiles. Econometrica 46, 33–50 (1978)

    MathSciNet  MATH  Google Scholar 

  27. Kong, E., Xia, Y.: A single-index quantile regression model and its estimation. Econom. Theory 28, 730–768 (2012)

    MathSciNet  MATH  Google Scholar 

  28. Kong, E., Xia, Y.: An adaptive composite quantile approach to dimension reduction. Ann. Stat. 42(4), 1657–1688 (2014)

    MathSciNet  MATH  Google Scholar 

  29. Kong, E., Linton, O., Xia, Y.: Uniform Bahadur representation for local polynomial estimates of M-regression and its application to the additive model. Econom. Theory 26, 1529–1564 (2010)

    MathSciNet  MATH  Google Scholar 

  30. Li, K.-C.: Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)

    MathSciNet  MATH  Google Scholar 

  31. Li, K.-C.: On Principal Hessian directions for data visualization and dimension reduction: another application of Stein’s Lemma. J. Am. Stat. Assoc. 87(420), 1025–1039 (1992)

    MathSciNet  MATH  Google Scholar 

  32. Li, B., Dong, Y.: Dimension reduction for nonelliptically distributed predictors. Ann. Stat. 37, 1272–1298 (2009)

    MathSciNet  MATH  Google Scholar 

  33. Li, K.-C., Duan, N.: Regression analysis under link violation. Ann. Stat. 17(3), 1009–1052 (1989)

    MathSciNet  MATH  Google Scholar 

  34. Li, B., Wang, S.: On directional regression for dimension reduction. J. Am. Stat. Assoc. 102(479), 997–1008 (2007)

    MathSciNet  MATH  Google Scholar 

  35. Li, B., Zha, H., Chiaromonte, F.: Contour regression: a general approach to dimension reduction. Ann. Stat. 33(4), 1580–1616 (2005)

    MathSciNet  MATH  Google Scholar 

  36. Luo, W., Li, B., Yin, X.: On efficient dimension reduction with respect to a statistical functional of interest. Ann. Stat. 42(1), 382–412 (2014)

    MathSciNet  MATH  Google Scholar 

  37. Ma, Y., Zhu, L.: A semiparametric approach to dimension reduction. J. Am. Stat. Assoc. 107(497), 168–179 (2012)

    MathSciNet  MATH  Google Scholar 

  38. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). Available online. https://bitcoin.org/bitcoin.pdf

  39. Pollard, D.: Asymptotics for least absolute deviation regression estimators. Econom. Theory 7(2), 186–199 (1991)

    MathSciNet  Google Scholar 

  40. Shin, S.J., Artemiou, A.: Penalized principal logistic regression for sparse sufficient dimension reduction. Comput. Stat. Data Anal. 111, 48–58 (2017)

    MathSciNet  MATH  Google Scholar 

  41. Wang, H., Xia, Y.: Sliced regression for dimension reduction. J. Am. Stat. Assoc. 103, 811–821 (2008)

    MathSciNet  MATH  Google Scholar 

  42. Wu, T.Z., Yu, K., Yu, Y.: Single index quantile regression. J. Multivar. Anal. 101(7), 1607–1621 (2010)

    MathSciNet  MATH  Google Scholar 

  43. Xia, Y., Tong, H., Li, W.K., Zhu, L.-X.: An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 64, 363–410 (2002)

    MathSciNet  MATH  Google Scholar 

  44. Ye, Z., Weiss, R.E.: Using the bootstrap to select one of a new class of dimension reduction methods. J. Am. Stat. Assoc. 98(464), 968–979 (2003)

    MathSciNet  MATH  Google Scholar 

  45. Yin, X., Cook, R.D.: Dimension reduction for the conditional \(k\)th moment in regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 62, 159–175 (2002)

    MATH  Google Scholar 

  46. Yin, X., Li, B.: Sufficient dimension reduction based on an ensemble of minimum average variance estimators. Ann. Stat. 39, 3392–3416 (2011)

    MathSciNet  MATH  Google Scholar 

  47. Yu, K., Jones, M.C.: Local linear quantile regression. J. Am. Stat. Assoc. 93(441), 228–238 (1998)

    MathSciNet  MATH  Google Scholar 

  48. Yu, K., Lu, Z.: Local linear additive quantile regression. Scand. J. Stat. 31, 333–346 (2004)

    MathSciNet  MATH  Google Scholar 

  49. Zhang, L.-M., Zhu, L.-P., Zhu, L.-X.: Sufficient dimension reduction in regressions through cumulative Hessian directions. Stat. Comput. 21(3), 325–334 (2011)

    MathSciNet  MATH  Google Scholar 

  50. Zhu, L.-P., Zhu, L.-X.: Dimension reduction for conditional variance in regressions. Stat. Sin. 19, 869–883 (2009)

    MathSciNet  MATH  Google Scholar 

  51. Zhu, L.-P., Zhu, L.-X., Feng, Z.-H.: Dimension reduction in regression through cumulative slicing estimation. J. Am. Stat. Assoc. 105(492), 1455–1466 (2010)

    MathSciNet  MATH  Google Scholar 

  52. Zhu, X., Guo, X., Zhu, L.: An adaptive-to-model test for partially parametric single-index models. Stat. Comput. 27(5), 1193–1204 (2017)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank Professors Michael Akritas and Bing Li from the Pennsylvania State University for useful discussions regarding the presented paper. We would also like to thank Mr. Mark Hamrick for help in running some of the simulations, and the two anonymous referees, whose comments lead to improvements in the presentation of this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Eliana Christou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: notation and assumptions

Notation We say that a function \(m(\cdot ): \mathbb {R}^{p} \rightarrow \mathbb {R}\) has the order of smoothness s on the support \(\mathcal {X}_{0}\), denoted by \(m(\cdot ) \in H_{s}(\mathcal {X}_{0})\), if (a) it is differentiable up to order [s], where [s] denotes the lowest integer part of s, and (b) there exists a constant \(L>0\), such that for all \(\mathbf {u}=(u_{1}, \ldots , u_{p})^\top \) with \(|\mathbf {u}|=u_{1}+ \cdots +u_{p}=[s]\), all \(\tau \) in an interval \([\underline{\tau }, \overline{\tau }]\), where \(0<\underline{\tau } \le \bar{\tau } <1\), and all \(\mathbf {x}\), \(\mathbf {x}'\) in \(\mathcal {X}_{0}\),

$$\begin{aligned} |D^{\mathbf {u}}m(\mathbf {x})-D^{\mathbf {u}}m(\mathbf {x'})| \le L \left\| \mathbf {x}-\mathbf {x}' \right\| ^{s-[s]}, \end{aligned}$$

where \(D^{\mathbf {u}}m(\mathbf {x})\) denotes the partial derivative

$$\begin{aligned} \partial ^{|\mathbf {u}|}m(\mathbf {x})/\partial x_{1}^{u_{1}} \ldots x_{p}^{u_{p}} \end{aligned}$$

and \(\left\| \cdot \right\| \) denotes the Euclidean norm.

Assumptions

  1. A1

    The following moment conditions are satisfied

    $$\begin{aligned} E \left\| \mathbf {X}\mathbf {X}^{\top } \right\|< \infty , \ \ E |Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})|^2< & {} \infty ,\\ E\left\{ Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})^2 \left\| \mathbf {X}\mathbf {X}^{\top } \right\| \right\}< & {} \infty , \end{aligned}$$

    for a given \(\tau \in (0,1)\).

  2. A2

    The distribution of \(\mathbf {A}^{\top }\mathbf {X}\) has a probability density function \(f_{\mathbf {A}}(\cdot )\) with respect to the Lebesgue measure, which is strictly positive and continuously differentiable over the support \(\mathcal {X}_{0}\) of \(\mathbf {X}\).

  3. A3

    The cumulative distribution function \(F_{Y| \mathbf {A}}(\cdot |\cdot )\) of Y given \(\mathbf {A}^{\top }\mathbf {X}\) has a continuous probability density function \(f_{Y|\mathbf {A}}(y|\)\( \mathbf {A}^{\top }\mathbf {x})\) with respect to the Lebesgue measure, which is strictly positive for y in \(\mathbb {R}\) and \(\mathbf {A}^{\top }\mathbf {x}\), for \(\mathbf {x}\) in \(\mathcal {X}_{0}\). The partial derivative \(\partial F_{Y| \mathbf {A}}(y| \mathbf {A}^{\top }\mathbf {x})/ \partial \mathbf {A}^{\top }\mathbf {x}\) is continuous. There is a \(L_{0}>0\), such that

    $$\begin{aligned}&|f_{Y|\mathbf {A}}(y|\mathbf {A}^{\top }\mathbf {x})-f_{Y|\mathbf {A}}(y'|\mathbf {A}^{\top }\mathbf {x}')|\\&\quad \le L_{0} \left\| (\mathbf {A}^{\top }\mathbf {x},y)-(\mathbf {A}^{\top }\mathbf {x}',y') \right\| \ \end{aligned}$$

    for all \((\mathbf {x},y), (\mathbf {x}',y') \ \text {of} \ \mathcal {X}_{0} \times \mathbb {R}\).

  4. A4

    The nonnegative kernel function \(K(\cdot )\), used in (5), is Lipschitz over \(\mathbb {R}^{d}\), \(d \ge 1\), and satisfies \(\int K(\mathbf {z})d \mathbf {z}=1\). For some \(\underline{K}>0\), \(K(\mathbf {z}) \ge \underline{K} I\{\mathbf {z} \in B(0,1)\}\), where B(0, 1) is the closed unit ball. The associated bandwidth h, used in the estimation procedure, is in \([\underline{h},\overline{h}]\) with \(0< \underline{h} \le \overline{h} < \infty \), \(\lim _{n \rightarrow \infty } \overline{h}=0\) and \(\lim _{n \rightarrow \infty } (\ln {n})/(n \underline{h}^{d})=0\).

  5. A5

    \(Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\) is in \(H_{s_{\tau }}(\mathcal {T}_{\mathbf {A}})\) for some \(s_{\tau }\) with \([s_{\tau }] \le 1\), where \(\mathcal {T}_{\mathbf {A}}=\{\mathbf {z} \in \mathbb {R}^{d}: \mathbf {z}=\mathbf {A}^{\top }\mathbf {x}, \mathbf {x} \in \mathcal {X}_{0}\}\), and \(\mathcal {X}_{0}\) is the support of \(\mathbf {X}\).

Appendix B: Proof of main results

Appendix B.1: Some lemmas

Lemma 1

Under Assumptions A2–A5 given in “Appendix A”, and the assumption that \(\widehat{\mathbf {A}}\) is \(\sqrt{n}\)-consistent estimate of the directions of the CS, then

$$\begin{aligned} \sup _{\mathbf {x} \in \mathcal {X}_{0}} |\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})-Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})|=O_{p}(1), \end{aligned}$$

where \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})\) denotes the local linear conditional quantile estimate of \(Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\), given in (5).

Proof

Observe that

$$\begin{aligned}&\sup _{\mathbf {x} \in \mathcal {X}_{0}} |\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})-Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})|\\&\quad \le \sup _{\mathbf {x} \in \mathcal {X}_{0}} |\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})-\widehat{Q}_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})|\\&\qquad +\sup _{\mathbf {x} \in \mathcal {X}_{0}} |\widehat{Q}_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})-Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})|\\&\quad =O_{p}(1). \end{aligned}$$

The first term follows from the Bahadur representation of \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})-\widehat{Q}_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\) (see Guerre and Sabbah 2012) and the \(\sqrt{n}\)-consistency of \(\widehat{\mathbf {A}}\). The second term follows from Corollary 1 (ii) of Guerre and Sabbah (2012). \(\square \)

Note For the study of the asymptotic properties of \(\widehat{\varvec{\beta }}_{\tau }\), defined in (4), we consider an equivalent objective function. Observe that minimizing \(\sum _{i=1}^{n}\{\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})-a_{\tau }-\mathbf {b}_{\tau }^{\top }\mathbf {X}_{i}\}^2\) with respect to \((a_{\tau },\mathbf {b}_{\tau })\), is equivalent with minimizing

$$\begin{aligned} \widehat{S}_{n}(a_{\tau },\mathbf {b}_{\tau })= & {} \frac{1}{2}\sum _{i=1}^{n}\{\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})-a_{\tau }-\mathbf {b}_{\tau }^{\top }\mathbf {X}_{i}\}^2 \nonumber \\&-\frac{1}{2}\sum _{i=1}^{n}\{\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})\}^2 \end{aligned}$$
(10)

with respect to \((a_{\tau },\mathbf {b}_{\tau })\). By expanding the square, (10) can be written as

$$\begin{aligned} \widehat{S}_{n}(a_{\tau },\mathbf {b}_{\tau })= & {} -(a_{\tau },\mathbf {b}^{\top }_{\tau })^{\top }\sum _{i=1}^{n}\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})(1,\mathbf {X}_{i})\nonumber \\&+\frac{1}{2}(a_{\tau },\mathbf {b}^{\top }_{\tau })^{\top }\sum _{i=1}^{n}(1,\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }(a_{\tau },\mathbf {b}_{\tau }).\nonumber \\ \end{aligned}$$
(11)

Lemma 2

Let \(\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }))\) be as defined in (11), where \(\varvec{\gamma }_{\tau }=\sqrt{n} \{(a_{\tau },\mathbf {b}_{\tau })-(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau })\}\) and \((\alpha ^*_{\tau }, \varvec{\beta }_{\tau }^*)\) is defined in (3). Then, under the assumptions of Lemma 1 and additionally Assumption A1 of “Appendix A”, we have the following quadratic approximation, uniformly in \(\varvec{\gamma }_{\tau }\) in a compact set,

$$\begin{aligned} \widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}{+}(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau })){=} \frac{1}{2}\varvec{\gamma }_{\tau }^{\top }\mathbb {V}\varvec{\gamma }_{\tau }+\mathbf {W}_{\tau ,n}^{\top }\varvec{\gamma }_{\tau }{+}C_{\tau ,n}{+}o_{p}(1), \end{aligned}$$

where \(\mathbb {V}=E\{(1,\mathbf {X})(1,\mathbf {X}^{\top })^{\top }\}\),

$$\begin{aligned} \mathbf {W}_{\tau ,n}=-\frac{1}{\sqrt{n}}\sum _{i=1}^{n} \widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})(1,\mathbf {X}_{i}), \end{aligned}$$
(12)

and

$$\begin{aligned} C_{\tau ,n}= & {} -\sum _{i=1}^{n}\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }) \nonumber \\&+\frac{1}{2}(\alpha ^*_{\tau },\varvec{\beta }^{*\top }_{\tau })^{\top }\sum _{i=1}^{n}(1,\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }). \end{aligned}$$
(13)

Proof

Observe that

$$\begin{aligned}&\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }))\\&\quad = \frac{1}{2n}\varvec{\gamma }_{\tau }^{\top }\sum _{i=1}^{n}(1,\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top } \varvec{\gamma }_{\tau }\\&\qquad -\,\frac{1}{\sqrt{n}}\sum _{i=1}^{n} \widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }\varvec{\gamma }_{\tau }\\&\qquad -\,\sum _{i=1}^{n}\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau })\\&\qquad +\,\frac{1}{2}(\alpha ^*_{\tau },\varvec{\beta }^{*\top }_{\tau })^{\top }\sum _{i=1}^{n}(1,\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau })\\&\quad =\frac{1}{2}\varvec{\gamma }^{\top }_{\tau }\mathbb {V}_{n}\varvec{\gamma }_{\tau }+\mathbf {W}_{\tau ,n}^{\top }\varvec{\gamma }_{\tau }+C_{\tau ,n}, \end{aligned}$$

where \(\mathbb {V}_{n}=n^{-1}\sum _{i=1}^{n}(1,\mathbf {X}_{i})(1,\mathbf {X}^{\top }_{i})^{\top }\), and \(\mathbf {W}_{\tau ,n}\) and \(C_{\tau ,n}\) are defined in (12) and (13), respectively. It is easy to see that \(\mathbb {V}_{n}=\mathbb {V}+o_{p}(1)\), and therefore,

$$\begin{aligned}&\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }))=\frac{1}{2}\varvec{\gamma }_{\tau }^{\top }\mathbb {V}\varvec{\gamma }_{\tau }\\&\quad +\,\mathbf {W}_{\tau ,n}^{\top }\varvec{\gamma }_{\tau }+C_{\tau ,n}+o_{p}(1). \end{aligned}$$

Provided that \(\mathbf {W}_{\tau ,n}\) is stochastically bounded, it follows from the convexity lemma (Pollard 1991) that the quadratic approximation to the convex function \(\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }))\) holds uniformly for \(\varvec{\gamma }_{\tau }\) in a compact set. Remains to prove that \(\mathbf {W}_{\tau ,n}\) is stochastically bounded.

Since \(\mathbf {W}_{\tau ,n}\) involves the quantity \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {X}_{i})\), which is data dependent and not deterministic function, we define

$$\begin{aligned} \mathbf {W}_{\tau ,n}(\phi _{\tau })=-\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\phi _{\tau }(Y|\mathbf {A}^{\top }\mathbf {X}_{i})(1,\mathbf {X}_{i}), \end{aligned}$$

where \(\phi _{\tau }: \mathbb {R}^{d+1} \rightarrow \mathbb {R}\) is a function in the class \(\Phi _{\tau }\), whose value at \((y,\mathbf {A}^\top \mathbf {x}) \in \mathbb {R}^{d+1}\) can be written as \(\phi _{\tau }(y|\mathbf {A}^{\top }\mathbf {x})\), in the non-separable space \(l^{\infty }(y,\mathbf {A}^{\top }\mathbf {x})=\{(y,\mathbf {A}^{\top }\mathbf {x}): \mathbb {R}^{d+1} \rightarrow \mathbb {R}: \left\| \phi _{\tau }\right\| _{(y,\mathbf {A}^{\top }\mathbf {x})}:= \sup _{(y,\mathbf {A}^{\top }\mathbf {x}) \in \mathbb {R}^{d+1}} |\phi _{\tau }(y|\mathbf {A}^{\top }\mathbf {x})|<\infty \}\), and satisfying \(E|\phi _{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})|^2 < \infty \) and

$$\begin{aligned} E \left\| \phi _{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})^2 \mathbf {X}\mathbf {X}^{\top } \right\| < \infty . \end{aligned}$$

Since \(\Phi _{\tau }\) includes \(Q_{\tau }(Y|\mathbf {A}^{\top }\mathbf {x})\), and according to Lemma 1, includes \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})\) for n large enough, almost surely, we will prove that \(\mathbf {W}_{\tau ,n}(\phi _{\tau })\) is stochastically bounded, uniformly on \(\phi _{\tau } \in \Phi _{\tau }\).

Observe that

$$\begin{aligned}&\sup _{\phi _{\tau } \in \Phi _{\tau }} \left\| E \left\{ \mathbf {W}_{\tau ,n}(\phi _{\tau }) \mathbf {W}^{\top }_{\tau ,n}(\phi _{\tau }) \right\} \right\| \\&\quad \le \sup _{\phi _{\tau } \in \Phi _{\tau }} \frac{1}{n} \sum _{i=1}^{n} E \left\{ \phi _{\tau }(Y|\mathbf {A}^{\top }\mathbf {X}_{i})^2 \left\| (1,\mathbf {X}_{i})(1,\mathbf {X}_{i}^{\top })^{\top } \right\| \right\} \\&\quad = O\left[ E \left\{ \phi _{\tau }(Y|\mathbf {A}^{\top }\mathbf {X})^2 \left\| (1,\mathbf {X})(1,\mathbf {X}^{\top })^{\top } \right\| \right\} \right] =O(1), \end{aligned}$$

which follows from the properties of the class \(\Phi _{\tau }\) defined above. Bounded second moment implies that \(\mathbf {W}_{\tau , n}(\phi _{\tau })\) is stochastically bounded. Since

  1. 1.

    The result was proven uniformly on \(\phi _{\tau }\), and

  2. 2.

    The class \(\Phi _{\tau }\) includes \(\widehat{Q}_{\tau }(Y|\widehat{\mathbf {A}}^{\top }\mathbf {x})\) for n large enough, almost surely,

the proof follows. \(\square \)

Appendix B.2: Proof of theorem 4

To prove the \(\sqrt{n}\)-consistency of \(\widehat{\varvec{\beta }}_{\tau }\), enough to show that for any given \(\delta _{\tau }>0\), there exists a constant \(C_{\tau }\) such that

$$\begin{aligned} \Pr \left\{ \inf _{\left\| \varvec{\gamma }_{\tau } \right\| {\ge } C_{\tau }} \widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}{+}(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau })){>}\widehat{S}_{n}(\alpha ^*_{\tau },\varvec{\beta }^*_{\tau }) \right\} \ge 1{-}\delta _{\tau },\nonumber \\ \end{aligned}$$
(14)

where \(\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau },\varvec{\beta }_{\tau }))\) defined in (10) and implies that with probability at least \(1-\delta _{\tau }\) there exists a local minimum in the ball \(\{\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }): \left\| \varvec{\gamma }_{\tau } \right\| \le C_{\tau }\}\). This in turn implies that there exists a local minimizer such that \(\left\| (\widehat{\alpha }_{\tau },\widehat{\varvec{\beta }}_{\tau })-(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }) \right\| =O_{p}\left( n^{-1/2} \right) \). The quadratic approximation derived in Lemma 2 yields that

$$\begin{aligned}&\widehat{S}_{n}(\varvec{\gamma }_{\tau }/\sqrt{n}+(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }))-\widehat{S}_{n}(\alpha _{\tau }^*,\varvec{\beta }^*_{\tau }) \nonumber \\&\quad =\frac{1}{2}\varvec{\gamma }_{\tau }^\top \mathbb {V}\varvec{\gamma }_{\tau }+\mathbf {W}^\top _{\tau ,n}\varvec{\gamma }_{\tau }+o_{p}(1), \end{aligned}$$
(15)

for any \(\varvec{\gamma }_{\tau }\) in a compact subset of \(\mathbb {R}^{p+1}\). Therefore, the difference (15) is dominated by the quadratic term \((1/2)\varvec{\gamma }_{\tau }^\top \mathbb {V}\varvec{\gamma }_{\tau }\) for \(\left\| \varvec{\gamma }_{\tau }\right\| \) greater than or equal to sufficiently large \(C_{\tau }\). Hence, (14) follows. \(\square \)

Appendix B.3: Proof of theorem 8

Let \(\widehat{\mathbf {V}}_{\tau }=(\widehat{\varvec{\beta }}_{\tau ,0}, \dots , \widehat{\varvec{\beta }}_{\tau ,p-1})\) be a \(p \times p\) matrix, where \(\widehat{\varvec{\beta }}_{\tau ,0}=\widehat{\varvec{\beta }}_{\tau }\), defined in (4), and \(\widehat{\varvec{\beta }}_{\tau ,j}=E_{n}\{\widehat{Q}_{\tau }(Y|\widehat{\varvec{\beta }}_{\tau ,j-1}^{\top }\mathbf {X})\mathbf {X}\}\) for \(j=1,\dots ,p-1\). Moreover, let \(\mathbf {V}_{\tau }\) be the population level of \(\widehat{\mathbf {V}}_{\tau }\). It is easy to see that \(\widehat{\mathbf {V}}_{\tau }\) converges to \(\mathbf {V}_{\tau }\) at \(\sqrt{n}\)-rate. This follows from the central limit theorem and Lemma 1. Then, for \(\left\| \cdot \right\| \) the Frobenius norm,

$$\begin{aligned} \left\| \widehat{\mathbf {V}}_{\tau } \widehat{\mathbf {V}}_{\tau }^{\top } - \mathbf {V}_{\tau } \mathbf {V}_{\tau }^{\top } \right\|\le & {} \left\| \widehat{\mathbf {V}}_{\tau } \widehat{\mathbf {V}}_{\tau }^{\top } - \widehat{\mathbf {V}}_{\tau } \mathbf {V}_{\tau }^{\top } \right\| \\&+ \left\| \widehat{\mathbf {V}}_{\tau } \mathbf {V}_{\tau }^{\top } - \mathbf {V}_{\tau } \mathbf {V}_{\tau }^{\top } \right\| \\= & {} O_{p}(n^{-1/2}), \end{aligned}$$

and the eigenvectors of \(\widehat{\mathbf {V}}_{\tau } \widehat{\mathbf {V}}_{\tau }^{\top }\) converge to the corresponding eigenvectors of \(\mathbf {V}_{\tau } \mathbf {V}_{\tau }^{\top }\). Finally, the subspace spanned by the \(d_{\tau }\) eigenvectors of \(\mathbf {V}_{\tau } \mathbf {V}_{\tau }^{\top }\) falls into \(\mathcal {S}_{Q_{\tau }(Y|\mathbf {X})}\) and the proof is complete. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Christou, E. Central quantile subspace. Stat Comput 30, 677–695 (2020). https://doi.org/10.1007/s11222-019-09915-8

Download citation

Keywords

  • Dimension reduction
  • Multi-index
  • Quantile regression
  • Single index
  • Statistical functional